WO2020062980A1 - File access tracking method, device, storage medium and terminal - Google Patents

File access tracking method, device, storage medium and terminal Download PDF

Info

Publication number
WO2020062980A1
WO2020062980A1 PCT/CN2019/093511 CN2019093511W WO2020062980A1 WO 2020062980 A1 WO2020062980 A1 WO 2020062980A1 CN 2019093511 W CN2019093511 W CN 2019093511W WO 2020062980 A1 WO2020062980 A1 WO 2020062980A1
Authority
WO
WIPO (PCT)
Prior art keywords
preset
file access
file
access information
function
Prior art date
Application number
PCT/CN2019/093511
Other languages
French (fr)
Chinese (zh)
Inventor
周明君
方攀
陈岩
Original Assignee
上海瑾盛通信科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海瑾盛通信科技有限公司 filed Critical 上海瑾盛通信科技有限公司
Publication of WO2020062980A1 publication Critical patent/WO2020062980A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions

Definitions

  • the embodiments of the present application relate to the technical field of terminals, for example, to a method, a device, a storage medium, and a terminal for tracking file access.
  • a lot of data or information is stored in the form of files.
  • one or more types of files are accessed frequently. How to optimize file access is an important part of the system optimization topic.
  • the embodiments of the present application provide a file access tracking method, device, storage medium, and terminal, which can optimize a file access tracking scheme.
  • an embodiment of the present application provides a file access tracking method, including:
  • the file access information is stored in a storage format corresponding to the preset virtual machine for user space reading.
  • an embodiment of the present application provides a file access tracking device, including:
  • a judging module configured to judge whether a preset program code written based on a preset virtual machine exists in a kernel space before a function to be called corresponding to the preset file access event is triggered when a preset file access event is triggered;
  • the access information acquisition module is configured to acquire file access information corresponding to the function to be called through the preset program code in response to a judgment result of the preset program code written based on the preset virtual machine;
  • the access information storage module is configured to store the file access information in a storage format corresponding to the preset virtual machine for user space to read.
  • a computer-readable storage medium is provided in the embodiment of the present application, and a computer program is stored on the computer program.
  • the computer program is executed by the processor, the file access tracking method according to the embodiment of the present application is implemented.
  • an embodiment of the present application provides a terminal, including a memory, a processor, and a computer program stored in the memory and executable by the processor.
  • the processor executes the computer program, the implementation is as in the present application.
  • the file access tracking method according to the embodiment.
  • FIG. 1 is a schematic flowchart of a file access tracking method according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of another file access tracking method according to an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of another file access tracking method according to an embodiment of the present application.
  • FIG. 4 is a structural block diagram of a file access tracking device according to an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of still another terminal provided by an embodiment of the present application.
  • Some exemplary embodiments herein are described as processes or methods depicted as flowcharts. Although flowcharts describe multiple steps as sequential processing, many of these steps can be performed in parallel, concurrently, or simultaneously. In addition, the order of one or more steps may be rearranged. The process may be terminated when its operation is completed, but may also have additional steps not included in the drawings. The processing may correspond to methods, functions, procedures, subroutines, subroutines, and so on.
  • FIG. 1 is a schematic flowchart of a file access tracking method according to an embodiment of the present application.
  • the method may be performed by a file access tracking device, where the device may be implemented by software and / or hardware, and may generally be integrated in a terminal.
  • the method includes:
  • Step 110 When a preset file access event is triggered, determine whether there is a preset program code written based on a preset virtual machine in a kernel space before a function to be called corresponding to the preset file access event.
  • the terminal in the embodiment of the present application may include a device provided with an operating system, such as a mobile phone, a tablet computer, a notebook computer, a computer, and a smart home appliance.
  • an operating system such as a mobile phone, a tablet computer, a notebook computer, a computer, and a smart home appliance.
  • the type of the operating system is not limited in the embodiments of the present application, and may include, for example, an Android operating system, a Windows operating system, and an Apple operating system.
  • an Android operating system For ease of description, the embodiments of this application will take the Android operating system as an example for subsequent description.
  • the terminal's Android operating system access to one or more types of files is very frequent. How to optimize file access is an important part of the system optimization topic.
  • classification is generally based on existing features provided by file systems such as file names or types (such as text or pictures). However, this classification is relatively simple and crude, and is not accurate in many scenarios, such as two different The text may be used in different ways.
  • the use may include the access order (such as reading from the beginning of the file to the end of the file is an access order, and reading from the middle of the file to the end of the file is another access order).
  • the access order such as reading from the beginning of the file to the end of the file is an access order, and reading from the middle of the file to the end of the file is another access order.
  • the purpose of file access tracking can be achieved by tracking the kernel information of file access.
  • Linux kernel-level tracing frameworks like ftrace are generally used.
  • the earliest ftrace is a function tracer, which can only record the function call process of the kernel.
  • Now ftrace has developed into a framework, which can be used to support developers to add more kinds of traces (plugin).
  • the trace) function helps developers understand the runtime behavior of the Linux kernel for fault debugging or performance analysis. Therefore, it can be used to trace kernel information for file access.
  • a tracing framework such as ftrace for file access tracing
  • the collected information is written into a ring buffer (ring buffer), and the information cannot be filtered or counted during the collection process.
  • file access tracking may be implemented based on a preset virtual machine implemented in a kernel space.
  • the preset file access event may include at least one of a file read, a file write, a file synchronization (fsync), and a file data synchronization (fdatasync).
  • read means reading the file
  • write means writing the file
  • fsync means synchronizing all the modified file data in the memory to the storage device.
  • fsync In addition to synchronizing the modified content (dirty pages) of the file, fsync also synchronizes the file's description information (metadata (metadata), including size (size), access time and modification time (st_atime & st_mtime), etc .; fdatasync means refreshing data to disk. fdatasync functions similarly to fsync, but only synchronizes metadata if necessary, so you can Reduce one input / output (IO) write operation.
  • IO input / output
  • the kernel is generally implemented based on Linux, and the bottom layer of the system is generally the Linux kernel (Linux Kernel).
  • the system divides the kernel space and user space. Different operating systems may have different division methods or division results.
  • User space generally refers to the memory area where the user process is located.
  • Application programs run in user space, and user process data is stored in user space.
  • Kernel space is the memory area occupied by the operating system.
  • the operating system and drivers run in kernel space.
  • the operating system The data is stored in the system space. In this way, user data and system data can be isolated to ensure system stability.
  • user space and kernel space interact through system calls.
  • System calls can be understood as a set of all system calls provided by the operating system implementation, that is, a program interface or application programming interface (Application Programming Interface, API). ) Is the interface between the application and the system.
  • the function of the operating system is to manage hardware resources and provide a good environment for application developers to make applications more compatible.
  • the kernel provides a series of multi-kernel functions with predetermined functions.
  • An interface called a system call is presented to the user. The system call passes the application's request to the kernel, calls the corresponding kernel function to complete the required processing, and returns the processing result to the application.
  • the kernel space is accessed through a system call, that is, the corresponding system call interface is called to access the kernel space. Therefore, a system call corresponding to a preset file access event may be used. Whether the interface is called to determine whether the preset file access event is triggered. If it is called, the preset file access event may be considered to be triggered.
  • a corresponding function in the kernel space is also called to implement file access, and the corresponding function may be called a function to be called.
  • the corresponding function may be called a function to be called.
  • each preset file access event corresponds to a corresponding function to be called, and is used to implement functions of reading a file, writing a file, synchronizing a file, and synchronizing file data.
  • the specific implementation form of the function may be different in different operating systems, which are not limited in the embodiments of the present application.
  • a preset program code writing method based on a preset virtual machine may be used in advance, and a preset program code is inserted before a function to be called in the kernel space, and the preset program code is used to obtain file access information.
  • the preset file access event when the preset file access event is triggered, it can be determined whether there is a preset program code written based on the preset virtual machine, and then it is determined whether the preset program code is executed first or the code corresponding to the function to be called is directly executed. Inserting the preset program code based on the preset virtual machine can ensure the stability of the system.
  • Step 120 In response to a judgment result that the preset program code written based on the preset virtual machine exists, obtain file access information corresponding to the function to be called through the preset program code.
  • the preset program code may be executed first, and then the function to be called is executed, and the preset program code is used to access file-related information when the function to be called is executed.
  • This information is obtained, which is referred to as file access information in the embodiments of the present application.
  • the file access information may include detailed information reflecting the file access process, such as what files are accessed, where the files are stored, attribute information of the files, and specific access methods, etc.
  • the file access information includes offset information, where the offset information is used to indicate a file access location.
  • the access location of the file can include, for example, the location where the access starts and the location where the access ends.
  • the file access information may further include at least one of a file name, a file path, and a file size.
  • Step 130 Store the file access information in a storage format corresponding to the preset virtual machine for user space reading.
  • the obtained file access information is stored by using a preset storage format corresponding to the virtual machine, instead of storing in a kernel standard format, which can save storage space.
  • the user space does not need to read the file access information constantly, but can read the required file access information on a regular basis or at a time when required, which can effectively reduce the number of interactions between kernel space and user space.
  • the preset storage format of the virtual machine reduces the amount of read information and transmission, so the amount of interactive data is also reduced, thereby reducing the burden of file access tracking on the system and improving system stability.
  • the file access tracking method when a preset file access event is triggered, if it is determined that a function to be called corresponding to the preset file access event exists in a kernel space, there is a preset program written based on a preset virtual machine. Code, the file access information corresponding to the function to be called is obtained through the preset program code, and the file access information is stored in a preset storage format corresponding to the virtual machine for user space reading.
  • a preset program code can be inserted before a function to be called corresponding to a preset file access event in the kernel space, and the preset program code can be used to obtain and Storing file access information for user space reading can reduce the interaction between kernel space and user space, reduce the burden of file access tracking on the system, and improve system stability.
  • the preset program code written based on the preset virtual machine before the determining whether a function to be called corresponding to the preset file access event in the kernel space exists, includes determining a virtual file in the kernel space.
  • the system layer whether a preset program code written based on a preset virtual machine exists at a starting position of a function to be called corresponding to the preset file access event.
  • the advantage of this setting is that it does not destroy existing code in kernel space.
  • the virtual file system layer can be understood as a file abstraction layer. The function to be called for implementing file access is generally in this layer.
  • a program based on a preset virtual machine can be written at the starting position of the function to be called, and used to In order to obtain file access information, the timing and location of determining whether a preset program code exists is more clear, and the preset program code can be successfully executed.
  • acquiring the file access information corresponding to the function to be called by using the preset program code includes: acquiring function parameter content and / or the function to be called by using the preset program code. Content of the kernel data structure corresponding to the function to be called; and determining file access information corresponding to the function to be called according to the function parameter content and / or the content of the kernel data structure.
  • the advantage of this setting is that it can successfully and accurately obtain file access information.
  • some file access information will exist in the function parameters, such as file path, file size, or offset information, and some may exist in corresponding kernel data structures, such as file names.
  • the existing form is not limited. After obtaining the content of the function parameters or the content of the kernel data structure, it may be necessary to perform conversion to obtain the finally required file access information.
  • the preset virtual machine includes an extended Berkeley Packet Filter (eBPF), and a storage format corresponding to the preset virtual machine includes a hash table.
  • eBPF is a set of virtual machines implemented in the kernel. It was originally designed to filter network packets. Now it has the ability to insert and execute virtual machine code anywhere in the kernel, and generally inserts virtual machine code. A large number of tests will be performed before to ensure that the stability of the system will not be affected.
  • the storage format specified in eBPF includes a hash table (Hash table).
  • a hash table is also called a hash table. It is a data structure that is directly accessed according to a key value (key value). It maps key values to A place in the table to access records to speed up lookups. The advantage of this setting is that it can further ensure system stability and reduce interaction between kernel space and user space.
  • storing the file access information by using a storage format corresponding to the preset virtual machine includes: using a file identifier corresponding to the file access information as a key value of a hash table, The file access information is stored, wherein the file identifier includes a device number to which the file belongs and an i-node of the file.
  • the advantage of this setting is that the file access information can be stored concisely and accurately, and it is convenient to query.
  • the device number may be understood as a label of a hardware or software partition in the terminal, such as a label of a data area, a system area, and a memory card area.
  • Inodes can be used to identify files. There may be files with the same index node in two different device numbers. Therefore, in the embodiment of the present application, the device number and the inode number may be combined as a key value and stored in the hash table.
  • the method before the storing the file access information by using the storage format corresponding to the preset virtual machine, the method further includes: obtaining a preset hash table, where the preset hash table is The user space is passed to the kernel space, and filter condition information is stored in the preset hash table; and the file access information is filtered according to the filter condition information in the preset hash table.
  • the storing the file access information by using the storage format corresponding to the preset virtual machine includes storing the filtered file access information by using the storage format corresponding to the preset virtual machine. .
  • filtering can be forward filtering or reverse filtering. The advantage of this setting is that before the file access information is stored, it can be selectively filtered to further reduce the storage amount.
  • the preset hash table is passed from user space to kernel space, which allows users to set the preset hash table by themselves, such as selecting an application or a file under a path that they care about as filter condition information. Instruct kernel space to filter out file access information corresponding to these files for storage.
  • the application name or path can be used as the key value
  • the filtering method can be used as the corresponding storage content (such as reserved or filtered out).
  • the filtering method can be used as the key value
  • the application name or path can be used as the corresponding Store content and more.
  • it may further include: receiving a filtering condition setting operation through user space, and generating a preset hash table according to the filtering condition setting operation; User space is passed to the kernel space.
  • the method further includes: upon receiving the preset reading in the user space.
  • a hash table storing file access information is queried according to the preset read request, and the query result is fed back to the user space.
  • further operations such as analysis or statistics on the query result may be performed in the user space. For example, analyze the files that are read more frequently, and focus on these files as hot files; for example, analyze the access order of multiple files, which type of files are accessed from beginning to end, and which types of files Use random access to analyze user's file access habits.
  • FIG. 2 is a schematic flowchart of another file access tracking method according to an embodiment of the present application. Taking a preset virtual machine as an eBPF as an example, the method includes the following steps:
  • Step 201 It is detected that a preset file access event is triggered.
  • whether a preset file access event is triggered may be determined according to whether a system call interface corresponding to the preset file access event is called. If it is called, the preset file access event may be considered to be triggered.
  • the preset file access events may include read, write, fsync, and fdatasync.
  • Step 202 Determine whether there is a preset program code written based on eBPF at the start position of the function to be called corresponding to the preset file access event in the virtual file system layer in the kernel space, and if yes, perform step 203; Otherwise, go to step 207.
  • the corresponding function to be called may be a vfs_read function, etc.
  • the function prototype may be ssize_t vfs_read (struct file * file, char__user * buf, size_t count, loff_t * pos).
  • Step 203 Obtain the content of the function parameters corresponding to the function to be called and the content of the kernel data structure corresponding to the function to be called through the preset program code.
  • the preset program code can be executed first, and then the function to be called is executed. Some file access information may exist in the function parameters, and some may exist in the corresponding kernel data structure. Therefore, the preset program code may be used. Obtain the contents of function parameters and kernel data structures involved in the execution of the called function.
  • Step 204 Determine file access information corresponding to the function to be called according to the content of the function parameters and the content of the kernel data structure.
  • the file access information may include file name, path, size, and offset information, and the offset information is used to indicate a file access location.
  • the parameter buf points to a memory address in user space, so that the file path in the file access information can be obtained.
  • the above file access information can be obtained comprehensively.
  • some kernel data structures may have file descriptions or numbers used to indicate file names. There is a one-to-one correspondence between file descriptions or numbers and file names, so they can be converted to the required file names.
  • Step 205 Use the file identifier corresponding to the file access information as a key value of the hash table to store the file access information.
  • the file identifier includes a device number to which the file belongs and an i-node of the file.
  • the device number and the i-node can be combined into a continuous string to form a key value of the hash table.
  • Step 206 When a preset read request from the user space is received in the kernel space, a hash table storing file access information is queried according to the preset read request, and the query result is fed back to the user space.
  • Step 207 Execute the function to be called.
  • the file access tracking method provided by the embodiment of the present application utilizes the eBPF framework to implement real-time file access statistics in the kernel, and inserts an eBPF program at the starting position of the function to be called to implement file access information acquisition and stores in
  • the kernel space queries the hash table according to the preset read request, and feeds back the query result to the user space.
  • Practice has proved that the amount of each transmission can be reduced to tens of KB. It can be seen that by using the solution provided in the embodiments of the present application, the interaction between the kernel space and the user space can be effectively reduced, the burden of the file access tracking on the system, and the system stability can be improved. .
  • FIG. 3 is a schematic flowchart of another file access tracking method according to an embodiment of the present application. The method includes:
  • Step 301 Receive a filtering condition setting operation through user space, and generate a first hash table according to the filtering condition setting operation.
  • the file tracking setting interface may be displayed to the user in the terminal, and the user may enter a filtering condition setting operation based on the setting interface, such as selecting an application or a storage path that he or she cares about as a target application or a target storage path.
  • the user space generates a first hash table according to the filtering condition setting operation input by the user, and is used to instruct the kernel space to track the files in the target application or the files in the target storage path in real time.
  • Step 302 Pass the first hash table from user space to kernel space.
  • Step 303 It is detected that a preset file access event is triggered.
  • Step 304 In the virtual file system layer in the kernel space, a preset program code written based on eBPF is detected at a start position of a function to be called corresponding to a preset file access event.
  • Step 305 Obtain the content of the function parameters corresponding to the function to be called and the content of the kernel data structure corresponding to the function to be called through the preset program code.
  • Step 306 Determine file access information corresponding to the function to be called according to the content of the function parameters and the content of the kernel data structure.
  • the file access information may include file name, path, size, and offset information, and the offset information is used to indicate a file access location.
  • Step 307 Filter the file access information according to the filter condition information in the first hash table.
  • Step 308 Use the file identifier corresponding to the filtered file access information as a key value of the hash table, store the filtered file access information, and obtain a second hash table.
  • the file identifier includes a device number to which the file belongs and an i-node of the file.
  • the device number and the i-node can be combined into a continuous string to form a key value of the second hash table.
  • Step 309 When a preset read request from the user space is received in the kernel space, a second hash table is queried according to the preset read request, and the query result is fed back to the user space.
  • the file access tracking method provided by the embodiment of the present application can pre-set the files that need to be tracked by the user. After the file access information is obtained through the eBPF program in the kernel space, the corresponding filtering is performed according to the user's settings and then stored in In the hash table, not only can the storage amount be further reduced, but the pertinence and personalization of file access tracking can be enhanced. When the user space needs to read the file access information, the interaction between the kernel space and the user space can be further reduced. Magnitude of file tracking facilitates deployment to mass production systems.
  • FIG. 4 is a structural block diagram of a file access tracking device according to an embodiment of the present application.
  • the device may be implemented by software and / or hardware, and is generally integrated in a terminal.
  • File access tracking may be performed by executing a file access tracking method.
  • the device includes:
  • the judging module 401 is configured to judge whether a preset program code written based on a preset virtual machine exists in a kernel space before a function to be called corresponding to the preset file access event is triggered when a preset file access event is triggered. ;
  • the access information acquisition module 402 is configured to acquire the file access information corresponding to the function to be called through the preset program code when the judgment result of the judgment module is present;
  • the access information storage module 403 is configured to store the file access information in a storage format corresponding to the preset virtual machine for user space to read.
  • the file access tracking device when a preset file access event is triggered, if it is determined that a function to be called corresponding to the preset file access event exists in a kernel space, there is a preset program written based on a preset virtual machine. Code, the file access information corresponding to the function to be called is obtained through the preset program code, and the file access information is stored in a preset storage format corresponding to the virtual machine for user space reading.
  • a preset program code can be inserted before a function to be called corresponding to a preset file access event in the kernel space, and the preset program code can be used to obtain and Storing file access information for user space reading can reduce the interaction between kernel space and user space, reduce the burden of file access tracking on the system, and improve system stability.
  • the determining module 401 is configured to:
  • the file access information includes offset information, where the offset information is used to indicate a file access location.
  • the access information acquisition module 402 is configured to:
  • the preset virtual machine includes an extended Berkeley packet filter eBPF, and the storage format corresponding to the preset virtual machine includes a hash table.
  • the access information storage module 403 is configured to:
  • the file identification information corresponding to the file access information is used as a key value of the hash table to store the file access information, where the file identification includes a device number to which the file belongs and an index node of the file.
  • the device further includes:
  • the preset hash table acquisition module is configured to obtain a preset hash table before storing the file access information in the storage format corresponding to the preset virtual machine, wherein the preset hash table
  • the user space is passed to the kernel space, and filter condition information is stored in the preset hash table.
  • the filtering module is configured to filter the file access information according to the filtering condition information in the preset hash table.
  • the access information storage module 403 is configured to:
  • the filtered file access information is stored in a storage format corresponding to the preset virtual machine.
  • the device further includes:
  • the query result feedback module is configured to receive the user space preset in the kernel space after storing the file access information in a storage format corresponding to the preset virtual machine for reading by the user space.
  • a hash table storing file access information is queried according to the preset read request, and the query result is fed back to the user space.
  • the device may further include:
  • a preset hash table generating module configured to receive a filtering condition setting operation through user space, and generate a preset hash table according to the filtering condition setting operation;
  • the preset hash table transfer module is configured to transfer the preset hash table from the user space to the kernel space.
  • An embodiment of the present application further provides a storage medium including computer-executable instructions, which are used to execute a file access tracking method when executed by a computer processor, and the method includes:
  • the file access information is stored in a storage format corresponding to the preset virtual machine for user space reading.
  • Storage medium any one or more types of memory devices or storage devices.
  • the term "storage medium” is intended to include: installation media, such as Compact Disc Read-Only Memory (CD-ROM), floppy disks or magnetic tape devices; computer system memory or random access memory, such as dynamic random access memory Access Memory (Dynamic Random Access Memory, DRAM), Double-Rate Random Access Memory (Double Random Access Memory, DDRRAM), Static Random Access Memory (Static Random-Access Memory, SRAM), extended data output random storage Access memory (Extended Data Output Random Access Memory (EDORAM), Rambus Random Access Memory (Rambus Random Access Memory, Rambus RAM), etc .; non-volatile memory such as flash memory, magnetic media (such as hard disk or optical storage) Registers or other similar types of memory elements, etc.
  • installation media such as Compact Disc Read-Only Memory (CD-ROM), floppy disks or magnetic tape devices
  • computer system memory or random access memory such as dynamic random access memory Access Memory (Dynamic Random Access Memory, DRAM
  • the storage medium may further include other types of memory or a combination thereof.
  • the storage medium may be located in a first computer system in which the program is executed, or may be located in a different second computer system connected to the first computer system through a network such as the Internet.
  • the second computer system may provide program instructions to the first computer, and the first computer is configured to execute the program instructions.
  • the term "storage medium" may include two or more storage media that may reside in different locations, such as in different computer systems connected through a network.
  • the storage medium may store program instructions (for example, embodied as a computer program) executable by one or more processors.
  • a storage medium including computer-executable instructions provided in the embodiments of the present application is not limited to the file access tracking operation described above, and can also perform the file access tracking provided by any embodiment of the application. Related operations in the method.
  • FIG. 5 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • the terminal 500 may include: a memory 501, a processor 502, and a computer program stored on the memory 501 and executable on the processor 502. When the processor 502 executes the computer program, file access as described in the embodiments of the present application is implemented. Tracking method.
  • the terminal provided in the embodiment of the present application may be based on a preset virtual machine implemented in the kernel space, insert a preset program code before a function to be called corresponding to a preset file access event in the kernel space, and use the preset program code.
  • Obtaining and storing file access information for user space reading can reduce the interaction between kernel space and user space, reduce the burden of file access tracking on the system, and improve system stability.
  • FIG. 6 is a schematic structural diagram of another terminal provided by an embodiment of the present application.
  • the terminal may include: a housing (not shown in the figure), a memory 601, a central processing unit (CPU) 602 (also referred to as processing). Device), a circuit board (not shown in the figure), and a power supply circuit (not shown in the figure).
  • the circuit board is disposed in a space surrounded by the housing; the CPU 602 and the memory 601 are disposed on the circuit board; and the power supply circuit is configured to supply power to each circuit or device of the terminal
  • the memory 601 is configured to store executable preset program code; the CPU 602 runs a computer program corresponding to the executable preset program code by reading the executable preset program code stored in the memory 601 To achieve the following steps:
  • the file access information is stored in a storage format corresponding to the preset virtual machine for user space reading.
  • the terminal further includes: a peripheral interface 603, a radio frequency (RF) circuit 605, an audio circuit 606, a speaker 611, a power management chip 608, an input / output (I / O) subsystem 609, and other
  • RF radio frequency
  • the input / control device 610, the touch screen 612, other input / control devices 610, and the external port 604, these components communicate through one or more communication buses or signal lines 607.
  • the illustrated terminal 600 is only an example of the terminal, and the terminal 600 may have more or fewer components than those shown in the figure, may combine two or more components, or may have Different component configurations.
  • the one or more components shown in the figures may be implemented in hardware, software, or a combination of hardware and software including one or more signal processing and / or application specific integrated circuits.
  • the terminal for file access tracking provided in this embodiment is described in detail below.
  • the terminal uses a mobile phone as an example.
  • Memory 601 which can be accessed by CPU602, peripheral interface 603, etc.
  • the memory 601 can include high-speed random access memory, and can also include non-volatile memory, such as one or more disk storage devices, flash memory devices , Or other volatile solid-state storage devices.
  • Peripheral interface 603, which can connect the input and output peripherals of the device to the CPU 602 and the memory 601.
  • the I / O subsystem 609 which can connect input / output peripherals on the device, such as touch screen 612 and other input / control devices 610, to peripheral interface 603.
  • the I / O subsystem 609 may include a display controller 6091 and one or more input controllers 6092 for controlling other input / control devices 610.
  • one or more input controllers 6092 receive electrical signals from or send electrical signals to other input / control devices 610.
  • Other input / control devices 610 may include physical buttons (press buttons, rocker buttons, etc.) ), Dial, slide switch, joystick, click wheel.
  • the input controller 6092 can be connected to any of the following: a keyboard, an infrared port, a USB interface, and a pointing device such as a mouse.
  • a touch screen 612 which is an input interface and an output interface between a user terminal and a user, and displays a visual output to the user.
  • the visual output may include graphics, text, icons, videos, and the like.
  • the display controller 6091 in the I / O subsystem 609 receives electrical signals from the touch screen 612 or sends electrical signals to the touch screen 612.
  • the touch screen 612 detects a contact on the touch screen, and the display controller 6091 converts the detected contact into interaction with a user interface object displayed on the touch screen 612, that is, realizes human-computer interaction.
  • the user interface object displayed on the touch screen 612 may be an operation Icons for games, icons connected to the appropriate network, etc.
  • the device may further include a light mouse, which is a touch-sensitive surface that does not display a visible output, or an extension of the touch-sensitive surface formed by a touch screen.
  • the RF circuit 605 is configured to establish communication between the mobile phone and the wireless network (that is, the network side), and realize data reception and transmission of the mobile phone and the wireless network. For example, send and receive text messages, e-mail, and so on.
  • the RF circuit 605 receives and sends RF signals.
  • the RF signals are also referred to as electromagnetic signals.
  • the RF circuit 605 converts electrical signals into electromagnetic signals or converts electromagnetic signals into electrical signals, and communicates with the communication network through the electromagnetic signals. As well as other devices.
  • RF circuit 605 may include known circuits for performing these functions, including but not limited to antenna systems, RF transceivers, one or more amplifiers, tuners, one or more oscillators, digital signal processors, codec (COder-DECoder, CODEC) chipset, Subscriber Identity Module (SIM), and so on.
  • codec COder-DECoder
  • CODEC CODEC
  • SIM Subscriber Identity Module
  • the audio circuit 606 is configured to receive audio data from the peripheral interface 603, convert the audio data into an electrical signal, and send the electrical signal to the speaker 611.
  • the speaker 611 is configured to restore a voice signal received by the mobile phone from the wireless network through the RF circuit 605 to a sound and play the sound to a user.
  • the power management chip 608 is configured to provide power and power management for the hardware connected to the CPU 602, the I / O subsystem 609, and the peripheral interface 603.
  • the file access tracking device, storage medium, and terminal provided in the foregoing embodiments can execute the file access tracking method provided by any embodiment of the present application, and have corresponding function modules and effects for executing the method.
  • a file access tracking method provided in any embodiment of the present application.

Abstract

A file access tracking method, device, storage medium and terminal. The method comprises: when a preset file access event is triggered, determining whether a preset program code written on the basis of a preset virtual machine is present before a function to be called corresponding to the preset file access event in a kernel space (110); in response to the determination result that a preset program code written on the basis of a preset virtual machine is present, by means of the preset program code, acquiring file access information corresponding to the function to be called (120); storing the file access information by using a storage format corresponding to the preset virtual machine for user space reading (130).

Description

文件访问追踪方法、装置、存储介质及终端File access tracking method, device, storage medium and terminal
本申请要求在2018年9月26日提交中国专利局、申请号为201811126366.8的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims priority from a Chinese patent application filed with the Chinese Patent Office on September 26, 2018, with an application number of 201811126366.8, the entire contents of which are incorporated herein by reference.
技术领域Technical field
本申请实施例涉及终端技术领域,例如涉及文件访问追踪方法、装置、存储介质及终端。The embodiments of the present application relate to the technical field of terminals, for example, to a method, a device, a storage medium, and a terminal for tracking file access.
背景技术Background technique
很多数据或信息都是以文件的形式进行存储的,在终端的操作系统中,对一类或多类文件的访问比较频繁,如何对文件访问进行优化,是系统优化课题中重要的一环。A lot of data or information is stored in the form of files. In the operating system of the terminal, one or more types of files are accessed frequently. How to optimize file access is an important part of the system optimization topic.
在研究对文件访问的优化时,通常需要对文件的访问进行追踪,以了解文件被访问时的相关信息。然而,相关技术中的文件追踪方案仍不够完善,需要改进。When researching the optimization of file access, it is usually necessary to track the file access to understand the relevant information when the file is accessed. However, the file tracking solution in related technologies is still not perfect and needs to be improved.
发明内容Summary of the Invention
本申请实施例提供一种文件访问追踪方法、装置、存储介质及终端,可以优化文件访问追踪方案。The embodiments of the present application provide a file access tracking method, device, storage medium, and terminal, which can optimize a file access tracking scheme.
在一实施例中,本申请实施例提供了一种文件访问追踪方法,包括:In an embodiment, an embodiment of the present application provides a file access tracking method, including:
在预设文件访问事件被触发的情况下,判断内核空间中与所述预设文件访问事件对应的待调用函数之前,是否存在基于预设虚拟机编写的预设程序代码;If a preset file access event is triggered, determining whether there is a preset program code written based on a preset virtual machine in a kernel space before a function to be called corresponding to the preset file access event;
响应于存在所述基于预设虚拟机编写的预设程序代码的判断结果,通过所述预设程序代码获取所述待调用函数对应的文件访问信息;Responding to the judgment result of the preset program code written based on the preset virtual machine, obtaining file access information corresponding to the function to be called through the preset program code;
采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,以供用户空间读取。The file access information is stored in a storage format corresponding to the preset virtual machine for user space reading.
在一实施例中,本申请实施例提供了一种文件访问追踪装置,包括:In an embodiment, an embodiment of the present application provides a file access tracking device, including:
判断模块,设置为在预设文件访问事件被触发的情况下,判断内核空间中 与所述预设文件访问事件对应的待调用函数之前,是否存在基于预设虚拟机编写的预设程序代码;A judging module configured to judge whether a preset program code written based on a preset virtual machine exists in a kernel space before a function to be called corresponding to the preset file access event is triggered when a preset file access event is triggered;
访问信息获取模块,设置为响应于存在所述基于预设虚拟机编写的预设程序代码的判断结果,通过所述预设程序代码获取所述待调用函数对应的文件访问信息;The access information acquisition module is configured to acquire file access information corresponding to the function to be called through the preset program code in response to a judgment result of the preset program code written based on the preset virtual machine;
访问信息存储模块,设置为采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,以供用户空间读取。The access information storage module is configured to store the file access information in a storage format corresponding to the preset virtual machine for user space to read.
在一实施例中,本申请实施例提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如本申请实施例所述的文件访问追踪方法。In an embodiment, a computer-readable storage medium is provided in the embodiment of the present application, and a computer program is stored on the computer program. When the computer program is executed by the processor, the file access tracking method according to the embodiment of the present application is implemented.
在一实施例中,本申请实施例提供了一种终端,包括存储器,处理器及存储在存储器上并可在处理器运行的计算机程序,所述处理器执行所述计算机程序时实现如本申请实施例所述的文件访问追踪方法。In an embodiment, an embodiment of the present application provides a terminal, including a memory, a processor, and a computer program stored in the memory and executable by the processor. When the processor executes the computer program, the implementation is as in the present application The file access tracking method according to the embodiment.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本申请实施例提供的一种文件访问追踪方法的流程示意图;FIG. 1 is a schematic flowchart of a file access tracking method according to an embodiment of the present application;
图2为本申请实施例提供的另一种文件访问追踪方法的流程示意图;2 is a schematic flowchart of another file access tracking method according to an embodiment of the present application;
图3为本申请实施例提供的又一种文件访问追踪方法的流程示意图;3 is a schematic flowchart of another file access tracking method according to an embodiment of the present application;
图4为本申请实施例提供的一种文件访问追踪装置的结构框图;4 is a structural block diagram of a file access tracking device according to an embodiment of the present application;
图5为本申请实施例提供的一种终端的结构示意图;5 is a schematic structural diagram of a terminal according to an embodiment of the present application;
图6为本申请实施例提供的又一种终端的结构示意图。FIG. 6 is a schematic structural diagram of still another terminal provided by an embodiment of the present application.
具体实施方式detailed description
下面结合附图并通过具体实施方式来进一步说明本申请的技术方案。此处所描述的具体实施例用于解释本申请,而非对本申请的限定。为了便于描述,附图中仅示出了与本申请相关的部分而非全部结构。The technical solutions of the present application will be further described below with reference to the accompanying drawings and specific embodiments. The specific embodiments described herein are used to explain the present application, rather than limiting the present application. For ease of description, only some, but not all, structures related to the present application are shown in the drawings.
本文一些示例性实施例被描述成作为流程图描绘的处理或方法。虽然流程图将多个步骤描述成顺序的处理,但是其中的许多步骤可以被并行地、并发地 或者同时实施。此外,一个或多个步骤的顺序可以被重新安排。当其操作完成时所述处理可以被终止,但是还可以具有未包括在附图中的附加步骤。所述处理可以对应于方法、函数、规程、子例程、子程序等等。Some exemplary embodiments herein are described as processes or methods depicted as flowcharts. Although flowcharts describe multiple steps as sequential processing, many of these steps can be performed in parallel, concurrently, or simultaneously. In addition, the order of one or more steps may be rearranged. The process may be terminated when its operation is completed, but may also have additional steps not included in the drawings. The processing may correspond to methods, functions, procedures, subroutines, subroutines, and so on.
图1为本申请实施例提供的一种文件访问追踪方法的流程示意图,该方法可以由文件访问追踪装置执行,其中该装置可由软件和/或硬件实现,一般可集成在终端中。如图1所示,该方法包括:FIG. 1 is a schematic flowchart of a file access tracking method according to an embodiment of the present application. The method may be performed by a file access tracking device, where the device may be implemented by software and / or hardware, and may generally be integrated in a terminal. As shown in Figure 1, the method includes:
步骤110、在预设文件访问事件被触发的情况下,判断内核空间中与所述预设文件访问事件对应的待调用函数之前,是否存在基于预设虚拟机编写的预设程序代码。Step 110: When a preset file access event is triggered, determine whether there is a preset program code written based on a preset virtual machine in a kernel space before a function to be called corresponding to the preset file access event.
示例性的,本申请实施例中的终端可包括手机、平板电脑、笔记本电脑、计算机以及智能家电等设置安装有操作系统的设备。Exemplarily, the terminal in the embodiment of the present application may include a device provided with an operating system, such as a mobile phone, a tablet computer, a notebook computer, a computer, and a smart home appliance.
本申请实施例中对操作系统的类型不做限定,例如可包括安卓(Android)操作系统、窗口(Windows)操作系统以及苹果(ios)操作系统等等。为了便于说明,本申请实施例将以安卓操作系统为例进行后续的说明。在终端的Android操作系统中,对一类或多类文件的访问非常频繁,如何对文件访问进行优化,是系统优化课题中重要的一环。在相关技术中,一般通过文件名称或类型(如文本或图片等)等文件系统提供的已有的特征进行分类,然而这种分类比较简单粗暴,在很多场景下并不精确,比如两个不同的文本的使用方式可能不同,使用方式例如可包括访问顺序(如从文件开头顺序读到文件末尾是一种访问顺序,从文件中间顺序读到文件末尾是另一种访问顺序)。而如果能够对文件访问进行实时追踪,则能够更加准确地对文件进行分类。对于实时追踪文件的访问,相关技术中的方案一般会对系统产生较大的负担,且容易影响系统的稳定性,需要改进。The type of the operating system is not limited in the embodiments of the present application, and may include, for example, an Android operating system, a Windows operating system, and an Apple operating system. For ease of description, the embodiments of this application will take the Android operating system as an example for subsequent description. In the terminal's Android operating system, access to one or more types of files is very frequent. How to optimize file access is an important part of the system optimization topic. In related technologies, classification is generally based on existing features provided by file systems such as file names or types (such as text or pictures). However, this classification is relatively simple and crude, and is not accurate in many scenarios, such as two different The text may be used in different ways. For example, the use may include the access order (such as reading from the beginning of the file to the end of the file is an access order, and reading from the middle of the file to the end of the file is another access order). If you can track file access in real time, you can classify files more accurately. For real-time tracking of file access, the solutions in related technologies generally place a large burden on the system and easily affect the stability of the system, which needs to be improved.
示例性的,通常可通过对文件访问的内核信息进行追踪来达到文件访问追踪的目的。一般会使用类似跟踪工具(ftrace)的Linux内核级的追踪框架。最早ftrace是一个函数追踪器(function tracer),仅能够记录内核的函数调用流程,如今ftrace已经发展成为一个框架(framework),可采用插入(plugin)的方式支持开发人员添加更多种类的追踪(trace)功能,帮助开发人员了解Linux内核的运行时行为,以便进行故障调试或性能分析,因此,可以用于对文件访问的内核信息进行追踪。然而,采用ftrace一类的追踪框架进行文件访问追踪时,会将收集到的信息写入环形缓存(ring buffer)中,且收集过程中不能对信息进行 筛选或统计等操作,在收集内核信息时,仅能够获取内核标准格式的信息,导致收集的信息量较大,而ring buffer中的存储空间有限,为了防止信息丢失,用户空间程序就会不断地读取ring buffer中的内容,然后在用户空间对读取到的信息进行筛选、计算或统计等操作。这样,会产生大量的内核空间和用户空间的交互,用户空间程序会占用大量的处理器,如中央处理器(central processing unit,CPU),的资源。另外,若想要收集内核标准格式信息以外其他类型信息,则需要在内核中插入相应的代码,基于ftrace等架构在内核中插入代码尝尝会导致内核崩溃等异常,降低系统稳定性,导致开发周期较长,在实际量产系统中难以集成。Exemplarily, the purpose of file access tracking can be achieved by tracking the kernel information of file access. Linux kernel-level tracing frameworks like ftrace are generally used. The earliest ftrace is a function tracer, which can only record the function call process of the kernel. Now ftrace has developed into a framework, which can be used to support developers to add more kinds of traces (plugin). The trace) function helps developers understand the runtime behavior of the Linux kernel for fault debugging or performance analysis. Therefore, it can be used to trace kernel information for file access. However, when using a tracing framework such as ftrace for file access tracing, the collected information is written into a ring buffer (ring buffer), and the information cannot be filtered or counted during the collection process. When collecting kernel information Only the information in the kernel standard format can be obtained, resulting in a large amount of collected information, and the storage space in the ring buffer is limited. In order to prevent information loss, user space programs will continuously read the content in the ring buffer, and then in the user The space performs operations such as filtering, calculation or statistics on the read information. In this way, a large amount of kernel space and user space interactions will be generated, and user space programs will occupy a large number of processors, such as central processing unit (CPU) resources. In addition, if you want to collect other types of information than the standard format information of the kernel, you need to insert the corresponding code in the kernel. Inserting code into the kernel based on architectures such as ftrace will cause exceptions such as kernel crashes, reduce system stability, and lead to development. The cycle is long and it is difficult to integrate in the actual mass production system.
本申请实施例中,可以基于实现在内核空间中的预设虚拟机实现文件访问追踪。示例性的,所述预设文件访问事件可包括文件读取(read)、文件写入(write)、文件同步(fsync)和文件数据同步(fdatasync)中的至少一种。其中,read表示读取文件;write表示写文件;fsync表示同步内存中所有已修改的文件数据到储存设备,fsync除了同步文件的修改内容(脏页),还会同步文件的描述信息(元数据(metadata),包括大小(size)、访问时间和修改时间(st_atime&st_mtime)等等);fdatasync表示刷新数据到磁盘,fdatasync的功能与fsync类似,但是仅仅在必要的情况下才会同步metadata,因此可以减少一次输入/输出(Input/Output,IO)写操作。当然,还可包含其他文件访问事件,不同的操作系统中文件访问事件的类型或叫法可能不同,本领域技术人员可根据实际使用的操作系统进行适应性的选择。In the embodiment of the present application, file access tracking may be implemented based on a preset virtual machine implemented in a kernel space. Exemplarily, the preset file access event may include at least one of a file read, a file write, a file synchronization (fsync), and a file data synchronization (fdatasync). Among them, read means reading the file; write means writing the file; fsync means synchronizing all the modified file data in the memory to the storage device. In addition to synchronizing the modified content (dirty pages) of the file, fsync also synchronizes the file's description information (metadata (metadata), including size (size), access time and modification time (st_atime & st_mtime), etc .; fdatasync means refreshing data to disk. fdatasync functions similarly to fsync, but only synchronizes metadata if necessary, so you can Reduce one input / output (IO) write operation. Of course, other file access events may also be included, and the types or names of file access events may be different in different operating systems, and those skilled in the art may make adaptive selections according to the actual operating system used.
对于很多操作系统来说,内核一般基于Linux实现,系统底层一般为Linux内核(Linux Kernel),系统会进行内核空间和用户空间的划分,不同的操作系统划分方式或划分结果可能不同。用户空间一般指用户进程所在的内存区域,应用程序运行在用户空间,用户进程的数据存放于用户空间;而内核空间是操作系统占据的内存区域,操作系统和驱动程序运行在内核空间,操作系统的数据存放于系统空间。这样,可以将用户数据和系统数据进行隔离,保证系统的稳定性。一般的,用户空间和内核空间通过系统调用(system call)进行交互,系统调用可以理解为由操作系统实现提供的所有系统调用所构成的集合,即程序接口或应用编程接口(Application Programming Interface,API),是应用程序与系统之间的接口。操作系统的功能是为管理硬件资源和为应用程序开发人员提供良好的环境来使应用程序具有更好的兼容性,为了达到这个目的,内核提供一系 列具备预定功能的多内核函数,通过一组称为系统调用的接口呈现给用户。系统调用把应用程序的请求传给内核,调用相应的内核函数完成所需的处理,将处理结果返回给应用程序。For many operating systems, the kernel is generally implemented based on Linux, and the bottom layer of the system is generally the Linux kernel (Linux Kernel). The system divides the kernel space and user space. Different operating systems may have different division methods or division results. User space generally refers to the memory area where the user process is located. Application programs run in user space, and user process data is stored in user space. Kernel space is the memory area occupied by the operating system. The operating system and drivers run in kernel space. The operating system The data is stored in the system space. In this way, user data and system data can be isolated to ensure system stability. In general, user space and kernel space interact through system calls. System calls can be understood as a set of all system calls provided by the operating system implementation, that is, a program interface or application programming interface (Application Programming Interface, API). ) Is the interface between the application and the system. The function of the operating system is to manage hardware resources and provide a good environment for application developers to make applications more compatible. In order to achieve this purpose, the kernel provides a series of multi-kernel functions with predetermined functions. An interface called a system call is presented to the user. The system call passes the application's request to the kernel, calls the corresponding kernel function to complete the required processing, and returns the processing result to the application.
本申请实施例中,在应用程序对文件进行访问时,通过系统调用的方式访问内核空间,也即,调用相应的系统调用接口访问内核空间,因此,可根据预设文件访问事件对应的系统调用接口是否被调用来判断预设文件访问事件是否被触发,若被调用,则可认为预设文件访问事件被触发。In the embodiment of the present application, when an application program accesses a file, the kernel space is accessed through a system call, that is, the corresponding system call interface is called to access the kernel space. Therefore, a system call corresponding to a preset file access event may be used. Whether the interface is called to determine whether the preset file access event is triggered. If it is called, the preset file access event may be considered to be triggered.
示例性的,在调用相应的系统调用接口后,还调用内核空间中的相应函数来实现文件访问,可将该相应函数称为待调用函数。以上述的read、write、fsync和fdatasync为例,每个预设文件访问事件都对应着相应的待调用函数,用于实现读取文件、写入文件、文件同步以及文件数据同步的功能。在不同的操作系统中,函数具体的实现形式可能不同,本申请实施例不做限定。Exemplarily, after calling a corresponding system call interface, a corresponding function in the kernel space is also called to implement file access, and the corresponding function may be called a function to be called. Taking the aforementioned read, write, fsync, and fdatasync as examples, each preset file access event corresponds to a corresponding function to be called, and is used to implement functions of reading a file, writing a file, synchronizing a file, and synchronizing file data. The specific implementation form of the function may be different in different operating systems, which are not limited in the embodiments of the present application.
本申请实施例中,可预先基于预设虚拟机的程序代码编写方式,在内核空间中的待调用函数之前,插入预设程序代码,该预设程序代码用于获取文件访问信息。这样,预设文件访问事件被触发时,就可以判断是否存在基于预设虚拟机编写的预设程序代码,进而确定先执行该预设程序代码,还是直接执行待调用函数对应的代码。基于预设虚拟机插入预设程序代码,能够保证系统的稳定性。In the embodiment of the present application, a preset program code writing method based on a preset virtual machine may be used in advance, and a preset program code is inserted before a function to be called in the kernel space, and the preset program code is used to obtain file access information. In this way, when the preset file access event is triggered, it can be determined whether there is a preset program code written based on the preset virtual machine, and then it is determined whether the preset program code is executed first or the code corresponding to the function to be called is directly executed. Inserting the preset program code based on the preset virtual machine can ensure the stability of the system.
步骤120、响应于存在所述基于预设虚拟机编写的预设程序代码的判断结果,通过所述预设程序代码获取所述待调用函数对应的文件访问信息。Step 120: In response to a judgment result that the preset program code written based on the preset virtual machine exists, obtain file access information corresponding to the function to be called through the preset program code.
示例性的,在确定存在预设程序代码的情况下,可先执行该预设程序代码,然后再执行待调用函数,预设程序代码用于对执行待调用函数时的与文件访问相关的信息进行获取,这些信息本申请实施例中称为文件访问信息。文件访问信息可包括反应文件访问过程的细节信息,如访问的什么文件,该文件存储于哪里,该文件的属性信息,以及具体的访问方式等等。可选的,所述文件访问信息包括偏移信息,其中,所述偏移信息用于表示文件的访问位置。文件的访问位置例如可包括访问开始的位置和访问结束的位置,在访问一个文件时,可能并不会访问文件中的所有内容,有可能只访问部分内容,那么根据偏移信息就能够知道具体访问了文件中的哪些内容,是顺序访问、倒序访问、还是跳跃式访问等等。此外,文件访问信息还可包括文件名称、文件路径和文件大小中的至少一种。For example, when it is determined that a preset program code exists, the preset program code may be executed first, and then the function to be called is executed, and the preset program code is used to access file-related information when the function to be called is executed. This information is obtained, which is referred to as file access information in the embodiments of the present application. The file access information may include detailed information reflecting the file access process, such as what files are accessed, where the files are stored, attribute information of the files, and specific access methods, etc. Optionally, the file access information includes offset information, where the offset information is used to indicate a file access location. The access location of the file can include, for example, the location where the access starts and the location where the access ends. When accessing a file, it may not access all the content in the file, and it may access only a part of the content. According to the offset information, you can know the specific What content in the file was accessed, whether it was sequential access, reverse access, skip access, etc. In addition, the file access information may further include at least one of a file name, a file path, and a file size.
步骤130、采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,以供用户空间读取。Step 130: Store the file access information in a storage format corresponding to the preset virtual machine for user space reading.
本申请实施例中,利用预设虚拟机对应的存储格式对所获取的文件访问信息进行存储,而并非采用内核标准格式进行存储,能够节省存储空间。用户空间不需要不停地读取文件访问信息,而是可以定期或在有需求时一次性读取所需的文件访问信息即可,可有效减少内核空间和用户空间的交互次数,且由于采用预设虚拟机的存储格式,读取信息量更少,传输量小,所以交互数据量也减小,从而降低文件访问追踪对系统的负担,提高系统稳定性。In the embodiment of the present application, the obtained file access information is stored by using a preset storage format corresponding to the virtual machine, instead of storing in a kernel standard format, which can save storage space. The user space does not need to read the file access information constantly, but can read the required file access information on a regular basis or at a time when required, which can effectively reduce the number of interactions between kernel space and user space. The preset storage format of the virtual machine reduces the amount of read information and transmission, so the amount of interactive data is also reduced, thereby reducing the burden of file access tracking on the system and improving system stability.
本申请实施例中提供的文件访问追踪方法,预设文件访问事件被触发时,若判断出内核空间中与预设文件访问事件对应的待调用函数之前存在基于预设虚拟机编写的预设程序代码,则通过该预设程序代码获取待调用函数对应的文件访问信息,采用预设虚拟机对应的存储格式对文件访问信息进行存储,以供用户空间读取。通过采用上述技术方案,可以基于实现在内核空间中的预设虚拟机,在内核空间中的与预设文件访问事件对应的待调用函数之前插入预设程序代码,利用该预设程序代码获取并存储文件访问信息,供用户空间读取,可以减少内核空间和用户空间的交互,降低文件访问追踪对系统的负担,提高系统稳定性。In the file access tracking method provided in the embodiment of the present application, when a preset file access event is triggered, if it is determined that a function to be called corresponding to the preset file access event exists in a kernel space, there is a preset program written based on a preset virtual machine. Code, the file access information corresponding to the function to be called is obtained through the preset program code, and the file access information is stored in a preset storage format corresponding to the virtual machine for user space reading. By adopting the above technical solution, based on a preset virtual machine implemented in the kernel space, a preset program code can be inserted before a function to be called corresponding to a preset file access event in the kernel space, and the preset program code can be used to obtain and Storing file access information for user space reading can reduce the interaction between kernel space and user space, reduce the burden of file access tracking on the system, and improve system stability.
在一些实施例中,所述判断内核空间中与所述预设文件访问事件对应的待调用函数之前,是否存在基于预设虚拟机编写的预设程序代码,包括:判断内核空间中的虚拟文件系统层中,调用与所述预设文件访问事件对应的待调用函数的起始位置处,是否存在基于预设虚拟机编写的预设程序代码。这样设置的好处在于,不会破坏内核空间中已有代码。虚拟文件系统层可理解为一种文件抽象层,用于实现文件访问的待调用函数一般处于该层中,可以在调用待调用函数的起始位置处写入基于预设虚拟机的程序,用于获取文件访问信息,使得在判断是否存在预设程序代码时的时机和位置更加明确,保证预设程序代码能够成功被执行。In some embodiments, before the determining whether a function to be called corresponding to the preset file access event in the kernel space exists, the preset program code written based on the preset virtual machine includes determining a virtual file in the kernel space. In the system layer, whether a preset program code written based on a preset virtual machine exists at a starting position of a function to be called corresponding to the preset file access event. The advantage of this setting is that it does not destroy existing code in kernel space. The virtual file system layer can be understood as a file abstraction layer. The function to be called for implementing file access is generally in this layer. A program based on a preset virtual machine can be written at the starting position of the function to be called, and used to In order to obtain file access information, the timing and location of determining whether a preset program code exists is more clear, and the preset program code can be successfully executed.
在一些实施例中,所述通过所述预设程序代码获取所述待调用函数对应的文件访问信息,包括:通过所述预设程序代码获取所述待调用函数对应的函数参数内容和/或所述待调用函数对应的内核数据结构内容;根据所述函数参数内容和/或所述内核数据结构内容确定所述待调用函数对应的文件访问信息。这样设置的好处在于,能够成功准确地获取到文件访问信息。在执行待调用函数时, 一些文件访问信息会存在于函数参数里面,如文件路径、文件大小或偏移信息等;还有一些可能会存在于对应的内核数据结构中,如文件名称等。其中,存在的形式不做限定,在获取到函数参数内容或内核数据结构内容后,可能需要进行转换,从而得到最终所需的文件访问信息。In some embodiments, acquiring the file access information corresponding to the function to be called by using the preset program code includes: acquiring function parameter content and / or the function to be called by using the preset program code. Content of the kernel data structure corresponding to the function to be called; and determining file access information corresponding to the function to be called according to the function parameter content and / or the content of the kernel data structure. The advantage of this setting is that it can successfully and accurately obtain file access information. When the function to be called is executed, some file access information will exist in the function parameters, such as file path, file size, or offset information, and some may exist in corresponding kernel data structures, such as file names. Among them, the existing form is not limited. After obtaining the content of the function parameters or the content of the kernel data structure, it may be necessary to perform conversion to obtain the finally required file access information.
在一些实施例中,所述预设虚拟机包括扩展伯克利包过滤器(extended Berkeley Packet Filter,eBPF),所述预设虚拟机对应的存储格式包括哈希表。eBPF是一套实现在内核中的虚拟机,它最初被设计用来实现网络数据包的过滤,如今已经有了在内核任何位置进行插入虚拟机代码并实行的能力,并且一般插入的虚拟机代码在之前还会进行大量检测,保证不会影响系统的稳定性。在eBPF中规定的存储格式包括哈希表(Hash table),哈希表又称散列表,是根据关键码值(Key value,键值)而直接访问的数据结构,它通过把键值映射到表中一个位置来访问记录,以加快查找的速度。这样设置的好处在于,能够进一步保证系统稳定性,减少内核空间和用户空间的交互。In some embodiments, the preset virtual machine includes an extended Berkeley Packet Filter (eBPF), and a storage format corresponding to the preset virtual machine includes a hash table. eBPF is a set of virtual machines implemented in the kernel. It was originally designed to filter network packets. Now it has the ability to insert and execute virtual machine code anywhere in the kernel, and generally inserts virtual machine code. A large number of tests will be performed before to ensure that the stability of the system will not be affected. The storage format specified in eBPF includes a hash table (Hash table). A hash table is also called a hash table. It is a data structure that is directly accessed according to a key value (key value). It maps key values to A place in the table to access records to speed up lookups. The advantage of this setting is that it can further ensure system stability and reduce interaction between kernel space and user space.
在一些实施例中,所述采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,包括:以所述文件访问信息对应的文件标识作为哈希表的键值,对所述文件访问信息进行存储,其中,所述文件标识包括文件所属的设备号和文件的索引节点。这样设置的好处在于,能够简洁准确地对文件访问信息进行存储,且方便查询。其中,设备号可理解为终端中硬件或软件的分区的标号,如数据(data)区、系统(system)区以及存储卡区等的标号。索引节点(inode)可用于对文件进行标识。不相同的两个设备号中可能存在索引节点相同的文件,因此,本申请实施例中可以将设备号和inode号进行结合作为键值存储在哈希表中。In some embodiments, storing the file access information by using a storage format corresponding to the preset virtual machine includes: using a file identifier corresponding to the file access information as a key value of a hash table, The file access information is stored, wherein the file identifier includes a device number to which the file belongs and an i-node of the file. The advantage of this setting is that the file access information can be stored concisely and accurately, and it is convenient to query. The device number may be understood as a label of a hardware or software partition in the terminal, such as a label of a data area, a system area, and a memory card area. Inodes can be used to identify files. There may be files with the same index node in two different device numbers. Therefore, in the embodiment of the present application, the device number and the inode number may be combined as a key value and stored in the hash table.
在一些实施例中,在所述采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储之前,还包括:获取预设哈希表,其中,所述预设哈希表由所述用户空间传递至所述内核空间,所述预设哈希表中存储有过滤条件信息;根据所述预设哈希表中的过滤条件信息对所述文件访问信息进行过滤。在一实施例中,所述采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,包括:采用所述预设虚拟机对应的存储格式对过滤后的文件访问信息进行存储。其中,过滤可以是正向过滤,也可以是反向过滤。这样设置的好处在于,在对文件访问信息进行存储之前,能够有选择性的进行过滤,进一步减少存储量。另外,预设哈希表由用户空间传递至内核空间,可支持由用户自行对预设 哈希表进行设置,如选取自己关心的某个应用程序或某个路径下的文件作为过滤条件信息,指示内核空间将这些文件对应的文件访问信息筛选出来进行存储。示例性的,可将应用名称或路径作为键值,将过滤方式作为对应的存储内容(如保留或滤除);还可反过来,将过滤方式作为键值,将应用名称或路径作为对应的存储内容等等。例如,可选的,在一些实施例中,还可包括:通过用户空间接收过滤条件设置操作,根据所述过滤条件设置操作生成预设哈希表;将所述预设哈希表由所述用户空间传递至所述内核空间。In some embodiments, before the storing the file access information by using the storage format corresponding to the preset virtual machine, the method further includes: obtaining a preset hash table, where the preset hash table is The user space is passed to the kernel space, and filter condition information is stored in the preset hash table; and the file access information is filtered according to the filter condition information in the preset hash table. In an embodiment, the storing the file access information by using the storage format corresponding to the preset virtual machine includes storing the filtered file access information by using the storage format corresponding to the preset virtual machine. . Among them, filtering can be forward filtering or reverse filtering. The advantage of this setting is that before the file access information is stored, it can be selectively filtered to further reduce the storage amount. In addition, the preset hash table is passed from user space to kernel space, which allows users to set the preset hash table by themselves, such as selecting an application or a file under a path that they care about as filter condition information. Instruct kernel space to filter out file access information corresponding to these files for storage. Exemplarily, the application name or path can be used as the key value, and the filtering method can be used as the corresponding storage content (such as reserved or filtered out). Alternatively, the filtering method can be used as the key value, and the application name or path can be used as the corresponding Store content and more. For example, optionally, in some embodiments, it may further include: receiving a filtering condition setting operation through user space, and generating a preset hash table according to the filtering condition setting operation; User space is passed to the kernel space.
在一些实施例中,在采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,以供用户空间读取之后,还包括:在接收到所述用户空间的预设读取请求时,根据所述预设读取请求查询存储有文件访问信息的哈希表,并将查询结果反馈给所述用户空间。这样设置的好处在于,用户空间可以按照自己的意愿随时或定期向内核空间发送预设读取请求,由内核空间完成查询,而不是由用户空间读取整个存储有文件访问信息的哈希表并自行查询,能够进一步减少用户空间与内核空间之间的交互数据量,降低追踪文件访问对系统的负载。可选的,在所述将查询结果反馈给所述用户空间之后,还可在用户空间中对所述查询结果进行进一步的分析或统计等操作。如分析读取次数较多的文件有哪些,将这些文件作为热点文件进行重点分析;又如,分析多个文件的访问顺序,哪种类型的文件采用从头至尾的顺序访问,哪些类型的文件采用随机访问,进而对用户的文件访问习惯进行分析等。In some embodiments, after storing the file access information in a storage format corresponding to the preset virtual machine for reading in user space, the method further includes: upon receiving the preset reading in the user space. When requested, a hash table storing file access information is queried according to the preset read request, and the query result is fed back to the user space. The advantage of this setting is that user space can send preset read requests to kernel space at any time or periodically according to its own wishes, and the kernel space completes the query, instead of user space reading the entire hash table that stores file access information and Querying by yourself can further reduce the amount of interactive data between user space and kernel space, and reduce the load on the system caused by trace file access. Optionally, after the query result is fed back to the user space, further operations such as analysis or statistics on the query result may be performed in the user space. For example, analyze the files that are read more frequently, and focus on these files as hot files; for example, analyze the access order of multiple files, which type of files are accessed from beginning to end, and which types of files Use random access to analyze user's file access habits.
图2为本申请实施例提供的另一种文件访问追踪方法的流程示意图,以预设虚拟机为eBPF为例,该方法包括如下步骤:FIG. 2 is a schematic flowchart of another file access tracking method according to an embodiment of the present application. Taking a preset virtual machine as an eBPF as an example, the method includes the following steps:
步骤201、检测到预设文件访问事件被触发。Step 201: It is detected that a preset file access event is triggered.
示例性的,可根据预设文件访问事件对应的系统调用接口是否被调用来判断预设文件访问事件是否被触发,若被调用,则可认为预设文件访问事件被触发。预设文件访问事件可包括read、write、fsync和fdatasync。Exemplarily, whether a preset file access event is triggered may be determined according to whether a system call interface corresponding to the preset file access event is called. If it is called, the preset file access event may be considered to be triggered. The preset file access events may include read, write, fsync, and fdatasync.
步骤202、判断内核空间中的虚拟文件系统层中,调用与预设文件访问事件对应的待调用函数的起始位置处,是否存在基于eBPF编写的预设程序代码,若是,则执行步骤203;否则,执行步骤207。Step 202: Determine whether there is a preset program code written based on eBPF at the start position of the function to be called corresponding to the preset file access event in the virtual file system layer in the kernel space, and if yes, perform step 203; Otherwise, go to step 207.
示例性的,假设当前的预设文件访问事件为read,对应的待调用函数可以是vfs_read函数等,函数原型可以是ssize_t vfs_read(struct file*file,char__user*buf,size_t count,loff_t*pos)。Exemplarily, assuming that the current preset file access event is read, the corresponding function to be called may be a vfs_read function, etc., and the function prototype may be ssize_t vfs_read (struct file * file, char__user * buf, size_t count, loff_t * pos).
步骤203、通过预设程序代码获取待调用函数对应的函数参数内容和待调用函数对应的内核数据结构内容。Step 203: Obtain the content of the function parameters corresponding to the function to be called and the content of the kernel data structure corresponding to the function to be called through the preset program code.
示例性的,可先执行预设程序代码,再执行待调用函数,一些文件访问信息会存在于函数参数里面,还有一些可能会存在于对应的内核数据结构中,因此可利用预设程序代码对待调用函数执行过程中涉及到的函数参数内容和内核数据结构内容进行获取。For example, the preset program code can be executed first, and then the function to be called is executed. Some file access information may exist in the function parameters, and some may exist in the corresponding kernel data structure. Therefore, the preset program code may be used. Obtain the contents of function parameters and kernel data structures involved in the execution of the called function.
步骤204、根据函数参数内容和内核数据结构内容确定待调用函数对应的文件访问信息。Step 204: Determine file access information corresponding to the function to be called according to the content of the function parameters and the content of the kernel data structure.
其中,文件访问信息可包括文件名称、路径、大小和偏移信息,所述偏移信息用于表示文件的访问位置。The file access information may include file name, path, size, and offset information, and the offset information is used to indicate a file access location.
例如,由上述vfs_read函数可以看出,参数buf指向用户空间的内存地址,从而能够得到文件访问信息中的文件路径。通过对函数参数内容和内核数据结构内容的获取,能够全面地得到上述文件访问信息。在获取到函数参数内容或内核数据结构内容后,可能需要进行转换,从而得到最终所需的文件访问信息。例如,有些内核数据结构中会存在文件描述或用于表示文件名称的数字等,文件描述或数字与文件名称存在一一对应的关系,因此,可以转换成所需要的文件名称。For example, from the above vfs_read function, it can be seen that the parameter buf points to a memory address in user space, so that the file path in the file access information can be obtained. Through the acquisition of function parameter content and kernel data structure content, the above file access information can be obtained comprehensively. After obtaining the content of the function parameters or the content of the kernel data structure, it may need to be converted to obtain the file access information ultimately required. For example, some kernel data structures may have file descriptions or numbers used to indicate file names. There is a one-to-one correspondence between file descriptions or numbers and file names, so they can be converted to the required file names.
步骤205、以文件访问信息对应的文件标识作为哈希表的键值,对文件访问信息进行存储。Step 205: Use the file identifier corresponding to the file access information as a key value of the hash table to store the file access information.
其中,所述文件标识包括文件所属的设备号和文件的索引节点,可将设备号和索引节点组合成一个连续的字符串,形成哈希表的键值。The file identifier includes a device number to which the file belongs and an i-node of the file. The device number and the i-node can be combined into a continuous string to form a key value of the hash table.
步骤206、在内核空间接收到用户空间的预设读取请求时,根据预设读取请求查询存储有文件访问信息的哈希表,并将查询结果反馈给用户空间。Step 206: When a preset read request from the user space is received in the kernel space, a hash table storing file access information is queried according to the preset read request, and the query result is fed back to the user space.
步骤207、执行待调用函数。Step 207: Execute the function to be called.
示例性的,若待调用函数被调用的起始位置并不存在预设程序代码,那么可说明当前的文件访问事件并不被关注,不需要对本次文件访问进行追踪,也不需要获取相应的文件访问信息,可直接执行待调用函数。Exemplarily, if there is no preset program code at the starting position of the function to be called, then it can be shown that the current file access event is not concerned, and there is no need to track the file access and obtain the corresponding File access information, you can directly execute the function to be called.
本申请实施例提供的文件访问追踪方法,利用eBPF框架在内核中实现实时的文件访问统计,通过在待调用函数被调用的起始位置处插入eBPF程序来实现文件访问信息的获取,并存储于哈希表中,当接收到用户空间的预设读取请求时,由内核空间根据预设读取请求对哈希表进行查询,并将查询结果反馈给用 户空间。实践证明,每次传输量可降低到几十KB,可见,通过采用本申请实施例提供的方案,可有效减少内核空间和用户空间的交互,降低文件访问追踪对系统的负担,提高系统稳定性。The file access tracking method provided by the embodiment of the present application utilizes the eBPF framework to implement real-time file access statistics in the kernel, and inserts an eBPF program at the starting position of the function to be called to implement file access information acquisition and stores in In the hash table, when a preset read request in the user space is received, the kernel space queries the hash table according to the preset read request, and feeds back the query result to the user space. Practice has proved that the amount of each transmission can be reduced to tens of KB. It can be seen that by using the solution provided in the embodiments of the present application, the interaction between the kernel space and the user space can be effectively reduced, the burden of the file access tracking on the system, and the system stability can be improved. .
图3为本申请实施例提供的又一种文件访问追踪方法的流程示意图,该方法包括:FIG. 3 is a schematic flowchart of another file access tracking method according to an embodiment of the present application. The method includes:
步骤301、通过用户空间接收过滤条件设置操作,根据过滤条件设置操作生成第一哈希表。Step 301: Receive a filtering condition setting operation through user space, and generate a first hash table according to the filtering condition setting operation.
示例性的,可在终端中向用户展示文件追踪设置界面,用户可基于该设置界面输入过滤条件设置操作,如选择自己关心的应用程序或者存储路径作为目标应用程序或目标存储路径。用户空间根据用户输入的过滤条件设置操作生成第一哈希表,用于指示内核空间对目标应用程序中的文件或目标存储路径下的文件进行实时的访问追踪。For example, the file tracking setting interface may be displayed to the user in the terminal, and the user may enter a filtering condition setting operation based on the setting interface, such as selecting an application or a storage path that he or she cares about as a target application or a target storage path. The user space generates a first hash table according to the filtering condition setting operation input by the user, and is used to instruct the kernel space to track the files in the target application or the files in the target storage path in real time.
步骤302、将第一哈希表由用户空间传递至内核空间。Step 302: Pass the first hash table from user space to kernel space.
步骤303、检测到预设文件访问事件被触发。Step 303: It is detected that a preset file access event is triggered.
步骤304、在内核空间中的虚拟文件系统层中,调用与预设文件访问事件对应的待调用函数的起始位置处,检测到基于eBPF编写的预设程序代码。Step 304: In the virtual file system layer in the kernel space, a preset program code written based on eBPF is detected at a start position of a function to be called corresponding to a preset file access event.
步骤305、通过预设程序代码获取待调用函数对应的函数参数内容和待调用函数对应的内核数据结构内容。Step 305: Obtain the content of the function parameters corresponding to the function to be called and the content of the kernel data structure corresponding to the function to be called through the preset program code.
步骤306、根据函数参数内容和内核数据结构内容确定待调用函数对应的文件访问信息。Step 306: Determine file access information corresponding to the function to be called according to the content of the function parameters and the content of the kernel data structure.
其中,文件访问信息可包括文件名称、路径、大小和偏移信息,所述偏移信息用于表示文件的访问位置。The file access information may include file name, path, size, and offset information, and the offset information is used to indicate a file access location.
步骤307、根据第一哈希表中的过滤条件信息对文件访问信息进行过滤。Step 307: Filter the file access information according to the filter condition information in the first hash table.
示例性的,可判断当前文件访问信息对应的文件是否属于目标应用程序(或是否处于目标路径下),若是,则说明是需要进行追踪的文件访问,可将当前文件访问信息进行保留;若不是,则说明是不需要进行追踪的文件访问,可忽略当前文件访问信息,即进行滤除。For example, it can be determined whether the file corresponding to the current file access information belongs to the target application (or whether it is in the target path). If so, it indicates that the file access needs to be tracked and the current file access information can be retained; if not , It means that there is no need to track file access, and the current file access information can be ignored, that is, filtering is performed.
步骤308、以过滤后的文件访问信息对应的文件标识作为哈希表的键值,对过滤后的文件访问信息进行存储,得到第二哈希表。Step 308: Use the file identifier corresponding to the filtered file access information as a key value of the hash table, store the filtered file access information, and obtain a second hash table.
其中,所述文件标识包括文件所属的设备号和文件的索引节点,可将设备号和索引节点组合成一个连续的字符串,形成第二哈希表的键值。The file identifier includes a device number to which the file belongs and an i-node of the file. The device number and the i-node can be combined into a continuous string to form a key value of the second hash table.
步骤309、在内核空间接收到用户空间的预设读取请求时,根据预设读取请求查询第二哈希表,并将查询结果反馈给用户空间。Step 309: When a preset read request from the user space is received in the kernel space, a second hash table is queried according to the preset read request, and the query result is fed back to the user space.
本申请实施例提供的文件访问追踪方法,可以由用户对需要进行访问追踪的文件进行预先设置,在内核空间通过eBPF程序获取到文件访问信息后,根据用户的设置进行相应的过滤,再存储于哈希表中,不仅可以进一步减少存储量,还可增强文件访问追踪的针对性和个性化,在用户空间需要读取文件访问信息时,能够进一步减少内核空间和用户空间的交互,实现了轻量级的文件追踪,有利于部署到量产系统中。The file access tracking method provided by the embodiment of the present application can pre-set the files that need to be tracked by the user. After the file access information is obtained through the eBPF program in the kernel space, the corresponding filtering is performed according to the user's settings and then stored in In the hash table, not only can the storage amount be further reduced, but the pertinence and personalization of file access tracking can be enhanced. When the user space needs to read the file access information, the interaction between the kernel space and the user space can be further reduced. Magnitude of file tracking facilitates deployment to mass production systems.
图4为本申请实施例提供的一种文件访问追踪装置的结构框图,该装置可由软件和/或硬件实现,一般集成在终端中,可通过执行文件访问追踪方法来进行文件访问追踪。如图4所示,该装置包括:FIG. 4 is a structural block diagram of a file access tracking device according to an embodiment of the present application. The device may be implemented by software and / or hardware, and is generally integrated in a terminal. File access tracking may be performed by executing a file access tracking method. As shown in Figure 4, the device includes:
判断模块401,设置为在预设文件访问事件被触发的情况下,判断内核空间中与所述预设文件访问事件对应的待调用函数之前,是否存在基于预设虚拟机编写的预设程序代码;The judging module 401 is configured to judge whether a preset program code written based on a preset virtual machine exists in a kernel space before a function to be called corresponding to the preset file access event is triggered when a preset file access event is triggered. ;
访问信息获取模块402,设置为在所述判断模块的判断结果为存在时,通过所述预设程序代码获取所述待调用函数对应的文件访问信息;The access information acquisition module 402 is configured to acquire the file access information corresponding to the function to be called through the preset program code when the judgment result of the judgment module is present;
访问信息存储模块403,设置为采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,以供用户空间读取。The access information storage module 403 is configured to store the file access information in a storage format corresponding to the preset virtual machine for user space to read.
本申请实施例中提供的文件访问追踪装置,预设文件访问事件被触发时,若判断出内核空间中与预设文件访问事件对应的待调用函数之前存在基于预设虚拟机编写的预设程序代码,则通过该预设程序代码获取待调用函数对应的文件访问信息,采用预设虚拟机对应的存储格式对文件访问信息进行存储,以供用户空间读取。通过采用上述技术方案,可以基于实现在内核空间中的预设虚拟机,在内核空间中的与预设文件访问事件对应的待调用函数之前插入预设程序代码,利用该预设程序代码获取并存储文件访问信息,供用户空间读取,可以减少内核空间和用户空间的交互,降低文件访问追踪对系统的负担,提高系统稳定性。In the file access tracking device provided in the embodiment of the present application, when a preset file access event is triggered, if it is determined that a function to be called corresponding to the preset file access event exists in a kernel space, there is a preset program written based on a preset virtual machine. Code, the file access information corresponding to the function to be called is obtained through the preset program code, and the file access information is stored in a preset storage format corresponding to the virtual machine for user space reading. By adopting the above technical solution, based on a preset virtual machine implemented in the kernel space, a preset program code can be inserted before a function to be called corresponding to a preset file access event in the kernel space, and the preset program code can be used to obtain and Storing file access information for user space reading can reduce the interaction between kernel space and user space, reduce the burden of file access tracking on the system, and improve system stability.
在一实施例中,判断模块401是设置为:In one embodiment, the determining module 401 is configured to:
判断内核空间中的虚拟文件系统层中,调用与所述预设文件访问事件对应的待调用函数的起始位置处,是否存在基于预设虚拟机编写的预设程序代码。It is determined whether a preset program code written based on a preset virtual machine exists in a virtual file system layer in kernel space at a starting position of a function to be called corresponding to the preset file access event.
可选的,所述文件访问信息包括偏移信息,其中,所述偏移信息用于表示 文件的访问位置。Optionally, the file access information includes offset information, where the offset information is used to indicate a file access location.
在一实施例中,访问信息获取模块402是设置为:In one embodiment, the access information acquisition module 402 is configured to:
通过所述预设程序代码获取所述待调用函数对应的函数参数内容和/或所述待调用函数对应的内核数据结构内容;Obtaining the function parameter content corresponding to the function to be called and / or the kernel data structure content corresponding to the function to be called through the preset program code;
根据所述函数参数内容和/或所述内核数据结构内容确定所述待调用函数对应的文件访问信息。Determine file access information corresponding to the function to be called according to the function parameter content and / or the kernel data structure content.
可选的,所述预设虚拟机包括扩展伯克利包过滤器eBPF,所述预设虚拟机对应的存储格式包括哈希表。Optionally, the preset virtual machine includes an extended Berkeley packet filter eBPF, and the storage format corresponding to the preset virtual machine includes a hash table.
在一实施例中,访问信息存储模块403是设置为:In an embodiment, the access information storage module 403 is configured to:
以所述文件访问信息对应的文件标识作为哈希表的键值,对所述文件访问信息进行存储,其中,所述文件标识包括文件所属的设备号和文件的索引节点。The file identification information corresponding to the file access information is used as a key value of the hash table to store the file access information, where the file identification includes a device number to which the file belongs and an index node of the file.
可选的,该装置还包括:Optionally, the device further includes:
预设哈希表获取模块,设置为在所述采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储之前,获取预设哈希表,其中,所述预设哈希表由所述用户空间传递至所述内核空间,所述预设哈希表中存储有过滤条件信息。The preset hash table acquisition module is configured to obtain a preset hash table before storing the file access information in the storage format corresponding to the preset virtual machine, wherein the preset hash table The user space is passed to the kernel space, and filter condition information is stored in the preset hash table.
过滤模块,设置为根据所述预设哈希表中的过滤条件信息对所述文件访问信息进行过滤。The filtering module is configured to filter the file access information according to the filtering condition information in the preset hash table.
在一实施例中,访问信息存储模块403是设置为:In an embodiment, the access information storage module 403 is configured to:
采用所述预设虚拟机对应的存储格式对过滤后的文件访问信息进行存储。The filtered file access information is stored in a storage format corresponding to the preset virtual machine.
可选的,该装置还包括:Optionally, the device further includes:
查询结果反馈模块,设置为在采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,以供用户空间读取之后,在所述内核空间接收到所述用户空间的预设读取请求时,根据所述预设读取请求查询存储有文件访问信息的哈希表,并将查询结果反馈给所述用户空间。The query result feedback module is configured to receive the user space preset in the kernel space after storing the file access information in a storage format corresponding to the preset virtual machine for reading by the user space. When a read request is made, a hash table storing file access information is queried according to the preset read request, and the query result is fed back to the user space.
可选的,该装置还可包括:Optionally, the device may further include:
预设哈希表生成模块,设置为通过用户空间接收过滤条件设置操作,根据所述过滤条件设置操作生成预设哈希表;A preset hash table generating module configured to receive a filtering condition setting operation through user space, and generate a preset hash table according to the filtering condition setting operation;
预设哈希表传递模块,设置为将所述预设哈希表由所述用户空间传递至所述内核空间。The preset hash table transfer module is configured to transfer the preset hash table from the user space to the kernel space.
本申请实施例还提供一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行文件访问追踪方法,该方法包括:An embodiment of the present application further provides a storage medium including computer-executable instructions, which are used to execute a file access tracking method when executed by a computer processor, and the method includes:
在预设文件访问事件被触发的情况下,判断内核空间中与所述预设文件访问事件对应的待调用函数之前,是否存在基于预设虚拟机编写的预设程序代码;If a preset file access event is triggered, determining whether there is a preset program code written based on a preset virtual machine in a kernel space before a function to be called corresponding to the preset file access event;
若存在所述基于预设虚拟机编写的预设程序代码,则通过所述预设程序代码获取所述待调用函数对应的文件访问信息;If the preset program code written based on the preset virtual machine exists, obtaining file access information corresponding to the function to be called through the preset program code;
采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,以供用户空间读取。The file access information is stored in a storage format corresponding to the preset virtual machine for user space reading.
存储介质——任何的一种或多种类型的存储器设备或存储设备。术语“存储介质”旨在包括:安装介质,例如紧凑型光盘只读储存器(Compact Disc Read-Only Memory,CD-ROM)、软盘或磁带装置;计算机系统存储器或随机存取存储器,诸如动态随机存取存储器(Dynamic Random Access Memory,DRAM)、双倍速率随机存取存储器(Double Data Rate Random Access Memory,DDRRAM)、静态随机存取存储器(Static Random-Access Memory,SRAM)、扩展数据输出随机存取存储器(Extended Data Output Random Access Memory,EDORAM),兰巴斯随机存取存储器(Rambus Random Access Memory,Rambus RAM)等;非易失性存储器,诸如闪存、磁介质(例如硬盘或光存储);寄存器或其它相似类型的存储器元件等。存储介质可以还包括其它类型的存储器或其组合。另外,存储介质可以位于程序在其中被执行的第一计算机系统中,或者可以位于不同的第二计算机系统中,第二计算机系统通过网络(诸如因特网)连接到第一计算机系统。第二计算机系统可以提供程序指令给第一计算机,第一计算机设置为执行该程序指令。术语“存储介质”可以包括可以驻留在不同位置中(例如在通过网络连接的不同计算机系统中)的两个或更多存储介质。存储介质可以存储可由一个或多个处理器执行的程序指令(例如具体实现为计算机程序)。Storage medium-any one or more types of memory devices or storage devices. The term "storage medium" is intended to include: installation media, such as Compact Disc Read-Only Memory (CD-ROM), floppy disks or magnetic tape devices; computer system memory or random access memory, such as dynamic random access memory Access Memory (Dynamic Random Access Memory, DRAM), Double-Rate Random Access Memory (Double Random Access Memory, DDRRAM), Static Random Access Memory (Static Random-Access Memory, SRAM), extended data output random storage Access memory (Extended Data Output Random Access Memory (EDORAM), Rambus Random Access Memory (Rambus Random Access Memory, Rambus RAM), etc .; non-volatile memory such as flash memory, magnetic media (such as hard disk or optical storage) Registers or other similar types of memory elements, etc. The storage medium may further include other types of memory or a combination thereof. In addition, the storage medium may be located in a first computer system in which the program is executed, or may be located in a different second computer system connected to the first computer system through a network such as the Internet. The second computer system may provide program instructions to the first computer, and the first computer is configured to execute the program instructions. The term "storage medium" may include two or more storage media that may reside in different locations, such as in different computer systems connected through a network. The storage medium may store program instructions (for example, embodied as a computer program) executable by one or more processors.
当然,本申请实施例所提供的一种包含计算机可执行指令的存储介质,其计算机可执行指令不限于如上所述的文件访问追踪操作,还可以执行本申请任意实施例所提供的文件访问追踪方法中的相关操作。Certainly, a storage medium including computer-executable instructions provided in the embodiments of the present application is not limited to the file access tracking operation described above, and can also perform the file access tracking provided by any embodiment of the application. Related operations in the method.
本申请实施例提供了一种终端,该终端中可集成本申请实施例提供的文件访问追踪装置。图5为本申请实施例提供的一种终端的结构示意图。终端500可以包括:存储器501,处理器502及存储在存储器501上并可在处理器502运行的计算机程序,所述处理器502执行所述计算机程序时实现如本申请实施例所述的文件访问追踪方法。The embodiment of the present application provides a terminal, and the terminal may integrate the file access tracking device provided by the embodiment of the present application. FIG. 5 is a schematic structural diagram of a terminal according to an embodiment of the present application. The terminal 500 may include: a memory 501, a processor 502, and a computer program stored on the memory 501 and executable on the processor 502. When the processor 502 executes the computer program, file access as described in the embodiments of the present application is implemented. Tracking method.
本申请实施例提供的终端,可以基于实现在内核空间中的预设虚拟机,在 内核空间中的与预设文件访问事件对应的待调用函数之前插入预设程序代码,利用该预设程序代码获取并存储文件访问信息,供用户空间读取,可以减少内核空间和用户空间的交互,降低文件访问追踪对系统的负担,提高系统稳定性。The terminal provided in the embodiment of the present application may be based on a preset virtual machine implemented in the kernel space, insert a preset program code before a function to be called corresponding to a preset file access event in the kernel space, and use the preset program code. Obtaining and storing file access information for user space reading can reduce the interaction between kernel space and user space, reduce the burden of file access tracking on the system, and improve system stability.
图6为本申请实施例提供的另一种终端的结构示意图,该终端可以包括:壳体(图中未示出)、存储器601、中央处理器(central processing unit,CPU)602(又称处理器)、电路板(图中未示出)和电源电路(图中未示出)。所述电路板安置在所述壳体围成的空间内部;所述CPU602和所述存储器601设置在所述电路板上;所述电源电路,设置为为所述终端的每个电路或器件供电;所述存储器601,设置为存储可执行预设程序代码;所述CPU602通过读取所述存储器601中存储的可执行预设程序代码来运行与所述可执行预设程序代码对应的计算机程序,以实现以下步骤:FIG. 6 is a schematic structural diagram of another terminal provided by an embodiment of the present application. The terminal may include: a housing (not shown in the figure), a memory 601, a central processing unit (CPU) 602 (also referred to as processing). Device), a circuit board (not shown in the figure), and a power supply circuit (not shown in the figure). The circuit board is disposed in a space surrounded by the housing; the CPU 602 and the memory 601 are disposed on the circuit board; and the power supply circuit is configured to supply power to each circuit or device of the terminal The memory 601 is configured to store executable preset program code; the CPU 602 runs a computer program corresponding to the executable preset program code by reading the executable preset program code stored in the memory 601 To achieve the following steps:
在预设文件访问事件被触发的情况下,判断内核空间中与所述预设文件访问事件对应的待调用函数之前,是否存在基于预设虚拟机编写的预设程序代码;If a preset file access event is triggered, determining whether there is a preset program code written based on a preset virtual machine in a kernel space before a function to be called corresponding to the preset file access event;
若存在基于预设虚拟机编写的预设程序代码,则通过所述预设程序代码获取所述待调用函数对应的文件访问信息;If there is a preset program code written based on a preset virtual machine, obtaining file access information corresponding to the function to be called through the preset program code;
采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,以供用户空间读取。The file access information is stored in a storage format corresponding to the preset virtual machine for user space reading.
所述终端还包括:外设接口603、射频(Radio Frequency,RF)电路605、音频电路606、扬声器611、电源管理芯片608、输入/输出(Input/Output,I/O)子系统609、其他输入/控制设备610、触摸屏612、其他输入/控制设备610以及外部端口604,这些部件通过一个或多个通信总线或信号线607来通信。The terminal further includes: a peripheral interface 603, a radio frequency (RF) circuit 605, an audio circuit 606, a speaker 611, a power management chip 608, an input / output (I / O) subsystem 609, and other The input / control device 610, the touch screen 612, other input / control devices 610, and the external port 604, these components communicate through one or more communication buses or signal lines 607.
应该理解的是,图示终端600仅仅是终端的一个范例,并且终端600可以具有比图中所示出的更多的或者更少的部件,可以组合两个或更多的部件,或者可以具有不同的部件配置。图中所示出的一种或多种部件可以在包括一个或多个信号处理和/或专用集成电路在内的硬件、软件、或硬件和软件的组合中实现。It should be understood that the illustrated terminal 600 is only an example of the terminal, and the terminal 600 may have more or fewer components than those shown in the figure, may combine two or more components, or may have Different component configurations. The one or more components shown in the figures may be implemented in hardware, software, or a combination of hardware and software including one or more signal processing and / or application specific integrated circuits.
下面就本实施例提供的用于文件访问追踪的终端进行详细的描述,该终端以手机为例。The terminal for file access tracking provided in this embodiment is described in detail below. The terminal uses a mobile phone as an example.
存储器601,所述存储器601可以被CPU602、外设接口603等访问,所述存储器601可以包括高速随机存取存储器,还可以包括非易失性存储器,例如一个或多个磁盘存储器件、闪存器件、或其他易失性固态存储器件。 Memory 601, which can be accessed by CPU602, peripheral interface 603, etc. The memory 601 can include high-speed random access memory, and can also include non-volatile memory, such as one or more disk storage devices, flash memory devices , Or other volatile solid-state storage devices.
外设接口603,所述外设接口603可以将设备的输入和输出外设连接到CPU602和存储器601。 Peripheral interface 603, which can connect the input and output peripherals of the device to the CPU 602 and the memory 601.
I/O子系统609,所述I/O子系统609可以将设备上的输入输出外设,例如触摸屏612和其他输入/控制设备610,连接到外设接口603。I/O子系统609可以包括显示控制器6091和用于控制其他输入/控制设备610的一个或多个输入控制器6092。其中,一个或多个输入控制器6092从其他输入/控制设备610接收电信号或者向其他输入/控制设备610发送电信号,其他输入/控制设备610可以包括物理按钮(按压按钮、摇臂按钮等)、拨号盘、滑动开关、操纵杆、点击滚轮。值得说明的是,输入控制器6092可以与以下任一个连接:键盘、红外端口、USB接口以及诸如鼠标的指示设备。I / O subsystem 609, which can connect input / output peripherals on the device, such as touch screen 612 and other input / control devices 610, to peripheral interface 603. The I / O subsystem 609 may include a display controller 6091 and one or more input controllers 6092 for controlling other input / control devices 610. Among them, one or more input controllers 6092 receive electrical signals from or send electrical signals to other input / control devices 610. Other input / control devices 610 may include physical buttons (press buttons, rocker buttons, etc.) ), Dial, slide switch, joystick, click wheel. It is worth noting that the input controller 6092 can be connected to any of the following: a keyboard, an infrared port, a USB interface, and a pointing device such as a mouse.
触摸屏612,所述触摸屏612是用户终端与用户之间的输入接口和输出接口,将可视输出显示给用户,可视输出可以包括图形、文本、图标、视频等。A touch screen 612, which is an input interface and an output interface between a user terminal and a user, and displays a visual output to the user. The visual output may include graphics, text, icons, videos, and the like.
I/O子系统609中的显示控制器6091从触摸屏612接收电信号或者向触摸屏612发送电信号。触摸屏612检测触摸屏上的接触,显示控制器6091将检测到的接触转换为与显示在触摸屏612上的用户界面对象的交互,即实现人机交互,显示在触摸屏612上的用户界面对象可以是运行游戏的图标、联网到相应网络的图标等。在一实施例中,设备还可以包括光鼠,光鼠是不显示可视输出的触摸敏感表面,或者是由触摸屏形成的触摸敏感表面的延伸。The display controller 6091 in the I / O subsystem 609 receives electrical signals from the touch screen 612 or sends electrical signals to the touch screen 612. The touch screen 612 detects a contact on the touch screen, and the display controller 6091 converts the detected contact into interaction with a user interface object displayed on the touch screen 612, that is, realizes human-computer interaction. The user interface object displayed on the touch screen 612 may be an operation Icons for games, icons connected to the appropriate network, etc. In an embodiment, the device may further include a light mouse, which is a touch-sensitive surface that does not display a visible output, or an extension of the touch-sensitive surface formed by a touch screen.
RF电路605,设置为建立手机与无线网络(即网络侧)的通信,实现手机与无线网络的数据接收和发送。例如收发短信息、电子邮件等。在一实施例中,RF电路605接收并发送RF信号,RF信号也称为电磁信号,RF电路605将电信号转换为电磁信号或将电磁信号转换为电信号,并且通过该电磁信号与通信网络以及其他设备进行通信。RF电路605可以包括用于执行这些功能的已知电路,其包括但不限于天线系统、RF收发机、一个或多个放大器、调谐器、一个或多个振荡器、数字信号处理器、编译码器(COder-DECoder,CODEC)芯片组、用户标识模块(Subscriber Identity Module,SIM)等等。The RF circuit 605 is configured to establish communication between the mobile phone and the wireless network (that is, the network side), and realize data reception and transmission of the mobile phone and the wireless network. For example, send and receive text messages, e-mail, and so on. In an embodiment, the RF circuit 605 receives and sends RF signals. The RF signals are also referred to as electromagnetic signals. The RF circuit 605 converts electrical signals into electromagnetic signals or converts electromagnetic signals into electrical signals, and communicates with the communication network through the electromagnetic signals. As well as other devices. RF circuit 605 may include known circuits for performing these functions, including but not limited to antenna systems, RF transceivers, one or more amplifiers, tuners, one or more oscillators, digital signal processors, codec (COder-DECoder, CODEC) chipset, Subscriber Identity Module (SIM), and so on.
音频电路606,设置为从外设接口603接收音频数据,将该音频数据转换为电信号,并且将该电信号发送给扬声器611。The audio circuit 606 is configured to receive audio data from the peripheral interface 603, convert the audio data into an electrical signal, and send the electrical signal to the speaker 611.
扬声器611,设置为将手机通过RF电路605从无线网络接收的语音信号,还原为声音并向用户播放该声音。The speaker 611 is configured to restore a voice signal received by the mobile phone from the wireless network through the RF circuit 605 to a sound and play the sound to a user.
电源管理芯片608,设置为为CPU602、I/O子系统609及外设接口603所 连接的硬件进行供电及电源管理。The power management chip 608 is configured to provide power and power management for the hardware connected to the CPU 602, the I / O subsystem 609, and the peripheral interface 603.
上述实施例中提供的文件访问追踪装置、存储介质及终端可执行本申请任意实施例所提供的文件访问追踪方法,具备执行该方法相应的功能模块和效果。未在上述实施例中详尽描述的技术细节,可参见本申请任意实施例所提供的文件访问追踪方法。The file access tracking device, storage medium, and terminal provided in the foregoing embodiments can execute the file access tracking method provided by any embodiment of the present application, and have corresponding function modules and effects for executing the method. For technical details not described in detail in the foregoing embodiments, reference may be made to a file access tracking method provided in any embodiment of the present application.

Claims (20)

  1. 一种文件访问追踪方法,包括:A file access tracking method includes:
    在预设文件访问事件被触发的情况下,判断内核空间中与所述预设文件访问事件对应的待调用函数之前,是否存在基于预设虚拟机编写的预设程序代码;If a preset file access event is triggered, determining whether there is a preset program code written based on a preset virtual machine in a kernel space before a function to be called corresponding to the preset file access event;
    响应于存在所述基于预设虚拟机编写的预设程序代码的判断结果,通过所述预设程序代码获取所述待调用函数对应的文件访问信息;Responding to the judgment result of the preset program code written based on the preset virtual machine, obtaining file access information corresponding to the function to be called through the preset program code;
    采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,以供用户空间读取。The file access information is stored in a storage format corresponding to the preset virtual machine for user space reading.
  2. 根据权利要求1所述的方法,其中,所述判断内核空间中与所述预设文件访问事件对应的待调用函数之前,是否存在基于预设虚拟机编写的预设程序代码,包括:The method according to claim 1, wherein, before the determining whether a function to be called corresponding to the preset file access event in the kernel space exists, a preset program code written based on a preset virtual machine comprises:
    判断内核空间中的虚拟文件系统层中,调用与所述预设文件访问事件对应的待调用函数的起始位置处,是否存在所述基于预设虚拟机编写的预设程序代码。It is determined whether a preset program code written based on a preset virtual machine exists in a virtual file system layer in kernel space at a starting position of a function to be called corresponding to the preset file access event.
  3. 根据权利要求1所述的方法,其中,所述文件访问信息包括文件的偏移信息,其中,所述偏移信息用于表示文件的访问位置。The method according to claim 1, wherein the file access information comprises offset information of a file, and wherein the offset information is used to indicate an access location of the file.
  4. 根据权利要求1所述的方法,其中,所述通过所述预设程序代码获取所述待调用函数对应的文件访问信息,包括:The method according to claim 1, wherein the acquiring file access information corresponding to the function to be called by the preset program code comprises:
    通过所述预设程序代码获取所述待调用函数对应的函数参数内容和所述待调用函数对应的内核数据结构内容中的至少一项;Obtaining at least one of content of a function parameter corresponding to the function to be called and content of a kernel data structure corresponding to the function to be called through the preset program code;
    根据所述函数参数内容和所述内核数据结构内容中的至少一项,确定所述待调用函数对应的文件访问信息。Determine file access information corresponding to the function to be called according to at least one of the function parameter content and the kernel data structure content.
  5. 根据权利要求1所述的方法,其中,所述预设虚拟机包括扩展伯克利包过滤器eBPF,所述预设虚拟机对应的存储格式包括哈希表。The method according to claim 1, wherein the preset virtual machine comprises an extended Berkeley packet filter eBPF, and the storage format corresponding to the preset virtual machine comprises a hash table.
  6. 根据权利要求5所述的方法,其中,所述采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,包括:The method according to claim 5, wherein the storing the file access information by using a storage format corresponding to the preset virtual machine comprises:
    以所述文件访问信息对应的文件标识作为哈希表的键值,对所述文件访问信息进行存储,其中,所述文件标识包括文件所属的设备号和文件的索引节点。The file identification information corresponding to the file access information is used as a key value of the hash table to store the file access information, where the file identification includes a device number to which the file belongs and an index node of the file.
  7. 根据权利要求5所述的方法,在所述采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储之前,还包括:The method according to claim 5, before the storing the file access information by using the storage format corresponding to the preset virtual machine, further comprising:
    获取预设哈希表,其中,所述预设哈希表由所述用户空间传递至所述内核 空间,所述预设哈希表中存储有过滤条件信息;Obtaining a preset hash table, wherein the preset hash table is passed from the user space to the kernel space, and the preset hash table stores filter condition information;
    根据所述预设哈希表中的过滤条件信息对所述文件访问信息进行过滤。And filtering the file access information according to the filter condition information in the preset hash table.
  8. 根据权利要求7所述的方法,其中,所述采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,包括:The method according to claim 7, wherein the storing the file access information by using a storage format corresponding to the preset virtual machine comprises:
    采用所述预设虚拟机对应的存储格式对过滤后的文件访问信息进行存储。The filtered file access information is stored in a storage format corresponding to the preset virtual machine.
  9. 根据权利要求5-8任一项所述的方法,在采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,以供用户空间读取之后,还包括:The method according to any one of claims 5-8, after storing the file access information in a storage format corresponding to the preset virtual machine for user space reading, further comprising:
    在所述内核空间接收到所述用户空间的预设读取请求的情况下,根据所述预设读取请求查询存储有文件访问信息的哈希表,并将查询结果反馈给所述用户空间。In the case where the kernel space receives a preset read request of the user space, a hash table storing file access information is queried according to the preset read request, and the query result is fed back to the user space .
  10. 根据权利要求3所述的方法,其中,所述文件访问信息还包括下述至少一项:文件名称、文件路径和文件大小。The method according to claim 3, wherein the file access information further comprises at least one of the following: a file name, a file path, and a file size.
  11. 根据权利要求1所述的方法,其中,所述预设文件访问事件包括下述至少一项:文件读取、文件写入、文件同步和文件数据同步。The method according to claim 1, wherein the preset file access event comprises at least one of the following: file reading, file writing, file synchronization, and file data synchronization.
  12. 根据权利要求1所述的方法,其中,所述预设文件访问事件被触发,包括:The method according to claim 1, wherein the preset file access event is triggered comprises:
    响应于所述预设文件访问事件对应的系统调用接口被调用,确定所述预设文件访问事件被触发。In response to a system call interface corresponding to the preset file access event being called, it is determined that the preset file access event is triggered.
  13. 根据权利要求1所述的方法,其中,所述通过所述预设程序代码获取所述待调用函数对应的文件访问信息,包括:The method according to claim 1, wherein the acquiring file access information corresponding to the function to be called by the preset program code comprises:
    在执行所述预设程序代码后,执行所述待调用函数,其中,所述预设程序代码用于对执行所述待调用函数时的文件访问信息进行获取。After the preset program code is executed, the function to be called is executed, where the preset program code is used to obtain file access information when the function to be called is executed.
  14. 根据权利要求7所述的方法,其中,所述获取预设哈希表,包括:The method according to claim 7, wherein the obtaining a preset hash table comprises:
    通过所述用户空间接收过滤条件设置操作,根据所述过滤条件设置操作生成所述预设哈希表。Receiving a filtering condition setting operation through the user space, and generating the preset hash table according to the filtering condition setting operation.
  15. 根据权利要求9所述的方法,在将查询结果反馈给所述用户空间之后,还包括:The method according to claim 9, after feeding the query result to the user space, further comprising:
    在所述用户空间中对所述查询结果进行分析或统计操作。Perform analysis or statistical operations on the query results in the user space.
  16. 根据权利要求15所述的方法,其中,所述分析或统计操作包括:获取读取次数超过设定阈值的文件,获取多个文件的访问顺序,顺序访问的文件类型,随机访问的文件类型以及用户访问习惯。The method according to claim 15, wherein the analyzing or statistical operation comprises: obtaining a file whose reading times exceed a set threshold, obtaining an access sequence of multiple files, a file type of sequential access, a file type of random access, and User access habits.
  17. 根据权利要求7所述的方法,其中,所述根据所述预设哈希表中的过滤条件信息对所述文件访问信息进行过滤,包括:The method according to claim 7, wherein the filtering the file access information according to the filter condition information in the preset hash table comprises:
    判断所述文件访问信息对应的文件是否属于目标应用程序;Determining whether a file corresponding to the file access information belongs to a target application;
    响应于所述文件访问信息对应的文件属于所述目标应用程序的判断结果,保留所述文件访问信息;Responsive to the judgment result that the file corresponding to the file access information belongs to the target application, retaining the file access information;
    响应于所述文件访问信息对应的文件不属于所述目标应用程序的判断结果,滤除所述文件访问信息。In response to a determination result that the file corresponding to the file access information does not belong to the target application, the file access information is filtered.
  18. 一种文件访问追踪装置,包括:A file access tracking device includes:
    判断模块,设置为在预设文件访问事件被触发的情况下,判断内核空间中与所述预设文件访问事件对应的待调用函数之前,是否存在基于预设虚拟机编写的预设程序代码;A judging module, configured to judge whether a preset program code written based on a preset virtual machine exists in a kernel space before a function to be called corresponding to the preset file access event is triggered in a case where a preset file access event is triggered;
    访问信息获取模块,设置为响应于存在所述基于预设虚拟机编写的预设程序代码的判断结果,通过所述预设程序代码获取所述待调用函数对应的文件访问信息;The access information acquisition module is configured to acquire file access information corresponding to the function to be called through the preset program code in response to a judgment result of the preset program code written based on the preset virtual machine;
    访问信息存储模块,设置为采用所述预设虚拟机对应的存储格式对所述文件访问信息进行存储,以供用户空间读取。The access information storage module is configured to store the file access information in a storage format corresponding to the preset virtual machine for user space to read.
  19. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1-17中任一项所述的方法。A computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the method according to any one of claims 1-17 is implemented.
  20. 一种终端,包括存储器,处理器及存储在所述存储器上并可在所述处理器运行的计算机程序,所述处理器执行所述计算机程序时实现如权利要求1-17中任一项所述的方法。A terminal includes a memory, a processor, and a computer program stored on the memory and operable on the processor. When the processor executes the computer program, the device implements any one of claims 1-17. The method described.
PCT/CN2019/093511 2018-09-26 2019-06-28 File access tracking method, device, storage medium and terminal WO2020062980A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811126366.8A CN110955631B (en) 2018-09-26 2018-09-26 File access tracking method and device, storage medium and terminal
CN201811126366.8 2018-09-26

Publications (1)

Publication Number Publication Date
WO2020062980A1 true WO2020062980A1 (en) 2020-04-02

Family

ID=69952742

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/093511 WO2020062980A1 (en) 2018-09-26 2019-06-28 File access tracking method, device, storage medium and terminal

Country Status (2)

Country Link
CN (1) CN110955631B (en)
WO (1) WO2020062980A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113542245A (en) * 2021-07-02 2021-10-22 广州华多网络科技有限公司 Data flow monitoring method and device, computer equipment and storage medium
CN116743903A (en) * 2022-09-09 2023-09-12 荣耀终端有限公司 Chip identification method and electronic equipment
CN116886445A (en) * 2023-09-05 2023-10-13 苏州浪潮智能科技有限公司 Processing method and device of filtering result, storage medium and electronic equipment
CN117056030A (en) * 2023-10-10 2023-11-14 苏州元脑智能科技有限公司 Method and device for determining escape of container

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448690B (en) * 2021-08-27 2022-02-01 阿里云计算有限公司 Monitoring method and device
CN114398318B (en) * 2022-03-25 2022-07-08 广东统信软件有限公司 File operation method of user space file system and user space file system
CN115202990B (en) * 2022-09-09 2022-12-06 天津市天河计算机技术有限公司 Method, device, equipment and storage medium for acquiring IO performance data
CN115758420B (en) * 2022-11-29 2023-06-09 北京天融信网络安全技术有限公司 File access control method, device, equipment and medium
CN117215901B (en) * 2023-11-09 2024-03-08 华南师范大学 Programming exercise evaluation method, system, equipment and medium based on dynamic tracking
CN117312099B (en) * 2023-11-28 2024-04-05 麒麟软件有限公司 File system event monitoring method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101944043A (en) * 2010-09-27 2011-01-12 公安部第三研究所 File access method of Linux virtual machine disk under Windows platform
CN106909437A (en) * 2015-12-23 2017-06-30 华为技术有限公司 The guard method of virtual machine kernel and device
US9804952B1 (en) * 2016-11-07 2017-10-31 Red Hat, Inc. Application debugging in a restricted container environment
CN107450964A (en) * 2017-08-10 2017-12-08 西安电子科技大学 It is a kind of to be used to finding that virtual machine is examined oneself whether there is the method for leak in system
CN107958152A (en) * 2017-12-04 2018-04-24 山东中创软件商用中间件股份有限公司 Tamper resistant method, device and equipment based on Virtual File System

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8464254B1 (en) * 2009-12-31 2013-06-11 Symantec Corporation Tracking storage operations of virtual machines
CN103617039B (en) * 2013-11-28 2017-02-01 北京华胜天成科技股份有限公司 Method and device for accessing user space file system
CN103984536B (en) * 2014-02-14 2017-07-14 中国科学院计算技术研究所 I/O request number systems and its method in a kind of cloud computing platform
CN105447203B (en) * 2015-12-31 2019-03-26 杭州华为数字技术有限公司 A kind of access method of shared file, system and relevant device
CN106970821B (en) * 2016-01-12 2021-02-02 阿里巴巴集团控股有限公司 Method and device for processing I/O request under KVM virtualization

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101944043A (en) * 2010-09-27 2011-01-12 公安部第三研究所 File access method of Linux virtual machine disk under Windows platform
CN106909437A (en) * 2015-12-23 2017-06-30 华为技术有限公司 The guard method of virtual machine kernel and device
US9804952B1 (en) * 2016-11-07 2017-10-31 Red Hat, Inc. Application debugging in a restricted container environment
CN107450964A (en) * 2017-08-10 2017-12-08 西安电子科技大学 It is a kind of to be used to finding that virtual machine is examined oneself whether there is the method for leak in system
CN107958152A (en) * 2017-12-04 2018-04-24 山东中创软件商用中间件股份有限公司 Tamper resistant method, device and equipment based on Virtual File System

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113542245A (en) * 2021-07-02 2021-10-22 广州华多网络科技有限公司 Data flow monitoring method and device, computer equipment and storage medium
CN113542245B (en) * 2021-07-02 2023-04-25 广州华多网络科技有限公司 Data traffic monitoring method, device, computer equipment and storage medium
CN116743903A (en) * 2022-09-09 2023-09-12 荣耀终端有限公司 Chip identification method and electronic equipment
CN116886445A (en) * 2023-09-05 2023-10-13 苏州浪潮智能科技有限公司 Processing method and device of filtering result, storage medium and electronic equipment
CN116886445B (en) * 2023-09-05 2024-01-19 苏州浪潮智能科技有限公司 Processing method and device of filtering result, storage medium and electronic equipment
CN117056030A (en) * 2023-10-10 2023-11-14 苏州元脑智能科技有限公司 Method and device for determining escape of container
CN117056030B (en) * 2023-10-10 2024-02-09 苏州元脑智能科技有限公司 Method and device for determining escape of container

Also Published As

Publication number Publication date
CN110955631B (en) 2023-01-03
CN110955631A (en) 2020-04-03

Similar Documents

Publication Publication Date Title
WO2020062980A1 (en) File access tracking method, device, storage medium and terminal
WO2020062985A1 (en) Block device access tracking method and apparatus, storage medium and terminal
KR101932395B1 (en) Activity continuation between electronic devices
JP5947131B2 (en) Search input method and system by region selection method
US20180107725A1 (en) Data Storage Method and Apparatus, and Data Read Method and Apparatus
US11868710B2 (en) Method and apparatus for displaying a text string copied from a first application in a second application
CN108038231B (en) Log processing method and device, terminal equipment and storage medium
US20110291930A1 (en) Electronic device with touch input function and touch input method thereof
CN109446353A (en) The management of local and remote media item
CN109710396B (en) Method and device for information acquisition and memory release
WO2019011141A1 (en) Startup and configuration method, apparatus and device, medium, and operating system
WO2019100853A1 (en) Page query method, device, and electronic apparatus
CN108932103A (en) Method, apparatus, terminal device and the storage medium of identified user interest
CN110618904A (en) Stuck detection method and device
WO2020062981A1 (en) Method and apparatus for tracking file caching efficiency, and storage medium and terminal
WO2022062743A1 (en) Application icon display method and related device
WO2021068382A1 (en) Multi-window operation control method and apparatus, and device and storage medium
US9977621B2 (en) Multimedia data backup method, user terminal and synchronizer
CN110058938B (en) Memory processing method and device, electronic equipment and readable medium
CN115981918A (en) Data recovery method and device, electronic equipment and storage medium
CN108459942A (en) A kind of data processing method, device and storage medium
CN112710300A (en) Method and device for switching dial plate of wearable device
CN112363932A (en) Method and device for testing business object to be tested and electronic device
CN107621903B (en) Double-touch-screen equipment and response control method thereof
CN110554821A (en) Display method, system and equipment of suspension frame for screening query conditions

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19864290

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19864290

Country of ref document: EP

Kind code of ref document: A1