WO2017007511A1 - Data management using index change events - Google Patents

Data management using index change events Download PDF

Info

Publication number
WO2017007511A1
WO2017007511A1 PCT/US2016/013108 US2016013108W WO2017007511A1 WO 2017007511 A1 WO2017007511 A1 WO 2017007511A1 US 2016013108 W US2016013108 W US 2016013108W WO 2017007511 A1 WO2017007511 A1 WO 2017007511A1
Authority
WO
WIPO (PCT)
Prior art keywords
metadata
file
metadata database
database
index
Prior art date
Application number
PCT/US2016/013108
Other languages
French (fr)
Inventor
Kannan Ramesh K.
Rajkumar Kannan
Annmary Justine K
Jothivelavan SIVASHANMUGAM
Original Assignee
Hewlett-Packard Development Company, L.P.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett-Packard Development Company, L.P. filed Critical Hewlett-Packard Development Company, L.P.
Publication of WO2017007511A1 publication Critical patent/WO2017007511A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices

Definitions

  • File systems may store volumes of data on storage devices.
  • File systems may store and organize data files to facilitate the process of locating the data files.
  • File system instructions may be used to manage data blocks that are stored on a storage device, such as a hard disk drive.
  • the file system may organize the data blocks into data files and directories.
  • the file system also keeps track of which data blocks belong to which file, which data blocks are not being used, when the file was created or modified, access permissions, ownership of the data file, and so on.
  • the data used by the file system to maintain such information is called metadata, and may be stored on the same storage device as the data files.
  • FIG. 1 is a block diagram illustrating an example data management system
  • Fig. 2 is a block diagram illustrating additional components of the example data management system of Fig.1
  • Fig. 3 is a block diagram illustrating an example database update and event generation using the data management system of Figs.1 and 2
  • Fig.4 is a block diagram illustrating an example index management using the data management system of Fig.2
  • Fig. 5 is an example block diagram illustrating query processing using the data management system of Fig.2
  • Fig. 6 depicts an example flow chart of processes for data management
  • Fig. 7 illustrates a block diagram of an example computing device for data management.
  • File systems may store volumes of data on storage devices.
  • File systems may store and organize data files to facilitate the process of locating the data files.
  • the file system also keeps track of information such as a file name, a length of contents of a file, and a location of a file, information about unused data blocks, when the file was created or modified, access permissions, ownership of the data file, and so on.
  • the data used by the file system to maintain such information is called metadata, and may be stored on the same storage device as the data files.
  • metadata may be stored on the same storage device as the data files.
  • the present application discloses techniques to provide a database architecture in which a kernel may directly write metadata updates to the metadata database, while the index management is achieved at a user space.
  • a kernel may directly write metadata updates to the metadata database, while the index management is achieved at a user space.
  • an operating system can divide system memory of a computing device into two distinct regions: kernel space and user space.
  • the kernel space may be reserved for running an operating system kernel (e.g., core processes of the operating system), kernel extensions, and device drivers and user space may be for application programs and some drivers executed by a user.
  • the user space may include code which runs outside the operating system's kernel.
  • User space may refer to various programs and libraries that the operating system uses to interact with the kernel instructions, that performs input/output (I/O), manipulates file system objects, application , and so on.
  • the file system running in the kernel may full- fill the file object I/O request, obtain a location in the metadata database, write metadata of the file object into the obtained location in the metadata database, and also write an index change event associated with the list of attributes changed to an event log file.
  • an index manager running in the user space may read the index change event from the event log file and updates corresponding index files in an indices database associated with the user space.
  • the index change events may be updated in a batch. Every batch update may create a copy (i.e., a new index file) of existing index file and make the changes in the new index file.
  • the new index file may not be available to a query processor until the update is complete.
  • the client applications may send an I/O request regarding the file objects through the query processor running in the user space.
  • the query processor may provide a standard interface to the clients.
  • the query processor upon receiving any I/O request from the clients, may retrieve a last updated index file associated with the file object from the indices database, and may read the metadata associated with the query from the metadata database using the retrieved last updated index file. This may ensure that the read and write processes to run in parallel without having to acquire any locks for each other.
  • Fig. 1 is a block diagram illustrating an example data management system 100.
  • the system 100 includes a metadata database 106 and a kernel 102 communicatively connected to the metadata database 106.
  • the metadata database 106 includes a plurality of files and metadata for the plurality of files.
  • the metadata database106 includes metadata for the plurality of files created by at least one application.
  • the kernel 102 includes a file system 104.
  • the file system 104 includes a metadata database manager 108 to manage the plurality of files in the metadata database 106.
  • the file system 104 includes a metadata writer 110 communicatively coupled to the metadata database manager 108.
  • Example file system can include network attached storage (NAS) file system.
  • the metadata writer 110 receives an input/output (I/O) operation on a file object in the metadata database 106 from a client. Further, the metadata writer 110 obtains a location in the metadata database 106 from the metadata database manager 108 upon receiving the I/O operation.
  • I/O input/output
  • the metadata database manager 108 may determine whether the free slots (e.g., metadata database free slots 112) are available in the metadata database 106. In one example, the metadata database manager 108 may create a file in the metadata database 106 and provide the location from the created file when the at least one free slot is not available in the metadata database 106. In another example, the metadata database manager 108 may provide the location from the at least one free slot when the at least one free slot is available in the metadata database 106. [00017] Furthermore, the metadata writer 110 writes metadata of the file object into the obtained location in the metadata database 106.
  • the free slots e.g., metadata database free slots 112
  • the metadata writer 110 writes an index change event associated with the file object to an event log file 114 upon writing the metadata of the file object into the obtained location. Further, the metadata writer 110 may store the metadata location details in an inode (e.g., on-disk metadata structure representing the file object). The index change event can be read from the event log file 114 to update corresponding index files in an indices database associated with a user space by an index manager running in the user space. This is explained in detail with respect to Fig.2. [00018] Referring now to Fig. 2 which is a block diagram 200, illustrating additional components of the example data management system 100 of Fig. 1. As shown in Fig.
  • the system 100 includes a user space 202 operatively connected to the metadata database 106 and the kernel 102.
  • the system 100 includes an indices database 206 connected to the user space 202.
  • the user space 202 includes an index manager 204 operatively coupled to the metadata database manager 108, the indices database 206 and the event log file 114.
  • the user space 202 includes a query processor 208 operatively coupled to the metadata database 106, the indices database 206 and the index manager 204.
  • the metadata database 106 may store the data in multiple smaller files instead of a single large file.
  • the metadata writer 110 receives an I/O operation on a file object in the metadata database 106.
  • the I/O operation may include creation of a new file, modification of an existing file and/or deletion of an existing file.
  • the metadata writer 110 obtains a location in the metadata database 106 from the metadata database manager 108.
  • the metadata database manager 108 may manage multiple database files and handles fragmentation due to creating new rows for every update on the database files.
  • the metadata database manager 108 may create a new database file if existing database file size reaches a predetermined threshold.
  • the de-fragmentation may be realized by reusing space consumed by old rows in the metadata database 106.
  • the metadata database manager 108 may first looks at the available metadata database free slots 112. For example, the metadata database free slots 112 may be stored in a system’s memory. If the metadata database free slots 112 can provide the space, then the metadata database manager 108 may obtain the new location from the available metadata database free slots 112 and provide the new location to the metadata writer 110.
  • the database manager 108 may create a new file in the metadata database 106 and provide the new location from the new file to the metadata writer 110.
  • the metadata writer 110 writes metadata of the file object into the obtained location in the metadata database 106.
  • the metadata writer 110 may store the metadata location details in an inode. The inode may be used to generate index change event with the old location.
  • the metadata writer 110 writes the index change event associated with the file object to the event log file 114.
  • the index change event includes information such as an old location and a new location of the metadata associated with the file object and/or old value and new value of metadata attributes associated with the file object that is changed due to the I/O operation.
  • the old location of the metadata may be retrieved from the inode.
  • the metadata writer 110 may be responsible for writing the metadata into the metadata database 106 and generating an event to update the indices database 206.
  • the metadata database and event generation is explained in detail in Fig.3.
  • the index manager 204 running in the user space 202 reads the index change event from the event log file 114 upon generating the event by the metadata writer 110. Further, the index manager 204 updates corresponding index files in the indices database 206 associated with the user space 202.
  • the index files may represent at least a subset of the files in the metadata database 106.
  • the index manager 204 may process the index change events in a batch. The size of the batch is configurable.
  • the index manager 204 may create a copy of existing index file and apply the update to the copy. Upon the batch update is completed, the new index file may be made available for querying.
  • the index manager 204 may keep track of old rows while processing the batch of index change events. After the batch of index change events is completely processed, the list of old row locations may be given to the metadata database manager 108 for reuse as free slots when there are no readers for old index files.
  • the index manager 204 may also take care of cleaning up the old index files if there are no readers for the index files. If there are readers for the old index files, the index manager 204 may mark the old index files for deletion.
  • the last reader e.g., the query processor 208 may take the task of providing the old row locations to the metadata database manager 108 for reuse as free slots and may delete the old index files marked for deletion upon processing of the request.
  • the index management is explained in detail in Fig.4. [00025]
  • the index manager 204 may send a list of free slots to the metadata database manager 108 upon updating the indices database 206. Further, the metadata database manager 108 may add the list of free slots received from the index manager 204 to the in-memory metadata database free slots 112.
  • the metadata database manager 108 may swap the metadata database free slots 112 in local memory to a disk when the number of metadata database free slots 112 exceeds a predetermined maximum threshold value, and swap free slots from the disk to the local memory when the number of the metadata database free slots 112 falls below a predetermined minimum threshold value.
  • the query processor 208 may receive a query regarding the file object in the metadata database 106. Further, the query processor 208 retrieves a last updated index file associated with the file object from the indices database 206. The query processor 208 may act as an interface for clients 210 to query the metadata.
  • the search query may be received from an application program which runs on an operating system via an input interface.
  • the query processor 208 may provide a standard structured query language (SQL) interface for the clients 210 to write simple SQL statements to query the metadata.
  • the query processor 208 may search for words in that content by searching through the indices database 206 to see if a particular word exists in any of the index files which have been indexed.
  • the query processor 208 reads the metadata associated with the query from the metadata database 106 using the retrieved last updated index file.
  • Fig.5 is a block diagram 300 illustrating an example database update and event generation using the data management system 100 of Figs. 1 and 2.
  • the I/O operation details are queued (e.g., 306) to be written to the metadata database 106.
  • the queue 306 may be processed when the queue size reaches a maximum threshold or time out.
  • the metadata writer 110 reads the I/O operations from the queue 306. [00028] For every I/O operation in the queue, the metadata writer 110 obtains a free slot from the metadata database manager 108.
  • the metadata database manager 108 performs the following:
  • the metadata database manager 108 may get the free slot from the metadata database free slots list 112 (i.e., existing unused slots in database files).
  • the metadata database manager 108 may trigger a swap from the disk 302 to memory and get the free slot from the metadata database free slots list 112. This is an asynchronous task and hence may not block the metadata writer 110.
  • the metadata database manager 108 may get a new slot from an end of the file in the metadata database 106. For example, if the metadata database 106 has 3 db files (e.g., file 1, file 2 and file N as shown in Fig.3), the metadata database manager 108 tries to get a slot to append to file N. If file N reaches maximum size, then the metadata database manager 108 creates another file N+1 and provides the new slot from the file N+1. [00029] Further, the metadata writer 110 writes the new row into the free slot (i.e., existing free slot or new slot) provided by the metadata database manager 108. For example, the metadata may be written as the new row in the metadata database 106 and the old row is reused for any future updates.
  • the metadata database manager 108 may get a new slot from an end of the file in the metadata database 106. For example, if the metadata database 106 has 3 db files (e.g., file 1, file 2 and file N as shown in Fig.3), the metadata database manager 108
  • the metadata writer 110 updates the new offset (i.e., slot location) in the inode 304.
  • the inode 304 is operatively connected to the file system 104.
  • the metadata writer 110 may update the inode 304 with the location where the metadata of the file object is stored.
  • the inode 304 may maintain information associated with at least one of metadata of the plurality of files and/or a location of the metadata of the plurality of files, and so on.
  • the metadata writer 110 may write the index change event associated with the file object to the event log file 114 with the following details:
  • old offset e.g., old location
  • new offset e.g., new location
  • old offset will be empty for file object create and new offset will be empty for file object delete.
  • old location of the metadata may be retrieved from the inode.
  • each event log file 114 may contain predetermined maximum number of events.
  • the log file may be made visible to the index manager 204 upon reaching a maximum number of events or time out occurs. The process of index management is explained in detail in Fig.4.
  • Fig.4 is a block diagram 400 illustrating an example index management using the data management system of Fig. 2.
  • the index manager 204 may read the Index change events through index change event log files 114.
  • the index manager 204 may process the events in a batch.
  • the size of the batch can be configurable.
  • Example methods for indexing may include binary search tree (B-tree), generalized search tree (GiST), and so on.
  • B-tree binary search tree
  • GaST generalized search tree
  • the index manager 204 creates a copy (i.e., new version V x+1 402B) of the index file set (V x 402A). Further, for every event in the batch, the index manager 204 updates the index file set number and applies the update to the copy V x+1 402B to reflect the new change as follows:
  • Identify the existing index entry The indexed field and the old offset are used to locate the index entry uniquely in the index file. This may be used if the indexed field is allowed to have duplicate values. For example, two different file objects can have same mtime. If mtime is predetermined for index, then combining mtime and old offset will uniquely identify the index entry.
  • indexed field value If indexed field value is not changed, then just update the new database offset in the existing row. iii. If the indexed field value is changed, then remove the old entry and insert the new entry to appropriate location in the index file.
  • the index manager 204 may make the new index files (V x+1 402B) to be available for querying. After the batch is completely processed, the index manager 204 provides the list of old row locations to the metadata database manager 108 for reuse.
  • the index manager 204 adds old slot locations to a local list of the metadata database free slots, sends the local list of metadata database free slots to the metadata database manager 108 (e.g., through I/O control (ioctl) or remote procedure call (RPC)), and removes old index files upon sending the local list of free slots to the metadata database manager 108.
  • the index manager 204 stores the free slot list in a disk and marks the old index files (Vx 402A) for cleanup.
  • Fig. 5 is a block diagram 500 illustrating an example query processing using the data management system of Fig. 2.
  • the query processor 208 provides a standard SQL interface through a query engine 502 (e.g., postgreSQL foreign tables).
  • a query engine 502 e.g., postgreSQL foreign tables.
  • the query engine 502 receives the query, prepares query plan and send an atomic query to foreign data wrapper (FDW) 504.
  • the FDW 504 may obtain the recently updated index file set provided by the index manager 204 to locate the relevant data in the metadata database 106.
  • the query processor 208 provides a user interface of the query engine 502 to allow a user to search through the metadata and also displays the search results which are obtained by the query engine 502 in the user interface.
  • the system 100 of Figs.1-5 shows example database management system and should be understood that other configurations may be employed to practice the techniques of the present application.
  • File systems may store volumes of data in a metadata database.
  • the metadata database can be used on many different kinds of storage devices, such as hard disk drives. Some file systems may be used on local data storage devices, while others may provide file access via a network protocol.
  • the network file system may act as a client for a remote file access protocol, providing access to files on a server.
  • system 100 may include an NAS file system, which is a file system stored in a computer data storage server connected to a computer network providing data access to a group of clients.
  • storage device may be a single storage device or a plurality of storage devices distributed across a plurality of computing devices.
  • each of the file system 104, index manager 204 and query processor 208 is shown as a single component but it should be understood that these modules may be plurality of modules distributed across a plurality of computing devices.
  • the components of system 100 may be implemented in hardware, machine-readable instructions or a combination thereof.
  • file system 104 may be implemented in hardware, machine-readable instructions or a combination thereof.
  • the functionality of the components of system 100 may be implemented using technology related to personal computers (PCs), server computers, tablet computers, mobile computers and the like.
  • Figs. 1-5 show a system 100 to provide database management.
  • the system 100 may include computer-readable storage medium comprising (e.g., encoded with) instructions executable by a processor to implement functionalities described herein in relation to Figs. 1-5.
  • the functionalities described herein in relation to instructions to implement functions of file system 104, index manager 204 and query processor 208 and any additional instructions described herein in relation to storage medium may be implemented as engines or modules comprising any combination of hardware and programming to implement the functionalities of the modules or engines, as described below.
  • the functions of file system 104, index manager 204 and query processor 208 may be implemented by a computing device which may be a server, blade enclosure, desktop computer, laptop (or notebook) computer, workstation, tablet computer, mobile phone, smart device, or any other processing device or equipment including a processing resource.
  • a processor may include, for example, one processor or multiple processors included in a single computing device or distributed across multiple computing devices.
  • Fig. 6 depicts an example flow chart 600 of a process for data management. It should be understood the process depicted in Fig. 6 represents generalized illustrations, and that other processes may be added or existing processes may be removed, modified, or rearranged without departing from the scope and spirit of the present application. In addition, it should be understood that the processes may represent instructions stored on a computer-readable storage medium that, when executed, may cause a processor to respond, to perform actions, to change states, and/or to make decisions.
  • an input/output (I/O) operation is received on a file object in a metadata database by a kernel.
  • the I/O operation is received on the file object by a file system running in the kernel.
  • the file system may be used to manage data that is stored in the metadata database.
  • a location in the metadata database is obtained by the kernel upon receiving the I/O operation.
  • a check is made to determine whether at least one free slot is available in the metadata database by the kernel. If the at least one free slot is not available in the metadata database, a file is created in the metadata database and the location is obtained from the created file. If the at least one free slot is available in the metadata database, the location is obtained from the at least one free slot.
  • metadata of the file object is written into the obtained location in the metadata database by the kernel. In one example, the metadata of the file object is written to a new row in the obtained location and old rows associated with the file object can be reused for any future updates.
  • an index change event associated with the file object is written to an event log file by the kernel upon writing the metadata of the file object into the obtained location.
  • the index change event includes any combination of information including an old location and a new location of the metadata associated with the file object and old value and new value of metadata attributes associated with the file object that is changed due to the I/O operation.
  • the kernel may raise an event to update the indices database associated with a user space. The index change event is read from the event log file and corresponding index files are updated in an indices database by an index manager running in the user space. The user space is operatively connected to the metadata database and the kernel.
  • the event log file is read by the index manager upon one of reaching a maximum number of index change events in the event log file and occurring a time out of the event log file.
  • a query regarding the file object in the metadata database is received by a query processor running in the user space.
  • a last updated index file associated with the file object is retrieved from the indices database by the query processor.
  • the metadata associated with the query is read from the metadata database using the retrieved last updated index file by the query processor.
  • the process 600 of Fig. 6 shows an example process and it should be understood that other configurations can be employed to practice the techniques of the present application. For example, process 600 may communicate with a plurality of computing devices and the like.
  • the example methods and systems described through Figs. 1-6 may improve freshness of the metadata database as the update processing is simplified by multiple folds. For example, if the database has billion rows, a single update may take few seconds to reach the database and available for query. The improved freshness enables the anti-virus, disaster recovery, data integrity check, and tiering applications to utilize the above example methods and systems instead of walking through the file system and also provides ability for these applications to define policies in the database which can be applied on a subset of files. [00049] Further, the example methods and systems described through Figs. 1-6 may reduce the resource consumption exponentially due to the following:
  • Fig. 7 illustrates a block diagram 700 of an example computing device 702 for data management.
  • the computing device 702 includes a processor 704 and a machine-readable storage medium 706 communicatively coupled through a system bus.
  • the processor 704 may be any type of central processing unit (CPU), microprocessor, or processing logic that interprets and executes machine-readable instructions stored in the machine-readable storage medium 706.
  • the machine-readable storage medium 706 may be a random access memory (RAM) or another type of dynamic storage device that may store information and machine-readable instructions that may be executed by the processor 704.
  • the machine-readable storage medium 706 may be synchronous DRAM (SDRAM), double data rate (DDR), rambus DRAM (RDRAM), rambus RAM, etc., or storage memory media such as a floppy disk, a hard disk, a CD-ROM, a DVD, a pen drive, and the like.
  • SDRAM synchronous DRAM
  • DDR double data rate
  • RDRAM rambus DRAM
  • rambus RAM etc.
  • storage memory media such as a floppy disk, a hard disk, a CD-ROM, a DVD, a pen drive, and the like.
  • the machine-readable storage medium 706 may be a non-transitory machine- readable medium. In an example, the machine-readable storage medium 706 may be remote but accessible to the computing device 702. [00052]
  • the machine-readable storage medium 706 may store instructions 708-714.
  • instructions 708-714 may be executed by the processor 704 to provide a mechanism for data management by a file system.
  • Instructions 708 may be executed by the processor 704 to receive an input/output (I/O) operation on a file object in a metadata database by a kernel.
  • Instructions 710 may be executed by processor 704 to obtain a location in the metadata database by the kernel upon receiving the I/O operation.
  • Instructions 712 may be executed by processor 704 to write metadata of the file object into the obtained location in the metadata database by the kernel.
  • Instructions 714 may be executed by processor 704 to write an index change event associated with the file object to an event log file by the kernel upon writing the metadata of the file object into the obtained location.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

In one example, an input/output (I/O) operation on a file object in a metadata database is received by a kernel. A location in the metadata database is obtained by the kernel upon receiving the I/O operation. Metadata of the file object is written into the obtained location in the metadata database by the kernel. An index change event associated with the file object is written to an event log file by the kernel upon writing the metadata of the file object into the obtained location.

Description

DATA MANAGEMENT USING INDEX CHANGE EVENTS BACKGROUND
[0001] File systems may store volumes of data on storage devices. File systems may store and organize data files to facilitate the process of locating the data files. File system instructions may be used to manage data blocks that are stored on a storage device, such as a hard disk drive. The file system may organize the data blocks into data files and directories. The file system also keeps track of which data blocks belong to which file, which data blocks are not being used, when the file was created or modified, access permissions, ownership of the data file, and so on. The data used by the file system to maintain such information is called metadata, and may be stored on the same storage device as the data files. BRIEF DESCRIPTION OF THE DRAWINGS
[0002] Examples are described in the following detailed description and in reference to the drawings, in which: [0003] Fig. 1 is a block diagram illustrating an example data management system; [0004] Fig. 2 is a block diagram illustrating additional components of the example data management system of Fig.1; [0005] Fig. 3 is a block diagram illustrating an example database update and event generation using the data management system of Figs.1 and 2; [0006] Fig.4 is a block diagram illustrating an example index management using the data management system of Fig.2; [0007] Fig. 5 is an example block diagram illustrating query processing using the data management system of Fig.2; [0008] Fig. 6 depicts an example flow chart of processes for data management; [0009] Fig. 7 illustrates a block diagram of an example computing device for data management. DETAILED DESCRIPTION
[00010] File systems may store volumes of data on storage devices. File systems may store and organize data files to facilitate the process of locating the data files. The file system also keeps track of information such as a file name, a length of contents of a file, and a location of a file, information about unused data blocks, when the file was created or modified, access permissions, ownership of the data file, and so on. The data used by the file system to maintain such information is called metadata, and may be stored on the same storage device as the data files. [00011] With petabytes of data getting added to enterprise storage devices, the data management with respect to querying, locating and providing data mining capabilities through centralized metadata view may become challenging. Some existing methods may provide a module in the file system to capture file system events and execute queries into database. In this case, the file system and the database may not have the knowledge of each other, and hence integrating the file system and database may be inefficient and resource intensive. [00012] The present application discloses techniques to provide a database architecture in which a kernel may directly write metadata updates to the metadata database, while the index management is achieved at a user space. For example, an operating system can divide system memory of a computing device into two distinct regions: kernel space and user space. The kernel space may be reserved for running an operating system kernel (e.g., core processes of the operating system), kernel extensions, and device drivers and user space may be for application programs and some drivers executed by a user. In other words, the user space may include code which runs outside the operating system's kernel. User space may refer to various programs and libraries that the operating system uses to interact with the kernel instructions, that performs input/output (I/O), manipulates file system objects, application , and so on. [00013] In one example, when the file system running in the kernel receives any I/O request on a file object in a metadata database, the file system may full- fill the file object I/O request, obtain a location in the metadata database, write metadata of the file object into the obtained location in the metadata database, and also write an index change event associated with the list of attributes changed to an event log file. Further, an index manager running in the user space may read the index change event from the event log file and updates corresponding index files in an indices database associated with the user space. In one example, the index change events may be updated in a batch. Every batch update may create a copy (i.e., a new index file) of existing index file and make the changes in the new index file. The new index file may not be available to a query processor until the update is complete. Further, the client applications may send an I/O request regarding the file objects through the query processor running in the user space. The query processor may provide a standard interface to the clients. The query processor, upon receiving any I/O request from the clients, may retrieve a last updated index file associated with the file object from the indices database, and may read the metadata associated with the query from the metadata database using the retrieved last updated index file. This may ensure that the read and write processes to run in parallel without having to acquire any locks for each other. [00014] The terms “database” and “metadata database” are used interchangeably throughout the document. Further, the term“file object” refers to files, memory cache, and/or data in the metadata database. Further, the terms “in-memory” and “local memory” are used interchangeably throughout the document and may refer to a system memory (e.g., random access memory). [00015] Fig. 1 is a block diagram illustrating an example data management system 100. The system 100 includes a metadata database 106 and a kernel 102 communicatively connected to the metadata database 106. The metadata database 106 includes a plurality of files and metadata for the plurality of files. In one example, the metadata database106 includes metadata for the plurality of files created by at least one application. The kernel 102 includes a file system 104. The file system 104 includes a metadata database manager 108 to manage the plurality of files in the metadata database 106. Furthermore, the file system 104 includes a metadata writer 110 communicatively coupled to the metadata database manager 108. Example file system can include network attached storage (NAS) file system. [00016] In one example, the metadata writer 110 receives an input/output (I/O) operation on a file object in the metadata database 106 from a client. Further, the metadata writer 110 obtains a location in the metadata database 106 from the metadata database manager 108 upon receiving the I/O operation. In one example, when the metadata writer 110 asks for a location to write a new row, the metadata database manager 108 may determine whether the free slots (e.g., metadata database free slots 112) are available in the metadata database 106. In one example, the metadata database manager 108 may create a file in the metadata database 106 and provide the location from the created file when the at least one free slot is not available in the metadata database 106. In another example, the metadata database manager 108 may provide the location from the at least one free slot when the at least one free slot is available in the metadata database 106. [00017] Furthermore, the metadata writer 110 writes metadata of the file object into the obtained location in the metadata database 106. Also, the metadata writer 110 writes an index change event associated with the file object to an event log file 114 upon writing the metadata of the file object into the obtained location. Further, the metadata writer 110 may store the metadata location details in an inode (e.g., on-disk metadata structure representing the file object). The index change event can be read from the event log file 114 to update corresponding index files in an indices database associated with a user space by an index manager running in the user space. This is explained in detail with respect to Fig.2. [00018] Referring now to Fig. 2 which is a block diagram 200, illustrating additional components of the example data management system 100 of Fig. 1. As shown in Fig. 2, the system 100 includes a user space 202 operatively connected to the metadata database 106 and the kernel 102. The system 100 includes an indices database 206 connected to the user space 202. The user space 202 includes an index manager 204 operatively coupled to the metadata database manager 108, the indices database 206 and the event log file 114. Further, the user space 202 includes a query processor 208 operatively coupled to the metadata database 106, the indices database 206 and the index manager 204. [00019] In one example, the metadata database 106 may store the data in multiple smaller files instead of a single large file. Any update or insert to the file object in the metadata database 106 may written as a new row in the metadata database 106 and the old row associated with the file object can be reused for any future updates. [00020] In operation, the metadata writer 110 receives an I/O operation on a file object in the metadata database 106. For example, the I/O operation may include creation of a new file, modification of an existing file and/or deletion of an existing file. Further, the metadata writer 110 obtains a location in the metadata database 106 from the metadata database manager 108. The metadata database manager 108 may manage multiple database files and handles fragmentation due to creating new rows for every update on the database files. [00021] For example, the metadata database manager 108 may create a new database file if existing database file size reaches a predetermined threshold. The de-fragmentation may be realized by reusing space consumed by old rows in the metadata database 106. For example when the metadata writer 110 asks for a new location to write a new row, the metadata database manager 108 may first looks at the available metadata database free slots 112. For example, the metadata database free slots 112 may be stored in a system’s memory. If the metadata database free slots 112 can provide the space, then the metadata database manager 108 may obtain the new location from the available metadata database free slots 112 and provide the new location to the metadata writer 110. If the metadata database free slots 112 cannot provide the space, then the database manager 108 may create a new file in the metadata database 106 and provide the new location from the new file to the metadata writer 110. [00022] Furthermore, the metadata writer 110 writes metadata of the file object into the obtained location in the metadata database 106. Further, the metadata writer 110 may store the metadata location details in an inode. The inode may be used to generate index change event with the old location. Also, the metadata writer 110 writes the index change event associated with the file object to the event log file 114. For example, the index change event includes information such as an old location and a new location of the metadata associated with the file object and/or old value and new value of metadata attributes associated with the file object that is changed due to the I/O operation. For example, the old location of the metadata may be retrieved from the inode. In one example, the metadata writer 110 may be responsible for writing the metadata into the metadata database 106 and generating an event to update the indices database 206. The metadata database and event generation is explained in detail in Fig.3. [00023] The index manager 204 running in the user space 202 reads the index change event from the event log file 114 upon generating the event by the metadata writer 110. Further, the index manager 204 updates corresponding index files in the indices database 206 associated with the user space 202. The index files may represent at least a subset of the files in the metadata database 106. [00024] In one example, the index manager 204 may process the index change events in a batch. The size of the batch is configurable. For every batch of index updates, the index manager 204 may create a copy of existing index file and apply the update to the copy. Upon the batch update is completed, the new index file may be made available for querying. The index manager 204 may keep track of old rows while processing the batch of index change events. After the batch of index change events is completely processed, the list of old row locations may be given to the metadata database manager 108 for reuse as free slots when there are no readers for old index files. The index manager 204 may also take care of cleaning up the old index files if there are no readers for the index files. If there are readers for the old index files, the index manager 204 may mark the old index files for deletion. In this case, the last reader (e.g., the query processor 208) may take the task of providing the old row locations to the metadata database manager 108 for reuse as free slots and may delete the old index files marked for deletion upon processing of the request. The index management is explained in detail in Fig.4. [00025] Also, the index manager 204 may send a list of free slots to the metadata database manager 108 upon updating the indices database 206. Further, the metadata database manager 108 may add the list of free slots received from the index manager 204 to the in-memory metadata database free slots 112. In one example, the metadata database manager 108 may swap the metadata database free slots 112 in local memory to a disk when the number of metadata database free slots 112 exceeds a predetermined maximum threshold value, and swap free slots from the disk to the local memory when the number of the metadata database free slots 112 falls below a predetermined minimum threshold value. [00026] The query processor 208 may receive a query regarding the file object in the metadata database 106. Further, the query processor 208 retrieves a last updated index file associated with the file object from the indices database 206. The query processor 208 may act as an interface for clients 210 to query the metadata. The search query may be received from an application program which runs on an operating system via an input interface. For example, the query processor 208 may provide a standard structured query language (SQL) interface for the clients 210 to write simple SQL statements to query the metadata. The query processor 208 may search for words in that content by searching through the indices database 206 to see if a particular word exists in any of the index files which have been indexed. Furthermore, the query processor 208 reads the metadata associated with the query from the metadata database 106 using the retrieved last updated index file. The query processing is explained in detail in Fig.5. [00027] Referring now to Fig.3, which is a block diagram 300 illustrating an example database update and event generation using the data management system 100 of Figs. 1 and 2. When any file object I/O operation which changes the metadata of the file is received, the I/O operation details are queued (e.g., 306) to be written to the metadata database 106. The queue 306 may be processed when the queue size reaches a maximum threshold or time out. The metadata writer 110 reads the I/O operations from the queue 306. [00028] For every I/O operation in the queue, the metadata writer 110 obtains a free slot from the metadata database manager 108. In one example, the metadata database manager 108 performs the following:
1. If the metadata database free slots 112 are available, the metadata database manager 108 may get the free slot from the metadata database free slots list 112 (i.e., existing unused slots in database files).
2. If the metadata database free slots 112 are swapped to a disk 302, and now the metadata database free slots 112 reach a lower threshold in- memory, the metadata database manager 108 may trigger a swap from the disk 302 to memory and get the free slot from the metadata database free slots list 112. This is an asynchronous task and hence may not block the metadata writer 110.
3. If there are no free slots, the metadata database manager 108 may get a new slot from an end of the file in the metadata database 106. For example, if the metadata database 106 has 3 db files (e.g., file 1, file 2 and file N as shown in Fig.3), the metadata database manager 108 tries to get a slot to append to file N. If file N reaches maximum size, then the metadata database manager 108 creates another file N+1 and provides the new slot from the file N+1. [00029] Further, the metadata writer 110 writes the new row into the free slot (i.e., existing free slot or new slot) provided by the metadata database manager 108. For example, the metadata may be written as the new row in the metadata database 106 and the old row is reused for any future updates. Furthermore, the metadata writer 110 updates the new offset (i.e., slot location) in the inode 304. As shown in Fig. 3, the inode 304 is operatively connected to the file system 104. The metadata writer 110 may update the inode 304 with the location where the metadata of the file object is stored. For example, the inode 304 may maintain information associated with at least one of metadata of the plurality of files and/or a location of the metadata of the plurality of files, and so on. [00030] Also, the metadata writer 110 may write the index change event associated with the file object to the event log file 114 with the following details:
1. Old offset (e.g., old location) and new offset (e.g., new location) of metadata. For example, old offset will be empty for file object create and new offset will be empty for file object delete. For example, the old location of the metadata may be retrieved from the inode.
2. Old value and new value for every metadata (i.e., changed due to the file object I/O operation) which are arranged for index (e.g., by file change time (ctime), by file modify time (mtime), by unique identifier (uid), and so on). [00031] In one example, each event log file 114 may contain predetermined maximum number of events. The log file may be made visible to the index manager 204 upon reaching a maximum number of events or time out occurs. The process of index management is explained in detail in Fig.4. [00032] Referring now to Fig.4, which is a block diagram 400 illustrating an example index management using the data management system of Fig. 2. The index manager 204 may read the Index change events through index change event log files 114. The index manager 204 may process the events in a batch. The size of the batch can be configurable. Example methods for indexing may include binary search tree (B-tree), generalized search tree (GiST), and so on. [00033] For every batch, the index manager 204 creates a copy (i.e., new version Vx+1402B) of the index file set (Vx 402A). Further, for every event in the batch, the index manager 204 updates the index file set number and applies the update to the copy Vx+1402B to reflect the new change as follows:
1. Index update for file object create:
i. Insert indexed field value and the database offset at appropriate location in the index file.
2. Index update for file object update:
i. Identify the existing index entry. The indexed field and the old offset are used to locate the index entry uniquely in the index file. This may be used if the indexed field is allowed to have duplicate values. For example, two different file objects can have same mtime. If mtime is predetermined for index, then combining mtime and old offset will uniquely identify the index entry.
ii. If indexed field value is not changed, then just update the new database offset in the existing row. iii. If the indexed field value is changed, then remove the old entry and insert the new entry to appropriate location in the index file.
iv. Add the old offset to free slot list (i.e., local list for every batch). 3. Index update for file object delete:
i. Identify the existing index entry.
ii. Delete the index entry.
iii. Add the old offset to free slot list (i.e., local list for every batch). [00034] Furthermore, the index manager 204 may make the new index files (Vx+1402B) to be available for querying. After the batch is completely processed, the index manager 204 provides the list of old row locations to the metadata database manager 108 for reuse. In one example, if there are no readers for the accessing the old index file version (Vx 402A), the index manager 204 adds old slot locations to a local list of the metadata database free slots, sends the local list of metadata database free slots to the metadata database manager 108 (e.g., through I/O control (ioctl) or remote procedure call (RPC)), and removes old index files upon sending the local list of free slots to the metadata database manager 108. [00035] In another example, if there are any existing readers for old index file version (Vx 402A), the index manager 204 stores the free slot list in a disk and marks the old index files (Vx 402A) for cleanup. In this case, the last reader of the old index file version (Vx 402A) may ensure that the old index files are removed and free slot list in the disk is sent to the metadata database manager 108. In one example, the metadata database manager 108 adds the list of metadata database free slots provided by the index manager 204 to in-memory metadata database free slots 112. If the in-memory database free slots 112 are crossing over a maximum threshold, the metadata database manager 108 may swap excess entries to the disk 302. The new index files (Vx+1402B) can be available for querying as explained in Fig.5. [00036] Fig. 5 is a block diagram 500 illustrating an example query processing using the data management system of Fig. 2. In one example, the query processor 208 provides a standard SQL interface through a query engine 502 (e.g., postgreSQL foreign tables). When the clients execute query (e.g., in PostgreSQL), the query engine 502 receives the query, prepares query plan and send an atomic query to foreign data wrapper (FDW) 504. The FDW 504 may obtain the recently updated index file set provided by the index manager 204 to locate the relevant data in the metadata database 106. Further, the query processor 208 provides a user interface of the query engine 502 to allow a user to search through the metadata and also displays the search results which are obtained by the query engine 502 in the user interface. [00037] The system 100 of Figs.1-5 shows example database management system and should be understood that other configurations may be employed to practice the techniques of the present application. File systems may store volumes of data in a metadata database. The metadata database can be used on many different kinds of storage devices, such as hard disk drives. Some file systems may be used on local data storage devices, while others may provide file access via a network protocol. The network file system may act as a client for a remote file access protocol, providing access to files on a server. For example, system 100 may include an NAS file system, which is a file system stored in a computer data storage server connected to a computer network providing data access to a group of clients. In one example, storage device may be a single storage device or a plurality of storage devices distributed across a plurality of computing devices. [00038] In one example, each of the file system 104, index manager 204 and query processor 208 is shown as a single component but it should be understood that these modules may be plurality of modules distributed across a plurality of computing devices. The components of system 100 may be implemented in hardware, machine-readable instructions or a combination thereof. In one example, file system 104 may be implemented in hardware, machine-readable instructions or a combination thereof. In another example, the functionality of the components of system 100 may be implemented using technology related to personal computers (PCs), server computers, tablet computers, mobile computers and the like. [00039] Figs. 1-5 show a system 100 to provide database management. The system 100 may include computer-readable storage medium comprising (e.g., encoded with) instructions executable by a processor to implement functionalities described herein in relation to Figs. 1-5. In some examples, the functionalities described herein in relation to instructions to implement functions of file system 104, index manager 204 and query processor 208 and any additional instructions described herein in relation to storage medium, may be implemented as engines or modules comprising any combination of hardware and programming to implement the functionalities of the modules or engines, as described below. The functions of file system 104, index manager 204 and query processor 208 may be implemented by a computing device which may be a server, blade enclosure, desktop computer, laptop (or notebook) computer, workstation, tablet computer, mobile phone, smart device, or any other processing device or equipment including a processing resource. In examples described herein, a processor may include, for example, one processor or multiple processors included in a single computing device or distributed across multiple computing devices. [00040] Fig. 6 depicts an example flow chart 600 of a process for data management. It should be understood the process depicted in Fig. 6 represents generalized illustrations, and that other processes may be added or existing processes may be removed, modified, or rearranged without departing from the scope and spirit of the present application. In addition, it should be understood that the processes may represent instructions stored on a computer-readable storage medium that, when executed, may cause a processor to respond, to perform actions, to change states, and/or to make decisions. Alternatively, the processes may represent functions and/or actions performed by functionally equivalent circuits like analog circuits, digital signal processing circuits, application specific integrated circuits (ASICs), or other hardware components associated with the system. Furthermore, the flow charts are not intended to limit the implementation of the present application, but rather the flow charts illustrate functional information to design/fabricate circuits, generate machine-readable instructions, or use a combination of hardware and machine-readable instructions to perform the illustrated processes. [00041] At block 602, an input/output (I/O) operation is received on a file object in a metadata database by a kernel. In one example, the I/O operation is received on the file object by a file system running in the kernel. The file system may be used to manage data that is stored in the metadata database. [00042] At block 604, a location in the metadata database is obtained by the kernel upon receiving the I/O operation. To obtain a location in the metadata database, a check is made to determine whether at least one free slot is available in the metadata database by the kernel. If the at least one free slot is not available in the metadata database, a file is created in the metadata database and the location is obtained from the created file. If the at least one free slot is available in the metadata database, the location is obtained from the at least one free slot. [00043] At block 606, metadata of the file object is written into the obtained location in the metadata database by the kernel. In one example, the metadata of the file object is written to a new row in the obtained location and old rows associated with the file object can be reused for any future updates. [00044] At block 608, an index change event associated with the file object is written to an event log file by the kernel upon writing the metadata of the file object into the obtained location. For example, the index change event includes any combination of information including an old location and a new location of the metadata associated with the file object and old value and new value of metadata attributes associated with the file object that is changed due to the I/O operation. [00045] In one example, the kernel may raise an event to update the indices database associated with a user space. The index change event is read from the event log file and corresponding index files are updated in an indices database by an index manager running in the user space. The user space is operatively connected to the metadata database and the kernel. In one example, the event log file is read by the index manager upon one of reaching a maximum number of index change events in the event log file and occurring a time out of the event log file. [00046] Further, a query regarding the file object in the metadata database is received by a query processor running in the user space. A last updated index file associated with the file object is retrieved from the indices database by the query processor. The metadata associated with the query is read from the metadata database using the retrieved last updated index file by the query processor. [00047] The process 600 of Fig. 6 shows an example process and it should be understood that other configurations can be employed to practice the techniques of the present application. For example, process 600 may communicate with a plurality of computing devices and the like. [00048] The example methods and systems described through Figs. 1-6 may improve freshness of the metadata database as the update processing is simplified by multiple folds. For example, if the database has billion rows, a single update may take few seconds to reach the database and available for query. The improved freshness enables the anti-virus, disaster recovery, data integrity check, and tiering applications to utilize the above example methods and systems instead of walking through the file system and also provides ability for these applications to define policies in the database which can be applied on a subset of files. [00049] Further, the example methods and systems described through Figs. 1-6 may reduce the resource consumption exponentially due to the following:
1. Since the file object updates are directly written to the metadata database, eliminates various stages of pipleline processing.
2. One copy of the metadata database is used hence reduces the database footprint.
3. May not involve any sort or merge processing for metadata database update. [00050] The example methods and systems described through Figs. 1-6 may dynamically add the database index, thereby enabling to configure the index based on the use case. The example methods and systems described through Figs. 1-6 may improve the file system performance and provide rich parallel processing capability where a reader can read a row while the writer process writes new update on the same row. The example methods and systems described through Figs.1-6 may facilitate automatic custom metadata tagging of file objects to provide rich and unified storage solutions. [00051] Fig. 7 illustrates a block diagram 700 of an example computing device 702 for data management. The computing device 702 includes a processor 704 and a machine-readable storage medium 706 communicatively coupled through a system bus. The processor 704 may be any type of central processing unit (CPU), microprocessor, or processing logic that interprets and executes machine-readable instructions stored in the machine-readable storage medium 706. The machine-readable storage medium 706 may be a random access memory (RAM) or another type of dynamic storage device that may store information and machine-readable instructions that may be executed by the processor 704. For example, the machine-readable storage medium 706 may be synchronous DRAM (SDRAM), double data rate (DDR), rambus DRAM (RDRAM), rambus RAM, etc., or storage memory media such as a floppy disk, a hard disk, a CD-ROM, a DVD, a pen drive, and the like. In an example, the machine-readable storage medium 706 may be a non-transitory machine- readable medium. In an example, the machine-readable storage medium 706 may be remote but accessible to the computing device 702. [00052] The machine-readable storage medium 706 may store instructions 708-714. In an example, instructions 708-714 may be executed by the processor 704 to provide a mechanism for data management by a file system. Instructions 708 may be executed by the processor 704 to receive an input/output (I/O) operation on a file object in a metadata database by a kernel. Instructions 710 may be executed by processor 704 to obtain a location in the metadata database by the kernel upon receiving the I/O operation. Instructions 712 may be executed by processor 704 to write metadata of the file object into the obtained location in the metadata database by the kernel. Instructions 714 may be executed by processor 704 to write an index change event associated with the file object to an event log file by the kernel upon writing the metadata of the file object into the obtained location. [00053] It may be noted that the above-described examples of the present solution is for the purpose of illustration only. Although the solution has been described in conjunction with a specific embodiment thereof, numerous modifications may be possible without materially departing from the teachings and advantages of the subject matter described herein. Other substitutions, modifications and changes may be made without departing from the spirit of the present solution. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive. [00054] The terms“include,”“have,” and variations thereof, as used herein, have the same meaning as the term“comprise” or appropriate variation thereof. Furthermore, the term“based on”, as used herein, means“based at least in part on.” Thus, a feature that is described as based on some stimulus can be based on the stimulus or a combination of stimuli including the stimulus. [00055] The present description has been shown and described with reference to the foregoing examples. It is understood, however, that other forms, details, and examples can be made without departing from the spirit and scope of the present subject matter that is defined in the following claims.

Claims

WHAT IS CLAIMED IS: 1. A data management system comprising:
a metadata database to store a plurality of files; and
a kernel operatively connected to the metadata database, the kernel comprising:
a file system, the file system comprising:
a metadata database manager to manage the plurality of files in the metadata database; and
a metadata writer coupled to the metadata database manager, wherein the metadata writer to receive an input/output (I/O) operation on a file object in the metadata database, obtain a location in the metadata database from the metadata database manager upon receiving the I/O operation, write metadata of the file object into the obtained location in the metadata database, and write an index change event associated with the file object to an event log file upon writing the metadata of the file object into the obtained location.
2. The data management system of claim 1, further comprising:
a user space operatively connected to the metadata database and the kernel, the user space comprising:
an index manager to read the index change event from the event log file and update corresponding index files in an indices database associated with the user space.
3. The data management system of claim 2, wherein the user space further comprises:
a query processor operatively coupled to the metadata database, the indices database, and the index manager, wherein the query processor to receive a query regarding the file object in the metadata database, retrieve a last updated index file associated with the file object from the indices database, and read the metadata associated with the query from the metadata database using the retrieved last updated index file.
4. The data management system of claim 2, wherein the index manager to process index change events in a batch, wherein a size of the batch is variable.
5. The data management system of claim 2, wherein the index manager to send a list of free slots to the metadata database manager, wherein the metadata database manager to add the list of free slots to free slots in the metadata database, and wherein the metadata database manager to swap free slots in local memory to a disk when a number of free slots exceeds a predetermined maximum threshold value, and swap free slots from the disk to the local memory when the number of the free slots falls below a predetermined minimum threshold value.
6. The data management system of claim 1, wherein the metadata database manager to determine whether at least one free slot is available in the metadata database upon receiving a request for the location from the metadata writer, create a file in the metadata database and provides the location from the created file when the at least one free slot is not available in the metadata database, and provide the location from the at least one free slot when the at least one free slot is available in the metadata database.
7. The data management system of claim 1, further comprising:
an inode operatively connected to the file system, wherein the metadata writer to update the inode with the location where the metadata of the file object is stored, wherein the inode to maintain information associated with at least one of metadata of the plurality of files and a location of the metadata of the plurality of files.
8. The data management system of claim 1, wherein the index change event includes information selected from the group consisting of an old location and a new location of the metadata associated with the file object and old value and new value of metadata attributes associated with the file object that is changed due to the I/O operation.
9. A method for data management, comprising:
receiving an input/output (I/O) operation on a file object in a metadata database by a kernel;
obtaining a location in the metadata database by the kernel upon receiving the I/O operation;
writing metadata of the file object into the obtained location in the metadata database by the kernel; and
writing an index change event associated with the file object to an event log file by the kernel upon writing the metadata of the file object into the obtained location.
10. The method of claim 9, further comprising:
reading the index change event from the event log file and updating corresponding index files in an indices database by an index manager running in a user space, wherein the user space is operatively connected to the metadata database and the kernel, and wherein the event log file is read by the index manager upon one of reaching a maximum number of index change events in the event log file and occurring a time out of the event log file.
11. The method of claim 10, further comprising:
receiving a query regarding the file object in the metadata database by a query processor running in the user space;
retrieving a last updated index file associated with the file object from the indices database by the query processor; and
reading the metadata associated with the query from the metadata database using the retrieved last updated index file by the query processor.
12. The method of claim 9, wherein obtaining the location in the metadata database by the kernel comprises:
determining whether at least one free slot is available in the metadata database by the kernel;
creating a file in the metadata database and obtaining the location from the created file when the at least one free slot is not available in the metadata database; and
obtaining the location from the at least one free slot when the at least one free slot is available in the metadata database.
13. A non-transitory computer-readable medium having computer executable instructions stored thereon for data replication management, the instructions are executable by a processor to:
receive an input/output (I/O) operation on a file object in a metadata database by a kernel;
obtain a location in the metadata database by the kernel upon receiving the I/O operation;
write metadata of the file object into the obtained location in the metadata database by the kernel; and
write an index change event associated with the file object to an event log file by the kernel upon writing the metadata of the file object into the obtained location.
14. The non-transitory computer-readable medium of claim 13, further comprising instructions that if executed cause a processor to read the index change event from the event log file and update corresponding index files in an indices database by an index manager running in a user space, wherein the user space is operatively connected to the metadata database and the kernel, and wherein the event log file is read by the index manager upon one of reaching a maximum number of index change events in the event log file and occurring a time out of the event log file.
15. The non-transitory computer-readable medium of claim 14, further comprising instructions that if executed cause a processor to:
receive a query regarding the file object in the metadata database by a query processor running in the user space;
retrieve a last updated index file associated with the file object from the indices database by the query processor; and
read the metadata associated with the query from the metadata database using the retrieved last updated index file by the query processor.
PCT/US2016/013108 2015-07-06 2016-01-12 Data management using index change events WO2017007511A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN3446CH2015 2015-07-06
IN3446/CHE/2015 2015-07-06

Publications (1)

Publication Number Publication Date
WO2017007511A1 true WO2017007511A1 (en) 2017-01-12

Family

ID=57685911

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/013108 WO2017007511A1 (en) 2015-07-06 2016-01-12 Data management using index change events

Country Status (1)

Country Link
WO (1) WO2017007511A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114238241A (en) * 2022-02-26 2022-03-25 杭州字节方舟科技有限公司 Metadata processing method and computer system for financial data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8055613B1 (en) * 2008-04-29 2011-11-08 Netapp, Inc. Method and apparatus for efficiently detecting and logging file system changes
US8131691B1 (en) * 2002-12-30 2012-03-06 Symantec Operating Corporation System and method for updating a search engine index based on which files are identified in a file change log
US8484257B2 (en) * 2003-11-26 2013-07-09 Symantec Operating Corporation System and method for generating extensible file system metadata
US20140136802A1 (en) * 2012-11-09 2014-05-15 International Business Machines Corporation Accessing data in a storage system
US8874517B2 (en) * 2007-01-31 2014-10-28 Hewlett-Packard Development Company, L.P. Summarizing file system operations with a file system journal

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8131691B1 (en) * 2002-12-30 2012-03-06 Symantec Operating Corporation System and method for updating a search engine index based on which files are identified in a file change log
US8484257B2 (en) * 2003-11-26 2013-07-09 Symantec Operating Corporation System and method for generating extensible file system metadata
US8874517B2 (en) * 2007-01-31 2014-10-28 Hewlett-Packard Development Company, L.P. Summarizing file system operations with a file system journal
US8055613B1 (en) * 2008-04-29 2011-11-08 Netapp, Inc. Method and apparatus for efficiently detecting and logging file system changes
US20140136802A1 (en) * 2012-11-09 2014-05-15 International Business Machines Corporation Accessing data in a storage system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114238241A (en) * 2022-02-26 2022-03-25 杭州字节方舟科技有限公司 Metadata processing method and computer system for financial data

Similar Documents

Publication Publication Date Title
Vora Hadoop-HBase for large-scale data
US11455217B2 (en) Transaction consistency query support for replicated data from recovery log to external data stores
US10496319B2 (en) Lifecycle management for data in non-volatile memory including blocking creation of a database savepoint and associating non-volatile memory block identifiers with database column fragments
US11269832B2 (en) Application-centric object configuration
JP6553822B2 (en) Dividing and moving ranges in distributed systems
JP6697392B2 (en) Transparent discovery of semi-structured data schema
US8782101B1 (en) Transferring data across different database platforms
US9886464B2 (en) Versioned bloom filter
US20170116311A1 (en) System and method for use of automatic slice merge in a multidimensional database environment
US8924373B2 (en) Query plans with parameter markers in place of object identifiers
US10007548B2 (en) Transaction system
CN108628885B (en) Data synchronization method and device and storage equipment
US20170116255A1 (en) System and method for use of lock-less techniques with a multidimensional database
US11650967B2 (en) Managing a deduplicated data index
Hu et al. Towards big linked data: a large-scale, distributed semantic data storage
US9396218B2 (en) Database insert with deferred materialization
CN115552390A (en) Server-free data lake indexing subsystem and application programming interface
Mukhopadhyay et al. Addressing name node scalability issue in Hadoop distributed file system using cache approach
CN112965939A (en) File merging method, device and equipment
CN111753141B (en) Data management method and related equipment
WO2017007511A1 (en) Data management using index change events
US11568067B2 (en) Smart direct access
US20130297576A1 (en) Efficient in-place preservation of content across content sources
US10762084B2 (en) Distribute execution of user-defined function
US20210357419A1 (en) Preventing dbms deadlock by eliminating shared locking

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16821759

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16821759

Country of ref document: EP

Kind code of ref document: A1