CN116226038A - File system acceleration method and system based on LevelDB - Google Patents

File system acceleration method and system based on LevelDB Download PDF

Info

Publication number
CN116226038A
CN116226038A CN202211524055.3A CN202211524055A CN116226038A CN 116226038 A CN116226038 A CN 116226038A CN 202211524055 A CN202211524055 A CN 202211524055A CN 116226038 A CN116226038 A CN 116226038A
Authority
CN
China
Prior art keywords
file
file system
leveldb
level
small
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211524055.3A
Other languages
Chinese (zh)
Inventor
赵泽钧
袁苇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Newland Communication Science Technologies Co ltd
Original Assignee
Fujian Newland Communication Science Technologies Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Newland Communication Science Technologies Co ltd filed Critical Fujian Newland Communication Science Technologies Co ltd
Priority to CN202211524055.3A priority Critical patent/CN116226038A/en
Publication of CN116226038A publication Critical patent/CN116226038A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/137Hash-based
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Stored Programmes (AREA)

Abstract

The invention provides a file system acceleration method and a file system acceleration system based on a level DB, which belong to the technical field of software engineering, wherein the method comprises the following steps: step S10, creating a live DB table in a user space; step S20, applying for i node numbers for each file to be stored through a global counter; step S30, setting a capacity threshold, and storing the file into a local file system or a live DB table of the kernel space based on the capacity threshold and the i node number; step S40, storing the update log of the live l DB table to a local file system; and S50, quickly accessing the small file based on the live l DB table of the local file system. The invention has the advantages that: the access performance of the small file is greatly improved.

Description

File system acceleration method and system based on LevelDB
Technical Field
The invention relates to the technical field of software engineering, in particular to a file system acceleration method and system based on a LevelDB.
Background
In current file systems, directory entries are stored in a linear array in a single file and are simply associated with inode numbers. A file system such as ext4 uses a hash table for directory association operations, while XFS, ZFS, etc. use B-trees for indexing directories; at the same time, LFS, ZFS, etc. also use log techniques to batch changes to metadata in a file system and write in a sequential fashion, which can centralize all metadata needed to access a file.
With the advent of the big data age, a large number of small files were stored to the file system, whereas conventional file systems were suitable for high bandwidth, large file transfers, and when accessing a large number of small files, low performance problems were often encountered due to the limited cache coverage.
Therefore, how to provide a file system acceleration method and system based on a level db to achieve the improvement of the small file access performance becomes a technical problem to be solved urgently.
Disclosure of Invention
The invention aims to solve the technical problem of providing a file system acceleration method and a file system acceleration system based on a LevelDB, which can improve the access performance of small files.
In a first aspect, the present invention provides a file system acceleration method based on a level db, including the steps of:
step S10, a level DB table is created in a user space;
step S20, applying for an inode number for each file to be stored through a global counter;
step S30, setting a capacity threshold, and storing the file into a local file system or a level DB (database) table of the kernel space based on the capacity threshold and an inode number;
step S40, storing the update log of the LevelDB table to a local file system;
and S50, rapidly accessing the small file based on the level DB table of the local file system.
Further, in step S10, the level db table adopts a key structure, the key length is 128 bits, the key of the first 64 bits is the inode number of the parent directory, the key of the last 64 bits is the hash value of the file name of the small file, and the value of each row at least includes the file name of the small file, the inode number, the access authority, the file size, the timestamp, and the data carried by the small file.
Further, in the step S20, the inode number is 64 bits long.
Further, the step S30 specifically includes:
setting a capacity threshold, classifying files into large files and small files based on the capacity threshold, storing the large files into a local file system of a kernel space based on the inode number, and storing the small files into a level DB table based on the inode number.
Further, the step S40 specifically includes:
and storing the LevelDB table to a local file system, wherein the LevelDB table is based on a self-contained pre-written log function, and synchronizing an update log of the LevelDB table to the local file system based on a preset synchronization period.
In a second aspect, the present invention provides a file system acceleration system based on a leverdb, including the following modules:
the LevelDB table creation module is used for creating a LevelDB table in the user space;
the node number application module is used for applying the node number for each file to be stored through the global counter;
the file classification storage module is used for setting a capacity threshold value, and storing files into a local file system or a level DB (database) table of the kernel space based on the capacity threshold value and an inode number;
the LevelDB table storage module is used for storing the LevelDB table and the update log of the LevelDB table to a local file system;
and the small file access module is used for quickly accessing the small file based on the level DB table of the local file system.
Further, in the level db table creating module, the level db table adopts a key value structure, the key length is 128 bits, the key of the first 64 bits is the inode number of the parent directory, the key of the last 64 bits is the hash value of the file name of the small file, and the value of each row at least includes the file name of the small file, the inode number, the access authority, the file size, the timestamp, and the data carried by the small file.
Further, in the inode number application module, the inode number is 64 bits long.
Further, the file classification storage module is specifically configured to:
setting a capacity threshold, classifying files into large files and small files based on the capacity threshold, storing the large files into a local file system of a kernel space based on the inode number, and storing the small files into a level DB table based on the inode number.
Further, the level db table storage module specifically includes:
and storing the LevelDB table to a local file system, wherein the LevelDB table is based on a self-contained pre-written log function, and synchronizing an update log of the LevelDB table to the local file system based on a preset synchronization period.
The invention has the advantages that:
by creating the LevelDB table, storing small files with the capacity smaller than the set capacity threshold value into the LevelDB table based on inode numbers, storing the LevelDB table with the catalogues of the small files, and storing the LevelDB table into a local file system, namely, centralizing all the small files into the LevelDB table with a key value structure, the addressing problem when different small files are frequently accessed is solved, namely, the data of the small files can be rapidly positioned through one LevelDB table, and finally, the small file access performance is greatly improved.
Drawings
The invention will be further described with reference to examples of embodiments with reference to the accompanying drawings.
FIG. 1 is a flow chart of a LevelDB-based file system acceleration method of the present invention.
Fig. 2 is a schematic structural diagram of a file system acceleration system based on a level db according to the present invention.
FIG. 3 is a schematic diagram of the architecture of the present invention.
Detailed Description
According to the technical scheme in the embodiment of the application, the overall thought is as follows: the small files are stored into the LevelDB table based on the inode number, and then the LevelDB table is stored into a local file system, namely, all the small files are concentrated into the LevelDB table with a key value structure, so that the addressing problem when different small files are accessed frequently is solved, and the small file access performance is improved.
Referring to fig. 1 to 3, a preferred embodiment of a file system acceleration method based on a leverldb of the present invention includes the following steps:
step S10, a level DB table is created in a user space; the small files can be stored through the LevelDB table, so that the small file access performance can be improved, and the small files are compatible with a modern local file system of most POSIX standards in Linux;
step S20, applying for an inode number for each file to be stored through a global counter; the global counter will self-increment when creating a new file or new directory;
step S30, setting a capacity threshold, and storing the file into a local file system or a level DB (database) table of the kernel space based on the capacity threshold and an inode number;
step S40, storing the update log of the LevelDB table to a local file system;
and S50, rapidly accessing the small file based on the level DB table of the local file system.
The invention has three modes in use: firstly, the Linux kernel module is embedded under the VFS, so that the mode has the best performance; secondly, separating the application layer independent service from the kernel, and interacting other applications with a local file system through a FUSE library of Linux; thirdly, the method directly operates in the application program as a library, and the method has the advantage of being convenient in deployment and operation.
In the step S10, the level db table adopts a key structure, the key length is 128 bits, the key of the first 64 bits is the inode number of the parent directory, the key of the last 64 bits is the hash value of the file name of the small file, and each row of values at least includes the file name of the small file, the inode number, the access authority, the file size, the timestamp, and the data carried by the small file.
The first 64 bits of the values of all directory entries under the same directory are the same, and when readdir () is called, after the inode number of the target directory is obtained, the items of which the first 64 bits of all keys are the inode number are sequentially scanned in the level DB table; when a single file needs to be parsed, the retrieval is started from the root directory, and the current inode number is continuously combined with the hash value of the next part in the path to look up the table until the line of data representing the file is finally found.
When hard links appear, more than two rows can have the same inode number, file attribute and data, so special treatment is needed; for duplicate entries, only one row is reserved for storing attributes and data, and the first 64 bits of the key of that particular row are its own inode number, and the last 64 bits are null values, with the other rows having only hard-linked tags in the value portion.
The LevelDB provides atomicity guarantee for simple write operations of a row, but the read-modify-write operation does not have atomicity guarantee, because all file attributes are stored in the LevelDB as row values, so the read-modify-write operation must be performed frequently, and security and performance are guaranteed by a lightweight fine-grained lock when implemented.
The level db table is exemplified as follows:
key with a key Value of
<0,h1> 1,"home",structstat
<1,h2> 2,"foo",structstat
<1,h3> 3,"bar",structstat
<2,h4> 4,"apple",hardlink
<2,h5> 5,"book",structstat,inlinesmallfile(<4KB)
<3,h6> 4,"pear",hardlink
<4,null> 4,structstat,largefilepointer(>4KB)
The LevelDB is a key value database using LSM tree (Log-Structured Merge Tree), providing GET, PUT, DELETE, SCAN etc. APIs; the basic principle of the LevelDB and LSM tree is to use a log mode to manage a plurality of large ordered data arrays (namely SSTable technology) on a local disk, when data is newly added or updated, the new data is written back to a buffer area in a memory, and then the new data is written back to the disk as SSTable after being ordered. When the buffer area grows beyond the threshold value, the buffer area is written back to the disk in the SSTable; when inquiring data, the system searches in a list containing a plurality of SSTable and returns the latest value in time; in order to reduce the number of SSTable to be searched, the LevelDB maintains an index table in the memory, which records the range of keys in each SSTable, and uses a bloom filter to reduce the false searching process; to expedite querying data and removing deleted data, SSTable is also periodically subjected to a merge-sort process, known as "merging".
In the step S20, the inode number is 64 bits long, and since the order of magnitude of the 64 bits long is very large, a reclamation mechanism is not temporarily required to reclaim the inode number of the deleted file.
The step S30 specifically includes:
setting a capacity threshold, classifying files into large files and small files based on the capacity threshold, storing the large files into a local file system of a kernel space based on the inode number, and storing the small files into a level DB table based on the inode number; the local file system is preferably ext4; the capacity threshold is preferably 4K.
The large file is stored in a two-layer directory tree structure, and the large file with the inode number I is stored under the directory of "/LargeStore/J/I" (J=I/10000).
The step S40 specifically includes:
and storing the LevelDB table to a local file system, wherein the LevelDB table is based on a self-contained pre-written log function, and synchronizing an update log of the LevelDB table to the local file system based on a preset synchronization period.
The LevelDB table supports synchronous or asynchronous mode to synchronize the update log to the local file system, and in order to realize consistency guarantee similar to the sequential mode in ext4, the LevelDB table needs to be forced to submit the update log to the local file system in a synchronous mode based on a preset synchronous period; the synchronization period is preferably 5 seconds.
The invention relates to a preferred embodiment of a file system acceleration system based on a LevelDB, which comprises the following modules:
the LevelDB table creation module is used for creating a LevelDB table in the user space; the small files can be stored through the LevelDB table, so that the small file access performance can be improved, and the small files are compatible with a modern local file system of most POSIX standards in Linux;
the node number application module is used for applying the node number for each file to be stored through the global counter; the global counter will self-increment when creating a new file or new directory;
the file classification storage module is used for setting a capacity threshold value, and storing files into a local file system or a level DB (database) table of the kernel space based on the capacity threshold value and an inode number;
the LevelDB table storage module is used for storing the LevelDB table and the update log of the LevelDB table to a local file system;
and the small file access module is used for quickly accessing the small file based on the level DB table of the local file system.
The invention has three modes in use: firstly, the Linux kernel module is embedded under the VFS, so that the mode has the best performance; secondly, separating the application layer independent service from the kernel, and interacting other applications with a local file system through a FUSE library of Linux; thirdly, the method directly operates in the application program as a library, and the method has the advantage of being convenient in deployment and operation.
In the level db table creation module, the level db table adopts a key value structure, the key length is 128 bits, the key of the first 64 bits is the inode number of the parent directory, the key of the last 64 bits is the hash value of the file name of the small file, and the value of each row at least includes the file name of the small file, the inode number, the access authority, the file size, the timestamp, and the data carried by the small file.
The first 64 bits of the values of all directory entries under the same directory are the same, and when readdir () is called, after the inode number of the target directory is obtained, the items of which the first 64 bits of all keys are the inode number are sequentially scanned in the level DB table; when a single file needs to be parsed, the retrieval is started from the root directory, and the current inode number is continuously combined with the hash value of the next part in the path to look up the table until the line of data representing the file is finally found.
When hard links appear, more than two rows can have the same inode number, file attribute and data, so special treatment is needed; for duplicate entries, only one row is reserved for storing attributes and data, and the first 64 bits of the key of that particular row are its own inode number, and the last 64 bits are null values, with the other rows having only hard-linked tags in the value portion.
The LevelDB provides atomicity guarantee for simple write operations of a row, but the read-modify-write operation does not have atomicity guarantee, because all file attributes are stored in the LevelDB as row values, so the read-modify-write operation must be performed frequently, and security and performance are guaranteed by a lightweight fine-grained lock when implemented.
The level db table is exemplified as follows:
Figure BDA0003974476950000071
Figure BDA0003974476950000081
the LevelDB is a key value database using LSM tree (Log-Structured Merge Tree), providing GET, PUT, DELETE, SCAN etc. APIs; the basic principle of the LevelDB and LSM tree is to use a log mode to manage a plurality of large ordered data arrays (namely SSTable technology) on a local disk, when data is newly added or updated, the new data is written back to a buffer area in a memory, and then the new data is written back to the disk as SSTable after being ordered. When the buffer area grows beyond the threshold value, the buffer area is written back to the disk in the SSTable; when inquiring data, the system searches in a list containing a plurality of SSTable and returns the latest value in time; in order to reduce the number of SSTable to be searched, the LevelDB maintains an index table in the memory, which records the range of keys in each SSTable, and uses a bloom filter to reduce the false searching process; to expedite querying data and removing deleted data, SSTable is also periodically subjected to a merge-sort process, known as "merging".
In the inode number application module, the inode number is 64 bits long, and a recovery mechanism is not needed temporarily to recover the inode number of the deleted file because the order of magnitude of the 64 bits long is very large.
The file classification storage module is specifically used for:
setting a capacity threshold, classifying files into large files and small files based on the capacity threshold, storing the large files into a local file system of a kernel space based on the inode number, and storing the small files into a level DB table based on the inode number; the local file system is preferably ext4; the capacity threshold is preferably 4K.
The large file is stored in a two-layer directory tree structure, and the large file with the inode number I is stored under the directory of "/LargeStore/J/I" (J=I/10000).
The LevelDB table storage module specifically comprises:
and storing the LevelDB table to a local file system, wherein the LevelDB table is based on a self-contained pre-written log function, and synchronizing an update log of the LevelDB table to the local file system based on a preset synchronization period.
The LevelDB table supports synchronous or asynchronous mode to synchronize the update log to the local file system, and in order to realize consistency guarantee similar to the sequential mode in ext4, the LevelDB table needs to be forced to submit the update log to the local file system in a synchronous mode based on a preset synchronous period; the synchronization period is preferably 5 seconds.
In summary, the invention has the advantages that:
by creating the LevelDB table, storing small files with the capacity smaller than the set capacity threshold value into the LevelDB table based on inode numbers, storing the LevelDB table with the catalogues of the small files, and storing the LevelDB table into a local file system, namely, centralizing all the small files into the LevelDB table with a key value structure, the addressing problem when different small files are frequently accessed is solved, namely, the data of the small files can be rapidly positioned through one LevelDB table, and finally, the small file access performance is greatly improved.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that the specific embodiments described are illustrative only and not intended to limit the scope of the invention, and that equivalent modifications and variations of the invention in light of the spirit of the invention will be covered by the claims of the present invention.

Claims (10)

1. A file system acceleration method based on a LevelDB is characterized by comprising the following steps of: the method comprises the following steps:
step S10, a level DB table is created in a user space;
step S20, applying for an inode number for each file to be stored through a global counter;
step S30, setting a capacity threshold, and storing the file into a local file system or a level DB (database) table of the kernel space based on the capacity threshold and an inode number;
step S40, storing the update log of the LevelDB table to a local file system;
and S50, rapidly accessing the small file based on the level DB table of the local file system.
2. A file system acceleration method based on a level db as claimed in claim 1, characterized in that: in the step S10, the level db table adopts a key structure, the key length is 128 bits, the key of the first 64 bits is the inode number of the parent directory, the key of the last 64 bits is the hash value of the file name of the small file, and each row of values at least includes the file name of the small file, the inode number, the access authority, the file size, the timestamp, and the data carried by the small file.
3. A file system acceleration method based on a level db as claimed in claim 1, characterized in that: in the step S20, the inode number is 64 bits long.
4. A file system acceleration method based on a level db as claimed in claim 1, characterized in that: the step S30 specifically includes:
setting a capacity threshold, classifying files into large files and small files based on the capacity threshold, storing the large files into a local file system of a kernel space based on the inode number, and storing the small files into a level DB table based on the inode number.
5. A file system acceleration method based on a level db as claimed in claim 1, characterized in that: the step S40 specifically includes:
and storing the LevelDB table to a local file system, wherein the LevelDB table is based on a self-contained pre-written log function, and synchronizing an update log of the LevelDB table to the local file system based on a preset synchronization period.
6. A file system acceleration system based on a level db, characterized in that: the device comprises the following modules:
the LevelDB table creation module is used for creating a LevelDB table in the user space;
the node number application module is used for applying the node number for each file to be stored through the global counter;
the file classification storage module is used for setting a capacity threshold value, and storing files into a local file system or a level DB (database) table of the kernel space based on the capacity threshold value and an inode number;
the LevelDB table storage module is used for storing the LevelDB table and the update log of the LevelDB table to a local file system;
and the small file access module is used for quickly accessing the small file based on the level DB table of the local file system.
7. A level db based file system acceleration system as set forth in claim 6, wherein: in the level db table creation module, the level db table adopts a key value structure, the key length is 128 bits, the key of the first 64 bits is the inode number of the parent directory, the key of the last 64 bits is the hash value of the file name of the small file, and the value of each row at least includes the file name of the small file, the inode number, the access authority, the file size, the timestamp, and the data carried by the small file.
8. A level db based file system acceleration system as set forth in claim 6, wherein: in the inode number application module, the inode number is 64 bits long.
9. A level db based file system acceleration system as set forth in claim 6, wherein: the file classification storage module is specifically used for:
setting a capacity threshold, classifying files into large files and small files based on the capacity threshold, storing the large files into a local file system of a kernel space based on the inode number, and storing the small files into a level DB table based on the inode number.
10. A level db based file system acceleration system as set forth in claim 6, wherein: the LevelDB table storage module specifically comprises:
and storing the LevelDB table to a local file system, wherein the LevelDB table is based on a self-contained pre-written log function, and synchronizing an update log of the LevelDB table to the local file system based on a preset synchronization period.
CN202211524055.3A 2022-12-01 2022-12-01 File system acceleration method and system based on LevelDB Pending CN116226038A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211524055.3A CN116226038A (en) 2022-12-01 2022-12-01 File system acceleration method and system based on LevelDB

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211524055.3A CN116226038A (en) 2022-12-01 2022-12-01 File system acceleration method and system based on LevelDB

Publications (1)

Publication Number Publication Date
CN116226038A true CN116226038A (en) 2023-06-06

Family

ID=86570335

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211524055.3A Pending CN116226038A (en) 2022-12-01 2022-12-01 File system acceleration method and system based on LevelDB

Country Status (1)

Country Link
CN (1) CN116226038A (en)

Similar Documents

Publication Publication Date Title
CN106874383B (en) Decoupling distribution method of metadata of distributed file system
EP2434417B1 (en) Large scale data storage in sparse tables
US4823310A (en) Device for enabling concurrent access of indexed sequential data files
US7418544B2 (en) Method and system for log structured relational database objects
US8442957B2 (en) Efficient management of large files
US9149054B2 (en) Prefix-based leaf node storage for database system
US7769792B1 (en) Low overhead thread synchronization system and method for garbage collecting stale data in a document repository without interrupting concurrent querying
US11755427B2 (en) Fast recovery and replication of key-value stores
US9547706B2 (en) Using colocation hints to facilitate accessing a distributed data storage system
CN103595797B (en) Caching method for distributed storage system
US7634517B1 (en) System and method for dynamically updating a document repository without interrupting concurrent querying
CN106462592A (en) Systems and methods to optimize multi-version support in indexes
US8346731B1 (en) Techniques for global single instance indexing for backup data
CN102508913A (en) Cloud computing system with data cube storage index structure
US8640136B2 (en) Sharing objects between computer systems
KR20090063733A (en) Method recovering data server at the applying multiple reproduce dispersion file system and metadata storage and save method thereof
US7617226B1 (en) Document treadmilling system and method for updating documents in a document repository and recovering storage space from invalidated documents
CN112632068A (en) Solution for rapidly providing mass data query service
Evangelidis et al. The hBPi/-tree: A Modified hB-tree Supporting Concurrency, Recovery and Node Consolidation
US10558636B2 (en) Index page with latch-free access
CN110413724B (en) Data retrieval method and device
CN116226038A (en) File system acceleration method and system based on LevelDB
CN110908830A (en) Method for realizing file system to object storage difference comparison and backup through database
CN117215477A (en) Data object storage method, device, computer equipment and storage medium
GB2439575A (en) Replacing data with references when freezing data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination