CN112380171B - YAFFS file system OOB identification method, terminal device and storage medium - Google Patents

YAFFS file system OOB identification method, terminal device and storage medium Download PDF

Info

Publication number
CN112380171B
CN112380171B CN202011382793.XA CN202011382793A CN112380171B CN 112380171 B CN112380171 B CN 112380171B CN 202011382793 A CN202011382793 A CN 202011382793A CN 112380171 B CN112380171 B CN 112380171B
Authority
CN
China
Prior art keywords
oob
field
value
page
byte
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011382793.XA
Other languages
Chinese (zh)
Other versions
CN112380171A (en
Inventor
黄庆发
沈长达
黄志炜
金俊利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Meiya Pico Information Co Ltd
Original Assignee
Xiamen Meiya Pico Information Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Meiya Pico Information Co Ltd filed Critical Xiamen Meiya Pico Information Co Ltd
Priority to CN202011382793.XA priority Critical patent/CN112380171B/en
Publication of CN112380171A publication Critical patent/CN112380171A/en
Application granted granted Critical
Publication of CN112380171B publication Critical patent/CN112380171B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1748De-duplication implemented within the file system, e.g. based on file segments
    • G06F16/1752De-duplication implemented within the file system, e.g. based on file segments based on file chunks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems

Abstract

The invention relates to an OOB identification method of a YAFFS file system, terminal equipment and a storage medium, wherein the method comprises the following steps: s1: identifying the page size of the YAFFS image file, and determining the value of a field ByteCount according to the page size; s2: sequentially traversing the OOB areas in each page of each erasure data block contained in the image file, starting from the initial positions of the OOB areas, querying backwards by taking bytes as units, and calculating a position variable set formed by the times that each position is the initial position of each field of the OOB areas; determining the position of each field according to the position variable set of each field and the characteristics of each field; s3: each field is adjusted at its identifiable location in YAFFS based on its location, and YAFFS image files are identified based on the adjusted OOB zones. The method identifies key fields such as Blocksequence, ObjectID, ChunkID, ByteCount, ECC and the like in the OOB management data through a series of algorithm rules, thereby achieving the recovery of the field sequence of the OOB management data.

Description

YAFFS file system OOB identification method, terminal device and storage medium
Technical Field
The invention relates to the field of file management, in particular to an OOB identification method of a YAFFS file system, terminal equipment and a storage medium.
Background
With the advent of the internet of things era, the research of embedded intelligent equipment has raised a wave. In order to minimize the development cost of smart devices, most manufacturers often use NADN FLASH or NOR FLASH storage media with smaller storage capacity, rather than select SSDs or HDDs. The YAFFS (Yeast antenna Flash File System) File system is used as an embedded File system designed for the FLASH, and can better realize the File management function and solve the problem of wear balance caused by the limitation of FLASH erasing times. For this part, YAFFS is implemented primarily by introducing specific format OOB for management data. However, the storage formats of different FLASH storage media vendors for OOB management data are often different. This results in the device mirroring the memory chip mirroring tool, and the YAFFS file system cannot be loaded with browsing files because the format of the original OOB data is unknown. At present, a mature scheme does not exist in the identification technology for the OOB management data of the YAFFS file system, related documents are few, and the requirements for the development of the field of embedded evidence obtaining safety of the Internet of things are not met. Therefore, it is necessary to research the OOB management data identification technology of the YAFFS file system, and the technology is of great significance to evidence obtaining of the internet of things embedded device.
Disclosure of Invention
In order to solve the above problem, the present invention provides an OOB identification method for YAFFS file system, a terminal device and a storage medium.
The specific scheme is as follows:
a YAFFS file system OOB identification method comprises the following steps:
s1: identifying the page size of the YAFFS image file, and determining the value of a field ByteCount according to the page size;
s2: sequentially traversing the OOB areas in each page of each erasure data block contained in the image file, backward querying by taking bytes as units from the initial position of the OOB areas, and calculating a position variable set consisting of the times that each position is the initial position of each field of the OOB areas; determining the position of each field according to the position variable set of each field and the characteristics of each field;
s3: each field is adjusted at its identifiable location in YAFFS based on its location, and YAFFS image files are identified based on the adjusted OOB zones.
Further, the process of determining the location of the field ByteCount includes:
constructing a position variable set bc ═ bc corresponding to the field ByteCount of each erasure data block0,bc1,bc2,...,bcNAnd the value of each element in the initial setting set is 0, wherein bc isNRepresenting the number of times of starting position of ByteCount in the field of nth byte in the OOB area;
traversing each OOB area in each page, sequentially querying backwards in units of bytes from the initial position of each OOB area, judging whether the value in 4 continuous bytes starting from the jth byte is the value of ByteCount or not when the jth byte is queried, and if so, setting bcjAdding 1;
and determining the position of a field ByteCount in each block according to a subscript corresponding to the maximum value of the element in the ByteCount position variable set bc of each erasure data block.
Further, the process of determining the location of the field blockasequence includes:
traversing the OOB area in each page in each erasure data block, searching backwards in units of bytes from the initial position of the OOB area, judging whether the value in continuous 4 bytes starting with the jth byte simultaneously meets the following three conditions when the jth byte is searched, and if so, judging that the jth byte in the OOB area is the initial position of a field Blocksequence;
setting the value V in the continuous 4 bytes starting from the jth byte in the OOB regionjThen, the following conditions need to be satisfied:
(1) v of the pagejV of the first page corresponding to it in the erased data blockjEqual, neither of which is equal to the value of the field ByteCount;
(2) there are more than 5 consecutive erased data blocks that satisfy: v in all OOB areas in the same erase data blockjAre all equal;
(3) v in OOB zone in consecutive 5 different erased data blocksjNot equal.
Further, the determination of the location of the fields ECC1 and ECC2 includes:
the location variable set ECC1 corresponding to the field ECC1 for each erase data block is constructed { ECC1 ═0,ecc11,ecc12,...,ecc1MPosition variable set ECC2 ═ ECC2 corresponding to field ECC20,ecc21,ecc22,...,ecc2 M0 for each element of initial setting sets ecc1 and ecc2, where ecc1M、ecc2MRespectively expressed in OOB regionThe Mth byte is the number of times of the starting position of the field ECC1 or ECC 2;
traversing each page; for a data area in a page, respectively performing ECC check corresponding to YAFFS on the first 1024 bytes and the last 1024 bytes of the data area to obtain values of fields ECC1 and ECC2 corresponding to the page; for the OOB area in the page, query backward in bytes from its starting position, when the jth byte is found, determine whether the value in the 12 consecutive bytes starting with the jth byte is the value of the field ECC1 and ECC2 corresponding to the page, if it is the value of ECC1, set ECC1jAdding 1; if it is the value of ECC2, ECC2 is setj Adding 1;
the location of fields ECC1 and ECC2 in each erased block of data is determined by the subscript corresponding to the maximum of the elements in ECC1, ECC2 location variable sets ECC1, ECC2 for each erased block of data.
Further, the process of determining the location of the ChunkID field includes:
constructing a position variable set ch ═ ch { ch } corresponding to the field ChunkID of each erasure data block0,ch1,ch2,...,chNAnd initially setting the value of each element in the set ch to be 0, wherein the chNRepresents the number of times the starting position of the ChunkID field is the nth byte in the OOB region;
traversing each page contained in each erasure data block, sequentially querying backwards in byte unit from the start position of the OOB area of each page, when the jth byte is queried, judging whether the values in continuous 4 bytes starting from the jth byte in the OOB areas of two continuous pages are not equal to 0x00000000 and 0 xFFFFFFFFFF, and the difference value between the next page and the previous page is 1, if so, setting chjAdding 1;
selecting positions corresponding to the first two elements with the largest median value in the elements of the position variable set ch corresponding to the field ChunkID of each erasure data block as two suspicious positions of the ChunkID in each erasure data block;
determining for each suspect location whether: more than half of data areas in the page satisfy the condition that the ChunkID values corresponding to all suspicious positions contained in all the erasing data blocks are 0: the first 4 bytes of the data area are YAFFS file type id: 0x00000001, 0x00000002, 0x00000003, and 0x 00000004; the suspicious location satisfying the condition is taken as the final location of ChunkID.
Further, the determination of the location of the field ObjectID includes:
constructing a position variable set ob ═ ob corresponding to the field objectID of each erasure data block0,ob1,ob2,...,obNAnd the value of each element in the initial setting set ob is 0, wherein obNIndicates the number of times the starting position of the ObjectID is located in the nth byte of the OOB area;
traversing the OOB area of each page contained in each erasure data block, sequentially querying backwards in units of bytes from the start position of the OOB area, when the jth byte is queried, judging whether the values in continuous 4 bytes starting from the jth byte in the OOB areas of two continuous pages are not equal to the values of 0x00000000, 0xFFFFFFFF and the field ByteCount, if so, setting bcjAdding 1;
selecting positions corresponding to the first four elements with the largest median value of the elements of the position variable set ob corresponding to the field ObjectID of each erasure data block as four suspicious positions of the ObjectID in each erasure data block;
for each suspicious position, judging whether the value of the objectID corresponding to the suspicious position meets the values which are not equal to 0x80FFFFFF, 0x40FFFFFF, 0xFFFFFF80, 0xFFFFFF40 and field BlockSequence, and if so, taking the suspicious position meeting the following conditions as the final position of the field objectID; if the four suspicious positions are not satisfied, further judgment is carried out;
the further judgment comprises:
selecting positions corresponding to the first two elements with the largest median value in the elements of the position variable set ch corresponding to the field ChunkID of each erasure data block as two suspicious positions of the ObjectID in each erasure data block;
judging whether a condition I, a condition II and a condition III are simultaneously met or not aiming at each suspicious position, and if so, taking the met suspicious position as the final position of the field ObjectID; if the values of the two elements in the position variable set ch are not met, the first two elements with the largest median values in the position variable set ch corresponding to the ChunkID are removed, and then the positions corresponding to the first two elements with the largest values in the ch are selected again to serve as two suspicious positions of the ObjectID in each erasure data block for judgment;
wherein, the conditions II and the conditions III are respectively as follows:
the first condition is as follows: whether the value of ObjectID corresponding to the suspect location satisfies a value that is not equal to the determined values of the other fields;
and (2) carrying out a second condition: when the value of ChunkID is 0, the file size of the data area is larger than 0x 00001000;
and (3) performing a third condition: traversing each OOB area in the erasure data block, when the value of the OOB area at the suspicious position is equal to the value of the objectID corresponding to the suspicious position, adding the value of ChunkID in the corresponding OOB area into the set, sorting and de-duplicating each element in the set according to the size sequence, and then at least having continuous values of three elements.
Further, the determination process of the location of the field PageStatus includes:
constructing a position variable set pgs ═ pgs corresponding to the field PageStatus of each erasure data block0,pgs1,pgs2,...,pgsLAnd the value of each element in the initial setting set pgs is 0, wherein pgsLIndicating the number of times of starting position of the PageStatus field with the L-th byte in the OOB area;
traversing each OOB area in each page, sequentially querying backwards in units of bytes from the starting position of each OOB area, judging whether the value of the jth byte is 0x40 or 0xFF or 0x00 when the jth byte is queried, and if so, setting pgsjAdding 1; when the value of the jth byte is not 0x40 or 0xFF or 0x00, the OOB area in the next page does not judge the value of the jth byte any more;
and determining the position of the PageStatus in each erasure data block according to the subscript corresponding to the maximum value of the elements in the PageStatus position variable set pgs of each erasure data block.
Further, the determination process of the location of the field blockastatus includes:
constructing a position variable set bks ═ bks corresponding to the field BlockStatus of each erasure data block0,bks1,bks2,...,bksLEach element in the initial set bks has a value of 0, where bks isLRepresents the number of times of starting position by taking the L-th byte in the OOB area as a field BlockStatus;
traversing each OOB area in each page, sequentially searching backwards in byte units from the starting position, judging whether the value of the jth byte is 0x80 or 0xFF or 0x00 when the jth byte is searched, and if so, setting bksjAdding 1; when the value of the jth byte is not 0x80 or 0xFF or 0x00, the OOB area in the next page does not judge the value of the jth byte any more;
and determining the position of the BlockStatus in each erasure data block according to the subscript corresponding to the maximum value of the elements in the BlockStatus position variable set bks of each erasure data block.
A YAFFS file system OOB-aware terminal device, comprising a processor, a memory and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method as described above in the embodiments of the present invention when executing the computer program.
A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of the method as described above for an embodiment of the invention.
According to the technical scheme, key fields such as Blocksequence, ObjectID, ChunkID, ByteCount, ECC and the like in the OOB management data are identified through a series of algorithm rules, so that the OOB management data field sequence is restored. The method provides a solution for the out-of-order OOB identification of the YAFFS file system, fills the blank in the aspect of the out-of-order OOB identification technology of the YAFFS file system, and has great significance for electronic data forensics.
Drawings
Fig. 1 is a schematic diagram illustrating a Flash storage structure according to a first embodiment of the present invention.
Fig. 2 is a schematic diagram illustrating a composition structure of YAFFS data pages according to an embodiment of the invention.
Fig. 3 is a schematic diagram illustrating a chaotic OOB management area according to an embodiment of the present invention.
Fig. 4 is a schematic diagram illustrating OOB identification principles according to an embodiment of the present invention.
Fig. 5 is a flowchart illustrating a first embodiment of the present invention.
Fig. 6 is a OOB metadata distribution diagram of a linked file and a midlet file according to an embodiment of the invention.
Fig. 7 is a schematic diagram illustrating a decompressed file list according to an embodiment of the present invention.
Detailed Description
To further illustrate the various embodiments, the present invention provides the accompanying figures. The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the embodiments. Those skilled in the art will appreciate still other possible embodiments and advantages of the present invention with reference to these figures.
The invention will now be further described with reference to the accompanying drawings and detailed description.
The first embodiment is as follows:
as shown in fig. 1, a Flash memory space is composed of logical units such as erase blocks and data pages, and typically, an erase block includes 128 or 256 data pages. Where a page of data is the smallest programmable unit and an erase block is the smallest erasable unit. Each data page unit is in turn divided into a data area (DataArea) and a free area (also called metadata area or spare area, spareararea, OOB). The free area is mainly used for storing management metadata, recording a logical address of a data page, an Error Checking and Correction (ECC) code, page status, bad block information, and the like.
At present, the page size parameters of most Flash memory chips are shown in table 1, wherein the former structure belongs to a small block structure, the size of a data area is equivalent to a sector, and a free area is immediately behind the data area. The latter is a large block structure, the size of the data area and the size of the free area are integral multiples of the size of the corresponding block of the small block structure, and the address of the free area in the flash memory physical mirror image can be determined according to the size of the physical page after the free areas are stored in the data area in a centralized manner.
TABLE 1
Page size (Byte) Idle cell size (Byte) Data area size (Byte)
512+16 16 512
2048+64 2048 2048
The YAFFS file system is a log file system designed for Flash, and at present, there are two versions, namely YAFFS1 and YAFFS2 file systems, one of the main differences is that YAFFS1 is a Flash memory suitable for a small block structure, and YAFFS2 can better support a large-capacity Flash chip. The storage structure of each page in FLASH is shown in fig. 2, each page of data on FLASH has extra space for storing additional information, and YAFFS uses the rest of the space to store the content related to the file system. Taking YAFFS1 as an example, a data field of 512 bytes is preceded by OOB management data of 16 bytes, and the detailed field meanings are shown in tables 2 and 3.
TABLE 2
Figure GDA0003591790330000091
TABLE 3
Figure GDA0003591790330000092
Figure GDA0003591790330000101
As can be known from the principle of YAFFS file system, each file has an object ID for unique association identification, and the data blocks of the file are associated with the file information through the object ID in OOB, and the sorting of the data blocks inside the file is performed through the ChunkID in OOB. However, due to different device manufacturers, considering the security of the data, the order of storing the OOB data into the FLASH OOB area is often changed based on YAFFS source code migration, as shown in fig. 3. Therefore, as shown in FIG. 4, the order in which OOBs are identified and restored according to the characteristics of the YAFFS file system OOBs is the key to YAFFS image resolution. Secondly, according to the tree organization structure of the YAFFS file system, the corresponding data blocks are placed into the nodes of the Tnode tree of the corresponding ObjectID file to form a directory structure of the file system which can be normally accessed.
Based on the foregoing principle, an embodiment of the present invention provides an OOB identification method for a YAFFS file system, as shown in fig. 5, the method includes the following steps:
the method comprises the following steps: the page size of the YAFFS image file is identified, and the value of the field ByteCount is determined according to the page size.
The process of identifying the page size in this embodiment includes: and sequentially reading the data blocks N with the size of 512+16 or 2048+64 from the start position of the mirror image, and filtering the data blocks with the full FF of the last 16 or 64 bytes of the data blocks. Calculating the ECC value of the first 256 bytes or 1024 bytes in the data area of the data block, judging whether the ECC values of 5 continuous pages can be inquired to the same value in the corresponding metadata area of the page, and if so, identifying the page size to be 512+16 or 2048+ 64.
The method for inquiring the same value in the metadata area corresponding to the page comprises the following steps: sequentially querying the metadata area (OOB area) backwards in byte units from the starting position of the metadata area, judging whether the value in continuous 3 or 12 bytes starting from the jth byte is an ECC value when the jth byte is queried, and if so, judging that the same value is queried in the metadata area corresponding to the page.
In this embodiment, if the identified page size is 2048+64, the value of the corresponding field ByteCount is 2048(0x 00000800).
S2: sequentially traversing the OOB areas in each page of each erasure data block contained in the image file, backward querying by taking bytes as units from the initial position of the OOB areas, and calculating a position variable set consisting of the times that each position is the initial position of each field of the OOB areas; and determining the position of each field according to the position variable set of each field and the characteristics of each field.
Step S2 specifically includes the following processes:
s201: constructing a position variable set bc ═ bc corresponding to the field ByteCount of each erasure data block0,bc1,bc2,...,bcNAnd the value of each element in the initial setting set is 0, wherein bc isNIndicating the number of times the nth byte in the OOB region is the starting position of the ByteCount field.
Traversing each OOB area in each page, sequentially querying backwards in units of bytes from the initial position of each OOB area, judging whether the value in 4 continuous bytes starting from the jth byte is the value of ByteCount or not when the jth byte is queried, and if so, setting bcjAnd adding 1.
And determining the position of a field ByteCount in each block according to a subscript corresponding to the maximum value of the element in the ByteCount position variable set bc of each erasure data block.
S202: traversing the OOB area in each page in each erasure data block, searching backwards in units of bytes from the initial position of the OOB area, when the jth byte is searched, judging whether the value in continuous 4 bytes starting from the jth byte simultaneously meets the following three conditions, and if so, judging that the jth byte in the OOB area is the initial position of the field BlockSequence.
Setting the jth byte in the OOB zone asThe value within the first 4 consecutive bytes is VjThen, the following conditions need to be satisfied:
(1) v of the pagejV of the first page corresponding to it in the erased data blockjEqual, and neither is equal to the value of the field ByteCount.
(2) There are more than 5 consecutive erased data blocks that satisfy: v in all OOB areas in the same erase data blockjAre all equal.
(3) V in OOB region in consecutive 5 different erase data blocksjAre not equal.
S203: the location variable set ECC1 corresponding to the field ECC1 for each erase data block is constructed { ECC1 ═0,ecc11,ecc12,...,ecc1MPosition variable set ECC2 ═ ECC2 corresponding to field ECC20,ecc21,ecc22,...,ecc2MH, the value of each element in the initial setting sets ecc1 and ecc2 is 0, where ecc1M、ecc2MIndicating the number of times the starting location of the field ECC1 or ECC2 is the mth byte in the OOB zone, respectively.
Traversing each page; for a data area in a page, respectively performing ECC check corresponding to YAFFS on the first 1024 bytes and the last 1024 bytes of the data area to obtain values of fields ECC1 and ECC2 corresponding to the page; for the OOB area in the page, query backward in bytes from its starting position, when the jth byte is found, determine whether the value in the 12 consecutive bytes starting with the jth byte is the value of the field ECC1 and ECC2 corresponding to the page, if it is the value of ECC1, set ECC1jAdding 1; if it is the value of ECC2, ECC2 is setjAnd adding 1.
The location of fields ECC1 and ECC2 in each erase block is determined by the subscript corresponding to the maximum of the elements in ECC1, ECC2 location variable sets ECC1, ECC2 for each erase block.
S204: constructing a position variable set ch ═ ch { ch } corresponding to the field ChunkID of each erasure data block0,ch1,ch2,...,chNAnd initially setting the value of each element in the set ch to be 0, wherein the chNIs shown inThe nth byte in the OOB zone is the number of times the ChunkID starts a position.
Traversing each page contained in each erasure data block, sequentially querying backwards in byte unit from the start position of the OOB area of each page, when the jth byte is queried, judging whether the values in continuous 4 bytes starting from the jth byte in the OOB areas of two continuous pages are not equal to 0x00000000 and 0 xFFFFFFFFFF, and the difference value between the next page and the previous page is 1, if so, setting chjAnd adding 1.
Selecting positions corresponding to the first two elements with the largest median value in the elements of the position variable set ch corresponding to the field ChunkID of each erasure data block as two suspicious positions of the ChunkID in each erasure data block;
determining for each suspect location whether: more than half of data areas in the page when the values of ChunkIDs corresponding to all suspicious positions contained in all the erasure data blocks are 0 satisfy: the first 4 bytes of the data area are YAFFS file type id: 0x00000001, 0x00000002, 0x00000003 and 0x 00000004; the suspicious location satisfying the condition is taken as the final location of the ChunkID.
S205: constructing a position variable set ob ═ ob corresponding to the field objectID of each erasure data block0,ob1,ob2,...,obNAnd b, initially setting the value of each element in the set ob to be 0, wherein obNIndicating the number of times the starting position of the ObjectID is set to the nth byte in the OOB area.
Traversing the OOB area of each page contained in each erasure data block, sequentially querying backwards in byte unit from the starting position, when the jth byte is queried, judging whether the values in continuous 4 bytes starting from the jth byte in the OOB areas of two continuous pages are not equal to the values of 0x00000000, 0xFFFFFFFF and the field ByteCount, if yes, setting bcjAnd adding 1.
Selecting positions corresponding to the first four elements with the maximum median value in the elements of the position variable set ob corresponding to the field ObjectID of each erasure data block as four suspicious positions of the ObjectID in each erasure data block;
as the mixture of PageStatus and BlockStatus with other spare fields may cause the situation of equality for a plurality of times before and after, for each suspicious location, determining whether the value of ObjectID corresponding to the suspicious location satisfies the values not equal to 0x80FFFFFF, 0x40FFFFFF, 0xfffffff 80, 0xfffff 40 and field BlockSequence, if so, taking the suspicious location satisfying the following conditions as the final location of the field ObjectID, and entering S207; if none of the four suspicious locations is satisfied, it may be that there are many link files and medium and small files in the image, resulting in the ObjectID presenting a certain continuity, as shown in fig. 6, proceed to S206 for further determination.
S206: and selecting positions corresponding to the first two elements with the largest median value in the elements of the position variable set ch corresponding to the field ChunkID of each erasure data block as two suspicious positions of the object ID in each erasure data block.
Judging whether a condition I, a condition II and a condition III are simultaneously met or not for each suspicious position, and if so, taking the met suspicious position as the final position of the field ObjectID; if the values of the two elements in the position variable set ch are not met, the first two elements with the largest values in the position variable set ch corresponding to the field ChunkID are removed, and then the positions corresponding to the first two elements with the largest values in the ch are selected again to serve as the two suspicious positions of the ObjectID in each erasure data block for judgment.
Wherein, the condition one, the condition two and the condition three are respectively as follows:
the first condition is as follows: whether the value of ObjectID corresponding to the suspicious location satisfies a field that is not equal to the determined value of another field, such as ChunkID, BlockSequence, ECC, ByteCount, etc.
And (2) carrying out a second condition: when the ChunkID has a value of 0, the file size of the data area is greater than 0x 00001000.
And (3) carrying out a third condition: traversing each OOB area in the erasure data block, when the value of the OOB area at the suspicious position is equal to the value of the objectID corresponding to the suspicious position, adding the value of the ChunkID in the corresponding OOB area into the set, and sequencing and de-duplicating the elements in the set according to the size sequence, wherein the values of at least three elements are continuous.
S207: constructing the field PageStatu for each erased data blocks is equal to { pgs ═ pgs }0,pgs1,pgs2,...,pgsLAnd the value of each element in the initial set of pgs is 0, wherein pgsLIndicating the number of times the starting position of the PageStatus is a field with the lth byte in the OOB region.
Traversing each OOB area in each page, sequentially querying backwards in units of bytes from the starting position of each OOB area, judging whether the value of the jth byte is 0x40 or 0xFF or 0x00 when the jth byte is queried, and if so, setting pgsjAdding 1; when the value of the jth byte is not 0x40 or 0xFF or 0x00, the OOB area in the next page does not make a judgment on the value of the jth byte any more.
And determining the position of the PageStatus in each erasure data block according to the subscript corresponding to the maximum value of the elements in the PageStatus position variable set pgs of each erasure data block.
S208: constructing a position variable set bks ═ bks corresponding to the field BlockStatus of each erasure data block0,bks1,bks2,...,bksLEach element in the initial set bks has a value of 0, where bks isLRepresents the number of times of starting position by taking the L-th byte in the OOB area as a field BlockStatus;
traversing each OOB area in each page, sequentially searching backwards in byte units from the starting position, judging whether the value of the jth byte is 0x80 or 0xFF or 0x00 when the jth byte is searched, and if so, setting bksjAdding 1; when the value of the jth byte is not 0x80 or 0xFF or 0x00, the OOB area in the next page does not judge the value of the jth byte any more;
and determining the position of the BlockStatus in each erasure data block according to the subscript corresponding to the maximum value of the elements in the BlockStatus position variable set bks of each erasure data block.
S3: and according to the position of each field, adjusting each field to the identifiable position in the YAFFS, and identifying the YAFFS image file according to the adjusted OOB area.
The ObjectID, ChunkID in the OOB zone metadata combined with YAFFS are organized into a corresponding directory tree, and the result is shown in FIG. 7.
The method and the device identify key fields such as Blocksequence, ObjectID, ChunkID, ByteCount, ECC and the like in the OOB management data through a series of algorithm rules, thereby achieving the restoration of the OOB management data field sequence. The method provides a solution for the out-of-order OOB identification of the YAFFS file system, fills the blank in the aspect of the out-of-order OOB identification technology of the YAFFS file system, and has great significance for electronic data forensics.
Example two:
the present invention further provides a YAFFS file system OOB identification terminal device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the above method embodiment of the first embodiment of the present invention when executing the computer program.
Further, as an executable solution, the YAFFS file system OOB identification terminal device may be a desktop computer, a notebook, a palm computer, a cloud server, and other computing devices. The YAFFS file system OOB identification terminal device may include, but is not limited to, a processor, a memory. It will be understood by those skilled in the art that the above-mentioned configuration of the YAFFS file system OOB identification terminal device is only an example of the YAFFS file system OOB identification terminal device, and does not constitute a limitation of the YAFFS file system OOB identification terminal device, and may include more or less components than the above-mentioned one, or combine some components, or different components, for example, the YAFFS file system OOB identification terminal device may further include an input-output device, a network access device, a bus, and the like, which is not limited by the embodiments of the present invention.
Further, as an executable solution, the Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, a discrete hardware component, and the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the processor being the control center for the YAFFS file system OOB identification terminal device, with various interfaces and lines connecting the various parts of the entire YAFFS file system OOB identification terminal device.
The memory may be configured to store the computer programs and/or modules, and the processor may implement various functions of the YAFFS file system OOB identification terminal device by running or executing the computer programs and/or modules stored in the memory and invoking data stored in the memory. The memory can mainly comprise a program storage area and a data storage area, wherein the program storage area can store an operating system and an application program required by at least one function; the storage data area may store data created according to the use of the mobile phone, and the like. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
The invention also provides a computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of the above-mentioned method of an embodiment of the invention.
The YAFFS file system OOB identifies the terminal device integrated module/unit if implemented in the form of a software functional unit and sold or used as a stand-alone product, which may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments described above may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), software distribution medium, and the like.
While the invention has been particularly shown and described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (9)

1. A YAFFS file system OOB identification method is characterized by comprising the following steps:
s1: identifying the page size of the YAFFS image file, and determining the value of a field ByteCount according to the page size;
s2: sequentially traversing the OOB areas in each page of each erasure data block contained in the image file, searching backwards by taking bytes as units from the initial position of the OOB areas, and calculating a position variable set consisting of the times that each position is the initial position of the OOB area fields ByteCount, ECC1, ECC2, ChunkID, ObjectID, PageStatus and BlockStatus; determining the location of the fields ByteCount, ECC1, ECC2, ChunkID, ObjectID, PageStatus, and blockackstatus according to the location variable sets and characteristics of the fields ByteCount, ECC1, ECC2, ChunkID, ObjectID, PageStatus, and blockackstatus; determining the position of the field BlockSequence, wherein the determining process of the position of the field BlockSequence comprises the following steps:
traversing the OOB area in each page in each erasure data block, searching backwards in a byte unit from the initial position, when a jth byte is searched, judging whether values in continuous 4 bytes starting from the jth byte simultaneously meet the following three conditions, and if so, judging that the jth byte in the OOB area is the initial position of a field BlockSequence;
set in the OOB regionThe value in the consecutive 4 bytes starting with the jth byte is VjThen, the following conditions need to be satisfied:
(1) v of the pagejV of the first page corresponding to it in the erased data blockjEqual, neither of which is equal to the value of the field ByteCount;
(2) there are more than 5 consecutive erased data blocks that satisfy: v in all OOB areas in the same erasure data blockjAre all equal;
(3) v in OOB region in consecutive 5 different erase data blocksjAre not equal;
s3: each field is adjusted at its identifiable location in YAFFS based on its location, and YAFFS image files are identified based on the adjusted OOB zones.
2. The YAFFS file system OOB identification method of claim 1, wherein: the determination process of the location of the field ByteCount includes:
constructing a position variable set bc ═ bc corresponding to the field ByteCount of each erasure data block0,bc1,bc2,...,bcNAnd the value of each element in the initial setting set is 0, wherein bc isNRepresents the number of times of starting position of a ByteCount field with the Nth byte in the OOB area;
traversing each OOB area in each page, sequentially querying backwards in units of bytes from the initial position of each OOB area, judging whether the value in 4 continuous bytes starting from the jth byte is the ByteCount value or not when the jth byte is queried, and if so, setting bcjAdding 1;
and determining the position of a field ByteCount in each block according to the subscript corresponding to the maximum value of the element in the ByteCount position variable set bc of each erased data block.
3. The YAFFS file system OOB identification method of claim 1, wherein: the determination of the location of the fields ECC1 and ECC2 includes:
constructing a set of location variables ECC corresponding to the field ECC1 for each block of erase data1={ecc10,ecc11,ecc12,...,ecc1MPosition variable set ECC2 corresponding to field ECC2 ═ ECC20,ecc21,ecc22,...,ecc2M0 for each element of initial setting sets ecc1 and ecc2, where ecc1M、ecc2MRepresents the number of times the starting position of ECC1 or ECC2 is represented by the Mth byte in the OOB zone respectively;
traversing each page; for a data area in a page, respectively performing ECC check corresponding to YAFFS on the first 1024 bytes and the last 1024 bytes of the data area to obtain values of fields ECC1 and ECC2 corresponding to the page; for an OOB area in a page, sequentially querying backwards in bytes from the starting position of the OOB area, judging whether the value in continuous 12 bytes starting from the jth byte is the value of the fields ECC1 and ECC2 corresponding to the page when the jth byte is queried, and if the value is the value of ECC1, setting ECC1jAdding 1; if it is the value of ECC2, ECC2 is setjAdding 1;
the location of fields ECC1 and ECC2 in each erased block of data is determined by the subscript corresponding to the maximum of the elements in ECC1, ECC2 location variable sets ECC1, ECC2 for each erased block of data.
4. The YAFFS file system OOB identification method of claim 1, wherein: the determination of the location of the ChunkID field includes:
constructing a position variable set ch ═ ch { ch } corresponding to the field ChunkID of each erasure data block0,ch1,ch2,...,chNAnd initially setting the value of each element in a set ch to be 0, wherein the chNRepresents the number of times the starting position of the ChunkID field is the nth byte in the OOB zone;
traversing each page contained in each erasure data block, sequentially querying backwards in byte unit from the start position of the OOB area of each page, when the jth byte is queried, judging whether the values in 4 continuous bytes starting from the jth byte in the OOB areas of two continuous pages are not equal to 0x00000000 and 0 xFFFFFFFFFF, and the difference value between the next page and the previous page is 1, if so, determining that the next page and the previous page are all equal to zero, and if not, determining that the next page and the previous page are all equal to zero, determining that the next page are all equal to zeroThen, set chjAdding 1;
selecting positions corresponding to the first two elements with the largest median value in the elements of the position variable set ch corresponding to the field ChunkID of each erasure data block as two suspicious positions of the ChunkID in each erasure data block;
determining for each suspect location whether: more than half of data areas in the page when the values of ChunkIDs corresponding to all suspicious positions contained in all the erasure data blocks are 0 satisfy: the first 4 bytes of the data area are YAFFS file type identification: 0x00000001, 0x00000002, 0x00000003 and 0x 00000004; the suspicious location satisfying the condition is taken as the final location of the ChunkID.
5. The YAFFS file system OOB identification method of claim 1, wherein: the determination process of the position of the field ObjectID includes:
constructing a position variable set ob ═ ob corresponding to the field objectID of each erasure data block0,ob1,ob2,...,obNAnd the value of each element in the initial setting set ob is 0, wherein obNRepresents the number of times the starting position is located by the nth byte in the OOB region as the field ObjectID;
traversing the OOB area of each page contained in each erasure data block, sequentially querying backwards in byte unit from the starting position, when the jth byte is queried, judging whether the values in continuous 4 bytes starting from the jth byte in the OOB areas of two continuous pages are not equal to the values of 0x00000000, 0xFFFFFFFF and the field ByteCount, if yes, setting bcjAdding 1;
selecting positions corresponding to the first four elements with the largest median value of the elements of the position variable set ob corresponding to the field ObjectID of each erasure data block as four suspicious positions of the ObjectID in each erasure data block;
for each suspicious position, judging whether the value of the objectID corresponding to the suspicious position meets the values which are not equal to 0x80FFFFFF, 0x40FFFFFF, 0xFFFFFF80, 0xFFFFFF40 and field BlockSequence, and if so, taking the suspicious position meeting the following conditions as the final position of the field objectID; if the four suspicious positions are not satisfied, further judgment is carried out;
the further judgment comprises:
selecting positions corresponding to the first two elements with the largest median value in the elements of the position variable set ch corresponding to the field ChunkID of each erasure data block as two suspicious positions of the ObjectID in each erasure data block;
judging whether a condition I, a condition II and a condition III are simultaneously met or not aiming at each suspicious position, and if so, taking the met suspicious position as the final position of the field ObjectID; if the values of the two elements in the position variable set ch are not met, eliminating the first two elements with the largest median in the position variable set ch corresponding to the ChunkID field, and then selecting the positions corresponding to the first two elements with the largest values in the ch as two suspicious positions of the ObjectID in each erasure data block for judgment;
wherein, the first condition, the second condition and the third condition are respectively as follows:
the first condition is as follows: whether the value of ObjectID corresponding to the suspect location satisfies a value that is not equal to the determined values of the other fields;
and (2) carrying out a second condition: when the value of ChunkID is 0, the file size of the data area is larger than 0x 00001000;
and (3) carrying out a third condition: traversing each OOB area in the erasure data block, when the value of the OOB area at the suspicious position is equal to the value of the objectID corresponding to the suspicious position, adding the value of ChunkID in the corresponding OOB area into the set, sorting and de-duplicating each element in the set according to the size sequence, and then at least having continuous values of three elements.
6. The YAFFS file system OOB identification method of claim 1, wherein: the determination of the location of the field PageStatus includes:
constructing a position variable set pgs ═ pgs corresponding to the field PageStatus of each erasure data block0,pgs1,pgs2,...,pgsLAnd the value of each element in the initial set of pgs is 0, wherein pgsLIndicating the number of times the L-th byte in the OOB region is taken as the starting position of the field PageStatus;
traversing each OOB area in each page, sequentially querying backwards in units of bytes from the starting position of each OOB area, judging whether the value of the jth byte is 0x40 or 0xFF or 0x00 when the jth byte is queried, and if so, setting pgsjAdding 1; when the value of the jth byte is not 0x40 or 0xFF or 0x00, the OOB area in the next page does not judge the value of the jth byte any more;
and determining the position of the PageStatus in each erasure data block according to the subscript corresponding to the maximum value of the elements in the PageStatus position variable set pgs of each erasure data block.
7. The YAFFS file system OOB identification method of claim 1, wherein: the determination process of the location of the field blockackstatus includes:
constructing a position variable set bks ═ bks corresponding to the field BlockStatus of each erasure data block0,bks1,bks2,...,bksLEach element in the initial set bks has a value of 0, where bks isLRepresents the number of times of starting position by taking the L-th byte in the OOB area as a field BlockStatus;
traversing each OOB area in each page, sequentially searching backwards in byte units from the starting position, judging whether the value of the jth byte is 0x80 or 0xFF or 0x00 when the jth byte is searched, and if so, setting bksjAdding 1; when the value of the jth byte is not 0x80 or 0xFF or 0x00, the OOB area in the next page does not judge the value of the jth byte any more;
and determining the position of the BlockStatus in each erasure data block according to the subscript corresponding to the maximum value of the elements in the BlockStatus position variable set bks of each erasure data block.
8. A YAFFS file system OOB discerns terminal equipment, characterized by: comprising a processor, a memory and a computer program stored in said memory and running on said processor, said processor implementing the steps of the method according to any one of claims 1 to 7 when executing said computer program.
9. A computer-readable storage medium storing a computer program, the computer program characterized in that: the computer program when executed by a processor implementing the steps of the method as claimed in any one of claims 1 to 7.
CN202011382793.XA 2020-12-01 2020-12-01 YAFFS file system OOB identification method, terminal device and storage medium Active CN112380171B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011382793.XA CN112380171B (en) 2020-12-01 2020-12-01 YAFFS file system OOB identification method, terminal device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011382793.XA CN112380171B (en) 2020-12-01 2020-12-01 YAFFS file system OOB identification method, terminal device and storage medium

Publications (2)

Publication Number Publication Date
CN112380171A CN112380171A (en) 2021-02-19
CN112380171B true CN112380171B (en) 2022-07-15

Family

ID=74589177

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011382793.XA Active CN112380171B (en) 2020-12-01 2020-12-01 YAFFS file system OOB identification method, terminal device and storage medium

Country Status (1)

Country Link
CN (1) CN112380171B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113571121A (en) * 2021-07-26 2021-10-29 杭州国芯科技股份有限公司 ECC code storage method of NAND Flash of embedded device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101529395A (en) * 2006-08-31 2009-09-09 夏普株式会社 File system
CN102136296A (en) * 2011-02-21 2011-07-27 北京理工大学 Method for identifying metadata format of NANDFlash memory chip
CN109086004A (en) * 2018-07-19 2018-12-25 江苏华存电子科技有限公司 The recognition methods of block type in a kind of flash memory

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7991942B2 (en) * 2007-05-09 2011-08-02 Stmicroelectronics S.R.L. Memory block compaction method, circuit, and system in storage devices based on flash memories

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101529395A (en) * 2006-08-31 2009-09-09 夏普株式会社 File system
CN102136296A (en) * 2011-02-21 2011-07-27 北京理工大学 Method for identifying metadata format of NANDFlash memory chip
CN109086004A (en) * 2018-07-19 2018-12-25 江苏华存电子科技有限公司 The recognition methods of block type in a kind of flash memory

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于Hash的YAFFS2文件各版本恢复算法研究;李亚萌等;《技术研究》;20160531(第5期);第51-57页 *

Also Published As

Publication number Publication date
CN112380171A (en) 2021-02-19

Similar Documents

Publication Publication Date Title
US8423519B2 (en) Data reduction indexing
US10678654B2 (en) Systems and methods for data backup using data binning and deduplication
CN112506814B (en) Memory, control method thereof and memory system
US10372687B1 (en) Speeding de-duplication using a temporal digest cache
CN108733306B (en) File merging method and device
CN111125033B (en) Space recycling method and system based on full flash memory array
US20010051954A1 (en) Data updating apparatus that performs quick restoration processing
CN104021161A (en) Cluster storage method and device
CN110569147B (en) Deleted file recovery method based on index, terminal device and storage medium
KR102509913B1 (en) Method and apparatus for maximized dedupable memory
CN109496292A (en) A kind of disk management method, disk management device and electronic equipment
CN111209257B (en) File system fragmentation method and device
CN100399294C (en) Method and apparatus for effective data management of files
CN112380171B (en) YAFFS file system OOB identification method, terminal device and storage medium
CN112379835B (en) OOB area data extraction method, terminal device and storage medium
CN103530322A (en) Method and device for processing data
CN111124939A (en) Data compression method and system based on full flash memory array
CN111984651A (en) Column type storage method, device and equipment based on persistent memory
CN107748705B (en) Method for recovering system EVT log fragments, terminal equipment and storage medium
CN116339613A (en) Storage device, method of operating the same, and method of operating storage system including the same
US11372565B2 (en) Facilitating data reduction using weighted similarity digest
WO2020238750A1 (en) Data processing method and apparatus, electronic device, and computer storage medium
CN114115734A (en) Data deduplication method, device, equipment and storage medium
WO2023141987A1 (en) File reading method and apparatus
CN112527745B (en) Embedded file system multi-partition analysis method, terminal device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant