CN102622302A - Recognition method for fragment data type - Google Patents

Recognition method for fragment data type Download PDF

Info

Publication number
CN102622302A
CN102622302A CN2011100311238A CN201110031123A CN102622302A CN 102622302 A CN102622302 A CN 102622302A CN 2011100311238 A CN2011100311238 A CN 2011100311238A CN 201110031123 A CN201110031123 A CN 201110031123A CN 102622302 A CN102622302 A CN 102622302A
Authority
CN
China
Prior art keywords
data
crumb data
type
crumb
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011100311238A
Other languages
Chinese (zh)
Other versions
CN102622302B (en
Inventor
汤燕彬
杨泽明
刘宝旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of High Energy Physics of CAS
Original Assignee
Institute of High Energy Physics of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of High Energy Physics of CAS filed Critical Institute of High Energy Physics of CAS
Priority to CN201110031123.8A priority Critical patent/CN102622302B/en
Publication of CN102622302A publication Critical patent/CN102622302A/en
Application granted granted Critical
Publication of CN102622302B publication Critical patent/CN102622302B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Complex Calculations (AREA)

Abstract

The invention provides a recognition method for a fragment data type, which comprises the following steps: first extracting byte frequency distribution F(x) of fragment data x to be tested; then calculating similarity Tx of byte frequency distribution between the fragment data x to be tested and some sample S, judging whether the similarity Tx of the byte frequency distribution between the fragment data x to be tested and some sample S falls into a range of similarity of a fragment data type Ti in known data types T, and if the similarity Tx of the byte frequency distribution between the fragment data x to be tested and some sample S falls into the range of similarity of the fragment data type Ti in the known data types T, judging that the tested fragment data x belong to the type represented by the known fragment data type Ti; and if the similarity Tx of the byte frequency distribution between the fragment data x to be tested and some sample S does not fall into the range of any known data type T, judging that the type of the fragment data x to be tested cannot be recognized. The method can recognize the type of the fragment data, provides basis for follow-up fragment data reconstruction work, and accordingly files with certain content can be restored according to the fragment data so as to provide technical support for judicial evidence obtaining.

Description

The recognition methods of crumb data type
Technical field
The present invention relates to the recognition methods of crumb data type in disk fragments type of data or the memory mirror of a kind of hard disc of computer or other movable storage mediums, particularly relate to recognition methods based on the crumb data type of byte frequency distribution.
Background technology
Disk bunch or piece form by one or more sectors, the sector is the minimum physical memory cell of disk, and bunch is the minimum unit that operating system is distributed; Disk bunch be generally a plurality of sectors; A plurality of sectors such as 2,4,8,16,32,64 are for example arranged, and each bunch can only be taken by a file, even have only several bytes in this file; Also never allow one bunch of plural file sharing, otherwise can cause the confusion of data.Wherein, the sector is a physics, and bunch is logic, bunch can be changed by operating system, forms bunch to be convenient to system management.
File system the storage data during to disk with bunch or piece be unit, distributed and saved is different local to whole magnetic disk, in the prior art, these distributed and saved is called file fragmentation to the different piece of the files in difference place of disk.These file fragmentations can cause system performance to reduce, and make travelling speed descend, thereby; DEFRAGMENT through traditional is handled fragment; DEFRAGMENT can be analyzed the disk fragments in the hard disk, moves and the merged file fragment, makes each file can take storage area independent and continuous on the hard disk; Thereby improve the utilization rate of disk usage space, improve the speed that disk reads file.
In disk except having above-mentioned traditional file fragmentation; Also exist another kind of data; Promptly be present in the data in unallocated bunch or the piece, the generation of these data normally because after disk uses a period of time, duplicate times without number, generation and deleted file cause.For example, after file is deleted, but the part actual content of this document still is stored in this space.Characteristics imperfect, that be prone to be capped that this type data have.With the example that is operating as of deleted file, after file was deleted, the space of storing this document originally was identified as " unallocated space ", and the disk file system in the disk can write this part zone with fresh content in reclaiming use unallocated space process.Yet in fact, this unallocated space also has the partial content of original deleted file, when fresh content is write this space, makes former already present data message covered by new data message.
Although this type data are normally incomplete, be prone to be capped, this type data extract and reconstruct after can obtain comparatively complete content, thereby use as electronic evidence.
For the present invention clearly is described, in the present invention, be crumb data with this data definition that is kept in the disk in unallocated bunch or the piece.In addition; The file what type is arranged; The crumb data that just has corresponding types, the type identification of crumb data are bases of file reorganization or file reduction, therefore; The present invention is based on sector 512B is unit, and definition crumb data type is meant with 512B the type of data of the crumb data representative that is unit.
Visible through above-mentioned analysis, said crumb data is playing an important role aspect the formation electronic evidence, and can improve the discrimination of follow-up file recombination to the identification of crumb data type, and reduces the corresponding calculated amount.Yet not having any prior art at present can analyze and utilize described crumb data, and the crumb data type is discerned.
Summary of the invention
The present invention provides a kind of recognition methods of crumb data type in order to address the above problem, in order to the type of identification crumb data, for follow-up crumb data recombination provides the basis.
In order to solve the problems of the technologies described above, the invention provides following technical scheme:
A kind of recognition methods of crumb data type may further comprise the steps:
Step 1 is extracted the byte frequency distribution F (x) of crumb data x to be tested; Wherein, F (x)={ f 0, f 1F iF 255, f iFor with the sector being the number of times that byte value i occurs in the crumb data of unit;
Step 2 is calculated the similarity T of byte frequency distribution between crumb data x to be tested and a certain sample S through formula (1) x,
T ( A , S ) = A · S | | A | | 2 + | | S | | 2 - A · S Formula (1)
Wherein, A=F (x) is the byte frequency distribution of sector, said test crumb data x place, and S is the frequency distribution of sample data byte; n=256;
Step 3 is judged the similarity T of byte frequency distribution between said crumb data x to be tested and a certain sample S xWhether fall into a kind of crumb data type T of known types T iThe scope of similarity in, if fall into, judge that then said test crumb data x belongs to known types T iThe type of representative; If do not fall in the scope of any one known types T, judge that then the type of said crumb data x to be tested can't be discerned;
Wherein, T={T 1, T 2... T iT mThe total m kind crumb data type of expression T, T iRepresent i kind crumb data type, i=1......m.
Further, the recognition methods of described crumb data type also comprises step 4,
Step 4, the similarity T of byte frequency distribution between said crumb data x to be tested and a certain sample S xFall into a known types T iThe scope of similarity in the time, further judge whether there is δ among the crumb data x xIf, exist, then determine whether to satisfy δ x∈ T jIf, satisfy, and, if i=j judges that then said test crumb data x belongs to known types T iThe type of representative;
Wherein, δ xBe the architectural feature of said a certain file type, Tj is the set of the architectural feature of UNKNOWN TYPE data.
Further, the recognition methods of described crumb data type also comprises step 5,
Step 5, the similarity T of byte frequency distribution between crumb data x said to be tested in the step 3 and a certain sample S xFall into a known types T iThe scope of similarity in similarity during less than preset range, perhaps during the i in the step 4 ≠ j, judge that the similarity of other crumb data in the data block at said crumb data to be measured place falls into said known types T iScope in quantity whether reach predetermined quantity, if reach, judge that then said crumb data x belongs to data type T iThe type of representative, otherwise judge that said crumb data x can't discern.
In addition, before the step 1 of the recognition methods of aforesaid crumb data type, comprise the steps:
Steps A: extract sample pattern S, confirm the crumb data of various file types and the similarity between the said sample pattern S;
Step B: extract the architectural feature δ of various file types, wherein, δ={ δ 1, δ 2δ iδ m, the architectural feature of the total m kind file type of expression δ.
Crumb data of the present invention comprises crumb data and the crumb data in the internal memory in the various disks.
Method provided by the invention can be discerned the type of crumb data, for follow-up crumb data recombination provides the basis, thereby can make it possible to recover the file with certain content according to crumb data, for judicial evidence collection provides technical support.
Below in conjunction with accompanying drawing and specific embodiment, technical scheme according to the invention is at length explained.
Description of drawings
Fig. 1 is the process flow diagram of crumb data kind identification method according to the invention;
Fig. 2 is the process flow diagram of a specific embodiment of crumb data kind identification method according to the invention;
Fig. 3 is the detail flowchart of step S15 among Fig. 2;
Fig. 4 is the detail flowchart of step S16 among Fig. 2;
Fig. 5 is the process flow diagram of crumb data evidence obtaining work.
Embodiment
As shown in Figure 1, be the process flow diagram of crumb data kind identification method according to the invention.
Step S1 before beginning to carry out the crumb data type identification, at first will carry out preliminary work, promptly should obtain the byte frequency distribution sample and its its specific structure characteristic in various file type data zone.If the byte frequency distribution sample and its its specific structure characteristic in existing various file type data zone; Then can skip this step and directly begin to carry out identification work from step 2; If no, then need extract through a large amount of work, like collection, contrast, analysis, summary etc. in this step; Obtain the byte frequency distribution sample and its its specific structure characteristic in various file type data zone, for the type identification that goes on foot down provides the basis.
Step S2 to the crumb data to be tested that will discern, extracts the byte frequency distribution of crumb data to be tested.
Step S3 utilizes the Tanimoto coefficient to set up corresponding model of cognition, calculates the similarity of the byte frequency distribution of crumb data to be tested and a certain sample.
Step S4; The similarity of the byte frequency distribution of the crumb data of a similarity that calculates and a known type and same sample is compared, judge whether the similarity that calculates falls into the scope of back one similarity, if fall into; Then in step S4; The crumb data of confirming crumb data to be tested and this known type belongs to same type, if not in the scope of back one similarity, then this crumb data to be tested of affirmation can't be discerned.
Wherein, The foundation of judging should obtain in advance; The similarity of byte frequency distribution that is known certain type crumb data and a certain sample should be a known range; Can judge so just whether the similarity of calculating falls into this scope,, explain that then crumb data to be tested belongs to the type if fall into.
In addition; The present invention proposes the identification of the auxiliary crumb data type of two types of parameters optimization, and the one, search the specific structural features that whether contains the related data type in the crumb data, the 2nd, consider the relevance of crumb data; Be crumb data to be tested with adjacent crumb data type between have certain related; Can strengthen the accuracy of crumb data identification through these two kinds of methods, and guarantee in identifying, not change raw data, thereby guarantee the authenticity and the reliability of counting.
Fig. 2 is the process flow diagram of a specific embodiment of crumb data kind identification method according to the invention, specifically comprises following step: 1) pre-service; 2) set up model of cognition; 3) type under the preliminary judgement crumb data to be tested; 4) dependency structure of introducing crumb data to be tested is characterized as parameters optimization 1; 5) relevance of distance is a parameters optimization 2 between the introducing crumb data.Utilize parameters optimization can improve the accuracy of crumb data type identification.Below specify above-mentioned each step:
Step S11, pre-service.At pretreatment stage, comprise the byte frequency distribution sample that extracts various file type data zone, set up sample pattern S, wherein, S={S 1, S 2... S iS m, the set of S representative sample model, s iBe one of them daughter element, this is to come out sample pattern is abstract with the method for mathematics, representes with S;
Also comprise extraction document type its specific structure characteristic δ, wherein, δ={ δ 1, δ 2δ iδ m, the set of δ representation file architectural feature.
The byte frequency distribution is meant the leave operation system level, by the frequency distribution of byte statistics raw data.In function F (x), f iExpression is the number of times that byte value i (being the pairing decimal system numerical value of each byte (byte) in the computing machine) occurs in the crumb data of unit with the sector.Through this function F (x); Can extract the characteristic of byte frequency distribution according to the difference of different types of data self property; The advantage of this characteristic is: can abandon the surface that file type, file extension, file special identifier etc. are given by operating system; Be based on the content of crumb data self, can truly reflect the characteristic of crumb data.
File type its specific structure characteristic δ is meant its distinctive continuous binary data sign of various file types, and these architectural features not only are distributed in the reference position of file, and might be distributed in the central or ending of file.Need obtain through the mass data analysis, can come to obtain automatically through some algorithm, also can manual analysis obtain by machine.
About file type its specific structure characteristic δ, the different files type, its architectural feature is different, is example with the jpeg file type, and the file of jpeg file type mainly comprises binary data sign as shown in table 1 below.
Table 1
Code Implication
FFD8 SOI SOI (Start of Image)
FFE0 APP0 mark (Marker)
FFDB Quantization table DQT (difine quantization table)
FFC4 Huffman table DHT (Difine Huffman Table)
FFC0 Two field picture begins SOF0 (Start of Frame)
FFDA Scanning beginning SOS (Start of Scan)
FFD9 Image finishes EOI (End ofImage)
The byte frequency distribution F (x) of step S12, extraction test crumb data x (x representes the code name of crumb data to be tested), wherein, F (x)={ f 0, f 1F iF 255.
Step S13, through the Tanimoto coefficient, promptly formula (1) calculates the similarity T of byte frequency distribution between sample S and the test data F (x) x
The Tanimoto coefficient can be measured the similarity of document data, and reduction is the Jaccard coefficient under two meta-attribute situation.The present invention proposes a kind of crumb data model of cognition based on the byte frequency distribution; This model is minimum test cell with the crumb data of 512B; Add up the byte frequency distribution F (x) among each test 512B, can draw the similarity T of byte frequency distribution between sample S and the test crumb data F (x) through the Tanimoto coefficient x
T ( A , S ) = A · S | | A | | 2 + | | S | | 2 - A · S Formula (1)
Wherein A=F (X) for the byte frequency distribution of sector, test crumb data x place, is 1 dimensional vector with 256 elements; S is the byte frequency distribution of sample data;
A · S = Σ i = 1 n A i S i , | | A | | 2 = Σ i = 1 n A i A i , n=256。
It is thus clear that the span of T is [0,1], when T=0, A and S similarity are minimum; When T trended towards equaling 1, A and S similarity were the highest.The value of T from 0 to 1 o'clock, A and S similarity were from low to high.
When calculating similarity, can calculate by means of means such as computing machines, for example, in computing machine, write calculation procedure, through inputting interface input S and A, promptly can calculate the similarity T of byte frequency distribution between sample S and the test data F (x) automatically x
Step S14 calculates the similarity T of byte frequency distribution between sample S and the test data F (x) xAfter, the similarity T of preliminary judgement crumb data x to be tested xWhether fall in the similarity scope of byte frequency distribution of crumb data and same sample of a known type.
In the present invention, store the data type T that draws according to the similarity between various types of crumb data and the sample in advance, i.e. T={T 1, T 2... T iT m, the total m kind crumb data type of expression.Wherein, what Ti represented is i kind data type, and it is represented with two parameter Ti1, Ti2; Wherein, Ti1 represents similarity, and promptly the similarity of i kind data type is represented with Ti1; It is one from 0 to 1 a scope, and the similarity of each data type all has an effective range, is example with the jpeg file type; Utilize the effective range of the similarity that the Tanimoto coefficient calculations goes out to be [0.55,1], promptly between 0.55 to 1; The set of Ti2 representative data architectural feature, promptly the data structure characteristic set of i kind data type is represented with Ti2, and for example, the data structure characteristic set of jpeg file type can be the content of aforementioned table 1.
Based on the data type T that the above-mentioned similarity according between various types of crumb data and the sample of storage in advance draws, the similarity T of preliminary judgement crumb data x to be tested xWhether fall in the Ti1 scope, if similarity T xFall in the Ti1 scope, can think that then crumb data x belongs to i class crumb data; If similarity T xDo not fall in the Ti1 scope, can think that then crumb data x does not belong to i class crumb data, need to continue to judge similarity T xWhether fall into T I+1Scope is promptly in the similarity scope of another known type, if similarity T xAll do not fall in the similarity scope of all known types, be about to this crumb data to be tested of then thinking that the m kind prestores and identify type.
The dependency structure characteristic δ of step S15, introducing crumb data to be tested xFor parameters optimization 1, as shown in Figure 3.Be the similarity T of byte frequency distribution between crumb data x and a certain sample S xFall into a known similar degree T iScope in the time, further judge whether there is δ among the crumb data x xIf, exist, continue to judge whether to satisfy δ x∈ T j, wherein, Tj represents the data structure characteristic set of another kind of UNKNOWN TYPE, if satisfy δ x∈ T j, continue to judge whether i equates with j, if i=j explains that then the set of Tj data represented architectural feature is identical with Ti2, can judge that then said test crumb data x belongs to data type T iThe type of representative.If i and j are unequal, continue step S 16, if do not satisfy δ x∈ T j, perhaps do not have δ among the crumb data x x, then this situation is not done analysis, with the judged result of step S14 as overall result.
In step S15, the similarity T of byte frequency distribution between said crumb data x to be tested and a certain sample S xFall into a known types T iThe similarity scope in the time, further confirm the architectural feature δ of said crumb data x to be tested xWhether also belong to this known types T iArchitectural feature set, thereby confirm to judge the type of test crumb data x more exactly.
The relevance of distance is a parameters optimization 2 between step S16, the introducing crumb data.Because to be distributed in 32 possibilities within the data block is 80% to fragment in the identical file, so crumb data is not stochastic distribution in disk, is that certain relevance is arranged between the fragment, and promptly a certain section continuous crumb data belongs to same file.
In step S15, when i ≠ j, perhaps among the step S14, although the similarity T of crumb data x to be tested xFallen in the Ti1 scope, but similarity being lower, for example, is example with the jpeg file type, and the effective range of similarity is [0.55,1], and the similarity T of crumb data x to be tested xBe 0.56, the similarity degree of obvious crumb data x to be tested and jpeg file type is very low.Under above-mentioned two kinds of situation, all can adopt the measure of step S16.As shown in Figure 4; Whether the sequence number of judging current tested crumb data x is last of place data block, if not, sequence number adds 1; Whether the similarity of crumb data of then judging this sequence number is in the Ti scope; Circulation compares, and all compares up to other data with crumb data x to be tested place data block, adds up the number of the crumb data of similarity in the Ti scope then; If similarity greater than 80%, judges then that said crumb data x belongs to data type T in the ratio of the quantity of Ti scope internal fragment data iThe type of representative, otherwise judge that said crumb data x can't discern.
Promptly in step S16, judge to fall into T iScope in the crumb data ratio that accounts for this data block have muchly, for example,, can think very definitely that then said crumb data x belongs to data type T if greater than 80% iThe type of representative.
Can effectively evaluate the type of crumb data through the foregoing description.In addition, distribute by page or leaf (4K) during Memory Allocation, be the integral multiple of 512B, therefore, the crumb data described in the present invention also can refer to it is the data in the internal memory.
Crumb data kind identification method of the present invention can provide certain electronic evidence information for judicial evidence collection; Guarantee to identify the type of crumb data on the one hand; Further, improved the discrimination of type, on the other hand; Guaranteed the data in the crumb data type identification process reliability, with the consistance of raw data, for certain place mat work is done in follow-up crumb data recombination.
Fig. 5 is the collect evidence process flow diagram of whole work of crumb data.Wherein preparatory stage and crumb data are extracted the stage as a series of preliminary works of the present invention, do not do detailed description at this, can adopt existing universal method.After having extracted crumb data; Carry out the analysis of crumb data; Comprising rejecting the contiguous file data block, carrying out the identification of crumb data type of the present invention, then carry out the reorganization of crumb data, show the fragment evidence then; And be submitted to court, promptly reach a conclusion according to the crumb data that obtains.
Identification through crumb data type of the present invention; For the reorganization of next step crumb data in the electronic evidence-collecting provides the foundation; And; Because the crumb data in the electronic evidence-collecting process of the present invention is in the consistance of obtaining, having guaranteed in the identification, regrouping process with raw data, therefore, the reliability and the authenticity of the electronic evidence of fundamentally having guaranteed to obtain.

Claims (8)

1. the recognition methods of a crumb data type is characterized in that: may further comprise the steps:
Step 1 is extracted the byte frequency distribution F (x) of crumb data x to be tested; Wherein, F (x)={ f 0, f 1F iF 255, f iFor with the sector being the number of times that byte value i occurs in the crumb data of unit;
Step 2 is calculated the similarity T of byte frequency distribution between crumb data x to be tested and a certain sample S through formula (1) x,
T ( A , S ) = A · S | | A | | 2 + | | S | | 2 - A · S Formula (1)
Wherein, A=F (x) is the byte frequency distribution of sector, said test crumb data x place, and S is the frequency distribution of sample data byte;
Figure FSA00000429586300012
n=256;
Step 3 is judged the similarity T of byte frequency distribution between said crumb data x to be tested and a certain sample S xWhether fall into a kind of crumb data type T of known types T iThe scope of similarity in, if fall into, judge that then said test crumb data x belongs to known types T iThe type of representative; If do not fall in the scope of any one known types T, judge that then the type of said crumb data x to be tested can't be discerned;
Wherein, T={T 1, T 2... T iT mThe total m kind crumb data type of expression T, T iRepresent i kind crumb data type, i=1......m.
2. the recognition methods of crumb data type according to claim 1 is characterized in that: also comprise step 4,
Step 4, the similarity T of byte frequency distribution between said crumb data x to be tested and a certain sample S xFall into a known types T iThe scope of similarity in the time, further judge whether there is δ among the crumb data x xIf, exist, then determine whether to satisfy δ x∈ T jIf, satisfy, and, if i=j judges that then said test crumb data x belongs to known types T iThe type of representative;
Wherein, δ xBe the architectural feature of said a certain file type, Tj is the set of the architectural feature of UNKNOWN TYPE data.
3. the recognition methods of crumb data type according to claim 1 and 2 is characterized in that: also comprise step 5,
Step 5, the similarity T of byte frequency distribution between crumb data x said to be tested in the step 3 and a certain sample S xFall into a known types T iThe scope of similarity in similarity during less than preset range, perhaps during the i in the step 4 ≠ j, judge that the similarity of other crumb data in the data block at said crumb data to be measured place falls into said known types T iScope in quantity whether reach predetermined quantity, if reach, judge that then said crumb data x belongs to data type T iThe type of representative, otherwise judge that said crumb data x can't discern.
4. the recognition methods of crumb data type according to claim 1 is characterized in that: before step 1, comprise the steps:
Steps A: extract sample pattern S, confirm the crumb data of various file types and the similarity between the said sample pattern S.
5. according to the recognition methods of the described crumb data type of claim, it is characterized in that: before step 1, comprise the steps:
Step B: extract the architectural feature δ of various file types, wherein, δ={ δ 1, δ 2δ iδ m, the architectural feature of the total m kind file type of expression δ.
6. the recognition methods of crumb data type according to claim 1 is characterized in that: described crumb data comprises crumb data and the crumb data in the internal memory in the various disks.
7. the recognition methods of crumb data type according to claim 3 is characterized in that: the quantity of said crumb data to be measured place data block is 2 5-2 8Piece.
8. the recognition methods of crumb data type according to claim 3 is characterized in that: described predetermined quantity is the quantity more than 80% that accounts for said crumb data to be measured place data block quantity.
CN201110031123.8A 2011-01-26 2011-01-26 Recognition method for fragment data type Expired - Fee Related CN102622302B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110031123.8A CN102622302B (en) 2011-01-26 2011-01-26 Recognition method for fragment data type

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110031123.8A CN102622302B (en) 2011-01-26 2011-01-26 Recognition method for fragment data type

Publications (2)

Publication Number Publication Date
CN102622302A true CN102622302A (en) 2012-08-01
CN102622302B CN102622302B (en) 2014-10-29

Family

ID=46562229

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110031123.8A Expired - Fee Related CN102622302B (en) 2011-01-26 2011-01-26 Recognition method for fragment data type

Country Status (1)

Country Link
CN (1) CN102622302B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105718335A (en) * 2016-01-27 2016-06-29 成都驭奔科技有限公司 Method for extracting single file based on features
CN107729558A (en) * 2017-11-08 2018-02-23 郑州云海信息技术有限公司 Method, system, device and the computer-readable storage medium that file system fragmentation arranges
CN108319518A (en) * 2017-12-08 2018-07-24 中国电子科技集团公司电子科学研究院 File fragmentation sorting technique based on Recognition with Recurrent Neural Network and device
CN109828866A (en) * 2019-01-26 2019-05-31 郑州汉江电子技术有限公司 A kind of XFS file fragmentation restoration methods and device
CN111309267A (en) * 2020-02-26 2020-06-19 Oppo广东移动通信有限公司 Storage space allocation method and device, storage equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090299999A1 (en) * 2009-03-20 2009-12-03 Loui Alexander C Semantic event detection using cross-domain knowledge
CN101604364A (en) * 2009-07-10 2009-12-16 珠海金山软件股份有限公司 Computer rogue program categorizing system and sorting technique based on file instruction sequence
CN101923618A (en) * 2010-08-19 2010-12-22 中国航天科技集团公司第七一○研究所 Hidden Markov model based method for detecting assembler instruction level vulnerability

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090299999A1 (en) * 2009-03-20 2009-12-03 Loui Alexander C Semantic event detection using cross-domain knowledge
CN101604364A (en) * 2009-07-10 2009-12-16 珠海金山软件股份有限公司 Computer rogue program categorizing system and sorting technique based on file instruction sequence
CN101923618A (en) * 2010-08-19 2010-12-22 中国航天科技集团公司第七一○研究所 Hidden Markov model based method for detecting assembler instruction level vulnerability

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
VASSIL ROUSSEV,ET AL: "《File Fragment Classification——The Case for Specialized Approaches》", 《2009 FOURTH INTERNATIONAL IEEE WORKSHOP ON SYSTEMATIC APPROACHES TO DIGITAL FORENSIC ENGINEERING》, 31 December 2009 (2009-12-31) *
钟秀玉 等: "《基于递归与多线程的丢失文件查找设计》", 《计算机技术与发展》, vol. 20, no. 9, 30 September 2010 (2010-09-30) *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105718335A (en) * 2016-01-27 2016-06-29 成都驭奔科技有限公司 Method for extracting single file based on features
CN107729558A (en) * 2017-11-08 2018-02-23 郑州云海信息技术有限公司 Method, system, device and the computer-readable storage medium that file system fragmentation arranges
CN108319518A (en) * 2017-12-08 2018-07-24 中国电子科技集团公司电子科学研究院 File fragmentation sorting technique based on Recognition with Recurrent Neural Network and device
CN108319518B (en) * 2017-12-08 2023-04-07 中国电子科技集团公司电子科学研究院 File fragment classification method and device based on recurrent neural network
CN109828866A (en) * 2019-01-26 2019-05-31 郑州汉江电子技术有限公司 A kind of XFS file fragmentation restoration methods and device
CN111309267A (en) * 2020-02-26 2020-06-19 Oppo广东移动通信有限公司 Storage space allocation method and device, storage equipment and storage medium
CN111309267B (en) * 2020-02-26 2023-10-03 Oppo广东移动通信有限公司 Storage space allocation method and device, storage equipment and storage medium

Also Published As

Publication number Publication date
CN102622302B (en) 2014-10-29

Similar Documents

Publication Publication Date Title
CN102016789B (en) Data processing apparatus and method of processing data
CN110008254B (en) Transformer equipment standing book checking processing method
CN102622302B (en) Recognition method for fragment data type
JP5708107B2 (en) Duplicate file detection device
CN104021132A (en) Method and system for verification of consistency of backup data of host database and backup database
CN102682024B (en) Method for recombining incomplete JPEG file fragmentation
CN109117440B (en) Metadata information acquisition method, system and computer readable storage medium
CN103812877B (en) Data compression method based on Bigtable distributed memory system
US8468134B1 (en) System and method for measuring consistency within a distributed storage system
US20110225164A1 (en) Granular and workload driven index defragmentation
WO2013105505A1 (en) Index scanning apparatus and index scanning method
CN111913925B (en) Data processing method and system in distributed storage system
CN109597757B (en) Method for measuring similarity between software networks based on multidimensional time series entropy
CN112597345B (en) Automatic acquisition and matching method for laboratory data
CN104991741B (en) A kind of situation adaptation power network big data storage method based on key-value model
CN104636401A (en) Data rollback method and device for SCADA system
CN110147353B (en) MongoDB data migration monitoring method and device based on log analysis
CN106383897A (en) Database capacity calculation method and apparatus
Chatzigeorgakidis et al. Local pair and bundle discovery over co-evolving time series
CN109656929A (en) A kind of method and device for carving multiple relationship type database file
CN109344163B (en) Data verification method and device and computer readable medium
CN105260465A (en) Graph data processing service method and apparatus
CN108108467B (en) Data deleting method and device
CN106776704B (en) Statistical information collection method and device
CN106126375B (en) A kind of each version restoration methods of YAFFS2 file based on Hash

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20141029

CF01 Termination of patent right due to non-payment of annual fee