CN103984608B - A kind of image file based on content carving compound recipe method - Google Patents

A kind of image file based on content carving compound recipe method Download PDF

Info

Publication number
CN103984608B
CN103984608B CN201410229438.7A CN201410229438A CN103984608B CN 103984608 B CN103984608 B CN 103984608B CN 201410229438 A CN201410229438 A CN 201410229438A CN 103984608 B CN103984608 B CN 103984608B
Authority
CN
China
Prior art keywords
data block
data
blocks
carving
jpeg image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410229438.7A
Other languages
Chinese (zh)
Other versions
CN103984608A (en
Inventor
孔祥维
张博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN201410229438.7A priority Critical patent/CN103984608B/en
Publication of CN103984608A publication Critical patent/CN103984608A/en
Application granted granted Critical
Publication of CN103984608B publication Critical patent/CN103984608B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

A kind of image file based on content of present invention carving compound recipe is owned by France in information security and Computer Applied Technology field, relate to a kind of jpeg image file restoration methods, a kind of jpeg image file restoration methods to having deleted in data storage device, time related file system metamessage is destroyed.The relevant interface that carving compound recipe method provides first by operating system, obtains and needs to carry out the ordered set carving all data blocks of the storage device of multiple operation;Then pretreatment is carried out, it is thus achieved that complex data set of blocks to be carved;Finally, in complex data set of blocks to be carved, carving is carried out multiple.The present invention is applicable to the data storage device including NAND flash memory equipment, and is not affected by the presence or absence of file system, and accuracy rate is higher.Can normally work on the data storage device of the discontinuous storage that traditional algorithm cannot work, can recover, in criminal investigation, data, the jpeg image data that field help related personnel recovers important.

Description

A kind of image file based on content carving compound recipe method
Technical field
The invention belongs to information security and Computer Applied Technology field, relate to a kind of jpeg image file extensive Compound recipe method, especially a kind of to data storage device has been deleted, related file system metamessage is broken The jpeg image file restoration methods in bad time.
Background technology
In criminal investigation process and data are recovered, recovering original file from data storage device is One important step, it is multiple that this process is referred to as carving.Store during wherein data storage device refers to computer system The external memory storage of binary data.The most popular data storage device of modern computer is hard disk.? On hard disk, operating system is tended to deposit file data blocks continuously, therefore, in file system corruption or After file is deleted by mistake, original file data can be recovered by the continuous print logical address of hard disk.JPEG It is the compression of most popular computer picture and storage format on our times.Owing to file is on hard disk It is similar to Coutinuous store, the current jpeg image carving compound recipe method file header mark SOI by jpeg file Its original position and end position can be navigated to easily, then by certain with end-of-file marker EOI Method excludes the abnormal data block in the middle of two positions, remaining data block is combined in order, I.e. can obtain carving the jpeg image file completed again.
Along with the development of technology, storage device based on NAND gate (NAND) flash memory is developed rapidly. Nand flash memory is a kind of Electrical Erasable programmable read-only memory, is a kind of storage that can be the most erasable Device.Relatively hard disk, read or write speed is fast, power consumption is low, noise is little, anti-seismic performance is good advantage that it has, must Following mainstream storage device will be become.It is limited to the physical characteristic of NAND flash memory equipment, literary composition above Part storage is not continuous print, but fragmentation, this will make existing carving compound recipe method normally to work. Up to now, the research for the file carving compound recipe method on this equipment is the most little.
Summary of the invention
The technical problem to be solved in the present invention is how to set in the data storage including NAND flash memory equipment The multiple jpeg image file of standby upper carving.By this method, in the storage device both continuously and discontinuously stored, i.e. Make file system corruption or file is deleted, still can carve original jpeg image file of appearing again.This Bright can meet data recover and computer forensics requirements of one's work, the jpeg image in subscriber computer is carried out Carving is multiple.
The technical solution used in the present invention is the jpeg file carving compound recipe method of a kind of image content-based, carves compound recipe Method first passes through operating system and obtains all data blocks needing to carve multiple storage device, is then passed through pretreatment Method carries out screening to it and obtains complex data set of blocks to be carved, and eventually passes the multiple process of carving and obtains carving and complete again Jpeg image file.Specifically comprising the following steps that of method
1. obtain and need to carry out the total data carving the storage device of multiple operation
Use the relevant interface that operating system provides, obtain the whole of the storage device that needs to carry out carving multiple operation Data.Here storage device refers to need to carry out wherein carving the whole storage device made or single of returning to work Subregion.These data presented in data block, the size of data block and order with in original storage device Bunch consistent.All data blocks are formed a set.The collection obtained is collectively referred to as original data block set.
2. pretreatment
In order to reduce the scale of the initial data set of blocks that previous step obtains, need to carry out pretreatment.According to JPEG Graphics standard, jpeg image file is split by labelling and marker field.Initial data set of blocks is divided into JPEG Image head data block set and non-jpeg image head data block set.Classification foundation is: if certain data block and Several data blocks adjoined after it can be met that jpeg image standard specifies, no by original order composition Binary data stream that be interrupted, be tagged to SOS marker field from SOI, then in this data stream belonging to byte Data block all belongs to jpeg image head data block set;Otherwise, data block belongs to non-jpeg image head number According to set of blocks.
Then, exclude from non-jpeg image head data block set and meet the data block of following either condition:
A labelling that () data stream packets retains containing jpeg image standard: 0xFFF0~0xFFED, 0xFF02~0xFFBF;
B () does not comprise EOI labelling, and entropy is less than certain threshold value.Wherein, the entropy of data block defines such as Formula (1):
E = - Σ i = 0 255 p i log 2 p i - - - ( 1 )
piFor in this data block, it is worth the frequency of the appearance of the byte for i.
Finally, two set being merged, the collection of generation is collectively referred to as complex data set of blocks to be carved.
3. carving is multiple
Carving is carried out multiple in complex data set of blocks to be carved.In YCbCr color space, define two pixels P1(Y1,Cb1,Cr1) and P2(Y2,Cb2,Cr2Distance between) such as formula (2):
Dis tan ce ( P 1 , P 2 ) = ( Y 1 - Y 2 ) 2 + ( Cb 1 - Cb 2 ) 2 + ( Cr 1 - Cr 2 ) 2 - - - ( 2 )
According to jpeg image standard, jpeg image data encode in units of minimum code unit. For the i-th data block of jpeg image file, define its forward direction matching distance such as formula (3):
ForwardDis tan ce ( i ) = 1 n Σ i = 1 n Dis tan ce ( p i , j , P i , j ′ ) - - - ( 3 )
Wherein Pi,jIt is the jth pixel that meets following either condition:
(1) data of the minimum code unit at this pixel place are entirely from i-th data block, and this picture Element and a certain pixel P ' obtained before i-th data block is decodedi,jAdjacent;
(2) data of the minimum code unit at this pixel place come from i-th data block and the i-th-1 data Block, and this pixel and a certain pixel P ' that obtainedi,jAdjacent.
The multiple algorithm flow of carving is as follows:
Step one: find a data block by SOI labelling in complex data set of blocks to be carved.If looking for Less than such data block, algorithm terminates, and otherwise proceeds to step 2.
Step 2: this data block is got rid of from complex data block to be carved.Use jpeg decoder to this data block And abutting some subsequent data blocks are decoded, until solving complete a line minimum code unit Till.If this process cannot complete, the carving abandoning present image is multiple, proceeds to step one, otherwise continues to solve Code, to current data block end, proceeds to step 3.
Step 3: preserve the working condition of now decoder, i.e. decoded state and associated internal memory region, is referred to as Breakpoint, k=1.Proceed to step 4.
Step 4: recover the working condition of decoder to breakpoint.Kth is taken from complex data set of blocks to be carved Data block, is spliced at breakpoint continue decoding to this end of data block.Now calculate and preserve this data block Forward direction matching distance.If the exception of decoding process trigger decoder, then it is assumed that the forward direction coupling of this data block Distance is infinitely great.Proceed to step 5.Extra, the solution if the data that data block comprises fail Going out a complete minimum code unit, the carving abandoning present image is multiple, proceeds to step one.
Step 5: if k is equal to the element number of complex data set of blocks to be carved, i.e. to the traversal of this set Terminate, the most saved all forward direction matching distance are found minima, data block corresponding for this value is made For the data block that the next one is correct, after being spliced to breakpoint, decoding, to this end of data block, proceeds to step 6; Otherwise, k=k+1, proceed to step 4.
Step 6:
If having reached to carve the condition of multiple junction bundle, all data blocks currently obtained are output in order carving multiple The jpeg image file completed, proceeds to step one;Otherwise, step 3 is proceeded to.
Finally, the jpeg image completed again i.e. carved by the whole output files obtained.
The invention has the beneficial effects as follows the content characteristic according to jpeg image, invented a kind of based in image The jpeg file carving compound recipe method held, whole set is traveled through by the method when finding each data block, Select the data block mated most with the most decoded image completed as correct data block, it is possible to traditional On hard disk and new-type NAND flash memory equipment, the multiple jpeg image being deleted of carving, has the highest accuracy rate.
Accompanying drawing explanation
Fig. 1 is the flow chart that the present invention carves multiple process.
Fig. 2 is the schematic diagram of the forward direction matching distance calculating data block.Wherein, right figure is a width JPEG Image, 1,2,3,4,5,6,7,8,99 grids are minimum code unit.
Detailed description of the invention
The detailed description of the invention of the present invention is described in detail below in conjunction with technical scheme and accompanying drawing.
Accompanying drawing 1 is the flow chart that the present invention carves multiple process, and accompanying drawing 2 is the forward direction matching distance calculating data block Schematic diagram, wherein, right figure represents a width jpeg image, and this image is made up of upper and lower two data blocks, 9 grids represent 9 minimum code unit.The border of two data blocks occurs in No. 5 minimum code unit, I.e. decoding obtains No. 5 minimum code unit needs the data of upper and lower two data blocks.Left side figure is right part of flg The enlarged drawing of shape, the lattice in each minimum code unit represents a pixel.Will according to formula (3), The pixel that the forward direction matching distance of calculating bottom data block comprises is the pixel gone out with dark signs in figure.
Experimental facilities is one piece comprises the solid state hard disc of 1390 width jpeg images, and all data therein are All deleted.
First, all data blocks of solid state hard disc is obtained.By this solid state hard disc carry on (SuSE) Linux OS, Use dd order to obtain the mirror image of specified partition, wherein contain all of data of this subregion.From the mirror obtained Extracting all of bunch in Xiang, i.e. initial data set of blocks, this set is ordered into, and wherein data block is suitable Sequence with its in original storage device consistent.
Secondly, initial data set of blocks is carried out pretreatment, according to jpeg image standard, by whole set Data block be divided into jpeg image head data block set and non-jpeg image head data block set.Classification foundation For: if certain data block and several data blocks adjoined after it can meet JPEG by original order composition Graphics standard regulation, continual, be tagged to the binary data stream of SOS marker field from SOI, then should In data stream, the data block belonging to byte all belongs to jpeg image head data block set;Otherwise, data block belongs to In non-jpeg image head data block set.To non-jpeg image head data block set, exclude below meeting The data block of either condition:
A () data stream comprises the reservation labelling of jpeg image standard: 0xFFF0~0xFFED, 0xFF02~0xFFBF;
B () does not comprise EOI labelling, and according to formula (1) calculated entropy less than 5.0.
Merge jpeg image head data block set and non-jpeg image head data block set obtains what scale reduced Complex data set of blocks to be carved.
Finally, treat carving complex data set of blocks to operate according to following steps:
Step one: find a data block by SOI labelling in complex data set of blocks to be carved.If looking for Less than such data block, algorithm terminates, and otherwise proceeds to step 2.
Step 2: this data block is got rid of from complex data block to be carved.Use jpeg decoder to this data block And abutting some subsequent data blocks are decoded, until solving complete a line minimum code unit Till.If this process cannot complete, the carving abandoning present image is multiple, proceeds to step one, otherwise continues to solve Code, to current data block end, proceeds to step 3.
Step 3: preserve the working condition of now decoder, i.e. decoded state and associated internal memory region, is referred to as Breakpoint, k=1.Proceed to step 4.
Step 4: recover the working condition of decoder to breakpoint.Kth is taken from complex data set of blocks to be carved Data block, is spliced at breakpoint continue decoding to this end of data block.Now calculate and preserve this data block Forward direction matching distance.If the exception of decoding process trigger decoder, then it is assumed that the forward direction coupling of this data block Distance is infinitely great.Proceed to step 5.Extra, the solution if the data that data block comprises fail Going out a complete minimum code unit, the carving abandoning present image is multiple, proceeds to step one.
Step 5: if k is equal to the element number of complex data set of blocks to be carved, i.e. to the traversal of this set Terminate, the most saved all forward direction matching distance are found minima, data block corresponding for this value is made For the data block that the next one is correct, after being spliced to breakpoint, decoding, to this end of data block, proceeds to step 6; Otherwise, k=k+1, proceed to step 4.
Step 6:
If having reached to carve the condition of multiple junction bundle, all data blocks currently obtained are output in order carving multiple The jpeg image file completed, proceeds to step one;Otherwise, step 3 is proceeded to.
The jpeg image completed again i.e. carved by the output file obtained.Carving the results are shown in Table 1 again, and in table, accuracy is Refer to, in multiple to the carving of certain image, recover correct data block quantity and original data block ratio of number.
Multiple junction fruit carved by table 1
Accuracy Amount of images Image proportion
< 10% 21 1.51%
10%~50% 95 6.83%
50%~90% 73 5.25%
90%~100% 45 3.24%
100% 1156 83.2%

Claims (1)

1. image file based on a content carving compound recipe method, it is characterised in that: provide first by operating system Relevant interface, obtains and needs to carry out the ordered set carving all data blocks of the storage device of multiple operation;Then Carry out pretreatment, it is thus achieved that complex data set of blocks to be carved;Finally, in complex data set of blocks to be carved, carving is carried out multiple;
Described pretreatment comprises the following steps:
Classification: initial data set of blocks is divided into jpeg image head data block set and non-jpeg image head number According to set of blocks;Classification foundation is: if certain data block and several data blocks adjoined after it can be by former By order form meet jpeg image standard specify, continual, be tagged to SOS marker field from SOI Binary data stream, then in this data stream, data block belonging to byte all belongs to jpeg image head data block Set;Otherwise, data block belongs to non-jpeg image head data block set;
Then, exclude from non-jpeg image head data block set and meet the data block of following either condition:
A labelling that () data stream packets retains containing jpeg image standard: 0xFFF0~0xFFED, 0xFF02~0xFFBF;
B () does not comprise EOI labelling, and entropy is less than threshold value 5.0;Wherein, the entropy of data block defines such as Formula (1):
E = - &Sigma; i = 0 255 p i log 2 p i - - - ( 1 )
Wherein, piFor in this data block, it is worth the frequency of the appearance of the byte for i;Finally, by two set Merging, the collection of generation is collectively referred to as complex data set of blocks to be carved;
Described in complex data set of blocks to be carved, carry out carving multiple process:
In two pixels P defined in YCbCr color space1(Y1,Cb1,Cr1) and P2(Y2,Cb2,Cr2Distance between) Such as formula (2):
D i s tan c e ( P 1 , P 2 ) = ( Y 1 - Y 2 ) 2 + ( Cb 1 - Cb 2 ) 2 + ( Cr 1 - Cr 2 ) 2 - - - ( 2 )
For the i-th data block of jpeg image file, define its forward direction matching distance such as formula (3):
F o r w a r d D i s tan c e ( i ) = 1 n &Sigma; i = 1 n D i s tan c e ( P i , j , P i , j &prime; ) - - - ( 3 )
Wherein, Pi,jIt is the jth pixel that meets following either condition:
A the data of the minimum code unit at () this pixel place are entirely from i-th data block, and this picture Element and a certain pixel P ' obtained before i-th data block is decodedI, jAdjacent;
B the data of the minimum code unit at () this pixel place come from i-th data block and the i-th-1 data Block, and this pixel and a certain pixel P ' that obtainedI, jAdjacent;
Step one: find a data block by SOI labelling in complex data set of blocks to be carved;If looking for Less than such data block, algorithm terminates, and otherwise proceeds to step 2;
Step 2: this data block is got rid of from complex data block to be carved;Use jpeg decoder to this data block And abutting some subsequent data blocks are decoded, until solving complete a line minimum code unit Till;If this process cannot complete, the carving abandoning present image is multiple, proceeds to step one, otherwise continues to solve Code, to current data block end, proceeds to step 3;
Step 3: preserve the working condition of now decoder, i.e. decoded state and associated internal memory region, is referred to as Breakpoint, k=1;Proceed to step 4;
Step 4: recover the working condition of decoder to breakpoint;Kth is taken from complex data set of blocks to be carved Data block, is spliced at breakpoint continue decoding to this end of data block;Now calculate and preserve this data block Forward direction matching distance;If the exception of decoding process trigger decoder, then it is assumed that the forward direction coupling of this data block Distance is infinitely great;Proceed to step 5;Extra, the solution if the data that data block comprises fail Going out a complete minimum code unit, the carving abandoning present image is multiple, proceeds to step one;
Step 5: if k is equal to the element number of complex data set of blocks to be carved, i.e. to the traversal of this set Terminate, the most saved all forward direction matching distance are found minima, data block corresponding for this value is made For the data block that the next one is correct, after being spliced to breakpoint, decoding, to this end of data block, proceeds to step 6; Otherwise, k=k+1, proceed to step 4;
Step 6:
If having reached to carve the condition of multiple junction bundle, all data blocks currently obtained are output in order carving multiple The jpeg image file completed, proceeds to step one;Otherwise, step 3 is proceeded to;
Finally, the jpeg image completed again i.e. carved by the whole output files obtained.
CN201410229438.7A 2014-05-27 2014-05-27 A kind of image file based on content carving compound recipe method Expired - Fee Related CN103984608B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410229438.7A CN103984608B (en) 2014-05-27 2014-05-27 A kind of image file based on content carving compound recipe method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410229438.7A CN103984608B (en) 2014-05-27 2014-05-27 A kind of image file based on content carving compound recipe method

Publications (2)

Publication Number Publication Date
CN103984608A CN103984608A (en) 2014-08-13
CN103984608B true CN103984608B (en) 2017-01-04

Family

ID=51276598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410229438.7A Expired - Fee Related CN103984608B (en) 2014-05-27 2014-05-27 A kind of image file based on content carving compound recipe method

Country Status (1)

Country Link
CN (1) CN103984608B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI775869B (en) * 2017-06-29 2022-09-01 佳能企業股份有限公司 Image capture apparatus and image processing method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102053881A (en) * 2011-01-07 2011-05-11 杭州电子科技大学 Zip file carving recovery method based on contents
CN102053880A (en) * 2011-01-07 2011-05-11 杭州电子科技大学 Rar file carving recovery method based on contents

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102053881A (en) * 2011-01-07 2011-05-11 杭州电子科技大学 Zip file carving recovery method based on contents
CN102053880A (en) * 2011-01-07 2011-05-11 杭州电子科技大学 Rar file carving recovery method based on contents

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《JPEG文件雕复技术的设计与研究》;董书乐;《中国优秀硕士论文电子期刊网》;20120215;30-46 *
《头部缺失的JPEG文件恢复方法研究》;黄立;《中国优秀硕士论文电子期刊网》;20130615;15-27 *

Also Published As

Publication number Publication date
CN103984608A (en) 2014-08-13

Similar Documents

Publication Publication Date Title
CN103268461A (en) Intranet-extranet physical isolation data exchange method based on QR (quick response) code
US8019171B2 (en) Vision-based compression
CN103379333B (en) The decoding method and its corresponding device of decoding method, video sequence code stream
CN101968796B (en) Method for segmenting bidirectionally and concurrently executed file level variable-length data
CN104504109A (en) Image search method and device
CN102682024A (en) Method for recombining incomplete JPEG file fragmentation
CN104519323A (en) Personnel and vehicle target classification system and method
CN105487942A (en) Backup and remote copy method based on data deduplication
CN102595141A (en) Fractal image compression method based on combination of quad tree and neighborhood searching
CN103838645B (en) Remote difference synthesis backup method based on Hash
US20090278844A1 (en) Method and apparatus for encoding/decoding 3d mesh information including stitching information
CN105068885A (en) JPG fragmented file recovery and reconstruction method
JP2013045364A5 (en)
Zhang et al. Multi-view multi-label active learning for image classification
CN114743630A (en) Medical report generation method based on cross-modal contrast learning
CN103984608B (en) A kind of image file based on content carving compound recipe method
CN104965835A (en) Method and apparatus for reading and writing files of a distributed file system
Sari et al. A review of graph theoretic and weightage techniques in file carving
CN109918545B (en) Method and device for extracting sensor data
CN105677797B (en) A kind of fragment recombination method based on data similarity in JPEG picture file
CN104394415A (en) Method for distributed decoding of video big data
CN105184185B (en) For detaching storage and the key disks of restoring data and its detaching and restoring data method
CN105164665A (en) Creation of a hierarchical dictionary
CN104978352B (en) The method and client of information processing
CN102890818A (en) Method and device for reassembling joint photographic experts group (JPG) picture fragments based on thumbnail

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170104

Termination date: 20200527