CN104750675B - A kind of unknown format encrypts the recognition methods of file - Google Patents

A kind of unknown format encrypts the recognition methods of file Download PDF

Info

Publication number
CN104750675B
CN104750675B CN201510151456.2A CN201510151456A CN104750675B CN 104750675 B CN104750675 B CN 104750675B CN 201510151456 A CN201510151456 A CN 201510151456A CN 104750675 B CN104750675 B CN 104750675B
Authority
CN
China
Prior art keywords
mrow
file
file destination
mfrac
msub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510151456.2A
Other languages
Chinese (zh)
Other versions
CN104750675A (en
Inventor
王继志
杨光
陈丽娟
杨英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Computer Science Center
Original Assignee
Shandong Computer Science Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Computer Science Center filed Critical Shandong Computer Science Center
Priority to CN201510151456.2A priority Critical patent/CN104750675B/en
Publication of CN104750675A publication Critical patent/CN104750675A/en
Application granted granted Critical
Publication of CN104750675B publication Critical patent/CN104750675B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Storage Device Security (AREA)

Abstract

The invention discloses the recognition methods that a kind of unknown format encrypts file, it comprises the following steps:S1:It is determined that needing that the arbitrary format file of identification is encrypted, and it is designated file destination;S2:Data in file destination are extracted;S3:The step S2 data extracted are judged, if it is determined that being clear data, then the result that this document is non-encrypted file are exported, if it is determined that being encryption file, then it is the result for encrypting file to export this document.Judgement is encrypted by the data to file destination progress data extraction and to extraction in the present invention, can be in the case where not knowing file format, progress automatic identification judgement whether is encrypted to arbitrary format file, the problem of effectively identification can not only be carried out to encryption file, efficiency high is judged, and can avoid manually being judged and causing to waste time and energy.

Description

A kind of unknown format encrypts the recognition methods of file
Technical field
The present invention relates to file identification technical field, specifically a kind of unknown format encrypts the recognition methods of file.
Background technology
In computer forensics field, often storage is encrypted in important evidence of crime by suspect, and is converted File format.When evidence obtaining personnel obtain the disk of suspect's storage evidence of crime, it is necessary to be quickly found out in mass file These encrypted files, then crack these encryption files, so as to obtain suspect using the method for password cracking Evidence of crime.
But whether for the file of arbitrary format, how to automate one file of judgement is not one by encryption Easy thing.At present in computer forensics field, whether encrypted for a file, typically there are two methods.A kind of method It is that evidence obtaining personnel judge by hand, for example, manually opens a Word file, if necessary to input password, then the Word file is to add Close mistake, otherwise can directly it open;Another method is to be directed to specific file type, such as Word file, if one Word file is encrypted, then has an encryption indicator to be set to 1 in file header, can so be sentenced by programming automation Whether whether the disconnected encryption indicator is 1, then can be encrypted the Word file with automatization judgement.It will be apparent that former approach efficiency It is very low, waste time and energy, it is impossible to check the file of magnanimity one by one;And later approach, specific file format can only be directed to, if The conscious change file format of attacker, then be easy to this decision method of out-tricking, and leads to not carry out effectively judging encryption File.
Current encryption file decision method is difficult the text of the multiple format of the magnanimity to being run into computer forensics field Whether part is judged automatically by encryption, therefore, in the urgent need to one kind can be to file in the case of file format is not known Whether technology that encryption judged is passed through.
The content of the invention
For above-mentioned deficiency, file identification method is encrypted the invention provides a kind of unknown format, it can not know Progress automatic identification judgement whether is encrypted in the case of file format to arbitrary format file, encryption file can not only be entered Row effectively identification, judge efficiency high, and can avoid manually the problem of judged and cause to waste time and energy, also providing in addition A kind of data extraction method of unknown format file and a kind of data encryption decision method.
The present invention solves its technical problem and adopted the technical scheme that:A kind of unknown format encrypts the recognition methods of file, It is characterized in that, comprise the following steps:
S1:It is determined that needing that the arbitrary format file of identification is encrypted, and it is designated file destination;
S2:Data in file destination are extracted;
S3:The step S2 data extracted are judged, if it is determined that being clear data, then it is non-encrypted to export this document The result of file, if it is determined that being encryption file, then it is the result for encrypting file to export this document.
The data in file destination carry out extraction process and comprised the following steps:
S21:File destination is opened in a binary format;
S22:The content of file destination is read in the form of binary word throttles, and the file destination content of reading is stored in Buffer area, untill all contents of file destination read and finished;
S23:Close file destination.
It is described that the process that the data extracted are judged is comprised the following steps:
S31:The size of file destination byte stream in buffering area is calculated, in units of byte, size is designated as, then by byte stream Middle content is designated as b successively from the 1st byte to the size byte1, b2..., bsize
S32:By b1, b2..., bsizeIt is converted into signless integer;
S33:Mean μ is calculated according to the following equation:
S34:E is calculated according to the following equation:
S35:σ is calculated according to the following equation:
S36:R is calculated according to the following equation:
S37:Compare R and threshold values f set in advance, if R<F, then judge file destination as encrypted file, otherwise Then judge the file that file destination is crossed as unencryption.
The threshold values f is the correlation between file byte after encryption.
Present invention also offers a kind of data extraction method of file destination, it is characterized in that, comprise the following steps:
S21:File destination is opened in a binary format;
S22:The content of file destination is read in the form of binary word throttles, and the file destination content of reading is stored in Buffer area, untill all contents of file destination read and finished;
S23:Close file destination.
The file destination for determination needs that the arbitrary format file of identification is encrypted.
Present invention also offers a kind of data encryption decision method of file destination, it is characterized in that, including to unknown format Process that the data of file are extracted and the process that the data extracted are judged.
The process that the data to unknown format file are extracted comprises the following steps:
S21:File destination is opened in a binary format;
S22:The content of file destination is read in the form of binary word throttles, and the file destination content of reading is stored in Buffer area, untill all contents of file destination read and finished;
S23:Close file destination.
It is described that the process that the data extracted are judged is comprised the following steps:
S31:The size of file destination byte stream in buffering area is calculated, in units of byte, size is designated as, then by byte stream Middle content is designated as b successively from the 1st byte to the size byte1, b2..., bsize
S32:By b1, b2..., bsizeIt is converted into signless integer;
S33:Mean μ is calculated according to the following equation:
S34:E is calculated according to the following equation:
S35:σ is calculated according to the following equation:
S36:R is calculated according to the following equation:
S37:Compare R and threshold values f set in advance, if R<F, then judge file destination as encrypted file, otherwise Then judge the file that file destination is crossed as unencryption, the threshold values f is the correlation between file byte after encryption.
The file destination for determination needs that the arbitrary format file of identification is encrypted.
The beneficial effects of the invention are as follows:The present invention is by carrying out data extraction to file destination and to the data progress of extraction Whether encryption judgement, can encrypt progress automatic identification to arbitrary format file and sentence in the case where not knowing file format It is fixed, encryption file can not only be carried out effectively recognizing, judge efficiency high, and can avoid manually being judged and causing expense When it is laborious the problem of.
The present invention need not know the form of file in advance, believe in decision process also without using the form of file Breath, it is possible to can realize that whether the automation Jing Guo encryption judges to any file, facilitates evidence obtaining personnel and is counted Calculation machine is collected evidence, and improves the case handling efficiency of public security organ.
Brief description of the drawings
With reference to Figure of description, the present invention will be described.
Fig. 1 is the flow chart that unknown format of the present invention encrypts file identification method;
The method flow diagram that Fig. 2 extracts for the present invention to file destination data;
The method flow diagram that Fig. 3 is judged file destination data encryption for the present invention.
Embodiment
For the technical characterstic for illustrating this programme can be understood, below by embodiment, and its accompanying drawing is combined, to this hair It is bright to be described in detail.Following disclosure provides many different embodiments or example is used for realizing the different knots of the present invention Structure.In order to simplify disclosure of the invention, hereinafter the part and setting of specific examples are described.In addition, the present invention can be with Repeat reference numerals and/or letter in different examples.This repetition is that for purposes of simplicity and clarity, itself is not indicated Relation between various embodiments are discussed and/or set.It should be noted that part illustrated in the accompanying drawings is not necessarily to scale Draw.Present invention omits the description to known assemblies and treatment technology and process to avoid being unnecessarily limiting the present invention.
The present invention main thought be using unknown file as byte stream analyze its byte between autocorrelation, after encryption File can show preferable randomness, and the file of unencryption has a correlation due to meaningful between byte, therefore from phase The criterion whether height of pass degree can be encrypted as file.Unknown format of the present invention encrypts the knowledge of file Other method employs the data extraction method and data encryption decision method of file destination.
As shown in figure 1, a kind of unknown format of the present invention encrypts the recognition methods of file, it comprises the following steps:
S1:It is determined that needing that the arbitrary format file of identification is encrypted, and it is designated file destination;
S2:The data in file destination are extracted using the data extraction method of file destination;
S3:The step S2 data extracted are judged using data encryption decision method, if it is determined that be clear data, Then output this document is the result of non-encrypted file, if it is determined that being encryption file, then it is the result for encrypting file to export this document.
As shown in Fig. 2 a kind of data extraction method of file destination of the present invention, it comprises the following steps:
S21:File destination is opened in a binary format;
S22:The content of file destination is read in the form of binary word throttles, and the file destination content of reading is stored in Buffer area, untill all contents of file destination read and finished;
S23:Close file destination.
As shown in figure 3, a kind of data encryption decision method of file destination of the present invention, it comprises the following steps:
S31:The size of file destination byte stream in buffering area is calculated, in units of byte, size is designated as, then by byte stream Middle content is designated as b successively from the 1st byte to the size byte1, b2..., bsize
S32:By b1, b2..., bsizeIt is converted into signless integer;
S33:Mean μ is calculated according to the following equation:
S34:E is calculated according to the following equation:
S35:σ is calculated according to the following equation:
S36:R is calculated according to the following equation:
S37:Compare R and threshold values f set in advance, if R<F, then judge file destination as encrypted file, otherwise Then judge the file that file destination is crossed as unencryption, the threshold values f is the correlation between file byte after encryption.
File destination described in the above method for determination needs that the arbitrary format file of identification is encrypted.
Simply the preferred embodiment of the present invention described above, for those skilled in the art, Without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications are also regarded as this hair Bright protection domain.

Claims (5)

1. a kind of unknown format encrypts the recognition methods of file, it is characterized in that, comprise the following steps:
S1:It is determined that needing that the arbitrary format file of identification is encrypted, and it is designated file destination;
S2:Data in file destination are extracted;
S3:The step S2 data extracted are judged, if it is determined that being clear data, then it is non-encrypted file to export this document Result, if it is determined that for encryption file, then export this document be encrypt file result;
The data in file destination carry out extraction process and comprised the following steps:
S21:File destination is opened in a binary format;
S22:The content of file destination is read in the form of binary word throttles, and the file destination content of reading is stored in caching Area, untill all contents of file destination read and finished;
S23:Close file destination;
It is described that the process that the data extracted are judged is comprised the following steps:
S31:Calculate the size of file destination byte stream in buffering area, in units of byte, be designated as size, then will be interior in byte stream Appearance is designated as b successively from the 1st byte to the size byte1, b2..., bsize
S32:By b1, b2..., bsizeIt is converted into signless integer;
S33:Mean μ is calculated according to the following equation:
<mrow> <mi>&amp;mu;</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>s</mi> <mi>i</mi> <mi>z</mi> <mi>e</mi> </mrow> </mfrac> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>s</mi> <mi>i</mi> <mi>z</mi> <mi>e</mi> </mrow> </munderover> <msub> <mi>b</mi> <mi>n</mi> </msub> <mo>;</mo> </mrow>
S34:Standard deviation E is calculated according to the following equation:
<mrow> <mi>E</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>s</mi> <mi>i</mi> <mi>z</mi> <mi>e</mi> <mo>-</mo> <mn>1</mn> </mrow> </mfrac> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>s</mi> <mi>i</mi> <mi>z</mi> <mi>e</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <mrow> <mo>(</mo> <msub> <mi>b</mi> <mi>n</mi> </msub> <mo>-</mo> <mi>&amp;mu;</mi> <mo>)</mo> </mrow> <mrow> <mo>(</mo> <msub> <mi>b</mi> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mo>-</mo> <mi>&amp;mu;</mi> <mo>)</mo> </mrow> <mo>;</mo> </mrow>
S35:Variances sigma is calculated according to the following equation:
<mrow> <mi>&amp;sigma;</mi> <mo>=</mo> <msqrt> <mrow> <mfrac> <mn>1</mn> <mrow> <mi>s</mi> <mi>i</mi> <mi>z</mi> <mi>e</mi> </mrow> </mfrac> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>s</mi> <mi>i</mi> <mi>z</mi> <mi>e</mi> </mrow> </munderover> <msup> <mrow> <mo>(</mo> <msub> <mi>b</mi> <mi>n</mi> </msub> <mo>-</mo> <mi>&amp;mu;</mi> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> </msqrt> <mo>;</mo> </mrow>
S36:Coefficient R is calculated according to the following equation:
<mrow> <mi>R</mi> <mo>=</mo> <mfrac> <mi>E</mi> <msup> <mi>&amp;sigma;</mi> <mn>2</mn> </msup> </mfrac> <mo>;</mo> </mrow>
S37:Compare R and threshold values f set in advance, if R<F, then judge that file destination, as encrypted file, is otherwise then sentenced The file that sets the goal is the file that unencryption is crossed.
2. a kind of unknown format according to claim 1 encrypts the recognition methods of file, it is characterized in that, the threshold values f is Correlation after encryption between file byte.
3. a kind of data encryption decision method of file destination, it is characterized in that, including the data of unknown format file are carried The process taken and the process that the data extracted are judged;
It is described that the process that the data extracted are judged is comprised the following steps:
S31:Calculate the size of file destination byte stream in buffering area, in units of byte, be designated as size, then will be interior in byte stream Appearance is designated as b successively from the 1st byte to the size byte1, b2..., bsize
S32:By b1, b2..., bsizeIt is converted into signless integer;
S33:Mean μ is calculated according to the following equation:
<mrow> <mi>&amp;mu;</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>s</mi> <mi>i</mi> <mi>z</mi> <mi>e</mi> </mrow> </mfrac> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>s</mi> <mi>i</mi> <mi>z</mi> <mi>e</mi> </mrow> </munderover> <msub> <mi>b</mi> <mi>n</mi> </msub> <mo>;</mo> </mrow>
S34:Standard deviation E is calculated according to the following equation:
<mrow> <mi>E</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>s</mi> <mi>i</mi> <mi>z</mi> <mi>e</mi> <mo>-</mo> <mn>1</mn> </mrow> </mfrac> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>s</mi> <mi>i</mi> <mi>z</mi> <mi>e</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <mrow> <mo>(</mo> <msub> <mi>b</mi> <mi>n</mi> </msub> <mo>-</mo> <mi>&amp;mu;</mi> <mo>)</mo> </mrow> <mrow> <mo>(</mo> <msub> <mi>b</mi> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mo>-</mo> <mi>&amp;mu;</mi> <mo>)</mo> </mrow> <mo>;</mo> </mrow>
S35:Variances sigma is calculated according to the following equation:
<mrow> <mi>&amp;sigma;</mi> <mo>=</mo> <msqrt> <mrow> <mfrac> <mn>1</mn> <mrow> <mi>s</mi> <mi>i</mi> <mi>z</mi> <mi>e</mi> </mrow> </mfrac> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>s</mi> <mi>i</mi> <mi>z</mi> <mi>e</mi> </mrow> </munderover> <msup> <mrow> <mo>(</mo> <msub> <mi>b</mi> <mi>n</mi> </msub> <mo>-</mo> <mi>&amp;mu;</mi> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> </msqrt> <mo>;</mo> </mrow>
S36:Coefficient R is calculated according to the following equation:
<mrow> <mi>R</mi> <mo>=</mo> <mfrac> <mi>E</mi> <msup> <mi>&amp;sigma;</mi> <mn>2</mn> </msup> </mfrac> <mo>;</mo> </mrow>
S37:Compare R and threshold values f set in advance, if R<F, then judge that file destination, as encrypted file, is otherwise then sentenced The file that sets the goal is the file that unencryption is crossed, and the threshold values f is the correlation between file byte after encryption.
4. a kind of data encryption decision method of file destination according to claim 3, it is characterized in that, it is described to unknown lattice The process that the data of formula file are extracted comprises the following steps:
S21:File destination is opened in a binary format;
S22:The content of file destination is read in the form of binary word throttles, and the file destination content of reading is stored in caching Area, untill all contents of file destination read and finished;
S23:Close file destination.
5. a kind of data encryption decision method of file destination according to claim 3 or 4, it is characterized in that, the target File for determination needs that the arbitrary format file of identification is encrypted.
CN201510151456.2A 2015-04-01 2015-04-01 A kind of unknown format encrypts the recognition methods of file Active CN104750675B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510151456.2A CN104750675B (en) 2015-04-01 2015-04-01 A kind of unknown format encrypts the recognition methods of file

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510151456.2A CN104750675B (en) 2015-04-01 2015-04-01 A kind of unknown format encrypts the recognition methods of file

Publications (2)

Publication Number Publication Date
CN104750675A CN104750675A (en) 2015-07-01
CN104750675B true CN104750675B (en) 2017-09-26

Family

ID=53590387

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510151456.2A Active CN104750675B (en) 2015-04-01 2015-04-01 A kind of unknown format encrypts the recognition methods of file

Country Status (1)

Country Link
CN (1) CN104750675B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100631B (en) * 2020-08-11 2022-09-06 福建天泉教育科技有限公司 Processing method and terminal for judging encryption of PPTX (Power Point X) document

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567670A (en) * 2011-12-28 2012-07-11 南京邮电大学 Filter drive encryption implementing method for file system
CN103034815A (en) * 2011-09-30 2013-04-10 北大方正集团有限公司 Detection method and device for portable document format (PDF) file
CN103500294A (en) * 2013-09-23 2014-01-08 北京荣之联科技股份有限公司 Document encrypting and decrypting method and device
CN104113601A (en) * 2014-07-29 2014-10-22 深圳市中兴移动通信有限公司 File transfer method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050289639A1 (en) * 2004-06-23 2005-12-29 Leung Wai K System and method of securing the management of documentation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103034815A (en) * 2011-09-30 2013-04-10 北大方正集团有限公司 Detection method and device for portable document format (PDF) file
CN102567670A (en) * 2011-12-28 2012-07-11 南京邮电大学 Filter drive encryption implementing method for file system
CN103500294A (en) * 2013-09-23 2014-01-08 北京荣之联科技股份有限公司 Document encrypting and decrypting method and device
CN104113601A (en) * 2014-07-29 2014-10-22 深圳市中兴移动通信有限公司 File transfer method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"一种加密文件技术的探讨";王义;《信息技术与网络服务》;20031231(第31期);第37页第1-2段 *

Also Published As

Publication number Publication date
CN104750675A (en) 2015-07-01

Similar Documents

Publication Publication Date Title
US20150134971A1 (en) Apparatus and method for decrypting encrypted file
Indrayani et al. Increasing the security of mp3 steganography using AES Encryption and MD5 hash function
CN111224946A (en) TLS encrypted malicious traffic detection method and device based on supervised learning
CN102073829B (en) Document encrypting method and document decrypting method on basis of voice print
CN111030941A (en) Decision tree-based HTTPS encrypted flow classification method
CN112217763A (en) Hidden TLS communication flow detection method based on machine learning
CN104506322B (en) A kind of certification of exam inee&#39;s identity data compression encryption method and decryption method
CN107368592B (en) Text feature model modeling method and device for network security report
CN104168435B (en) The method and system that a kind of audio file batch merges and played
CN103942500B (en) Hash ciphertext re-encryption method based on noise and decryption method after re-encryption
CN107665164A (en) Secure data detection method and device
Qiu et al. A new approach to multimedia files carving
CN104750675B (en) A kind of unknown format encrypts the recognition methods of file
Thakar et al. Next generation digital forensic investigation model (NGDFIM)-enhanced, time reducing and comprehensive framework
CN113904861A (en) Encrypted flow security detection method and device
CN112134829B (en) Method and device for generating encrypted traffic feature set
WO2015196642A1 (en) Data encryption method, decryption method and device
CN112615714B (en) Side channel analysis method, device, equipment and storage medium
CN109598489A (en) A kind of method, apparatus and system of the storage of digital wallet mnemonic word
Kumar et al. SIGNIFICANCE of hash value generation in digital forensic: A case study
Zhao et al. Block cipher identification scheme based on Hamming weight distribution
CN112702157A (en) Block cipher system identification method based on improved random forest algorithm
Weerasinghe Secrecy and performance analysis of symmetric key encryption algorithms
CN113141349B (en) HTTPS encrypted flow classification method with self-adaptive fusion of multiple classifiers
CN109255225A (en) Hard disc data security control apparatus based on dual-identity authentication

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant