CN1920877A - Statistic supervision and structure supervision based hidden messages analysis system - Google Patents

Statistic supervision and structure supervision based hidden messages analysis system Download PDF

Info

Publication number
CN1920877A
CN1920877A CNA2006101131852A CN200610113185A CN1920877A CN 1920877 A CN1920877 A CN 1920877A CN A2006101131852 A CNA2006101131852 A CN A2006101131852A CN 200610113185 A CN200610113185 A CN 200610113185A CN 1920877 A CN1920877 A CN 1920877A
Authority
CN
China
Prior art keywords
detection
algorithm
writing
image
latent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2006101131852A
Other languages
Chinese (zh)
Other versions
CN100507943C (en
Inventor
陈铭
饶华一
史亚维
张茹
钮心忻
杨义先
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CNB2006101131852A priority Critical patent/CN100507943C/en
Publication of CN1920877A publication Critical patent/CN1920877A/en
Application granted granted Critical
Publication of CN100507943C publication Critical patent/CN100507943C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention relates to a hidden-writing analyze system, based on statistic detection and structural detection, wherein the core detecting module of system is integrated with six statistic detecting modules and two structure detecting modules, to user hidden-writing technique as LSB, BPCS and structure hidden-writing techniques of spatial region to detect the reliability of hidden encrypted document which uses common video and voice as carrier, as MBP image, JPEG image and WAVE voice. The invention combines the detecting results of several detecting modules to judge, with better adaptability on present hidden-writing method, to reduce average detecting rate and high false alarm rate, to detect variable carrier types and variable hidden-writing methods at high quality and high efficiency. And the invention has better expandable property, to integrate new detecting module to upgrade the detecting property of system.

Description

A kind of steganalysis system based on statistics detection and structure detection
Technical field
The present invention relates to the Information hiding detection range, specifically, the present invention proposes a kind of steganalysis system based on statistics detection and structure detection.Common latent writing technology comprises mainly that spatial domain LSB latent writes, frequency domain is latently write, latent the writing with structure is latent of BPCS write.The present invention be directed to that spatial domain LSB latent writes, BPCS is latent writes and the latent detection of writing of structure.
Background technology
Since 2002, the development of popular Steganography software that with camera shy (half screening face) is representative was like rain the back spring bamboo, brought in constant renewal in to enlarge, and other all kinds of similar products that design based on Information Hiding Techniques also occur rapidly.Steganography and cryptography have status of equal importance at information security field, Steganography just as same double-edged sword, if untimelyly go to further investigate from positive and negative two aspects, the illegal use of this technology will endanger national information security.For strengthening country and public safety, develop efficient detection instrument as early as possible at popular Steganography software, be China's very significant work aspect discovery of covert communications circuit and monitoring, be to challenge opportunity especially.
Fully understand the latent operating position of writing software, could analyze the weakness that they exist, reach the purpose of fast detecting, blocking-up and discovery.The present latent software carrier overwhelming majority that writes is image and audio frequency, and the research of corresponding detection algorithm also should be based on these two kinds of carriers with it.The hidden algorithm of these two kinds of carriers is closely similar, for example all adopts spatial domain, frequency domain algorithm etc., and is therefore also very similar on detection technique.
Hidden algorithm based on image mainly is divided into five big classes: based on the hidden algorithm of LSB, based on DCT territory hidden algorithm, based on the hidden algorithm of palette, based on the hidden algorithm of visual characteristic and the hidden algorithm that some are special, picture format has JPEG, GIF, BMP etc.In the practical application, the hidden method in the hidden method of LSB and DCT territory is most widely used.
Hidden algorithm based on audio frequency mainly is divided into time domain hidden algorithm and frequency domain hidden algorithm.Popular audio format mainly contains WAV, MP3, midi format on the network at present.Time domain hidden algorithm above-mentioned and frequency domain hidden algorithm all are applicable to the WAV form, and MP3 and MIDI are because the singularity of its file layout needs to adopt special hidden algorithm.Widely used audio concealment algorithm mainly is a time domain LSB algorithm on the network at present.
In addition, also be to use more hidden method at present based on the structure steganography of file layout, can insert extra secret information in some reserved field in file header or after the end of file according to the file layout feature of image or audio frequency.Being present in secret information in the file layout can not influence the normal use of carrier file, and it is invalid also to make based on the detection method of carrier file statistical nature simultaneously.
At present, though had considerable scientific research institution and security firm that the Steganography instrument is detected research, the result is undesirable.Analyze its reason 3 points arranged:
1) Steganography has only the important sensitive users of minority to use.
2) the multimedia form variation on the internet, and most of international and domestic detection algorithms of studying mainly are general statistical attacks now, though the strong detection accuracy of these algorithm applicabilities does not reach practicality.
3) research of Steganography tool detection mainly rests on and analyze the latent software that software or some educational institutions and personal preference person are issued of writing of increasing income at present, yet the user of actual these software transmission secret informations of use does not almost have, and for many popular latent analyses of writing software, because the analysis difficulty is too big thereby few people can propose accurate detection method, but they exactly may be the confidential corespondence instruments of the frequent use of user.
Summary of the invention
Technical matters to be solved by this invention is to provide a kind of steganalysis system based on statistics detection and structure detection, can be to being carrier with the common image audio frequency, year ciphertext part that uses the latent writing technology of main flow to hide, realize high quality detection, its correct recall rate height, have extensive applicability, the detection efficiency height can adapt to the network high-speed transmission requirements.
To achieve these goals, the invention provides a kind of steganalysis system based on statistics detection and structure detection, its system core thought is by integrated six structure detection modules that different statistics detection modules is different with two, when detecting, use integration technology, image or audio file are carried out the detection of a plurality of modules, thereby judged whether to hide Info existence.Its concrete operations mode has two kinds:
First kind is the detection at a small amount of apocrypha, only shows testing result, does not generate test results report.Its concrete operations step is as follows:
1) browse a collection of file to be detected of selection, selectable file type to be detected comprises BMP image, jpeg image and WAVE audio frequency.
2) select one or more statistics detection modules, comprise RS detection module, SPA detection module, DIH detection module, BPCS detection module, card side's detection module and LSM detection module, and setting threshold, threshold value is the numeral between 0~1, the default setting value is 0.05.
3) select one or more structure detection modules, comprise articulating detection module and format detection module.
4) show testing result on the right side, interface, promptly concealed listed files comprises the amount of hiding Info that detected apocrypha name and selected detection module are estimated.In scanning mode, show the detection procedure state, comprise current scanning pattern, current scanning document name, scanning document number, detected latent written document number and detect information such as sweep speed.
Second kind is to scan at the file that contains a collection of apocrypha, has both shown testing result, also can generate test results report at the specified path place.Its concrete operations step is as follows:
1) browse the selection scanning pattern, the folder path at file promptly to be detected place, specified format files all under this path are all with detected.
2) browse the selection storing path, the storing path of the scanning detecting result document that promptly will generate.
3) select file type to be detected, selectable file type comprises BMP image, jpeg image and WAVE audio frequency.
4) the statistics detection module of selection employing comprises RS detection module, SPA detection module, DIH detection module, BPCS detection module, card side's detection module and LSM detection module, and setting threshold, and the threshold value span is (0,1), and the default setting value is 0.05.
5) the structure detection module of selection employing comprises articulating detection module and format detection module.
6) setting completed then begins scanning for Scanning Options, use to specify detection module to scan to the specified type file in the catalogue to be detected, the time-out of may command scan procedure or stop in the scanning process.
7) behind the end of scan, show testing result on the right side, interface, promptly concealed listed files comprises the amount of hiding Info that detected apocrypha name and selected detection module are estimated.In scanning mode, show the detection procedure state, comprise current scanning pattern, current scanning document name, scanning document number, detected apocrypha number and detect information such as sweep speed.Generate relevant detection result report simultaneously, the latent information such as testing result of writing that comprises that sweep time, path, detection file type, scanning detect file and apocrypha.
Above-mentioned statistics and structure detection module, it is characterized in that: above-mentioned first kind is used identical detection module and framework with second kind of mode of operation, the core of system be realized six separate, have the different statistics detection modules that detect principle and have the different modules of structure detection targetedly with two, thereby can be to being carrier with common media formats, the file that uses the latent writing tools of main flow to hide carries out reliable detection.Use a plurality of similar detection modules, can effectively improve the detection performance, and can satisfy the simultaneously treated network real-time sexual demand of multiple medium type according to integration technology.
The core detection module specifically comprises as lower module: six statistics such as RS detection module, SPA detection module, DIH detection module, BPCS detection module, card side's detection module and LSM detection module detection module, and articulate detection and two structure detection modules of format detection.
Above-mentioned statistics and structure detection module, it is characterized in that: in eight core detection modules, there are five at the latent latent detection module of writing of writing of spatial domain LSB, first RS detection module specifically describes as follows: the basic thought of this detection module is to have constructed the high-order statistic about image---gray scale difference value between a plurality of pixels and, the gray scale difference value that utilizes LSB to replace to cause and variation, analyze the relation between this variation and the embedding message size, and this relation is established as a quadratic equation, estimate to embed the size of message by the root of accounting equation.
At first to the image pixel (x that divides into groups 1, x 2..., x n), for pixel groups G=(x 1, x 2..., x n), define a Discrimination Functions f (x 1, x 2..., x n) ∈ R describes the correlativity in the pixel groups:
f ( x 1 , x 2 , · · · , x n ) = Σ i = 1 n - 1 | x i + 1 - x i |
The size of Discrimination Functions value has been described the power of correlativity between pixel, the Discrimination Functions value is big more, and then the correlativity between pixels illustrated is weak more, otherwise correlativity is strong more, and the LSB replacement can make the randomness on pixel value LSB plane increase, and visible LSB replacement can make the Discrimination Functions value increase.Pixel value in the pixel groups is defined three kinds of mapping rulers:
F 1:01,23,…,254255;
F -1:-10,12,…,255256;
F 0:00,11,…,255255。
Mapping ruler F 1Be the corresponding LSB right conversion of pixel value when replacing, F -1Be considered as F 1Antithesis, F 0Be considered as the mapping of pixel, pixel groups be divided into three types according to three kinds of mapping rulers to self:
Figure A20061011318500072
Wherein F (G) expression concerns the pixel among the G under the F at reflection and shines upon, and defines the mask M (M of a n-tuple group simultaneously 1, M 2..., M n), M wherein i∈ 0, and ± 1} (0<i≤n), further definition:
F ( G ) = ( F M 1 ( G ) , F M 2 ( G ) , · · · , F M n ( G ) )
Respectively with R M, S MIn general the ratio of rule sets number and irregular group of shared all the pixel groups numbers of number under the expression mask M, has carried out after the mapping conversion Discrimination Functions value being increased to pixel value, and R is promptly arranged M>>S M, thereby R is arranged M+ S M≤ 1 and R -M+ S -M≤ 1, during analyzing, RS carried out an empirical hypothesis:
R M ≅ R - M , S M ≅ S - M
Found through experiments R -MAnd S -MCurve about p can come match with straight line, R MAnd S MCurve about p can come match with quafric curve, for S MAnd R MHave: S M ( 1 2 ) = R M ( 1 2 ) . Thereby can set up quadratic equation about p:
2(d 1+d 0)z 2+(d -0+d -1-d 1-3d 0)z+(d 0-d -0)=0
Wherein, d 0 = R M ( p 2 ) - S M ( p 2 ) , d 1 = R M ( 1 - p 2 ) - S M ( 1 - p 2 ) , d - 0 = R - M ( p 2 ) - S - M ( p 2 ) , d - 1 = R - M ( 1 - p 2 ) - S - M ( 1 - p 2 ) , Root by solving equation obtains message-length p and makes an estimate at last: p = z z - 0.5 .
Above-mentioned statistics and structure detection module, it is characterized in that: in eight core detection modules, there are five at the latent latent detection module of writing of writing of spatial domain LSB, second SPA detection module specifically describes as follows: this detection module is considered as gathering P to all pixels of image, and P is divided into mutually disjoint baseset X, Y and Z:
X={ (u, v) | (u, v) ∈ P, u<v and v are that even number or u>v and v are odd number);
Y={ (u, v) | (u, v) ∈ P, u>v and v are that even number or u<v and v are odd number);
Z={(u,V)|(u,V)∈P,u=v)。
Be further divided into two subclass W and V for baseset Y:
W={ (u, v) | (u, v) ∈ P, u=2k+1, v=2k or u=2k, v=2k+1); V=Y-W.
The change of replacing pixel value for LSB is defined as four kinds of modification patterns:
1) u, v remain unchanged, associative mode 00;
2) have only u to change associative mode 10;
3) have only v to change associative mode 01:
4) u, v change, associative mode 11.
According to the definition of baseset, the form of the composition that can draw baseset is as follows:
Figure A20061011318500086
Figure A20061011318500091
For arbitrary subclass A among the P, the definition of probability that pixel value among the A is modified because of LSB replaces be ρ (π, A), wherein π ∈ 00,01,10,11} represents to embed the message ratio with p, can calculate the probability that pixel value is modified thus:
ρ ( 00 , P ) = ( 1 - p / 2 ) 2 ; ρ ( 01 , P ) = ρ ( 10 , P ) = p / 2 ( 1 - p / 2 ) ρ ( 11 , P ) = ( p / 2 ) 2
Make two hypothesis according to above definition:
1) baseset X is identical with the gesture of Y, promptly | and X|=|Y|;
2) for arbitrary collection of pixels A ∈ X, V, W, Z), π ∈ 00,01,10,11) have ρ (π, A)=ρ (π, P).
Suppose that 1 sets up for natural image, because the pixel value of natural image gradient in different directions just is or is that negative probability is identical, suppose that 2 promptly refer to the message that embeds being dispersed in the entire image at random, the distribution of message and the distribution of gradation of image are separate.LSB replaces and will make pixel value conversion mutually between four basesets, changes thereby basic cardinality is produced.Similar with the RS detection module, this module is described basic cardinality in the change that embeds before and after the message according to the reformed probability of pixel value, represents to embed the baseset that carries close image after the message with X ', V ', W ', Z ', can set up equation:
| X ′ | = ( 1 - p 2 ) | X | + p 2 | V | ; | V ′ | = ( 1 - p 2 ) | V | + p 2 | X | ; | W ′ | = ( 1 - p + p 2 2 ) | W | + p ( 1 - p 2 ) | Z | .
X '+V '+W '+Z '=P wherein, but get according to two basic assumption abbreviations: 0.5 γ p 2+ (2|X ' |-| P|) p+|Y ' |-| X ' |=0, wherein γ=| W ' |+| Z ' |=| W|+|Z|, the parameter in the following formula all can obtain by carrying a close image, that less root of the absolute value that obtains by solving an equation is exactly the estimation to p.
Above-mentioned statistics and structure detection module, it is characterized in that: in eight core detection modules, there are five at the latent latent detection module of writing of writing of spatial domain LSB, the 3rd DIH detection module specifically describes as follows: this detection module utilizes the difference of the variation of histogram of difference before and after embedding message of carrier image and year close image to find out the vestige of hidden information, and then the size of hidden information is made an estimate.
The histogram of difference appearance profile of initial carrier image and year close image is too big difference not, can use the match well of Generalized Gaussian model, difference is that the form factor in the Generalized Gaussian model has along with the classified information length that embeds increases and the trend of increase.If to carrying out zero setting or put anti-operation and just will find in the LSB plane of image, obvious variation will take place in the histogram of difference of carrier image before and after operation, its appearance profile does not have well to keep the shape of approximation general Gaussian curve, and the histogram of difference that carries close image is almost without any change, and such statistical discrepancy becomes the basis of DIH detection module.
The histogram of difference of remembering tested image is h i, LSB of Image plane zero setting and put anti-back histogram of difference and be respectively g iAnd f iAnalyze as can be known, to h i, g iAnd f iThere is following relation: h 2i=f 2i=a 2i, 2ig 2i, h 2i+1=a 2i, 2i+1g 2i+ a 2i+2,2i+1g 2i+2, f 2i+1=a 2i, 2i-1g 2i+ a 2i+2,2i+3g 2i+2
In conjunction with symmetry hypothesis a 0,1≈ a 0 ,-1, can get recursion formula:
a 0,1 = g 0 - h 0 2 g 0 , a 2 i , 2 i = h 2 i g 2 i , a 2 i , 2 i - 1 = h 2 i - 1 - a 2 i - 2,2 i - 1 g 2 i - 2 g 2 i , a 2 i , 2 i + 1 = 1 - a 2 i , 2 i - a 2 i , 2 i - 1 .
The structure statistic: α i = a 2 i + 2,2 i + 1 a 2 i , 2 i + 1 , β i = a 2 i + 2,2 i + 3 a 2 i , 2 i - 1 , γ i = g 2 i g 2 i + 2
For natural image a is arranged 2i, 2i+1g 2i≈ a 2i+2,2i+1g 2i+2, along with the increase that embeds message-length, α iValue will reduce, when the LSB plane 100% of image embeds message, promptly during p=1, α iBe reduced to 1, this is the basis of analyzing.Statistical test shows, α iAnd the variation relation between the p can carry out match, i.e. y=ax with quadratic polynomial 2+ bx+c.Utilize four key point P 1=(0, γ i), P 2=(p, α i), P 3=(1,1) and P 4=(2-p, β i), can get relational expression:
c = γ i ap 2 + bp + c = α i a ( 2 - p ) 2 + b ( 2 - p ) + c = β i a + b + c = 1
Note d 1=1-γ i, d 2ii, d 3ii, get through abbreviation according to the restriction relation of following formula: 2d 1p 2+ (d 3-4d 1-d 2) p+2d 2=0.Little that of absolute value is exactly the embedding ratio p that we will estimate in two roots of above-mentioned equation.If discriminant is less than zero in the equation, should judge whether to satisfy α this moment i≈ β i≈ 1, if satisfy, shows embedding ratio p ≈ 1.
Above-mentioned statistics and structure detection module, it is characterized in that: in eight core detection modules, there are five at the latent latent detection module of writing of writing of spatial domain LSB, the 4th LSM detection module specifically describes as follows: this detection module is on the basis of the finite state machine of SPA detection module, added minimum variance estimate, improve the precision that detects, analyzed the error that SPA detection module critical assumptions are brought: ε m=| Y 2m+1|-| X 2m+1|, continue to use the SPA detection module to the pixel pair set | C m|, | D m|, | Y 2m+1|, | X 2m+1| definition, promptly | C m|: move to right pixel value after of the value that pixel is right is that the pixel of m is right; | D m|: it is that the pixel of m is right that the difference that pixel is right gets absolute value: | X 2m+1|: the absolute value of the value difference that pixel is right is 2m+1, and the bigger pixel logarithm of even number; | Y 2m+1|: the absolute value of the value difference that pixel is right is 2m+1, and the bigger pixel logarithm of odd number.
Derivation according to the SPA detection module can get following two equations:
( | C m | - | C m + 1 | ) p 2 / 4 - ( | D 2 m ′ | - | D 2 m + 2 ′ | + 2 | Y 2 m + 1 ′ | - 2 | X 2 m + 1 ′ | ) p / 2 + | Y 2 m + 1 ′ | - | X 2 m + 1 ′ | = ϵ m ( 1 - p 2 ) ( m > 1 ) ;
( 2 | C 0 | - | C 1 | ) p 2 / 4 - ( 2 | D 0 ′ | - | D 2 ′ | + 2 | Y 1 ′ | - 2 | X 1 ′ | ) p / 2 + | Y 1 ′ | - | X 1 ′ | = ϵ m ( 1 - p ) 2 ) ( m = 0 ) .
Each pixel is the b bits of encoded, and then m has 2 B-1Therefore-1 different value can construct different 2 B-1-1 equation is for different 2 B-1-1 equation utilizes the method for parameter estimation of least square method and minimization, obtains to make the quadratic sum on equation the right that minimizing p value be arranged, with the estimation of p value as the embedding rate.
Order | A m|=(| C m|-| C M+1|)/4, | B m|=-(| D 2m|-| D 2m+2|+2|Y 2m+1|-2|X 2m+1|)/2, | E m|=| Y 2m+1|-| X 2m+1|, then above first equation left side turns to A mp 2+ B mP+E m, after square be A m 2 p 4 + ( 2 A m E m + B m 2 ) p 2 + 2 B m E m p + E m 2 , Similar quadratic sum S (p) that can all equation left sides is: S ( p ) = Σ m = 0 j A m 2 p 4 + 2 Σ m = 0 j A m B m p 3 + Σ m = 0 j ( 2 A m E m + B m 2 ) p 2 + 2 Σ m = 0 j B m E m p + Σ m = 0 j E m 2 Differentiate gets to S (p):
4 Σ m = 0 5 A m 2 p 3 + 6 Σ m = 0 5 A m B m p 2 + 2 Σ m = 0 5 ( 2 A m E m + B m 2 ) p + 2 Σ m = 0 5 B m E m
Making S ' p value (p)=0, is exactly estimation to the embedding rate even S (p) has minimizing p value.But (p)=0 may there be three inequality real roots in equation S ', needs therefore to consider that the p value that S " (p), can make S " (p)>0 is exactly that the final embedding rate of determining is estimated.
Above-mentioned statistics and structure detection module, it is characterized in that: in eight core detection modules, there are five at the latent latent detection module of writing of writing of spatial domain LSB, the 5th card side detection module specifically describes as follows: based on pixel value to (Pair of Values, PoVs) statistical analysis technique is the steganalysis method at spatial domain LSB class algorithm that occurs the earliest, pixel value is to being two gray-scale values changing mutually when the LSB of replacement pixel, gray level image for 8 bits, pixel value to just by following value to constituting 0  1,2  3,, 254  255.To reach unanimity at the right gray-scale value number of the embedding later formation value of message, utilize this statistical nature, define a chi amount χ K-1 2Test detected image and whether have this statistical property, wherein
χ k - 1 2 = Σ i = 1 k ( n i - n i ′ ) 2 n i ′
With c 2iThe expression gray-scale value is the number of picture elements of 2i, in the definition following formula n i = c 2 i , n i ′ = c 2 i + c 2 i + 1 2 , n i = n i ′ Probability be:
p = 1 - 1 2 k - 1 2 Γ ( k - 1 2 ) ∫ 0 χ k - 1 2 e - x 2 x k - 1 2 - 1 dx
Judge whether contain classified information in the image, simultaneously image is carried out continuous Chi-square Test by the p value of calculating test pattern, can estimate the size of classified information by the variation of observing p.
Above-mentioned statistics and structure detection module, it is characterized in that: in eight core detection modules, there is one at the latent latent detection module of writing of writing of spatial domain BPCS, specifically describe as follows: by we learn to the complexity histogram analysis of the fritter of secret information, the complexity of secret information fritter mainly concentrates near 50~70, in low complex degree and high complexity scope, mostly be 0, and that the complexity histogram of image changes in this scope is milder, inevitablely after embedding secret information cause fluctuation at histogram.
Utilize this characteristic the view data that obtains to be carried out BPCS is latent to be write, complexity histogram continuity before and after relatively more latent then the writing, when original image does not comprise secret information, carry out latent the writing of BPCS and must make its continuity generation significant change, show that the maximal value of promptly facing a difference on the data mutually becomes big, on the contrary, changing if original image comprises secret information can be too unobvious.But this analysis needs certain data volume, therefore for bigger image (common 512 * 512) reasonable effect is arranged.According to such characteristic Design at the latent detection module of writing of spatial domain BPCS, and obtained good detection effect.
Above-mentioned statistics and structure detection module is characterized in that: in eight core detection modules, have two at the latent detection module of writing of structure steganography, first structure detection module based on the BMP image specifically describes as follows:
This detection module is a detected object with the latent software of hiding based on the employing form or the hanging-connecting of BMP image of writing.Exist many redundant digits that can be used for hiding Info in the BMP picture format, as the reservation position in the file header, the reservation position of palette in the information header, the offset data section between information header and view data, and in the view data number of pels per line to be supplied according to length be the position of supplying of 4 multiple.
Secondly, can after finishing, image file articulate secret information, inserting secret information in these redundant digits can not exert an influence to the normal use of image, can find to be hidden in redundant digit and image file secret information afterwards by the BMP image file format being carried out multianalysis, and secret information can be extracted the raw information that then can obtain hiding for the software that does not have encryption function with higher accuracy rate.
Above-mentioned statistics and structure detection module is characterized in that: in eight core detection modules, have two at the latent detection module of writing of structure steganography, it is as follows that second structure detection mould based on jpeg image opened volume description:
This detection module is a detected object with the latent software of hiding based on the employing form or the hanging-connecting of jpeg image of writing.The relative BMP image of jpeg image format is more complicated, mainly be that the data segment definition is comparatively complicated, the latent software of writing based on the jpeg image structure steganography mainly adopts dual mode to hide at present, a kind of is to be inserted into secret information in the note section of picture format as note, another kind is that hanging-connecting is hidden, by writing the Characteristic of Software indications with latent, can position parallel the extraction to the data segment of secret information in conjunction with the sign of the data segment in the jpeg image format.
Above-mentioned statistics and structure detection module is characterized in that: integrated a plurality of detection modules are used integration technology, and for improving the reliability that detects, the syncretizing mechanism of core detection module adopts gets method also to testing result, specifically describes as follows:
If in detection is provided with, selected a plurality of detection modules, each detection module all will produce a testing result, the interference that brings by compromise false-alarm and false dismissal, based on the decision mechanism that merges be: if detection module has surpassed preset threshold to the testing result of detected object, then adjudicating detected object is concealed object, make this court verdict be " very ", court verdict is " vacation " otherwise make.If select a plurality of modules to detect, when then the testing result when selected module was " very ", the judgement detected object was concealed object, otherwise judgement is non-concealed object.
Above-mentioned statistics and structure detection module have comprised five at the latent latent detection module of writing of writing of Spatial LSB, and one at the latent latent detection module of writing of writing of spatial domain BPCS, and two at the latent latent detection module of writing of writing of structure.The present invention adopts to unite and gets integration technology also, testing result to multiple module is carried out cascading judgement, thereby eliminated the individual module narrow application range, defectives such as average recall rate is low, false alarm rate height have realized that high-quality, the high-level efficiency to variety carrier type, multiple latent WriteMode detects.
Description of drawings
Fig. 1 is a window master interface view of the present invention.
Fig. 2 is a file type subwindow view of the present invention.
Fig. 3 is that statistics of the present invention detects the subwindow view.
Fig. 4 is a structure detection subwindow view of the present invention.
Fig. 5 is an overview flow chart of the present invention.
Fig. 6 is a RS detection module process flow diagram of the present invention.
Fig. 7 is a SPA detection module process flow diagram of the present invention.
Fig. 8 is a DIH detection module process flow diagram of the present invention.
Fig. 9 is a LSM detection module process flow diagram of the present invention.
Figure 10 is a card side of the present invention detection module process flow diagram.
Figure 11 is a BPCS detection module process flow diagram of the present invention.
Figure 12 is of the present invention based on BMP picture structure detection module process flow diagram.
Figure 13 is of the present invention based on jpeg image structure detection module process flow diagram.
Embodiment
The present invention constitutes the core detection module with eight detection sub-module, realized the reliable detection of carrying ciphertext part (BMP image, jpeg image, WAVE audio frequency etc.) to the steganography method generation of present main flow, its concrete enforcement detects and be divided into two kinds of detection modes in " Scanning Options " at main interface.First kind of mode is by selecting " detection file " that single or multiple files are detected.At first, " detection file " corresponding selection detects the Open dialog box of file, and optional one or more files detect in this dialog box, and optionally file layout has three kinds: BMP image, jpeg image and WAV audio frequency.File is selected to finish with ejection " statistics detects " subwindow, in this window, select scanning to detect employed statistics detection module, total RS detection module, the SPA detection module, the DIH detection module, the BPCS detection module, the side's of card detection module, six detection modules of LSM detection module are available, set the threshold value (default value is 0.05) that detects judgement simultaneously, setting the back selects " determining " will eject " structure detection " subwindow, in this window, select scanning to detect employed structure detection module, have format detection and articulate two detection modules of detection available, detection module is set the back and is selected " determining " will each file that scan be detected with selected module, shows testing result in " scanning mode " and " concealed listed files ".
The second way is by selecting " scanning pattern " that the All Files in certain file is scanned detection.At first, select " scanning pattern " in the Browse For Folder dialog box, to select the file that will scan.Secondly, the setting option that must carry out before the scanning beginning has: the storing path of selected scanning log file in " storing path "; In " file type " subwindow, set the file layout that scanning detects: BMP image, jpeg image or WAV audio frequency; Select the detection module of use in " statistics detects " subwindow or " structure detection " subwindow (the two can select one, also can set simultaneously), the selection of detection module is identical with first kind of detection mode.Setting completed for above option, then can select " beginning scanning " to begin to scan detection procedure, and scanning can be selected " suspending scanning " or " stopping scanning " to suspend or stop scan procedure after beginning.At last, in " scanning mode ", will show file, the number of files that has scanned of the folder path of current scanning, current scanning in real time and be judged to the number of files of carrying the ciphertext part, and in " concealed listed files ", demonstrate in real time and be judged to the listed files and the testing result of each detection module that carries the ciphertext part each concealed file.
The present invention by to the testing result of a plurality of detection modules with unite get and syncretizing mechanism carry out cascading judgement, steganography method to present main flow has good applicability, reduced the high influence of single detection module false alarm rate, realized reliable detection to the main flow steganography method in the main flow Digital Media form, has good extendability simultaneously, be convenient to integrated new detection module, the detection performance of upgrade-system.

Claims (6)

1. one kind is detected based on statistics and the steganalysis system of structure detection, the nucleus module of system is integrated six kinds of statistics detection algorithms and two kinds of structure detection algorithms, by using integration technology, use multiple algorithm to detect to image or audio file, judge whether detected object exists hidden information.Its concrete operations mode has two kinds:
First kind, at the detection of the file of appointment, show testing result, the concrete operations step is as follows:
1) browse the one or more files to be detected of selection, selectable file type comprises BMP image, jpeg image and WAV audio frequency;
2) the statistics detection algorithm of selection employing comprises RS detection, SPA detection, DIH detection, BPCS detection, the detection of card side and LSM detection, and setting threshold, and the threshold value span is [0,1], and the default setting value is 0.05;
3) the structure detection algorithm of selection employing comprises articulating and detects and format detection;
4) show testing result in real time on " concealed listed files " hurdle at interface, promptly suspicious listed files comprises the amount of hiding Info that detected apocrypha name and each selected algorithm estimate.In " scanning mode " hurdle at interface, show current detected state, comprise the path, current detection filename of current detection file, detect number of files, detected apocrypha number and detect information such as sweep speed.
Second kind, at the detection of the file of appointment, show testing result, and in specified path, generate test results report that the concrete operations step is as follows:
1) browse the folder path of selecting scanning, the file of all specified formats is all with detected under the selected path;
2) browse the selection storing path, will generate one under selected path is the document files of filename to scan working time, and the document is scanning result;
3) click " file type ", the file type that selection will detect comprises BMP image, jpeg image and WAV audio frequency;
4) click " is added up detection ", and the statistics detection algorithm of selection employing comprises RS detection, SPA detection, DIH detection, BPCS detection, the detection of card side and LSM detection, and setting threshold, and the threshold value span is [0,1], and the default setting value is 0.05;
5) click " structure detection ", select the structure detection algorithm of employing, comprise articulating and detect and format detection;
6) click " beginning scanning ", begin to use selected detection algorithm to scan detection to the file of the specified type in the specified path, click " suspending scanning ", " stopping scanning " etc. can be controlled scan procedure;
7) in the scanning process, " concealed listed files " hurdle at the interface shows testing result in real time, comprise the amount of hiding Info that detected apocrypha name and each selected algorithm estimate, in " scanning mode " hurdle at interface, show current scanning mode, comprise current scanning pattern, current scanning document name, scanning document number, information such as detected apocrypha number and detection sweep speed, generate relevant detection result report simultaneously, the test results report content comprises sweep time, the path, detect file type, scanning detects file and detected apocrypha, and to the information such as testing result of apocrypha.
2. according to claims 1 described statistics and structure detection method, it is characterized in that:
Above-mentioned first kind is used identical internal algorithm module and framework with second kind of mode of operation, the core detection module of system is integrated six kinds separate, have the different statistics detection algorithms that detect principle and have the different algorithms of structure detection targetedly with two kinds, thereby can be to being carrier with common media formats, the latent written document that uses the latent writing tools of main flow to generate carries out reliable detection.Use integration technology can effectively improve the reliability of detection to multiple similar detection algorithm, and can satisfy the simultaneously treated network real-time sexual demand of multiple medium type.
Eight kinds of integrated detection algorithms specifically comprise: RS detection, SPA detection, DIH detection, BPCS detection, the detection of card side and LSM detection etc. are added up detection algorithms for six kinds, and articulate detection and two kinds of structure detection algorithms of format detection.
3. according to claims 2 described statistics and structure detection methods, it is characterized in that: in eight kinds of core detection algorithms, comprise five kinds, specifically describe as follows at the latent latent detection algorithm of writing of writing of spatial domain LSB:
At the latent latent writing detection method of writing of spatial domain LSB, at detected object be to be the latent software of writing based on the LSB hidden algorithm of carrier with BMP image and WAV audio frequency, adopted the most representative steganalysis algorithm at present, comprised card side's detection method, RS algorithm, SPA algorithm, DIH algorithm and LSM algorithm at the LSB hidden algorithm.
It is LSB hidden algorithm in the BMP image that above-mentioned algorithm proposes the earliest, can carry out reliable detection to the LSB algorithm that continuous embedding and random site embed.Find after further research, LSB hidden algorithm based on the WAV audio frequency has and the similar statistical property of BMP image, therefore it is feasible using that above-mentioned detection algorithm detects the LSB algorithm in the WAV audio frequency, on this basis, card side's detection method, RS algorithm, SPA algorithm and DIH algorithm are extended in the detection at the LSB hidden algorithm of WAV audio frequency, have obtained good detection effect.
Utilize such steganalysis software to be known as the LSB algorithm to hidden algorithm, with BMP image and WAV audio frequency is that the latent software of writing of carrier effectively detects, as Blindside v0.9, BMP Secrets, S-Tools v4.0, Puffer 4.02, Eshow etc.
4. according to claims 2 described statistics and structure detection methods, it is characterized in that: in eight kinds of core detection algorithms, comprise a kind ofly, specifically describe as follows at the latent latent detection algorithm of writing of writing of spatial domain BPCS:
The BPCS algorithm is based on the hiding latent algorithm of writing of the high capacity realized of BMP image, has used for reference the thought that LSB algorithm meta is replaced, and the method that adopts piece to replace embeds information.This algorithm bit plane that image is different is divided into the identical block of pixels of size, whether distinguish block of pixels with complexity available, data block with the secret information structure removes to replace the available pixel piece, and such substitute mode will exert an influence to the histogrammic distribution characteristics of the complexity of image.The complexity histogram distribution of normal picture has good continuity, but secret information is general and image is incoherent, the embedding of information will destroy the correlativity of image, this will influence the continuity of complexity histogram distribution, tangible peak value appears in histogram, according to such characteristic Design at the latent latent detection algorithm of writing of writing of spatial domain BPCS, and obtained good detection effect.
5. according to claims 2 described statistics and structure detection methods, it is characterized in that: in eight kinds of core detection algorithms, comprise two kinds of latent detection algorithms of writing, specifically describe as follows at structure steganography:
Detected object at the latent writing detection method of structure steganography is BMP image and jpeg image.
1) be detected object with the latent software of hiding based on the employing form or the hanging-connecting of BMP image of writing, as AppendX, Camouflage, Steganography, Masker etc.Exist many redundant digits that can be used for hiding Info in the BMP picture format, as the reservation position in the file header, the reservation position of palette in the information header, the offset data section between information header and view data, and in the view data number of pels per line to be supplied according to length be the position of supplying of 4 multiple.Secondly, can after finishing, image file articulate secret information, inserting secret information in these redundant digits can not exert an influence to the normal use of image, can find to be hidden in redundant digit and image file secret information afterwards by the BMP image file format being carried out multianalysis, and secret information can be extracted the raw information that then can obtain hiding for the software that does not have encryption function with higher accuracy rate.
2) be detected object with the latent software of hiding based on the employing form or the hanging-connecting of jpeg image of writing, as JpegX, Camouflage, Invisible Secrets, Masker etc.The relative BMP image of jpeg image format is more complicated, mainly be that the data segment definition is comparatively complicated, the latent software of writing based on the jpeg image structure steganography mainly adopts dual mode to hide at present, a kind of is to be inserted into secret information in the note section of picture format as note, another kind is that hanging-connecting is hidden, by writing the Characteristic of Software indications with latent, can position parallel the extraction to the data segment of secret information in conjunction with the sign of the data segment in the jpeg image format.
6. according to claims 2 described statistics and structure detection methods, it is characterized in that: integrated a plurality of detection algorithms are used integration technology, for improving the reliability that detects, the syncretizing mechanism of core detection module adopts gets method also to testing result, specifically describes as follows:
If in detection is provided with, selected a plurality of detection algorithms, each detection algorithm all will produce a testing result, the interference that brings by compromise false-alarm and false dismissal, based on the decision mechanism that merges be: if detection algorithm has surpassed preset threshold to the testing result of detected object, then adjudicating detected object is concealed object, make this court verdict be " very ", court verdict is " vacation " otherwise make.If select polyalgorithm to detect, when then the testing result when selected algorithm was " very ", the judgement detected object was concealed object, otherwise judgement is non-concealed object.
CNB2006101131852A 2006-09-19 2006-09-19 Statistic supervision and structure supervision based hidden messages analysis system Expired - Fee Related CN100507943C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006101131852A CN100507943C (en) 2006-09-19 2006-09-19 Statistic supervision and structure supervision based hidden messages analysis system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006101131852A CN100507943C (en) 2006-09-19 2006-09-19 Statistic supervision and structure supervision based hidden messages analysis system

Publications (2)

Publication Number Publication Date
CN1920877A true CN1920877A (en) 2007-02-28
CN100507943C CN100507943C (en) 2009-07-01

Family

ID=37778596

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006101131852A Expired - Fee Related CN100507943C (en) 2006-09-19 2006-09-19 Statistic supervision and structure supervision based hidden messages analysis system

Country Status (1)

Country Link
CN (1) CN100507943C (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100440256C (en) * 2007-03-29 2008-12-03 上海大学 Digital image LSB substitution information hiding rapid detecting method
CN102012980A (en) * 2010-11-16 2011-04-13 中国科学技术大学苏州研究院 Method for securely detecting hidden text information based on homomorphic encryption system
CN101494051B (en) * 2008-01-23 2011-12-28 武汉大学 Detection method for time-domain audio LSB hidden write
CN102411771A (en) * 2011-08-03 2012-04-11 北京航空航天大学 Reversible image steganalysis method based on histogram peak value fluctuation quantity
CN102855602A (en) * 2011-06-28 2013-01-02 阿里巴巴集团控股有限公司 Picture processing method and picture processing device
CN103886864A (en) * 2014-03-03 2014-06-25 南京邮电大学 Method for detecting quantitatively-embedded secret information for DWT domain fidelity compensation
CN104050624A (en) * 2014-06-20 2014-09-17 宁波大学 Digital image steganographic method
CN104183244A (en) * 2014-08-18 2014-12-03 南京邮电大学 Steganography detection method based on evidence reasoning
CN105741222A (en) * 2015-12-31 2016-07-06 杨春芳 Steganographic information positioning method based on pixel subset embedding rate estimation
CN106203135A (en) * 2016-07-04 2016-12-07 中国科学院信息工程研究所 A kind of passive measuring method hiding information for RSID
CN107071455A (en) * 2017-05-03 2017-08-18 西安科技大学 Jpeg image information concealing method based on data flow
CN107169914A (en) * 2017-05-16 2017-09-15 宁波大学 A kind of cipher-text information generation method for image latent writing
CN110069936A (en) * 2019-03-29 2019-07-30 合肥高维数据技术有限公司 A kind of wooden horse steganography method and detection method
CN110120899A (en) * 2019-05-10 2019-08-13 北京百度网讯科技有限公司 A kind of detection method of data flow, device, electronic equipment and storage medium
CN110136074A (en) * 2019-03-26 2019-08-16 中国人民解放军战略支援部队信息工程大学 The hidden close pixel rate algorithm for estimating of high dynamic range images based on least square method
CN111083307A (en) * 2019-11-25 2020-04-28 北京安信荣达科技有限公司 File detection and cracking method based on steganography
CN111310205A (en) * 2020-02-11 2020-06-19 平安科技(深圳)有限公司 Sensitive information detection method and device, computer equipment and storage medium
CN112052471A (en) * 2020-09-17 2020-12-08 青岛大学 Information hiding method based on social network space

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102156842A (en) * 2010-02-11 2011-08-17 腾讯科技(深圳)有限公司 File encryption protection method and device

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100440256C (en) * 2007-03-29 2008-12-03 上海大学 Digital image LSB substitution information hiding rapid detecting method
CN101494051B (en) * 2008-01-23 2011-12-28 武汉大学 Detection method for time-domain audio LSB hidden write
CN102012980A (en) * 2010-11-16 2011-04-13 中国科学技术大学苏州研究院 Method for securely detecting hidden text information based on homomorphic encryption system
CN102012980B (en) * 2010-11-16 2013-02-13 中国科学技术大学苏州研究院 Method for securely detecting hidden text information based on homomorphic encryption system
CN102855602B (en) * 2011-06-28 2015-01-07 阿里巴巴集团控股有限公司 Picture processing method and picture processing device
CN102855602A (en) * 2011-06-28 2013-01-02 阿里巴巴集团控股有限公司 Picture processing method and picture processing device
CN102411771A (en) * 2011-08-03 2012-04-11 北京航空航天大学 Reversible image steganalysis method based on histogram peak value fluctuation quantity
CN102411771B (en) * 2011-08-03 2013-02-13 北京航空航天大学 Reversible image steganalysis method based on histogram peak value fluctuation quantity
CN103886864A (en) * 2014-03-03 2014-06-25 南京邮电大学 Method for detecting quantitatively-embedded secret information for DWT domain fidelity compensation
CN103886864B (en) * 2014-03-03 2017-04-05 南京邮电大学 A kind of secret information detection method is embedded in by DWT domains distortion-compensated Quantisation
CN104050624A (en) * 2014-06-20 2014-09-17 宁波大学 Digital image steganographic method
CN104050624B (en) * 2014-06-20 2017-10-24 宁波大学 A kind of digital picture steganography method
CN104183244A (en) * 2014-08-18 2014-12-03 南京邮电大学 Steganography detection method based on evidence reasoning
CN105741222A (en) * 2015-12-31 2016-07-06 杨春芳 Steganographic information positioning method based on pixel subset embedding rate estimation
CN105741222B (en) * 2015-12-31 2019-01-29 杨春芳 A kind of steganography information locating method based on the estimation of pixel subset insertion rate
CN106203135A (en) * 2016-07-04 2016-12-07 中国科学院信息工程研究所 A kind of passive measuring method hiding information for RSID
CN106203135B (en) * 2016-07-04 2019-07-16 中国科学院信息工程研究所 A kind of passive measuring method for RSID hiding information
CN107071455A (en) * 2017-05-03 2017-08-18 西安科技大学 Jpeg image information concealing method based on data flow
CN107071455B (en) * 2017-05-03 2019-11-29 西安科技大学 Jpeg image information concealing method based on data flow
CN107169914B (en) * 2017-05-16 2018-04-20 宁波大学 A kind of cipher-text information generation method for image latent writing
CN107169914A (en) * 2017-05-16 2017-09-15 宁波大学 A kind of cipher-text information generation method for image latent writing
CN110136074A (en) * 2019-03-26 2019-08-16 中国人民解放军战略支援部队信息工程大学 The hidden close pixel rate algorithm for estimating of high dynamic range images based on least square method
CN110136074B (en) * 2019-03-26 2020-12-08 中国人民解放军战略支援部队信息工程大学 High dynamic range image hidden pixel ratio estimation method based on least square method
CN110069936A (en) * 2019-03-29 2019-07-30 合肥高维数据技术有限公司 A kind of wooden horse steganography method and detection method
CN110120899A (en) * 2019-05-10 2019-08-13 北京百度网讯科技有限公司 A kind of detection method of data flow, device, electronic equipment and storage medium
CN110120899B (en) * 2019-05-10 2024-03-01 北京百度网讯科技有限公司 Data stream detection method and device, electronic equipment and storage medium
CN111083307A (en) * 2019-11-25 2020-04-28 北京安信荣达科技有限公司 File detection and cracking method based on steganography
CN111310205A (en) * 2020-02-11 2020-06-19 平安科技(深圳)有限公司 Sensitive information detection method and device, computer equipment and storage medium
WO2021159642A1 (en) * 2020-02-11 2021-08-19 平安科技(深圳)有限公司 Method and apparatus for detecting sensitive information, computer device, and storage medium
CN112052471A (en) * 2020-09-17 2020-12-08 青岛大学 Information hiding method based on social network space

Also Published As

Publication number Publication date
CN100507943C (en) 2009-07-01

Similar Documents

Publication Publication Date Title
CN1920877A (en) Statistic supervision and structure supervision based hidden messages analysis system
KR101305752B1 (en) Multimedia watermarking techniques with low distortion
US8650402B2 (en) General data hiding framework using parity for minimal switching
CN1574019A (en) Method and apparatus for video copy detection
Liu et al. A variable depth LSB data hiding technique in images
Sharma et al. An enhanced Huffman-PSO based image optimization algorithm for image steganography
CN101038661A (en) Blind watermark embedding and abstracting method based on regression support vector machine
Chen et al. Binary Image Steganalysis Based on Distortion Level Co-Occurrence Matrix.
Sun et al. Encoding spectral and spatial context information for hyperspectral image classification
Fan et al. A multi-watermarking algorithm for medical images using inception v3 and dct
Barni et al. Forensics aided steganalysis of heterogeneous images
CN101030297A (en) Method for cutting complexity measure image grain
Lu et al. Steganalysis of content-adaptive steganography based on massive datasets pre-classification and feature selection
Zhang et al. Reversible data hiding for 3D mesh models with hybrid prediction and multilayer strategy
Luo et al. Steganalysis of adaptive image steganography in multiple gray code bit-planes
Li et al. Steganographic security analysis from side channel steganalysis and its complementary attacks
Li et al. High-capacity coverless image steganographic scheme based on image synthesis
CN1710610A (en) Digital watermark anti fake method in anti-digital-to-analog conversion process
Wang An efficient multiple-bit reversible data hiding scheme without shifting
Li et al. Constructive texture steganography based on compression mapping of secret messages
Garhwal Bioinformatics-inspired analysis for watermarked images with multiple print and scan
Abdollahi et al. Image steganography based on smooth cycle-consistent adversarial learning
CN107577681A (en) A kind of terrain analysis based on social media picture, recommend method and system
Li et al. Anti-pruning multi-watermarking for ownership proof of steganographic autoencoders
CN1818925A (en) Intelligent imaging implicit writting analytical system based on three-layer frame

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090701

Termination date: 20100919