CN114630006B - Secret information extraction method based on consistent most advantageous test - Google Patents
Secret information extraction method based on consistent most advantageous test Download PDFInfo
- Publication number
- CN114630006B CN114630006B CN202210055235.5A CN202210055235A CN114630006B CN 114630006 B CN114630006 B CN 114630006B CN 202210055235 A CN202210055235 A CN 202210055235A CN 114630006 B CN114630006 B CN 114630006B
- Authority
- CN
- China
- Prior art keywords
- steganographic
- key
- sequence
- check matrix
- steganographic key
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 49
- 238000012360 testing method Methods 0.000 title claims abstract description 23
- 238000000034 method Methods 0.000 claims abstract description 14
- 239000011159 matrix material Substances 0.000 claims description 51
- 238000005315 distribution function Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 8
- 238000005070 sampling Methods 0.000 claims description 4
- 238000009826 distribution Methods 0.000 abstract description 44
- 238000004422 calculation algorithm Methods 0.000 abstract description 9
- 238000012216 screening Methods 0.000 abstract 1
- 230000003044 adaptive effect Effects 0.000 description 13
- 230000006870 function Effects 0.000 description 12
- 238000002474 experimental method Methods 0.000 description 10
- 238000004891 communication Methods 0.000 description 8
- 238000011160 research Methods 0.000 description 6
- 238000001514 detection method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 208000011580 syndromic disease Diseases 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N1/00—Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
- H04N1/44—Secrecy systems
- H04N1/4446—Hiding of documents or document information
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/08—Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
- H04L9/0861—Generation of secret information including derivation or calculation of cryptographic keys or passwords
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Computer Security & Cryptography (AREA)
- Computer Networks & Wireless Communication (AREA)
- Editing Of Facsimile Originals (AREA)
Abstract
The invention provides a secret information extraction method based on consistent most advantageous test. Firstly, researching probability distribution of different bits in a sequence extracted by a true steganographic key; then searching the relation between the probability distribution of the sequence extracted by the pseudo-steganographic key and the probability distribution of the secret carrying sequence, and proving that the probability distribution of the subsequence extracted by the pseudo-steganographic key has difference by researching the probability distribution of the space domain and JPEG domain secret carrying sequence; and finally, based on the difference, screening out the correct steganographic key by utilizing the consistent most advantageous test. Given the probability of false rejection and false taking, the threshold and sample size required for the hypothesis testing are derived. Experimental results show that the method provided by the invention can recover the steganographic key of the common main stream steganographic algorithm loaded secret image, thereby realizing the extraction of the steganographic information.
Description
Technical Field
The invention relates to the technical field of digital steganography, in particular to a secret information extraction method based on consistent optimal potential test.
Background
Digital steganography uses open channels to communicate by embedding steganographic information into multimedia files such as digital images, audio, video, etc. to achieve steganography. The communication conceals the existence of secret information, has strong deception, and has become a main mode for implementing the concealed communication and a research hot spot in the field of information security in recent years. The digital steganalysis technology mainly researches how to detect and extract secret information hidden in a digital carrier, and the final aim is to extract the hidden secret information and verify the correctness of steganography detection. On the one hand, because the two parties of communication mostly adopt a mode of 'hiding+encrypting' to realize hidden communication, for an attacker who wants to obtain hidden communication content and evidence-taking hidden communication behaviors, he must firstly extract hidden ciphertext information and then consider decoding, so the hidden information extraction is a problem which is difficult to avoid in the process of obtaining hidden communication content and evidence-taking hidden communication behaviors. On the other hand, most detection methods for steganography at present determine whether secret information is hidden by analyzing whether a carrier is modified, but the modified carrier does not necessarily contain sensitive information, so the validity of the determination is questioned, and the research of extracting the secret information is also necessary.
Currently, the carrier is adaptive steganography of images, that is, image adaptive steganography has become the dominant research direction in steganography. The embedding process of image adaptive steganography is typically composed of two parts, a distortion function and steganography, and may be expressed as "adaptive steganography=distortion function+steganography. The distortion function is used to calculate the distortion of the carrier image at different positions. Different steganography algorithms define distortion functions from different angles, the general principle being: the values are smaller in the texture complex region and the edge region, and larger in the smooth region. Steganographic encoding is used to select a modification position based on the calculated distortion to embed steganographic information with minimal distortion cost. The self-adaptive steganography adopts matrix coding, wet paper coding and other steganography coding. Document 1"T.T.Filler, P.Bas. "Using High-Dimensional Image Models to Perform Highly Undetectable Steganography," In: proceedings of the 12th International Workshop on Information Hiding (IH), calgary, canada,2010, pp.161-177 "applies STC (Syndrome-Trellis Codes) for the first time to image adaptive steganography. Afterwards, STC approaches the theoretical optimum characteristic with its performance in minimizing embedded distortion, and becomes the first choice coding of the adaptive steganography algorithm, and adaptive steganography based on STC has also become the key point of forward improvement and the difficulty of backward analysis of the steganography algorithm.
Most of the secret information extraction methods are carried out under specific conditions, and extraction methods under the condition of only a secret image are urgently needed to be researched. The document 2"X.Luo,X.Song,X.Li,et al," Steganalysis of HUGO steganography based on parameter recognition of Syndrome-Trellis-Codes, "Multimedia Tools and Applications,2016, vol.75, no.21, pp.13557-13583" proposes a secret information extraction method suitable for space domain steganography under a secret-only condition for plaintext embedding. The LSB (Least significant bit) of the spatial domain image pixels is random noise, and the frequency of occurrence of 01 bits is the same. The method considers that the probability distribution of the sequence extracted by the pseudo steganographic key is the same as that of the spatial carrier image pixel LSB, namely the 01 bit frequency is close to the same. However, most of the pictures transmitted on the internet are in JPEG format, and the frequency of occurrence of 01 bits in the embeddable DCT (Discrete Cosine Transform) coefficient of the JPEG image is not the same.
Disclosure of Invention
In order to provide a secret information extraction method suitable for JPEG images under a secret-only condition, the invention provides a secret information extraction method based on consistent most advantageous test.
The invention provides a secret information extraction method based on consistent most advantageous test, which comprises the following steps:
step 1: estimating the length m of the secret information according to pixels of the secret image or the embeddable DCT coefficient;
step 2: given a first class error rate alpha and a second class error rate beta, calculating a sample capacity N and a threshold value T; wherein the first type error rate α represents a probability of determining a true steganographic key as a false steganographic key, and the second type error rate β represents a probability of determining a false steganographic key as a true steganographic key;
step 3: constructing consistent most advantageous test statisticsEach check matrix in the exhaustive check matrix space sequentially performs a statistic calculation process, wherein the statistic calculation process comprises the following steps: extracting a sequence from the encrypted image by using the check matrix enumerated at the current moment, and sampling at intervals of 7 bits from the j-th bit of the first byte of the sequence to obtain a subsequence with the length of N->Calculating to obtain statistics->Is a value of (2);
step 4: judging whether N > m is true, if so, turning to step 7; otherwise, turning to step 5;
step 5: judging the corresponding check matrix of eachIf so, storing the corresponding check matrix into the key alternative set B; otherwise, discarding the corresponding check matrix;
step 6: when judging that all check matrix corresponds toThen, if |b|=1, the check matrix in B is the true steganographic key, and the extraction is successful; if |b|=0, the extraction fails; if |B|>1, making the check matrix space be B, and turning to step 7;
step 7: will enableThe check matrix reaching the maximum value is stored in the key alternative set D; if |D|=1, the check matrix in D is the true steganography key, and the extraction is successful; if |D|>1, extraction failure.
Further, step 2 specifically includes:
step 2.1: according to the given first class error rate alpha and second class error rate beta, calculating to obtain a critical value a, b:
step 2.2: the sample size N and the threshold T are calculated according to equation (28):
wherein a, b satisfy φ (a) =α, φ (b) =1- β, μ 0 and σ0 Respectively represent when H 0 R at the time of establishment i Is μ 1 and σ1 Respectively represent H 1 R at the time of establishment i Is the expected and variance of (1);i=1,2,…,n 0 ;H 0 and H1 Representing hypothesis testing questions, denoted as H 0 :D=D 0 ,H 1 :D=D 1 D represents the overall distribution function of the sample, D 0 D represents a distribution function of a steganographic key 1 A distribution function representing a pseudo steganographic key; let the sequence extracted by the steganographic key be l, starting with the j, j=1, 2, …,8 bits of the first byte, samples every 7 bits, a total of samples n 0 Bits, get subsequence->Is +.>i=1,2,…,n 0 ;/>Represents the j-th subsequence extracted with the use of the steganographic key>I < th > bit->Probability density function, χ 0 ,χ 1 ≠0.5;/>Representing the sequence extracted with pseudo steganographic key +.>I < th > bit->Probability density function, gamma 0 ,γ 1 ≈0.5;i=1,2,…,n 0 。
Further, step 3 further includes: constructing consistent most advantageous test statistics according to equation (29)
wherein ,expression sequence->Number of occurrences of 0, < >>Expression sequence->1 in the number of occurrences of (1).
The invention has the beneficial effects that:
the image self-adaptive steganography technology is also one of tools for planning and coordinating criminal activities by enemy and terrorists to endanger political safety and social stability in China while protecting data privacy. The secret information extraction method based on the consistent most advantageous test is simultaneously suitable for extracting secret information in airspace and JPEG (Joint Photographic Experts Group) domains, and can recover secret keys of common main stream secret algorithm secret images under the condition of plaintext embedding, thereby realizing secret information extraction and having important significance for maintaining national security of China.
Drawings
Fig. 1 is a flow chart of a secret information extraction method based on consistent most advantageous test according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of probability distribution of each bit of plain text according to an embodiment of the present invention;
FIG. 3 is a 01 bit frequency distribution of LSB bits of a carrier image provided by an embodiment of the present invention;
FIG. 4 is a 01 bit frequency distribution of LSB bits of a loaded image according to an embodiment of the present invention; (a) at 0.5bpp or bpnzac embedding rate; (b) at 0.4bpp or bpnzac embedding rate; (c) at 0.3bpp or bpnzac embedding rate; (d) at 0.2bpp or bpnzac embedding rate; (e) at 0.1bpp or bpnzac embedding rate;
FIG. 5 is a schematic diagram of an airspace carrier image according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a JPEG domain carrier image according to an embodiment of the present invention;
FIG. 7 is a 01 bit frequency distribution in a sub-sequence of pseudo steganographic key extraction provided by an embodiment of the present invention;
FIG. 8 is a statistical magnitude of sequences extracted from various steganographic keys provided by an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention is characterized in that: the correct steganographic key is screened out using a consistent most advantageous test based on the difference between the probability distribution of the sub-sequence extracted from the secret image using the true steganographic key and the probability distribution of the sub-sequence extracted from the secret image using the pseudo steganographic key.
Since the technical scheme of the invention is based on the premise that the probability distribution of sequences extracted by the true and false steganographic keys is different, but the prior art does not have a premise of researching and indicating the fact, the premise of the fact is proved before the technical scheme of the invention is introduced.
(1) Message distribution characteristics for steganographic key extraction
When the embedded message is in plaintext, the sequence extracted by the steganographic key is a plaintext sequence. Next, the probabilities of 0 and 1 of the respective bits of the chinese-english plaintext byte are studied.
In a computer system, english characters are stored in the form of bytes, each character occupies 1 byte, and the highest bit of each byte is 0; chinese characters are stored in the form of double bytes, the highest bit of each byte being 1 in order to avoid confusion with english characters. For the seven lower bits of the Chinese and English bytes, the probabilities of 0 and 1 in the seven lower bits of the Chinese and English bytes are calculated according to a large number of statistical results of natural language, and the results are shown in table 1 (8 represents the lowest bit, and the other bits are 7,6,5,5,4,3,2,1 in sequence).
Table 1 probabilities of respective bits of plaintext byte being 0 and 1
For plaintext embedding, the information extracted by the correct steganographic key is a plaintext sequence. Starting from the ith bit of the first byte of the plaintext sequence, sampling is performed at 7 bits each interval to obtain a subsequence L i I=1, 2, …,8. From Table 2, the subsequence L i I=0, 1 probability in 1,2, …,8.
TABLE 2 probability of 0 in subsequence
Obviously L i I=1, 2, …,8, i.e. the 01 bit probability imbalance in the different bits of the sequence extracted by the true steganographic key.
(2) Message distribution characteristics for pseudo steganographic key extraction
This section investigates the probability distribution of 01 bits in the sequence of pseudo steganographic key extraction. Firstly, the relation between the probability of 01 bits in the sequence extracted by the pseudo steganographic key and the probability of 01 bits in the carrier sequence is proved to be: the difference of the probabilities of 01 bits in the sequence extracted by the pseudo steganographic key is equal to the power of r of the difference of the probabilities of 01 bits in the carrier sequence; then, the probability distribution of 01 bits in the space domain and JPEG domain self-adaptive steganography secret-carrying sequences is studied respectively, so that the probability of 01 bits in the sequence extracted by the pseudo steganography secret key is close to balance.
(2.1) the following quotients are presented first:
lemma 1: is provided withF 2 For binary Galois field, for fixedIf pr (a) i =0)=p 0 ,pr(a i =1)=p 1 I=1, 2, …, q, then
And (3) proving: on the binary Galois field,if and only if->There are an odd number 1; />If and only if->There is an even number 1. Then:
thus (2)
The proof ends.
According to lemma 1, the theorem describing the relationship between the probability distribution of the sequence extracted by the pseudo steganographic key and the probability distribution of the secret-carrying sequence is given as follows.
Theorem 1: let the payload sequence be y= (y) 1 ,y 2 ,…,y n ) Wherein pr (y) i =0)=p 0 ,pr(y i =1)=p 1 I=1, 2, …, n. Let s=(s) be the sequence of pseudo steganographic key extraction 1 ,s 2 ,…,s t ) When the pseudo steganographic key is composed of one sub-check matrix, the probability of 0 in the sequence s is:
when the pseudo steganographic key is composed of two different sub-check matrices, the probability of 0 in the sequence s is:
wherein ni I=1, 2 represents the number of 1s in the i-th sub-check matrix.
And (3) proving: when the secret information is long enough, the distribution of h-1 bits in front of the sequence does not influence the overall distribution, wherein h is the height of the sub-check matrix. Therefore, the first h-1 bits of the sequence are not considered for ease of discussion. For any j is greater than or equal to h, s j Equal to the multiplication of the basis row vector with the corresponding partial encryption sequence. Let the vector of the non-zero part of the j-th row of the steganographic key beWherein when->V when (v) i =1. Setting corresponding partial carrier sequence +.>Then
When pr (y) i =0)=p 0 ,pr(y i =1)=p 1 At the same time, as can be seen from the quotients 1:
where r represents the number of 1s in the base row vector, namely: the difference in 01 bit probabilities in the sequence extracted by the pseudo steganographic key is equal to the r power of the difference in 01 bit probabilities in the carrier sequence.
When the check matrix is composed of a sub-check matrix, the number of 1 in the sub-check matrix is assumed to be n 1 At this time
When the check matrix is composed of two different sub-check matrices, it is assumed that the number of 1 in the sub-check matrix is n 1 and n2 At this time
And (5) finishing the verification.
(2.2) the frequency distribution in the sequence of pseudo steganographic key extraction is examined below. Obviously, the frequency of occurrence of 0 in the sequence is:
from Bernoulli's law of large numbers, it is known that:
i.e. when the sequence is long enough, the frequency of occurrence of 0 in the sequence stabilizes at the probability calculated in theorem 1.
A good check matrix must satisfy all 1's for the first and last rows, so n 1 ,n 2 And is more than or equal to 4. Due to 0<|p 0 -p 1 |<1, thus if (p 0 -p 1 ) Small enough to ensure that the frequency of 0's in the sequence of pseudo-steganographic key extraction is close to 0.5. The difference between the probabilities of 01 bits in the payload sequence, i.e. (p), is studied for adaptive steganography in the spatial and JPEG domains, respectively 0 -p 1 ) Is of a size of (a) and (b).
(2.3) for spatial adaptive steganography, the carrier (payload) sequence consists of LSBs of pixels of the payload (carrier) image. Since the LSB of the pixels of the spatial image is random noise, the probability distribution of 0 and 1 in the carrier and the carrier sequence is close to:
pr(y i =0)=pr(y i =1)=1/2 (12)
the result using theorem 1 shows that:
pr(m i =0)-pr(m i =1)=0 (13)
i.e. the frequency of 01 bits in the sequence of pseudo steganographic key extraction is close to equilibrium.
For JPEG domain adaptive steganography, the carrier (secret) sequence is composed of non-zero Alternating Current (AC) coefficients among DCT coefficients (hereinafter, DCT coefficients) after quantization of the carrier (secret) image. The DCT coefficients of the JPEG image follow the Laplacian distribution. The probability of 0 and 1 in the non-zero AC coefficients LSB of the carrier image DCT coefficients is estimated using the distribution of the carrier image DCT coefficients.
Since the carrier image DCT coefficients approximately obey the Laplacian distribution with a position parameter of 0, the probability density function of the carrier image DCT coefficients can be expressed as:
let h (0) represent the frequency with which DCT coefficients with a value of 0 appear in the carrier image. Thus:
it is considered that the frequency of occurrence is small when the absolute value of the DCT coefficients of the carrier image is large. For ease of calculation, therefore, during the following calculation, taking DCT coefficients of a carrier image the value range is extended to [ -, a-is, ++ infinity ].
Let alpha 0 and α1 Representing the probabilities of 0 and 1 in the LSB of the DCT coefficients of the carrier image, respectively, i.e., alpha 0 Alpha is the sum of the probabilities of the DCT coefficient values being even in the carrier image 1 For the sum of the probabilities of the DCT coefficient values being odd in the carrier image, then:
let beta 0 and β1 The probabilities of 0 and 1 in LSB representing the non-zero DCT coefficients of the carrier image, respectively, then:
let beta' 0 and β′1 The probabilities of 0 and 1 in LSB of non-zero AC coefficients of the carrier image are represented respectively:
β′ 0 ≈β 0 ,β′ 1 ≈β 1 (20)
if a vector sequence is usedThe 01 bit probability difference of the carrier sequence is estimated, and then the 01 bit probability difference in the sequence extracted by the error check matrix isWhere r represents the number of 1 in the basic row vector. The good check matrix must satisfy all 1 in the first row and the last row, so r is not less than 4. At this time, the probability difference between 0 and 1 in the sequence of pseudo steganographic key extraction is small, and the probabilities of 0 and 1 are close to the same. Note that theorem 1 holds true for arbitrary j+_h, so the above derivation holds true for the subsequence of pseudo-steganographic key extraction.
In section (1), the conclusion is that: the probability of 01 bits in different bits of the subsequence extracted by the true steganographic key is unbalanced; in section (2), the conclusion is that: the 01 bit frequency in the sub-sequence of pseudo steganographic key extraction is close to equilibrium. This can be verified by: the probability distribution of the sequence extracted by the true and false steganographic keys is different.
Example 1
On the basis of the premise of the facts, as shown in fig. 1, the embodiment of the invention provides a secret information extraction method based on consistent most advantageous test, which comprises the following steps:
s101: estimating the length m of the secret information according to pixels of the secret image or the embeddable DCT coefficient;
specifically, document "J" may be employed.The method of fridrich, "Quantitative Steganalysis using rich models," In: proceedings of SPIE, electronic Imaging, media Watermarking, security, and Forensics X v, san Francisco, CA,2013, vol.8665, pp.866500 "estimates the secret information length m, and is not described In detail herein.
S102: given a first class error rate alpha and a second class error rate beta, calculating a sample capacity N and a threshold value T; wherein the first type error rate α represents a probability of determining a true steganographic key as a false steganographic key, and the second type error rate β represents a probability of determining a false steganographic key as a true steganographic key;
specifically, let the sequence of the extraction of the true steganographic key be l, but let the j, j=1, 2, …,8 bits, length n 0 Is a subsequence of (2)Is +.>Then->Obeying two-point distribution, the probability density function is as follows:
wherein χ0 ,χ 1 ≠0.5。
The sequence of pseudo steganographic key extraction is approximately 01 balanced. For the sequence of pseudo-steganographic key extraction, correspondingly, starting from the j, j=1, 2, …,8 bits of the first byte, samples are taken at 7 bits intervals, together with a total of n 0 Bits, get length n 0 Is a subsequence of (2)The probability density function is:
wherein ,γ0 ,γ 1 ≈0.5。
Based on this statistical difference, the discrimination problem of the authenticity steganographic key can be converted into a hypothesis testing problem with respect to the sequence distribution:
H 0 :D=D 0 ,H 1 :D=D 1 (23)
wherein D represents the overall distribution function of the sample, D 0 D represents a distribution function of a steganographic key 1 A distribution function representing a pseudo steganographic key;
order the(i=1,2,…,n 0 ) Let it be assumed that when H 0 R at the time of establishment i The expectation and variance are μ respectively 0 and σ0 When H 1 R at the time of establishment i The expectation and variance are μ respectively 1 and σ1 As can be seen from formulas (24) and (27), respectively:
on this basis, as an implementation manner, the method mainly comprises the following substeps:
s1021: calculating to obtain a critical value a and b according to a given first class error rate alpha and a second class error rate beta;
specifically, it has been mentioned above that the first type error rate α represents the probability of determining a true steganographic key as a false steganographic key, and the second type error rate β represents the probability of determining a false steganographic key as a true steganographic key; so as long as α is made small enough, it can be ensured that the probability of missing a true steganographic key can be sufficiently small. And the expected number of the pseudo steganographic keys to be accepted is not more than 1, namely beta|K|is less than or equal to 1/|K|, beta=1/|K| is generally taken, wherein |K| represents the number of elements in the steganographic key space.
Thus, taking β=1/|k|, while properly defining the value of α, the critical values a, b satisfying Φ (a) =α, Φ (b) =1- β are calculated.
S1022: the sample size N and the threshold T are calculated according to equation (28):
specifically, in the above process, the expiration μ has been counted 0 and σ0 ,μ 1 and σ1 The threshold values a, b are calculated by using the formula (28) to obtain the sample capacity N and the threshold value T.
S103: each check matrix in the exhaustive check matrix space sequentially performs a statistic calculation process, wherein the statistic calculation process comprises the following steps: extracting a sequence from the encrypted image by using a check matrix enumerated at the current moment, and sampling at intervals of 7 bits from the j-th bit of the first byte of the sequence to obtain a subsequence with the length of NCalculating to obtain statistics->Is a value of (2); generally, j=2;
specifically, consistent most advantageous test statistics are constructed according to equation (29)
wherein ,n long sequence +.>Number of occurrences of 0, < >>N long sequence +.>1 in the number of occurrences of (1).
S104: judging whether N > m is true, if so, turning to step 7; otherwise, turning to step 5;
s105: judging the corresponding check matrix of eachIf so, storing the corresponding check matrix into the key alternative set B; otherwise, discarding the corresponding check matrix;
s106: when judging that all check matrix corresponds toThen, if |b|=1, the check matrix in B is the true steganographic key, and the extraction is successful; if |b|=0, the extraction fails; if |B|>1, making the check matrix space be B, and turning to step S107;
s107: will enableThe check matrix reaching the maximum value is stored in the key alternative set D; if |D|=1, the check matrix in D is the true steganography key, and the extraction is successful; if |D|>1, extraction failure.
The secret information extraction method provided by the invention is irrelevant to a specific distortion function adopted by the STC-based adaptive steganography, and is applicable to any adaptive steganography algorithm adopting the STC. The airspace steganography algorithm HUGO "T.T.Filler,P.Bas.“Using High-Dimensional Image Models to Perform Highly Undetectable Steganography,”In:Proceedings of the 12th International Workshop on Information Hiding (IH), calgary, canada,2010, pp.161-177' applies STC to adaptive steganography, which has led researchers to pay attention to STC and has proposed many improvements. JPEG domain steganography algorithm J-UNIWARD "V.Holub, J.Fridrich." Digital image steganography using universal distortion, "In: proceedings of the 1st ACM Information Hiding and Multimedia Security Workshop (IH&MMSec), montallier, france,2013, pp.59-68 "is the sum of the relative changes of coefficients in the directional filter bank decomposition of the carrier image. This directionality allows the embedded change regions to be concentrated in areas that are difficult to model in multiple directions, with strong resistance to detection.
In order to verify the effectiveness of the secret information extraction method provided by the invention, the invention also provides the following experimental data.
(one) Experimental objects and Experimental Environment
HUGO and J-UNWARD steganography algorithms are selected as experimental objects respectively. The experimental environment is as follows: the operating system is Microsoft Win 10, the CPU is Intel i5, the memory is 8GB, and the programming language is MATLAB.
(II) Experimental setup
In the experiment, 80 airspace carrier images are randomly selected from a BOSSBase_1.01 library, and then the 80 airspace carrier images are converted into JPEG domain carrier images by utilizing Photoshop, wherein the quality factor is 90. The 160 carrier images are grouped into groups of 20, 8 groups each, each designated G 1 ,G 2 ,…,G 8, wherein G1 ,G 2 ,G 3 ,G 4 Is airspace carrier image, G 5 ,G 6 ,G 7 ,G 8 Is a JPEG domain carrier image. Experiments produced a total of 800-page encrypted images at embedding ratios of 0.5bpp, 0.4bpp, 0.3bpp, 0.2bpp, and 0.1bpp, respectively. The experiment was performed at a sub-check matrix height of 7.
The sub-check matrixes adopted in the experiment are respectively as follows: [109,71],[109,79,83],[89,127,99,69],[95,75,121,71,109],[95,107,109,79,117,67,121,123,103,81].
(III) results of experiments
The experimental setup was as follows: section (1) researches the probability distribution of each bit of I'm's dream, and then researches the 01 bit probability distribution of the LSB bit of the spatial vector image pixel selected by the experiment and the 01 bit frequency distribution of the LSB bit of the DCT coefficient embeddable by the JPEG domain vector image; section (2) studies the individual digital features of the spatial size and statistics of the steganographic key; the (3) section verifies that the frequency of occurrence of 01 bits in the sequence extracted by the pseudo steganographic key is approximately equal at first, and then based on the (2) section, the assumption test statistics are calculated respectively to obtain the correct steganographic key.
(1) Probability distribution of plaintext and pixel (DCT) coefficients
First, verify the probability distribution of each bit of plaintext message I'm have a dream. As a result, as shown in fig. 2, for each sub-sequence, the right column thereof indicates a frequency of 1, and the left column indicates a frequency of 0. The resulting errors of bit 2 and bit 6 differ the most, but are also controlled to be within 0.1. The frequency error of the remaining bits is controlled to be within 0.06. The 2 nd bit and 6 th bit frequency errors are larger because of too little sample size.
And secondly, researching the frequency distribution of LSBs of the spatial carrier image pixels selected by the experiment and the frequency distribution of LSBs of the embeddable DCT coefficients of the JPEG domain carrier image. The upper right hand corner represents the JPEG image experimental set and the lower left hand corner represents the airspace image experimental set. The experimental results are shown in FIG. 3. As can be seen from fig. 3, airspace group G 1 ,G 2 ,G 3 ,G 4 The frequency of 1 of the pixel LSB is approximately stabilized around 0.5; JPEG domain group G 5 ,G 6 ,G 7 ,G 8 The frequency of 1 in LSB of the embeddable DCT coefficient is substantially stable around 0.7.
(2) Steganographic key space size and digital features
The size of the steganographic key space is calculated as follows. For a height h and width w 1 Is 2, the number of all possible sub-check matrices hw . The good sub-check matrix should meet the requirement that the first row and the last row are 1 and any two columns are different, when the embedding rate alpha is the reciprocal of a certain integer, the check matrix is only composed of a single sub-check momentArray formation, steganographic key space of size
Wherein when the embedding rate alpha is not the reciprocal of a certain integer, the check matrix is formed by two sub-check matrices together, and the size of the steganographic key space is
Table 3 shows the size of the steganographic key space at different embedding rates. The sub-check matrix in this experiment was 7 in height. When the embedding ratio is 0.05bpp, the steganographic key space is minimal, about 10 3 . When the embedding ratio is 0.3bpp, the steganographic key space is maximum, about 2.5X10 10 . To be able to recover the steganographic key in a reasonable time, the potential of the steganographic key space is taken to be |k|=10 3 First class error rate α=0.01, second class error rate β=1/|k|=1/10 3 . At this time a= -2.33, b=3.90.
Table 3 steganographic key space size at different embedding rates
From the formulas (24-27), R can be calculated i Expected and variance of i=1, 2, sample size and threshold. Table 4 shows the individual digital characteristics of the statistics at the different bits selected. The probability of 0 occurrence in the most significant bit and the third bit is 0, so that the formulas (24-27) are meaningless and are therefore not considered. As can be seen from table 4, when the second bit is selected, the required sample capacity value is smaller, which is caused by the larger difference of probability distribution of the sub-sequences extracted by the true-false steganographic key; and the larger the probability distribution difference of the sub-sequences extracted by the true-false steganographic key is, the smaller the required sample capacity value is.
Table 4 digital characteristics of statistics at different bits
As can be seen from Table 1, the probability p of 0 occurrence in the plaintext message is
The various digital features of the statistics when plaintext is selected are shown in table 5. The sample size required at this time is large because the difference between the two probability distributions to be distinguished is small.
TABLE 5 digital characterization of statistics in plaintext
(3) Steganographic key recovery result and method comparison
And randomly selecting 6 carrier images from the 80 carrier images adopted in the experiment, and displaying the experimental results. The carrier images are shown in fig. 4 and 5. The frequency distribution of the sub-sequences extracted by the pseudo steganographic key is first studied below. Taking the second bit as an example, the pseudo-steganographic key is exhausted, the sequence extracted from the pseudo-steganographic key is sampled at intervals of 7 bits from the second bit of the first byte, and a subsequence is obtained. When the embedding ratio is 0.5bpp (bpnzac), the sub-check matrix height is 7, the steganographic key space size is 992, and the pseudo steganographic key space size is 991. Fig. 6 shows the frequency distribution of 1 in the sub-sequence of pseudo-steganographic key extraction, wherein the abscissa represents the respective pseudo-steganographic key and the ordinate represents the frequency of 1 in the sub-sequence. Abscissa 1,2,3, …,991 represents steganographic keys [65,67], [65,69], [65,71], …, [127,125], respectively. As can be seen from fig. 6, at an embedding rate of 0.5bpp (bpnzac), the frequency of 1 in the sub-sequence of pseudo steganographic key extraction is approximately around 0.5. The experimental results under other embedded rates and other loaded images are similar. The conclusion above can thus be verified: the frequency distribution of the sub-sequences extracted by the authenticity steganographic key is different.
The following is an experimental result of fig. 6, and the steganographic key recovery is performed by using the steganographic information extraction method provided by the invention. Recorder sequence l 2 N is the first of (2) 0 Bits areWhen the steganographic key is a true key, then +.>Obeying two-point distribution, the probability density function is as follows:
from the above experiments, the probability density function of the subsequence extracted by the pseudo-steganographic key is:
wherein ,γ0 ,γ 1 ≈0.5。
Based on this statistical difference, the discrimination problem of the authenticity steganographic key can be converted into a hypothesis testing problem with respect to the sequence distribution:
H 0 :D=D 0 ,H 1 :D=D 1
statistics of the construction:
wherein N long sequence +.>Number of occurrences of 0. Statistics->I.e., the consistent most advantageous test statistic with the smallest probability of error.
When (when)When receiving hypothesis H 0 Will H i Storing the candidate key as a true steganographic key in a key alternative set; when->When receiving hypothesis H 1 Discard H i 。
FIG. 7 illustrates the sequence correspondence of respective steganographic key extractions for spatial and JPEG domain-loaded imagesValues, ordinate represent statistics->The value, the abscissa represents the steganographic key. When the embedding ratio is 0.5bpp (bpnzac), the steganographic key space size is 992. Abscissa 1,2,3, …,992 represent steganographic keys [65,67], respectively]、[65,69],[65,71],…,[127,125]. As can be seen from table 4, at this time, the threshold t=8.17, and the sample size n=44. All possible steganographic keys are exhausted and statistics are calculated separately +.>Values. The results of FIG. 8 show that only when the abscissa 686, the corresponding steganographic key, is [109,71]]When (I)>The accepted original hypothesis can be considered as [109,71]]Is the correct steganographic key.
At an embedding ratio of 0.5bpp (bpnzac), 0.4bpp (bpnzac), 0.3bpp (bpnzac), 0.2bpp (bpnzac), 0.1bpp (bpnzac)For spatial and JPEG domain-loaded images, there are and only statistics corresponding to the correct steganographic keyThe value satisfies->Or the value of the statistic corresponding to the correct steganographic key +.>The maximum number of the identified hidden secret keys is 1, and the accuracy rate can reach 100%.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (1)
1. The secret information extraction method based on the consistent most advantageous test is characterized by comprising the following steps:
step 1: estimating the length m of the secret information according to pixels of the secret image or the embeddable DCT coefficient;
step 2: given a first class error rate alpha and a second class error rate beta, calculating a sample capacity N and a threshold value T; wherein the first type error rate α represents a probability of determining a true steganographic key as a false steganographic key, and the second type error rate β represents a probability of determining a false steganographic key as a true steganographic key; the step 2 specifically comprises the following steps:
step 2.1: according to the given first class error rate alpha and second class error rate beta, calculating to obtain a critical value a, b:
step 2.2: the sample size N and the threshold T are calculated according to equation (28):
wherein a, b satisfy φ (a) =α, φ (b) =1- β, μ 0 and σ0 Respectively represent when H 0 R at the time of establishment i Is μ 1 and σ1 Respectively represent H 1 R at the time of establishment i Is the expected and variance of (1);H 0 and H1 Representing hypothesis testing questions, denoted as H 0 :F=F 0 ,H 1 :F=F 1 F represents the overall distribution function of the sample, F 0 Representing a distribution function of a steganographic key, F 1 A distribution function representing a pseudo steganographic key; let the sequence extracted by the steganographic key be l, starting with the j, j=1, 2, …,8 bits of the first byte, samples every 7 bits, a total of samples n 0 Bits, get subsequence->Is the ith bit of (2) Represents the j-th subsequence extracted with the use of the steganographic key>I < th > bit->Probability density function, χ 0 ,χ 1 ≠0.5;/>Representing the sequence extracted with pseudo steganographic key +.>I < th > bit->Probability density function, gamma 0 ,γ 1 ≈0.5;i=1,2,…,n 0 ;
Step 3: constructing consistent most advantageous test statistics according to equation (29)Each check matrix in the exhaustive check matrix space sequentially performs a statistic calculation process, wherein the statistic calculation process comprises the following steps: extracting a sequence from the encrypted image by using the check matrix enumerated at the current moment, and sampling at intervals of 7 bits from the j-th bit of the first byte of the sequence to obtain a subsequence with the length of N->Calculating to obtain statistics->Is a value of (2);
wherein ,expression sequence->Number of occurrences of 0, < >>Expression sequence->Number of occurrences of 1 in (2)
Step 4: judging whether N > m is true, if so, turning to step 7; otherwise, turning to step 5;
step 5: judging the corresponding check matrix of eachIf so, storing the corresponding check matrix into the key alternative set B; otherwise, discarding the corresponding check matrix;
step 6: when judging that all check matrix corresponds toThen, if |b|=1, the check matrix in B is the true steganographic key, and the extraction is successful; if |b|=0, the extraction fails; if |B|>1, making the check matrix space be B, and turning to step 7;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210055235.5A CN114630006B (en) | 2022-01-18 | 2022-01-18 | Secret information extraction method based on consistent most advantageous test |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210055235.5A CN114630006B (en) | 2022-01-18 | 2022-01-18 | Secret information extraction method based on consistent most advantageous test |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114630006A CN114630006A (en) | 2022-06-14 |
CN114630006B true CN114630006B (en) | 2023-05-26 |
Family
ID=81898699
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210055235.5A Active CN114630006B (en) | 2022-01-18 | 2022-01-18 | Secret information extraction method based on consistent most advantageous test |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114630006B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106530203A (en) * | 2016-10-28 | 2017-03-22 | 武汉大学 | Texture complexity-based JPEG image adaptive steganography method |
CN110086955A (en) * | 2019-04-29 | 2019-08-02 | 浙江工商职业技术学院 | A kind of large capacity image latent writing method |
CN113032813A (en) * | 2021-04-27 | 2021-06-25 | 河南大学 | Reversible information hiding method based on improved pixel local complexity calculation and multi-peak embedding |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
SG139580A1 (en) * | 2006-07-20 | 2008-02-29 | Privylink Pte Ltd | Method for generating cryptographic key from biometric data |
CN103345767B (en) * | 2013-07-02 | 2016-08-10 | 中国科学技术大学 | A kind of JPEG image steganography method of high security |
CN107689026B (en) * | 2017-08-24 | 2020-05-15 | 中国科学技术大学 | Reversible steganography method based on optimal coding |
CN108271027B (en) * | 2018-01-10 | 2020-06-12 | 中国人民解放军战略支援部队信息工程大学 | Method for extracting image self-adaptive secret information |
CN110365864B (en) * | 2018-04-10 | 2020-09-04 | 北京大学 | Image steganography method, image steganography system, computer device and computer-readable storage medium |
CN108717683B (en) * | 2018-05-16 | 2022-03-29 | 陕西师范大学 | Secret pattern camouflage recovery method combining secret key and random orthogonal tensor base |
FR3087557B1 (en) * | 2018-10-18 | 2021-04-30 | Novatec | PRINTING AND AUTHENTICATION PROCESS OF A PRINTED MARKING |
CN112714231A (en) * | 2020-12-28 | 2021-04-27 | 杭州电子科技大学 | Robust steganography method based on DCT (discrete cosine transformation) symbol replacement |
-
2022
- 2022-01-18 CN CN202210055235.5A patent/CN114630006B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106530203A (en) * | 2016-10-28 | 2017-03-22 | 武汉大学 | Texture complexity-based JPEG image adaptive steganography method |
CN110086955A (en) * | 2019-04-29 | 2019-08-02 | 浙江工商职业技术学院 | A kind of large capacity image latent writing method |
CN113032813A (en) * | 2021-04-27 | 2021-06-25 | 河南大学 | Reversible information hiding method based on improved pixel local complexity calculation and multi-peak embedding |
Also Published As
Publication number | Publication date |
---|---|
CN114630006A (en) | 2022-06-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Emad et al. | A secure image steganography algorithm based on least significant bit and integer wavelet transform | |
Fridrich et al. | Searching for the stego-key | |
Zhu et al. | When seeing isn't believing [multimedia authentication technologies] | |
Xiao et al. | High capacity data hiding in encrypted image based on compressive sensing for nonequivalent resources | |
Yan et al. | A multiwatermarking scheme for verifying medical image integrity and authenticity in the internet of medical things | |
Seyyedi et al. | A Secure Steganography Method Based on Integer Lifting Wavelet Transform. | |
Mao et al. | Unicity distance of robust image hashing | |
He et al. | Collusion-resistant video fingerprinting for large user group | |
Yu et al. | Reversible data hiding in encrypted images for coding channel based on adaptive steganography | |
CN111597568B (en) | Image encryption method of high-dimensional fractional order complex system based on distributed time lag | |
CN114630006B (en) | Secret information extraction method based on consistent most advantageous test | |
Cao et al. | Using image sensor PUF as root of trust for birthmarking of perceptual image hash | |
Huang et al. | Reversible data hiding in JPEG images for privacy protection | |
Nazari et al. | A novel image steganography scheme based on morphological associative memory and permutation schema | |
Swaminathan et al. | Security of feature extraction in image hashing | |
Cho et al. | Block-based image steganalysis for a multi-classifier | |
Neelima et al. | Perceptual hash function for images based on hierarchical ordinal pattern | |
Jana et al. | Cheating prevention in Visual Cryptography using steganographic scheme | |
Wu et al. | A signal processing and randomization perspective of robust and secure image hashing | |
Li et al. | Reversible data hiding for encrypted 3D model based on prediction error expansion | |
Dumitrescu et al. | LSB steganalysis based on high-order statistics | |
Goel | Data hiding in digital images: a Steganographic paradigm | |
Deepika et al. | Secure Text Sharing using Medical Image Steganography | |
Bhattacharyya et al. | A robust image steganography method using pmm in bit plane domain | |
Kamal | Securing the smart card authentications process by embedment random number of data bits into each pixel |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |