CN114630006B - Secret information extraction method based on consistent most advantageous test - Google Patents

Secret information extraction method based on consistent most advantageous test Download PDF

Info

Publication number
CN114630006B
CN114630006B CN202210055235.5A CN202210055235A CN114630006B CN 114630006 B CN114630006 B CN 114630006B CN 202210055235 A CN202210055235 A CN 202210055235A CN 114630006 B CN114630006 B CN 114630006B
Authority
CN
China
Prior art keywords
steganographic
key
sequence
check matrix
steganographic key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210055235.5A
Other languages
Chinese (zh)
Other versions
CN114630006A (en
Inventor
刘九芬
杜寒松
张祎
罗向阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University of PLA Strategic Support Force
Original Assignee
Information Engineering University of PLA Strategic Support Force
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Engineering University of PLA Strategic Support Force filed Critical Information Engineering University of PLA Strategic Support Force
Priority to CN202210055235.5A priority Critical patent/CN114630006B/en
Publication of CN114630006A publication Critical patent/CN114630006A/en
Application granted granted Critical
Publication of CN114630006B publication Critical patent/CN114630006B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/44Secrecy systems
    • H04N1/4446Hiding of documents or document information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

The invention provides a secret information extraction method based on consistent most advantageous test. Firstly, researching probability distribution of different bits in a sequence extracted by a true steganographic key; then searching the relation between the probability distribution of the sequence extracted by the pseudo-steganographic key and the probability distribution of the secret carrying sequence, and proving that the probability distribution of the subsequence extracted by the pseudo-steganographic key has difference by researching the probability distribution of the space domain and JPEG domain secret carrying sequence; and finally, based on the difference, screening out the correct steganographic key by utilizing the consistent most advantageous test. Given the probability of false rejection and false taking, the threshold and sample size required for the hypothesis testing are derived. Experimental results show that the method provided by the invention can recover the steganographic key of the common main stream steganographic algorithm loaded secret image, thereby realizing the extraction of the steganographic information.

Description

Secret information extraction method based on consistent most advantageous test
Technical Field
The invention relates to the technical field of digital steganography, in particular to a secret information extraction method based on consistent optimal potential test.
Background
Digital steganography uses open channels to communicate by embedding steganographic information into multimedia files such as digital images, audio, video, etc. to achieve steganography. The communication conceals the existence of secret information, has strong deception, and has become a main mode for implementing the concealed communication and a research hot spot in the field of information security in recent years. The digital steganalysis technology mainly researches how to detect and extract secret information hidden in a digital carrier, and the final aim is to extract the hidden secret information and verify the correctness of steganography detection. On the one hand, because the two parties of communication mostly adopt a mode of 'hiding+encrypting' to realize hidden communication, for an attacker who wants to obtain hidden communication content and evidence-taking hidden communication behaviors, he must firstly extract hidden ciphertext information and then consider decoding, so the hidden information extraction is a problem which is difficult to avoid in the process of obtaining hidden communication content and evidence-taking hidden communication behaviors. On the other hand, most detection methods for steganography at present determine whether secret information is hidden by analyzing whether a carrier is modified, but the modified carrier does not necessarily contain sensitive information, so the validity of the determination is questioned, and the research of extracting the secret information is also necessary.
Currently, the carrier is adaptive steganography of images, that is, image adaptive steganography has become the dominant research direction in steganography. The embedding process of image adaptive steganography is typically composed of two parts, a distortion function and steganography, and may be expressed as "adaptive steganography=distortion function+steganography. The distortion function is used to calculate the distortion of the carrier image at different positions. Different steganography algorithms define distortion functions from different angles, the general principle being: the values are smaller in the texture complex region and the edge region, and larger in the smooth region. Steganographic encoding is used to select a modification position based on the calculated distortion to embed steganographic information with minimal distortion cost. The self-adaptive steganography adopts matrix coding, wet paper coding and other steganography coding. Document 1"T.
Figure BDA0003475929830000011
T.Filler, P.Bas. "Using High-Dimensional Image Models to Perform Highly Undetectable Steganography," In: proceedings of the 12th International Workshop on Information Hiding (IH), calgary, canada,2010, pp.161-177 "applies STC (Syndrome-Trellis Codes) for the first time to image adaptive steganography. Afterwards, STC approaches the theoretical optimum characteristic with its performance in minimizing embedded distortion, and becomes the first choice coding of the adaptive steganography algorithm, and adaptive steganography based on STC has also become the key point of forward improvement and the difficulty of backward analysis of the steganography algorithm.
Most of the secret information extraction methods are carried out under specific conditions, and extraction methods under the condition of only a secret image are urgently needed to be researched. The document 2"X.Luo,X.Song,X.Li,et al," Steganalysis of HUGO steganography based on parameter recognition of Syndrome-Trellis-Codes, "Multimedia Tools and Applications,2016, vol.75, no.21, pp.13557-13583" proposes a secret information extraction method suitable for space domain steganography under a secret-only condition for plaintext embedding. The LSB (Least significant bit) of the spatial domain image pixels is random noise, and the frequency of occurrence of 01 bits is the same. The method considers that the probability distribution of the sequence extracted by the pseudo steganographic key is the same as that of the spatial carrier image pixel LSB, namely the 01 bit frequency is close to the same. However, most of the pictures transmitted on the internet are in JPEG format, and the frequency of occurrence of 01 bits in the embeddable DCT (Discrete Cosine Transform) coefficient of the JPEG image is not the same.
Disclosure of Invention
In order to provide a secret information extraction method suitable for JPEG images under a secret-only condition, the invention provides a secret information extraction method based on consistent most advantageous test.
The invention provides a secret information extraction method based on consistent most advantageous test, which comprises the following steps:
step 1: estimating the length m of the secret information according to pixels of the secret image or the embeddable DCT coefficient;
step 2: given a first class error rate alpha and a second class error rate beta, calculating a sample capacity N and a threshold value T; wherein the first type error rate α represents a probability of determining a true steganographic key as a false steganographic key, and the second type error rate β represents a probability of determining a false steganographic key as a true steganographic key;
step 3: constructing consistent most advantageous test statistics
Figure BDA0003475929830000021
Each check matrix in the exhaustive check matrix space sequentially performs a statistic calculation process, wherein the statistic calculation process comprises the following steps: extracting a sequence from the encrypted image by using the check matrix enumerated at the current moment, and sampling at intervals of 7 bits from the j-th bit of the first byte of the sequence to obtain a subsequence with the length of N->
Figure BDA0003475929830000022
Calculating to obtain statistics->
Figure BDA0003475929830000023
Is a value of (2);
step 4: judging whether N > m is true, if so, turning to step 7; otherwise, turning to step 5;
step 5: judging the corresponding check matrix of each
Figure BDA0003475929830000024
If so, storing the corresponding check matrix into the key alternative set B; otherwise, discarding the corresponding check matrix;
step 6: when judging that all check matrix corresponds to
Figure BDA0003475929830000031
Then, if |b|=1, the check matrix in B is the true steganographic key, and the extraction is successful; if |b|=0, the extraction fails; if |B|>1, making the check matrix space be B, and turning to step 7;
step 7: will enable
Figure BDA0003475929830000032
The check matrix reaching the maximum value is stored in the key alternative set D; if |D|=1, the check matrix in D is the true steganography key, and the extraction is successful; if |D|>1, extraction failure.
Further, step 2 specifically includes:
step 2.1: according to the given first class error rate alpha and second class error rate beta, calculating to obtain a critical value a, b:
step 2.2: the sample size N and the threshold T are calculated according to equation (28):
Figure BDA0003475929830000033
wherein a, b satisfy φ (a) =α, φ (b) =1- β, μ 0 and σ0 Respectively represent when H 0 R at the time of establishment i Is μ 1 and σ1 Respectively represent H 1 R at the time of establishment i Is the expected and variance of (1);
Figure BDA0003475929830000034
i=1,2,…,n 0 ;H 0 and H1 Representing hypothesis testing questions, denoted as H 0 :D=D 0 ,H 1 :D=D 1 D represents the overall distribution function of the sample, D 0 D represents a distribution function of a steganographic key 1 A distribution function representing a pseudo steganographic key; let the sequence extracted by the steganographic key be l, starting with the j, j=1, 2, …,8 bits of the first byte, samples every 7 bits, a total of samples n 0 Bits, get subsequence->
Figure BDA0003475929830000035
Is +.>
Figure BDA0003475929830000036
i=1,2,…,n 0 ;/>
Figure BDA0003475929830000037
Represents the j-th subsequence extracted with the use of the steganographic key>
Figure BDA0003475929830000038
I < th > bit->
Figure BDA0003475929830000039
Probability density function, χ 01 ≠0.5;/>
Figure BDA00034759298300000310
Representing the sequence extracted with pseudo steganographic key +.>
Figure BDA00034759298300000311
I < th > bit->
Figure BDA00034759298300000312
Probability density function, gamma 01 ≈0.5;i=1,2,…,n 0
Further, step 3 further includes: constructing consistent most advantageous test statistics according to equation (29)
Figure BDA0003475929830000041
Figure BDA0003475929830000042
wherein ,
Figure BDA0003475929830000043
expression sequence->
Figure BDA0003475929830000044
Number of occurrences of 0, < >>
Figure BDA0003475929830000045
Expression sequence->
Figure BDA0003475929830000046
1 in the number of occurrences of (1).
The invention has the beneficial effects that:
the image self-adaptive steganography technology is also one of tools for planning and coordinating criminal activities by enemy and terrorists to endanger political safety and social stability in China while protecting data privacy. The secret information extraction method based on the consistent most advantageous test is simultaneously suitable for extracting secret information in airspace and JPEG (Joint Photographic Experts Group) domains, and can recover secret keys of common main stream secret algorithm secret images under the condition of plaintext embedding, thereby realizing secret information extraction and having important significance for maintaining national security of China.
Drawings
Fig. 1 is a flow chart of a secret information extraction method based on consistent most advantageous test according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of probability distribution of each bit of plain text according to an embodiment of the present invention;
FIG. 3 is a 01 bit frequency distribution of LSB bits of a carrier image provided by an embodiment of the present invention;
FIG. 4 is a 01 bit frequency distribution of LSB bits of a loaded image according to an embodiment of the present invention; (a) at 0.5bpp or bpnzac embedding rate; (b) at 0.4bpp or bpnzac embedding rate; (c) at 0.3bpp or bpnzac embedding rate; (d) at 0.2bpp or bpnzac embedding rate; (e) at 0.1bpp or bpnzac embedding rate;
FIG. 5 is a schematic diagram of an airspace carrier image according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a JPEG domain carrier image according to an embodiment of the present invention;
FIG. 7 is a 01 bit frequency distribution in a sub-sequence of pseudo steganographic key extraction provided by an embodiment of the present invention;
FIG. 8 is a statistical magnitude of sequences extracted from various steganographic keys provided by an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention is characterized in that: the correct steganographic key is screened out using a consistent most advantageous test based on the difference between the probability distribution of the sub-sequence extracted from the secret image using the true steganographic key and the probability distribution of the sub-sequence extracted from the secret image using the pseudo steganographic key.
Since the technical scheme of the invention is based on the premise that the probability distribution of sequences extracted by the true and false steganographic keys is different, but the prior art does not have a premise of researching and indicating the fact, the premise of the fact is proved before the technical scheme of the invention is introduced.
(1) Message distribution characteristics for steganographic key extraction
When the embedded message is in plaintext, the sequence extracted by the steganographic key is a plaintext sequence. Next, the probabilities of 0 and 1 of the respective bits of the chinese-english plaintext byte are studied.
In a computer system, english characters are stored in the form of bytes, each character occupies 1 byte, and the highest bit of each byte is 0; chinese characters are stored in the form of double bytes, the highest bit of each byte being 1 in order to avoid confusion with english characters. For the seven lower bits of the Chinese and English bytes, the probabilities of 0 and 1 in the seven lower bits of the Chinese and English bytes are calculated according to a large number of statistical results of natural language, and the results are shown in table 1 (8 represents the lowest bit, and the other bits are 7,6,5,5,4,3,2,1 in sequence).
Table 1 probabilities of respective bits of plaintext byte being 0 and 1
Figure BDA0003475929830000051
For plaintext embedding, the information extracted by the correct steganographic key is a plaintext sequence. Starting from the ith bit of the first byte of the plaintext sequence, sampling is performed at 7 bits each interval to obtain a subsequence L i I=1, 2, …,8. From Table 2, the subsequence L i I=0, 1 probability in 1,2, …,8.
TABLE 2 probability of 0 in subsequence
Figure BDA0003475929830000061
Obviously L i I=1, 2, …,8, i.e. the 01 bit probability imbalance in the different bits of the sequence extracted by the true steganographic key.
(2) Message distribution characteristics for pseudo steganographic key extraction
This section investigates the probability distribution of 01 bits in the sequence of pseudo steganographic key extraction. Firstly, the relation between the probability of 01 bits in the sequence extracted by the pseudo steganographic key and the probability of 01 bits in the carrier sequence is proved to be: the difference of the probabilities of 01 bits in the sequence extracted by the pseudo steganographic key is equal to the power of r of the difference of the probabilities of 01 bits in the carrier sequence; then, the probability distribution of 01 bits in the space domain and JPEG domain self-adaptive steganography secret-carrying sequences is studied respectively, so that the probability of 01 bits in the sequence extracted by the pseudo steganography secret key is close to balance.
(2.1) the following quotients are presented first:
lemma 1: is provided with
Figure BDA0003475929830000062
F 2 For binary Galois field, for fixed
Figure BDA0003475929830000069
If pr (a) i =0)=p 0 ,pr(a i =1)=p 1 I=1, 2, …, q, then
Figure BDA0003475929830000063
And (3) proving: on the binary Galois field,
Figure BDA0003475929830000064
if and only if->
Figure BDA0003475929830000065
There are an odd number 1; />
Figure BDA0003475929830000066
If and only if->
Figure BDA0003475929830000067
There is an even number 1. Then:
Figure BDA0003475929830000068
Figure BDA0003475929830000071
thus (2)
Figure BDA0003475929830000072
The proof ends.
According to lemma 1, the theorem describing the relationship between the probability distribution of the sequence extracted by the pseudo steganographic key and the probability distribution of the secret-carrying sequence is given as follows.
Theorem 1: let the payload sequence be y= (y) 1 ,y 2 ,…,y n ) Wherein pr (y) i =0)=p 0 ,pr(y i =1)=p 1 I=1, 2, …, n. Let s=(s) be the sequence of pseudo steganographic key extraction 1 ,s 2 ,…,s t ) When the pseudo steganographic key is composed of one sub-check matrix, the probability of 0 in the sequence s is:
Figure BDA0003475929830000073
when the pseudo steganographic key is composed of two different sub-check matrices, the probability of 0 in the sequence s is:
Figure BDA0003475929830000074
wherein ni I=1, 2 represents the number of 1s in the i-th sub-check matrix.
And (3) proving: when the secret information is long enough, the distribution of h-1 bits in front of the sequence does not influence the overall distribution, wherein h is the height of the sub-check matrix. Therefore, the first h-1 bits of the sequence are not considered for ease of discussion. For any j is greater than or equal to h, s j Equal to the multiplication of the basis row vector with the corresponding partial encryption sequence. Let the vector of the non-zero part of the j-th row of the steganographic key be
Figure BDA0003475929830000075
Wherein when->
Figure BDA0003475929830000078
V when (v) i =1. Setting corresponding partial carrier sequence +.>
Figure BDA0003475929830000076
Then
Figure BDA0003475929830000077
When pr (y) i =0)=p 0 ,pr(y i =1)=p 1 At the same time, as can be seen from the quotients 1:
Figure BDA0003475929830000081
where r represents the number of 1s in the base row vector, namely: the difference in 01 bit probabilities in the sequence extracted by the pseudo steganographic key is equal to the r power of the difference in 01 bit probabilities in the carrier sequence.
When the check matrix is composed of a sub-check matrix, the number of 1 in the sub-check matrix is assumed to be n 1 At this time
Figure BDA0003475929830000082
When the check matrix is composed of two different sub-check matrices, it is assumed that the number of 1 in the sub-check matrix is n 1 and n2 At this time
Figure BDA0003475929830000083
And (5) finishing the verification.
(2.2) the frequency distribution in the sequence of pseudo steganographic key extraction is examined below. Obviously, the frequency of occurrence of 0 in the sequence is:
Figure BDA0003475929830000084
from Bernoulli's law of large numbers, it is known that:
Figure BDA0003475929830000085
i.e. when the sequence is long enough, the frequency of occurrence of 0 in the sequence stabilizes at the probability calculated in theorem 1.
A good check matrix must satisfy all 1's for the first and last rows, so n 1 ,n 2 And is more than or equal to 4. Due to 0<|p 0 -p 1 |<1, thus if (p 0 -p 1 ) Small enough to ensure that the frequency of 0's in the sequence of pseudo-steganographic key extraction is close to 0.5. The difference between the probabilities of 01 bits in the payload sequence, i.e. (p), is studied for adaptive steganography in the spatial and JPEG domains, respectively 0 -p 1 ) Is of a size of (a) and (b).
(2.3) for spatial adaptive steganography, the carrier (payload) sequence consists of LSBs of pixels of the payload (carrier) image. Since the LSB of the pixels of the spatial image is random noise, the probability distribution of 0 and 1 in the carrier and the carrier sequence is close to:
pr(y i =0)=pr(y i =1)=1/2 (12)
the result using theorem 1 shows that:
pr(m i =0)-pr(m i =1)=0 (13)
i.e. the frequency of 01 bits in the sequence of pseudo steganographic key extraction is close to equilibrium.
For JPEG domain adaptive steganography, the carrier (secret) sequence is composed of non-zero Alternating Current (AC) coefficients among DCT coefficients (hereinafter, DCT coefficients) after quantization of the carrier (secret) image. The DCT coefficients of the JPEG image follow the Laplacian distribution. The probability of 0 and 1 in the non-zero AC coefficients LSB of the carrier image DCT coefficients is estimated using the distribution of the carrier image DCT coefficients.
Since the carrier image DCT coefficients approximately obey the Laplacian distribution with a position parameter of 0, the probability density function of the carrier image DCT coefficients can be expressed as:
Figure BDA0003475929830000091
let h (0) represent the frequency with which DCT coefficients with a value of 0 appear in the carrier image. Thus:
Figure BDA0003475929830000092
it is considered that the frequency of occurrence is small when the absolute value of the DCT coefficients of the carrier image is large. For ease of calculation, therefore, during the following calculation, taking DCT coefficients of a carrier image the value range is extended to [ -, a-is, ++ infinity ].
Let alpha 0 and α1 Representing the probabilities of 0 and 1 in the LSB of the DCT coefficients of the carrier image, respectively, i.e., alpha 0 Alpha is the sum of the probabilities of the DCT coefficient values being even in the carrier image 1 For the sum of the probabilities of the DCT coefficient values being odd in the carrier image, then:
Figure BDA0003475929830000093
Figure BDA0003475929830000094
let beta 0 and β1 The probabilities of 0 and 1 in LSB representing the non-zero DCT coefficients of the carrier image, respectively, then:
Figure BDA0003475929830000101
Figure BDA0003475929830000102
let beta' 0 and β′1 The probabilities of 0 and 1 in LSB of non-zero AC coefficients of the carrier image are represented respectively:
β′ 0 ≈β 0 ,β′ 1 ≈β 1 (20)
if a vector sequence is usedThe 01 bit probability difference of the carrier sequence is estimated, and then the 01 bit probability difference in the sequence extracted by the error check matrix is
Figure BDA0003475929830000103
Where r represents the number of 1 in the basic row vector. The good check matrix must satisfy all 1 in the first row and the last row, so r is not less than 4. At this time, the probability difference between 0 and 1 in the sequence of pseudo steganographic key extraction is small, and the probabilities of 0 and 1 are close to the same. Note that theorem 1 holds true for arbitrary j+_h, so the above derivation holds true for the subsequence of pseudo-steganographic key extraction.
In section (1), the conclusion is that: the probability of 01 bits in different bits of the subsequence extracted by the true steganographic key is unbalanced; in section (2), the conclusion is that: the 01 bit frequency in the sub-sequence of pseudo steganographic key extraction is close to equilibrium. This can be verified by: the probability distribution of the sequence extracted by the true and false steganographic keys is different.
Example 1
On the basis of the premise of the facts, as shown in fig. 1, the embodiment of the invention provides a secret information extraction method based on consistent most advantageous test, which comprises the following steps:
s101: estimating the length m of the secret information according to pixels of the secret image or the embeddable DCT coefficient;
specifically, document "J" may be employed.
Figure BDA0003475929830000104
The method of fridrich, "Quantitative Steganalysis using rich models," In: proceedings of SPIE, electronic Imaging, media Watermarking, security, and Forensics X v, san Francisco, CA,2013, vol.8665, pp.866500 "estimates the secret information length m, and is not described In detail herein.
S102: given a first class error rate alpha and a second class error rate beta, calculating a sample capacity N and a threshold value T; wherein the first type error rate α represents a probability of determining a true steganographic key as a false steganographic key, and the second type error rate β represents a probability of determining a false steganographic key as a true steganographic key;
specifically, let the sequence of the extraction of the true steganographic key be l, but let the j, j=1, 2, …,8 bits, length n 0 Is a subsequence of (2)
Figure BDA0003475929830000111
Is +.>
Figure BDA0003475929830000112
Then->
Figure BDA0003475929830000113
Obeying two-point distribution, the probability density function is as follows:
Figure BDA0003475929830000114
wherein χ01 ≠0.5。
The sequence of pseudo steganographic key extraction is approximately 01 balanced. For the sequence of pseudo-steganographic key extraction, correspondingly, starting from the j, j=1, 2, …,8 bits of the first byte, samples are taken at 7 bits intervals, together with a total of n 0 Bits, get length n 0 Is a subsequence of (2)
Figure BDA0003475929830000115
The probability density function is:
Figure BDA0003475929830000116
wherein ,γ01 ≈0.5。
Based on this statistical difference, the discrimination problem of the authenticity steganographic key can be converted into a hypothesis testing problem with respect to the sequence distribution:
H 0 :D=D 0 ,H 1 :D=D 1 (23)
wherein D represents the overall distribution function of the sample, D 0 D represents a distribution function of a steganographic key 1 A distribution function representing a pseudo steganographic key;
order the
Figure BDA0003475929830000117
(i=1,2,…,n 0 ) Let it be assumed that when H 0 R at the time of establishment i The expectation and variance are μ respectively 0 and σ0 When H 1 R at the time of establishment i The expectation and variance are μ respectively 1 and σ1 As can be seen from formulas (24) and (27), respectively:
Figure BDA0003475929830000118
Figure BDA0003475929830000119
Figure BDA00034759298300001110
Figure BDA0003475929830000121
on this basis, as an implementation manner, the method mainly comprises the following substeps:
s1021: calculating to obtain a critical value a and b according to a given first class error rate alpha and a second class error rate beta;
specifically, it has been mentioned above that the first type error rate α represents the probability of determining a true steganographic key as a false steganographic key, and the second type error rate β represents the probability of determining a false steganographic key as a true steganographic key; so as long as α is made small enough, it can be ensured that the probability of missing a true steganographic key can be sufficiently small. And the expected number of the pseudo steganographic keys to be accepted is not more than 1, namely beta|K|is less than or equal to 1/|K|, beta=1/|K| is generally taken, wherein |K| represents the number of elements in the steganographic key space.
Thus, taking β=1/|k|, while properly defining the value of α, the critical values a, b satisfying Φ (a) =α, Φ (b) =1- β are calculated.
S1022: the sample size N and the threshold T are calculated according to equation (28):
Figure BDA0003475929830000122
specifically, in the above process, the expiration μ has been counted 0 and σ0 ,μ 1 and σ1 The threshold values a, b are calculated by using the formula (28) to obtain the sample capacity N and the threshold value T.
S103: each check matrix in the exhaustive check matrix space sequentially performs a statistic calculation process, wherein the statistic calculation process comprises the following steps: extracting a sequence from the encrypted image by using a check matrix enumerated at the current moment, and sampling at intervals of 7 bits from the j-th bit of the first byte of the sequence to obtain a subsequence with the length of N
Figure BDA0003475929830000123
Calculating to obtain statistics->
Figure BDA0003475929830000124
Is a value of (2); generally, j=2;
specifically, consistent most advantageous test statistics are constructed according to equation (29)
Figure BDA0003475929830000125
Figure BDA0003475929830000126
wherein ,
Figure BDA0003475929830000127
n long sequence +.>
Figure BDA0003475929830000128
Number of occurrences of 0, < >>
Figure BDA0003475929830000129
N long sequence +.>
Figure BDA00034759298300001210
1 in the number of occurrences of (1).
S104: judging whether N > m is true, if so, turning to step 7; otherwise, turning to step 5;
s105: judging the corresponding check matrix of each
Figure BDA0003475929830000131
If so, storing the corresponding check matrix into the key alternative set B; otherwise, discarding the corresponding check matrix;
s106: when judging that all check matrix corresponds to
Figure BDA0003475929830000132
Then, if |b|=1, the check matrix in B is the true steganographic key, and the extraction is successful; if |b|=0, the extraction fails; if |B|>1, making the check matrix space be B, and turning to step S107;
s107: will enable
Figure BDA0003475929830000133
The check matrix reaching the maximum value is stored in the key alternative set D; if |D|=1, the check matrix in D is the true steganography key, and the extraction is successful; if |D|>1, extraction failure.
The secret information extraction method provided by the invention is irrelevant to a specific distortion function adopted by the STC-based adaptive steganography, and is applicable to any adaptive steganography algorithm adopting the STC. The airspace steganography algorithm HUGO "T.
Figure BDA0003475929830000134
T.Filler,P.Bas.“Using High-Dimensional Image Models to Perform Highly Undetectable Steganography,”In:Proceedings of the 12th International Workshop on Information Hiding (IH), calgary, canada,2010, pp.161-177' applies STC to adaptive steganography, which has led researchers to pay attention to STC and has proposed many improvements. JPEG domain steganography algorithm J-UNIWARD "V.Holub, J.Fridrich." Digital image steganography using universal distortion, "In: proceedings of the 1st ACM Information Hiding and Multimedia Security Workshop (IH&MMSec), montallier, france,2013, pp.59-68 "is the sum of the relative changes of coefficients in the directional filter bank decomposition of the carrier image. This directionality allows the embedded change regions to be concentrated in areas that are difficult to model in multiple directions, with strong resistance to detection.
In order to verify the effectiveness of the secret information extraction method provided by the invention, the invention also provides the following experimental data.
(one) Experimental objects and Experimental Environment
HUGO and J-UNWARD steganography algorithms are selected as experimental objects respectively. The experimental environment is as follows: the operating system is Microsoft Win 10, the CPU is Intel i5, the memory is 8GB, and the programming language is MATLAB.
(II) Experimental setup
In the experiment, 80 airspace carrier images are randomly selected from a BOSSBase_1.01 library, and then the 80 airspace carrier images are converted into JPEG domain carrier images by utilizing Photoshop, wherein the quality factor is 90. The 160 carrier images are grouped into groups of 20, 8 groups each, each designated G 1 ,G 2 ,…,G 8, wherein G1 ,G 2 ,G 3 ,G 4 Is airspace carrier image, G 5 ,G 6 ,G 7 ,G 8 Is a JPEG domain carrier image. Experiments produced a total of 800-page encrypted images at embedding ratios of 0.5bpp, 0.4bpp, 0.3bpp, 0.2bpp, and 0.1bpp, respectively. The experiment was performed at a sub-check matrix height of 7.
The sub-check matrixes adopted in the experiment are respectively as follows: [109,71],[109,79,83],[89,127,99,69],[95,75,121,71,109],[95,107,109,79,117,67,121,123,103,81].
(III) results of experiments
The experimental setup was as follows: section (1) researches the probability distribution of each bit of I'm's dream, and then researches the 01 bit probability distribution of the LSB bit of the spatial vector image pixel selected by the experiment and the 01 bit frequency distribution of the LSB bit of the DCT coefficient embeddable by the JPEG domain vector image; section (2) studies the individual digital features of the spatial size and statistics of the steganographic key; the (3) section verifies that the frequency of occurrence of 01 bits in the sequence extracted by the pseudo steganographic key is approximately equal at first, and then based on the (2) section, the assumption test statistics are calculated respectively to obtain the correct steganographic key.
(1) Probability distribution of plaintext and pixel (DCT) coefficients
First, verify the probability distribution of each bit of plaintext message I'm have a dream. As a result, as shown in fig. 2, for each sub-sequence, the right column thereof indicates a frequency of 1, and the left column indicates a frequency of 0. The resulting errors of bit 2 and bit 6 differ the most, but are also controlled to be within 0.1. The frequency error of the remaining bits is controlled to be within 0.06. The 2 nd bit and 6 th bit frequency errors are larger because of too little sample size.
And secondly, researching the frequency distribution of LSBs of the spatial carrier image pixels selected by the experiment and the frequency distribution of LSBs of the embeddable DCT coefficients of the JPEG domain carrier image. The upper right hand corner represents the JPEG image experimental set and the lower left hand corner represents the airspace image experimental set. The experimental results are shown in FIG. 3. As can be seen from fig. 3, airspace group G 1 ,G 2 ,G 3 ,G 4 The frequency of 1 of the pixel LSB is approximately stabilized around 0.5; JPEG domain group G 5 ,G 6 ,G 7 ,G 8 The frequency of 1 in LSB of the embeddable DCT coefficient is substantially stable around 0.7.
(2) Steganographic key space size and digital features
The size of the steganographic key space is calculated as follows. For a height h and width w 1 Is 2, the number of all possible sub-check matrices hw . The good sub-check matrix should meet the requirement that the first row and the last row are 1 and any two columns are different, when the embedding rate alpha is the reciprocal of a certain integer, the check matrix is only composed of a single sub-check momentArray formation, steganographic key space of size
Figure BDA0003475929830000151
Wherein when the embedding rate alpha is not the reciprocal of a certain integer, the check matrix is formed by two sub-check matrices together, and the size of the steganographic key space is
Figure BDA0003475929830000152
Table 3 shows the size of the steganographic key space at different embedding rates. The sub-check matrix in this experiment was 7 in height. When the embedding ratio is 0.05bpp, the steganographic key space is minimal, about 10 3 . When the embedding ratio is 0.3bpp, the steganographic key space is maximum, about 2.5X10 10 . To be able to recover the steganographic key in a reasonable time, the potential of the steganographic key space is taken to be |k|=10 3 First class error rate α=0.01, second class error rate β=1/|k|=1/10 3 . At this time a= -2.33, b=3.90.
Table 3 steganographic key space size at different embedding rates
Figure BDA0003475929830000153
From the formulas (24-27), R can be calculated i Expected and variance of i=1, 2, sample size and threshold. Table 4 shows the individual digital characteristics of the statistics at the different bits selected. The probability of 0 occurrence in the most significant bit and the third bit is 0, so that the formulas (24-27) are meaningless and are therefore not considered. As can be seen from table 4, when the second bit is selected, the required sample capacity value is smaller, which is caused by the larger difference of probability distribution of the sub-sequences extracted by the true-false steganographic key; and the larger the probability distribution difference of the sub-sequences extracted by the true-false steganographic key is, the smaller the required sample capacity value is.
Table 4 digital characteristics of statistics at different bits
Figure BDA0003475929830000161
As can be seen from Table 1, the probability p of 0 occurrence in the plaintext message is
Figure BDA0003475929830000162
The various digital features of the statistics when plaintext is selected are shown in table 5. The sample size required at this time is large because the difference between the two probability distributions to be distinguished is small.
TABLE 5 digital characterization of statistics in plaintext
Figure BDA0003475929830000163
(3) Steganographic key recovery result and method comparison
And randomly selecting 6 carrier images from the 80 carrier images adopted in the experiment, and displaying the experimental results. The carrier images are shown in fig. 4 and 5. The frequency distribution of the sub-sequences extracted by the pseudo steganographic key is first studied below. Taking the second bit as an example, the pseudo-steganographic key is exhausted, the sequence extracted from the pseudo-steganographic key is sampled at intervals of 7 bits from the second bit of the first byte, and a subsequence is obtained. When the embedding ratio is 0.5bpp (bpnzac), the sub-check matrix height is 7, the steganographic key space size is 992, and the pseudo steganographic key space size is 991. Fig. 6 shows the frequency distribution of 1 in the sub-sequence of pseudo-steganographic key extraction, wherein the abscissa represents the respective pseudo-steganographic key and the ordinate represents the frequency of 1 in the sub-sequence. Abscissa 1,2,3, …,991 represents steganographic keys [65,67], [65,69], [65,71], …, [127,125], respectively. As can be seen from fig. 6, at an embedding rate of 0.5bpp (bpnzac), the frequency of 1 in the sub-sequence of pseudo steganographic key extraction is approximately around 0.5. The experimental results under other embedded rates and other loaded images are similar. The conclusion above can thus be verified: the frequency distribution of the sub-sequences extracted by the authenticity steganographic key is different.
The following is an experimental result of fig. 6, and the steganographic key recovery is performed by using the steganographic information extraction method provided by the invention. Recorder sequence l 2 N is the first of (2) 0 Bits are
Figure BDA0003475929830000164
When the steganographic key is a true key, then +.>
Figure BDA0003475929830000165
Obeying two-point distribution, the probability density function is as follows:
Figure BDA0003475929830000171
from the above experiments, the probability density function of the subsequence extracted by the pseudo-steganographic key is:
Figure BDA0003475929830000172
wherein ,γ01 ≈0.5。
Based on this statistical difference, the discrimination problem of the authenticity steganographic key can be converted into a hypothesis testing problem with respect to the sequence distribution:
H 0 :D=D 0 ,H 1 :D=D 1
statistics of the construction:
Figure BDA0003475929830000173
wherein
Figure BDA0003475929830000174
N long sequence +.>
Figure BDA0003475929830000175
Number of occurrences of 0. Statistics->
Figure BDA0003475929830000176
I.e., the consistent most advantageous test statistic with the smallest probability of error.
When (when)
Figure BDA0003475929830000177
When receiving hypothesis H 0 Will H i Storing the candidate key as a true steganographic key in a key alternative set; when->
Figure BDA0003475929830000178
When receiving hypothesis H 1 Discard H i
FIG. 7 illustrates the sequence correspondence of respective steganographic key extractions for spatial and JPEG domain-loaded images
Figure BDA0003475929830000179
Values, ordinate represent statistics->
Figure BDA00034759298300001710
The value, the abscissa represents the steganographic key. When the embedding ratio is 0.5bpp (bpnzac), the steganographic key space size is 992. Abscissa 1,2,3, …,992 represent steganographic keys [65,67], respectively]、[65,69],[65,71],…,[127,125]. As can be seen from table 4, at this time, the threshold t=8.17, and the sample size n=44. All possible steganographic keys are exhausted and statistics are calculated separately +.>
Figure BDA00034759298300001711
Values. The results of FIG. 8 show that only when the abscissa 686, the corresponding steganographic key, is [109,71]]When (I)>
Figure BDA00034759298300001712
The accepted original hypothesis can be considered as [109,71]]Is the correct steganographic key.
At an embedding ratio of 0.5bpp (bpnzac), 0.4bpp (bpnzac), 0.3bpp (bpnzac), 0.2bpp (bpnzac), 0.1bpp (bpnzac)For spatial and JPEG domain-loaded images, there are and only statistics corresponding to the correct steganographic key
Figure BDA0003475929830000181
The value satisfies->
Figure BDA0003475929830000182
Or the value of the statistic corresponding to the correct steganographic key +.>
Figure BDA0003475929830000183
The maximum number of the identified hidden secret keys is 1, and the accuracy rate can reach 100%.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (1)

1. The secret information extraction method based on the consistent most advantageous test is characterized by comprising the following steps:
step 1: estimating the length m of the secret information according to pixels of the secret image or the embeddable DCT coefficient;
step 2: given a first class error rate alpha and a second class error rate beta, calculating a sample capacity N and a threshold value T; wherein the first type error rate α represents a probability of determining a true steganographic key as a false steganographic key, and the second type error rate β represents a probability of determining a false steganographic key as a true steganographic key; the step 2 specifically comprises the following steps:
step 2.1: according to the given first class error rate alpha and second class error rate beta, calculating to obtain a critical value a, b:
step 2.2: the sample size N and the threshold T are calculated according to equation (28):
Figure FDA0004178612630000011
wherein a, b satisfy φ (a) =α, φ (b) =1- β, μ 0 and σ0 Respectively represent when H 0 R at the time of establishment i Is μ 1 and σ1 Respectively represent H 1 R at the time of establishment i Is the expected and variance of (1);
Figure FDA0004178612630000012
H 0 and H1 Representing hypothesis testing questions, denoted as H 0 :F=F 0 ,H 1 :F=F 1 F represents the overall distribution function of the sample, F 0 Representing a distribution function of a steganographic key, F 1 A distribution function representing a pseudo steganographic key; let the sequence extracted by the steganographic key be l, starting with the j, j=1, 2, …,8 bits of the first byte, samples every 7 bits, a total of samples n 0 Bits, get subsequence->
Figure FDA0004178612630000013
Is the ith bit of (2)
Figure FDA0004178612630000014
Figure FDA0004178612630000015
Represents the j-th subsequence extracted with the use of the steganographic key>
Figure FDA0004178612630000016
I < th > bit->
Figure FDA0004178612630000017
Probability density function, χ 01 ≠0.5;/>
Figure FDA0004178612630000018
Representing the sequence extracted with pseudo steganographic key +.>
Figure FDA0004178612630000019
I < th > bit->
Figure FDA00041786126300000110
Probability density function, gamma 01 ≈0.5;i=1,2,…,n 0
Step 3: constructing consistent most advantageous test statistics according to equation (29)
Figure FDA00041786126300000111
Each check matrix in the exhaustive check matrix space sequentially performs a statistic calculation process, wherein the statistic calculation process comprises the following steps: extracting a sequence from the encrypted image by using the check matrix enumerated at the current moment, and sampling at intervals of 7 bits from the j-th bit of the first byte of the sequence to obtain a subsequence with the length of N->
Figure FDA0004178612630000021
Calculating to obtain statistics->
Figure FDA0004178612630000022
Is a value of (2);
Figure FDA0004178612630000023
wherein ,
Figure FDA0004178612630000024
expression sequence->
Figure FDA0004178612630000025
Number of occurrences of 0, < >>
Figure FDA00041786126300000210
Expression sequence->
Figure FDA0004178612630000026
Number of occurrences of 1 in (2)
Step 4: judging whether N > m is true, if so, turning to step 7; otherwise, turning to step 5;
step 5: judging the corresponding check matrix of each
Figure FDA0004178612630000027
If so, storing the corresponding check matrix into the key alternative set B; otherwise, discarding the corresponding check matrix;
step 6: when judging that all check matrix corresponds to
Figure FDA0004178612630000028
Then, if |b|=1, the check matrix in B is the true steganographic key, and the extraction is successful; if |b|=0, the extraction fails; if |B|>1, making the check matrix space be B, and turning to step 7;
step 7: will enable
Figure FDA0004178612630000029
The check matrix reaching the maximum value is stored in the key alternative set D; if |D|=1, the check matrix in D is the true steganography key, and the extraction is successful; if |D|>1, extraction failure. />
CN202210055235.5A 2022-01-18 2022-01-18 Secret information extraction method based on consistent most advantageous test Active CN114630006B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210055235.5A CN114630006B (en) 2022-01-18 2022-01-18 Secret information extraction method based on consistent most advantageous test

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210055235.5A CN114630006B (en) 2022-01-18 2022-01-18 Secret information extraction method based on consistent most advantageous test

Publications (2)

Publication Number Publication Date
CN114630006A CN114630006A (en) 2022-06-14
CN114630006B true CN114630006B (en) 2023-05-26

Family

ID=81898699

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210055235.5A Active CN114630006B (en) 2022-01-18 2022-01-18 Secret information extraction method based on consistent most advantageous test

Country Status (1)

Country Link
CN (1) CN114630006B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106530203A (en) * 2016-10-28 2017-03-22 武汉大学 Texture complexity-based JPEG image adaptive steganography method
CN110086955A (en) * 2019-04-29 2019-08-02 浙江工商职业技术学院 A kind of large capacity image latent writing method
CN113032813A (en) * 2021-04-27 2021-06-25 河南大学 Reversible information hiding method based on improved pixel local complexity calculation and multi-peak embedding

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SG139580A1 (en) * 2006-07-20 2008-02-29 Privylink Pte Ltd Method for generating cryptographic key from biometric data
CN103345767B (en) * 2013-07-02 2016-08-10 中国科学技术大学 A kind of JPEG image steganography method of high security
CN107689026B (en) * 2017-08-24 2020-05-15 中国科学技术大学 Reversible steganography method based on optimal coding
CN108271027B (en) * 2018-01-10 2020-06-12 中国人民解放军战略支援部队信息工程大学 Method for extracting image self-adaptive secret information
CN110365864B (en) * 2018-04-10 2020-09-04 北京大学 Image steganography method, image steganography system, computer device and computer-readable storage medium
CN108717683B (en) * 2018-05-16 2022-03-29 陕西师范大学 Secret pattern camouflage recovery method combining secret key and random orthogonal tensor base
FR3087557B1 (en) * 2018-10-18 2021-04-30 Novatec PRINTING AND AUTHENTICATION PROCESS OF A PRINTED MARKING
CN112714231A (en) * 2020-12-28 2021-04-27 杭州电子科技大学 Robust steganography method based on DCT (discrete cosine transformation) symbol replacement

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106530203A (en) * 2016-10-28 2017-03-22 武汉大学 Texture complexity-based JPEG image adaptive steganography method
CN110086955A (en) * 2019-04-29 2019-08-02 浙江工商职业技术学院 A kind of large capacity image latent writing method
CN113032813A (en) * 2021-04-27 2021-06-25 河南大学 Reversible information hiding method based on improved pixel local complexity calculation and multi-peak embedding

Also Published As

Publication number Publication date
CN114630006A (en) 2022-06-14

Similar Documents

Publication Publication Date Title
Emad et al. A secure image steganography algorithm based on least significant bit and integer wavelet transform
Fridrich et al. Searching for the stego-key
Zhu et al. When seeing isn't believing [multimedia authentication technologies]
Xiao et al. High capacity data hiding in encrypted image based on compressive sensing for nonequivalent resources
Yan et al. A multiwatermarking scheme for verifying medical image integrity and authenticity in the internet of medical things
Seyyedi et al. A Secure Steganography Method Based on Integer Lifting Wavelet Transform.
Mao et al. Unicity distance of robust image hashing
He et al. Collusion-resistant video fingerprinting for large user group
Yu et al. Reversible data hiding in encrypted images for coding channel based on adaptive steganography
CN111597568B (en) Image encryption method of high-dimensional fractional order complex system based on distributed time lag
CN114630006B (en) Secret information extraction method based on consistent most advantageous test
Cao et al. Using image sensor PUF as root of trust for birthmarking of perceptual image hash
Huang et al. Reversible data hiding in JPEG images for privacy protection
Nazari et al. A novel image steganography scheme based on morphological associative memory and permutation schema
Swaminathan et al. Security of feature extraction in image hashing
Cho et al. Block-based image steganalysis for a multi-classifier
Neelima et al. Perceptual hash function for images based on hierarchical ordinal pattern
Jana et al. Cheating prevention in Visual Cryptography using steganographic scheme
Wu et al. A signal processing and randomization perspective of robust and secure image hashing
Li et al. Reversible data hiding for encrypted 3D model based on prediction error expansion
Dumitrescu et al. LSB steganalysis based on high-order statistics
Goel Data hiding in digital images: a Steganographic paradigm
Deepika et al. Secure Text Sharing using Medical Image Steganography
Bhattacharyya et al. A robust image steganography method using pmm in bit plane domain
Kamal Securing the smart card authentications process by embedment random number of data bits into each pixel

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant