CN113556439A - Rich Model steganography detection feature selection method based on feature component correlation - Google Patents

Rich Model steganography detection feature selection method based on feature component correlation Download PDF

Info

Publication number
CN113556439A
CN113556439A CN202110638762.4A CN202110638762A CN113556439A CN 113556439 A CN113556439 A CN 113556439A CN 202110638762 A CN202110638762 A CN 202110638762A CN 113556439 A CN113556439 A CN 113556439A
Authority
CN
China
Prior art keywords
feature
characteristic
rich model
components
correlation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110638762.4A
Other languages
Chinese (zh)
Inventor
刘粉林
金顺浩
杨春芳
马媛媛
刘媛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University of PLA Strategic Support Force
Original Assignee
Information Engineering University of PLA Strategic Support Force
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Engineering University of PLA Strategic Support Force filed Critical Information Engineering University of PLA Strategic Support Force
Priority to CN202110638762.4A priority Critical patent/CN113556439A/en
Publication of CN113556439A publication Critical patent/CN113556439A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N1/32101Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N1/32144Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title embedded in the image data, i.e. enclosed or integrated in the image, e.g. watermark, super-imposed logo or stamp
    • H04N1/32352Controlling detectability or arrangements to facilitate detection or retrieval of the embedded information, e.g. using markers

Abstract

The invention provides a Rich Model steganography detection feature selection method based on feature component correlation. The method comprises the following steps: step 1: disassembling the high-dimensional Rich Model steganography detection characteristics into a plurality of Rich Model submodels; step 2: for each Rich Model submodel, measuring the separability of each characteristic component of the Rich Model submodel, and sequencing each characteristic component in a descending order according to the measured value of the separability; and step 3: calculating the correlation between any two characteristic components of each Rich model submodel, and selecting the characteristics of the characteristic components according to the strength of the correlation; and 4, step 4: and combining the selected Rich Model submodels to serve as the final steganography detection feature. When the method is applied to the Rich Model characteristics of the frequency domain and the airspace, the Rich Model characteristic dimension can be effectively reduced under the condition of not influencing the steganography detection accuracy, and the effect on the frequency domain is more obvious.

Description

Rich Model steganography detection feature selection method based on feature component correlation
Technical Field
The invention relates to the technical field of steganography detection, in particular to a Rich Model steganography detection feature selection method based on feature component correlation.
Background
Digital steganography is a technique for embedding information in redundancy of digital images, audio, video, text and other media for the purpose of covert communication. With the proposal of the HUGO steganography algorithm in 2010, adaptive steganography taking 'distortion function design + STC coding' as a framework has become the mainstream of image steganography, and researchers successively propose a series of adaptive steganography algorithms with high detection resistance based on the framework. These algorithms make most of the conventional steganographic detection algorithms ineffective. The Rich Model feature was proposed by Fridrich et al in 2012 (reference 1 "Fridrich J, Kodovsky J. Rich Models for Steganalysis of Digital Images [ J ]. IEEE Transactions on Information formalism and Security,2012,7(3): 868-882"), which effectively improves the detection performance of HUGO steganography. Thereafter, steganography detection features such as the PSRM (project Spatial Rich model), PHARM, GFR, and CCJRM (reference 2 "Jan Kodovsky, Fridrich J. Steganalysis of JPEG images using Rich models [ C ]. Media Watermarking, Security, and formalics 2012.International Society for Optics and Photonics, 2012") are proposed in succession. The detection features have higher dimensionality, some of which even reach tens of thousands of dimensions, which brings huge calculation and storage expenses to the training of the classifier and even can cause the problem of dimension disaster. In response to this problem, researchers have conducted a series of work on both feature transformation and feature selection to reduce feature dimensionality.
Feature dimension reduction based on feature transformation mainly transforms a feature vector to another feature space, so that effective information in features is mainly concentrated in partial components of the transformed features, and then the most effective feature components are selected from the transformed features, so as to achieve the purpose of feature dimension reduction. Such as: qin and the like use Principal Component Analysis (PCA) to obtain Principal components in the features to reduce the dimensions of high-dimensional features, so that a good dimension reduction effect is achieved, but the PCA method is not ideal for the feature dimension reduction effect of a nonlinear structure; wang et al propose to perform one-dimensional discrete Fourier feature transformation on the SRM features, and only select the spectral coefficient of the positive half axis as the feature vector, thereby effectively reducing the feature dimension; borouland and the like obtain a type of nonlinear transformation through approximately symmetrical positive and semi-definite kernel functions, the detection efficiency is improved while the dimension reduction of the characteristics after the nonlinear transformation is realized, but the methods proposed by the Borouland and the like are only suitable for spatial domain characteristics.
The dimension reduction based on feature selection mainly selects part of feature components which can most effectively distinguish a carrier image and a hidden image from a feature vector so as to achieve the purpose of feature dimension reduction. Such as: xua and Jennifer measure the distinguishable performance of the characteristic component by using Bhattacharyya distance and Mahalanobis distance respectively, and select the combination of partial characteristic components when the distance between the carrier image characteristic and the hidden image characteristic is maximum to achieve the purpose of characteristic dimension reduction, but the characteristic dimension reduction effect is not outstanding; lu et al measure the importance of the feature vectors by using an improved Fisher criterion, and select the feature vector with the highest Fisher value for detection; zhang et al (reference 3 "Zhang Y, Liu F, Jia H, et al, optimization of rich model based on fisher criterion for image step analysis [ C ]//2018tent International Conference on Advanced computerized analysis (ICACI). 2018") further reduces feature dimensions by applying the ideas in Lu text in a sub-model space; ma et al (ref. 4 "Y.Y.Ma, X.Y.Luo, X.L.Li, Z.K.Bao, and Y.Zhang." Selection of channel Model catalysts Based on Decision making Rough Set α -Positive Region Reduction, "IEEE Transactions on Circuits and Systems for Video Technology, vol.29, No.2, pp.336-350,2019") propose a feature dimension Reduction method Based on Decision-making Rough Set α -Positive Region Reduction, selecting a combination of partial feature components that conforms to both Positive Region non-Reduction and independence principles, but this method introduces a large computational overhead during Positive Region Reduction.
Disclosure of Invention
In order to solve the problems that the traditional hidden writing detection feature dimension reduction method is limited in applicability and difficult to effectively reduce dimensions of a plurality of redundant feature components with strong correlation, the invention provides a Rich Model hidden writing detection feature selection method based on feature component correlation.
The invention provides a Rich Model steganography detection feature selection method based on feature component correlation, which comprises the following steps of:
step 1: disassembling the high-dimensional Rich Model steganography detection characteristics into a plurality of Rich Model submodels;
step 2: for each Rich Model submodel, measuring the separability of each characteristic component of the Rich Model submodel, and sequencing each characteristic component in a descending order according to the measured value of the separability;
and step 3: calculating the correlation between any two characteristic components of each Rich Model submodel, and selecting the characteristics of the characteristic components according to the strength of the correlation;
and 4, step 4: and combining the selected Rich Model submodels to serve as the final steganography detection feature.
Further, before the step 2, the method further comprises:
and (4) removing the characteristic components of which the sample variances are 0 before and after steganography aiming at each Rich Model submodel.
Further, in step 2, for each Rich Model submodel, the separability of its respective feature components is measured according to equation (1):
Figure BDA0003106278800000031
wherein Fscore (d) represents Fisher value of d-dimension characteristic component in the Rich Model submodel, and the Fisher value is used as a measurement value of separability;
Figure BDA0003106278800000032
and
Figure BDA0003106278800000033
respectively representing the mean values of the d-dimension characteristic components of the carrier image set C and the hidden image set S;
Figure BDA0003106278800000034
and
Figure BDA0003106278800000035
the standard deviations of the d-th dimension feature components of the carrier image set C and the hidden image set S are respectively represented.
Further, in step 3, for each Rich Model submodel, the correlation between any two feature components is calculated, specifically:
respectively calculating the correlation coefficient between the characteristic components in the carrier image characteristic set according to the formula (2)
Figure BDA0003106278800000036
And the correlation coefficient between each characteristic component in the feature set of the hidden image
Figure BDA0003106278800000037
Further obtaining a correlation coefficient matrix R of the carrier image feature set shown in formula (3)CCorrelation coefficient matrix R of hidden image characteristic setS
Figure BDA0003106278800000038
Figure BDA0003106278800000039
Wherein, Xi,XjRespectively representing the ith dimension characteristic component and the jth dimension characteristic component in the carrier image characteristic set or the hidden image characteristic set; r represents the correlation coefficient between the two characteristic components, -1. ltoreq. r.ltoreq.1; cov denotes the covariance between two feature components; sigma is the standard deviation among samples of a single characteristic component; μ is the sample mean of the individual feature components; e represents expectation; rCAnd RSAre symmetric matrices of D × D, D representing the feature dimension of the Rich Model submodel.
Further, in step 3, feature selection is performed on the feature component according to the strength of the correlation, specifically:
respectively in the order of r11→…→rD1→r22→…→rij→…→rDD-1→rDDTraverse correlation coefficient matrix RCAnd RSElements below or above the main diagonal;
when the element rijWhen the condition shown in the formula (4) is satisfied, the feature component group (X) is subjected to the selection rulei,Xj) And (3) selecting:
Figure BDA0003106278800000041
the selection rule is as follows:
when in use
Figure BDA0003106278800000042
And is
Figure BDA0003106278800000043
When, consider the characteristic component XiAnd XjIf there is strong positive linear correlation, then rejecting the characteristic component XiPreserving the characteristic component Xj
When in use
Figure BDA0003106278800000044
And is
Figure BDA00031062788000000412
When, consider the characteristic component XiAnd XjStrong negative linear correlation exists between the characteristic components, the characteristic component X is removediPreserving the characteristic component Xj
When in use
Figure BDA0003106278800000045
And is
Figure BDA0003106278800000046
Or
Figure BDA0003106278800000047
And is
Figure BDA0003106278800000048
Then both feature components are retained;
when in use
Figure BDA0003106278800000049
Or
Figure BDA00031062788000000410
Then both feature components are retained; where T is the truncation threshold.
Further, step 3 further includes: determining an optimal truncation threshold T based on a bisection method, specifically:
step 3.1: obtaining 9 initial truncation threshold values T at intervals of 0.01 in the interval of (0.9,1)1,T2,……,T9And respectively calculating to obtain the corresponding detection accuracy A1,A2,……,A9And the original feature detection accuracy A0
Step 3.2: according to A1→A2→…→A9In the order of (A) and0by comparison, when Am-A0> -0.005, then stop the subsequent comparison and assign m-1 and m to x and y, respectively;
step 3.3: order to
Figure BDA00031062788000000411
Calculating to obtain the corresponding detection accuracy AzAnd with AyBy comparison, if Az≥AyThen T will bezIs assigned to TyOtherwise, will TzIs assigned to Tx
Step 3.4: repeat step 3.3 until TzWhen the significant digit of the digital signal reaches 5 bits, the operation is stopped and T at that time is determinedzAs the optimal truncation threshold.
The invention has the beneficial effects that:
(1) the high-dimensional Rich Model feature generally consists of hundreds of submodels, a large number of redundant features exist in the interior, dimension disaster and huge calculated amount are brought to steganography detection, the existing dimension reduction method is difficult to effectively reduce dimensions of a plurality of redundant feature components with strong correlation, and the structure based on the correlation among the feature components and the Rich Model steganography detection feature can effectively reduce dimensions of the plurality of redundant feature components with strong correlation;
(2) experiments show that the method can be applied to Rich Model characteristics of a frequency domain and a space domain, and can effectively reduce Rich Model characteristic dimension (the characteristic vector dimension selected according to the method is far lower than the original characteristic dimension (less than 50 percent)) under the condition of not influencing the steganography accuracy, so that the overhead in the steganography detection process is reduced, and the effect on the frequency domain is more obvious.
Drawings
FIG. 1 is a block diagram of a high-dimensional Rich Model steganography detection feature provided by the prior art;
FIG. 2 is a schematic flow chart of a Rich Model steganography detection feature selection method based on feature component correlation according to an embodiment of the present invention;
FIG. 3 is a graph illustrating the performance effect of classification of the s1_ minmax22v _ q1 feature under different reduction degrees provided by the embodiment of the present invention;
FIG. 4 is a graph illustrating the classification performance effect of Ah _ T3 features under different reduction levels according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating the effect of CCJRM feature reduction with embedding rate of 0.1 according to an embodiment of the present invention;
fig. 6 is a diagram of detection effects before and after CCJRM feature reduction under different embedding rates according to an embodiment of the present invention: (a) an intercalation ratio of 0.2; (b) an intercalation ratio of 0.3; (c) an insertion rate of 0.4; (d) an intercalation ratio of 0.5;
FIG. 7 is a comparison graph of the reduction effect of CCJRM characteristics under different embedding rates provided by the embodiment of the invention;
FIG. 8 is a comparison graph of the accuracy of CCJRM feature detection at different embedding rates according to the embodiment of the present invention;
fig. 9 is a diagram of detection effects before and after SRM feature reduction under different embedding rates according to an embodiment of the present invention: (a) an insertion rate of 0.1; (b) an intercalation ratio of 0.2; (c) an intercalation ratio of 0.3; (d) an insertion rate of 0.4; (e) an intercalation ratio of 0.5;
fig. 10 is a graph comparing reduction effects of SRM characteristics at different embedding rates according to embodiments of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 2, an embodiment of the present invention provides a method for selecting a Rich Model steganography detection feature based on feature component correlation, including:
s101: disassembling the high-dimensional Rich Model steganography detection characteristics into a plurality of Rich Model submodels;
specifically, as shown in fig. 1, document 1 in 2012 first proposes a main framework of the Rich Model feature, which mainly includes key links such as residual calculation, quantization and truncation, and co-occurrence matrix extraction. Rich and diverse linear and nonlinear high-pass filters are designed according to the RichModel characteristic from different directions and angles, the filters are used for filtering the images, various types of residual images are obtained as a result, the residual images are high-frequency components of the images and are high-pass filtered images, and preliminary characteristic elements can be obtained by extracting a fourth-order co-occurrence matrix of each residual image. Due to the symmetry of the co-occurrence matrices, part of the co-occurrence matrices can be merged to form a new co-occurrence matrix. And finally, combining the submodels into a complete Rich Model characteristic, wherein the complete Rich Model characteristic is the high-dimensional Rich Model steganography detection characteristic in the step.
S102: for each Rich Model submodel, removing characteristic components with the variance of 0 before and after steganography;
in particular, in the feature of the high-dimensional Rich Model, there is a case where the variance before and after partial feature component steganography is 0, which is particularly common in the frequency domain feature. These feature components have no positive effect on the training and classification of the ensemble classifier. Therefore, in this embodiment, before feature reduction, preprocessing is first performed to remove feature components with a sample variance of 0 before and after steganography. It will be understood that this step is not necessarily performed in all cases and may be omitted as desired.
S103: for each Rich Model submodel, measuring the separability of each characteristic component of the Rich Model submodel, and sequencing each characteristic component in a descending order according to the measured value of the separability;
s104: calculating the correlation between any two characteristic components of each Rich Model submodel, and selecting the characteristics of the characteristic components according to the strength of the correlation;
s105: and combining the selected Rich Model submodels to serve as the final steganography detection feature.
According to the Rich Model steganography detection feature selection method based on the feature component correlation, provided by the embodiment of the invention, from the perspective of correlation among feature components, a plurality of feature components with high separability and strong correlation and redundancy can be effectively reduced in dimension based on a specific sub-Model structure in the Rich Model.
On the basis of the above embodiment, as an implementable manner, in step S103, for each Rich Model submodel, the separability of its respective feature components is measured according to formula (1):
Figure BDA0003106278800000071
wherein Fscore (d) represents Fisher value of d-dimension characteristic component in the Rich Model submodel, and the Fisher value is used as a measurement value of separability;
Figure BDA0003106278800000072
and
Figure BDA0003106278800000073
respectively representing sets C and C of carrier imagesA mean value of the d-th dimension feature component of the hidden image set S;
Figure BDA0003106278800000074
and
Figure BDA0003106278800000075
the standard deviations of the d-th dimension feature components of the carrier image set C and the hidden image set S are respectively represented.
In particular, the amount of the solvent to be used,
Figure BDA0003106278800000076
reflects the inter-class dispersion degree between the d-th dimension characteristic components of the respective samples of the carrier image set C and the hidden image set S (also called classes C and S), and the larger the value of the inter-class dispersion degree is, the larger the inter-class difference is,
Figure BDA0003106278800000077
and
Figure BDA0003106278800000078
the intra-class cohesion of the d-th dimension characteristic component of each sample of the classes C and S is reflected, and the smaller the value of the intra-class cohesion is, the smaller the corresponding intra-class difference is. It is understood that in steganography detection, when the Fisher value is larger, the contribution of the feature component to the detection of the steganographic image is larger.
The characteristics after the preprocessing in step S102 are respectively recorded as
Figure BDA0003106278800000079
And
Figure BDA00031062788000000713
the feature dimension is D. In step S103, the Fisher values of the preprocessed feature components are measured by formula (1), and a corresponding D-dimensional Fisher value vector is obtained
Figure BDA00031062788000000714
The vector obtained by rearranging the vectors in descending order is expressed as
Figure BDA00031062788000000710
And
Figure BDA00031062788000000711
in addition to the above embodiments, as an implementation manner, in step S104, for each Rich Model submodel, the correlation between any two feature components thereof is calculated, specifically:
respectively calculating the correlation coefficient between the characteristic components in the carrier image characteristic set according to the formula (2)
Figure BDA00031062788000000712
And the correlation coefficient between each characteristic component in the feature set of the hidden image
Figure BDA0003106278800000081
Further obtaining a correlation coefficient matrix R of the carrier image feature set shown in formula (3)CCorrelation coefficient matrix R of hidden image characteristic setS
Figure BDA0003106278800000082
Figure BDA0003106278800000083
Wherein, Xi,XjRespectively representing the ith dimension characteristic component and the jth dimension characteristic component in the carrier image characteristic set or the hidden image characteristic set; r isijRepresenting two characteristic components Xi,XjCoefficient of correlation between, i.e. two characteristic components Xi,XjThe strength of linear correlation between the two components is-1 ≦ rij≤1,rijThe larger the absolute value of (A), the two characteristic components X are representedi,XjThe stronger the correlation between them; when r isijWhen the two are completely inversely related, when r is-1ijWhen the correlation value is 1, the two are completely positively correlated, and the correlation is strongest; cov denotes the covariance between two feature components;sigma is the standard deviation among samples of a single characteristic component; μ is the sample mean of the individual feature components; e represents expectation; rCAnd RSAre symmetric matrices of D × D, D representing the feature dimension of the Rich Model submodel.
As an implementation manner, in step S104, feature reduction is performed on the feature component according to the strength of the correlation, specifically:
respectively in the order of r11→…→rD1→r22→…→rij→…→rDD-1→rDDTraverse correlation coefficient matrix RCAnd RSElements below or above the main diagonal;
in particular, due to RCAnd RSIs a symmetric matrix and thus may only need to be in order r11→…→rD1→r22→…→rij→…→rDD-1→rDDTraverse correlation coefficient matrix RCAnd RSElements below or above the main diagonal without traversing all elements.
When the element rijWhen the condition shown in the formula (4) is satisfied, the feature component group (X) is subjected to the selection rulei,Xj) And (3) selecting:
Figure BDA0003106278800000091
the selection rule is as follows:
when in use
Figure BDA0003106278800000092
And is
Figure BDA0003106278800000093
When, consider the characteristic component XiAnd XjIf there is strong positive linear correlation, then rejecting the characteristic component XiPreserving the characteristic component Xj
When in use
Figure BDA0003106278800000094
And is
Figure BDA0003106278800000095
When, consider the characteristic component XiAnd XjStrong negative linear correlation exists between the characteristic components, the characteristic component X is removediPreserving the characteristic component XjWherein T is a truncation threshold.
It is understood that when the
Figure BDA0003106278800000096
And is
Figure BDA00031062788000000913
When, or when
Figure BDA0003106278800000097
And is
Figure BDA0003106278800000098
When, or when
Figure BDA0003106278800000099
When, or when
Figure BDA00031062788000000910
In the process, the correlation coefficient of the characteristic components before and after steganography is changed in sign, and two characteristic components need to be reserved to better carve out steganography noise. The feature reduction is not performed on the feature component groups at this time. The reduced characteristics based on the conditions shown in formula (4) are expressed as
Figure BDA00031062788000000911
And
Figure BDA00031062788000000912
in the feature reduction process, the selection of the truncation threshold T of the correlation coefficient r is crucial. The larger the value of T, the fewer the dimensionality of the reduced features and vice versa. The idea to be followed for the selection of the truncation threshold T is: and while the characteristics are reduced, the diversity of the original characteristics set is kept as much as possible in the reduced characteristics set, and new redundant characteristic components are introduced as little as possible, so that the stability of the detection accuracy is kept. In addition, in order to ensure the finite nature of the experimental process, the selection precision of the threshold value T is 5 significant digits. Therefore, on the basis of the foregoing embodiments, as an implementable manner, the truncation threshold T in the formula (4) is determined according to the following method, specifically: determining an optimal truncation threshold T based on a bisection method, comprising:
s201: obtaining 9 initial truncation threshold values T at intervals of 0.01 in the interval of (0.9,1)1,T2,……,T9And respectively calculating to obtain the corresponding detection accuracy A1,A2,……,A9And the original feature detection accuracy A0(ii) a The original feature detection accuracy is the steganography detection accuracy when a feature dimension reduction method is not adopted.
In particular, T1=0.91,T2=0.92,……,T9=0.99。
S202: according to A1→A2→…→A9In the order of (A) and0by comparison, when Am-A0> -0.005, then stop the subsequent comparison and assign m-1 and m to x and y, respectively;
s203: order to
Figure BDA0003106278800000101
Calculating to obtain the corresponding detection accuracy AzAnd with AyBy comparison, if Az≥AyThen T will bezIs assigned to TyOtherwise, will TzIs assigned to Tx
S204: step S203 is repeated until TzWhen the significant digit of the digital signal reaches 5 bits, the operation is stopped and T at that time is determinedzAs the optimal truncation threshold.
In order to verify the effectiveness of the Rich Model steganography detection feature selection method based on feature component correlation provided by the invention, the following analysis process is provided.
Taking the SRM feature in document 1 as an example, firstly, decomposing the high-dimensional SRM feature into 106 sub-model feature sets; secondly, preprocessing the characteristic components in each sub-model, and deleting the characteristic components of which the sample variances are 0 in the carrier image and the hidden image; then, sorting the attribute importance of the sub-model feature set by using a formula (1); then, the submodels are reduced based on the proposed reduction rule; and finally, combining the feature vectors selected from each sub-model into a feature set finally used for steganography detection.
And the relation between the characteristics reduced by the correlation characteristics with different intensities and the detection performance is explored from the point of view of statistical phenomena. The present invention will employ (document 6 "Gretton, a., Borgwardt, m., Rasch, m.,
Figure BDA0003106278800000103
smola, A.J. (2007). Akernel method for the two-sample-publishing Systems in advance in Neural Information Processing Systems,2007,513-520 "and MMD (Maximum Mean distinction) standard mentioned in document 7" Pevny, T. (2008). Kernel methods in hierarchical analysis (Ph.D. theory). University of New York, Binghamton ") to measure the similarity of feature distributions before and after image steganography to characterize the classification performance of features. The MMD is calculated as follows:
Figure BDA0003106278800000102
wherein x isiRepresenting the feature vector extracted from the ith carrier image, corresponding to which yiRepresenting the feature vector extracted from the ith hidden image, m and n respectively representing the number of the carrier image and the hidden image, and k (·, ·,) being a Radial Basis Function (RBF). When the MMD value is smaller, the distribution of the features before and after the steganography is more similar, and the classification effect is poorer, and vice versa.
In order to research the relation between the features obtained by reduction of different intensities and MMD, steganography detection is carried out in a space domain and a frequency domain respectively, and the embedding rate of a steganographic image is 0.5. Aiming at the extracted features, the features are reduced to different degrees by the reduction method provided by the invention, and corresponding MMD values and detection accuracy are calculated to draw a scatter diagram. The effect graphs shown in fig. 3 and 4 are obtained by taking, as an example, the s1_ minmax22v _ q1 submodel feature in the SRM feature and the Ah _ T3 submodel feature in the CCJRM feature in document 2.
In the effect plots shown in fig. 3 and 4, a represents the MMD value of the characteristic after different degrees of reduction, and ● represents the corresponding detection correctness. Observing the two effect graphs, the MMD and the detection accuracy are increased along with the increase of the feature dimension, when the feature dimension reaches a certain limit, the MMD and the detection accuracy tend to be gentle, and the MMD and the detection accuracy are not greatly improved after the feature component with strong correlation is subsequently added. Therefore, it can be inferred that redundant information exists between feature components having strong correlation, and simply adding a feature component having strong correlation does not improve the classification performance of features. It is critical to select a suitable feature dimension by adjusting the truncation threshold T of the correlation coefficient.
In addition, the method is based on the existing typical steganography algorithm and the corresponding Rich Model steganography detection characteristics, and comparison experiments are respectively carried out on a space domain and a frequency domain. The following are specific experimental setups and experimental comparison results.
(one) Experimental setup
The images selected by the experiment of the invention are from a BOSSbase-1.01 image library. Firstly, 10000 gray images are selected to be subjected to DCT transformation, and 10000 JPEG carrier image sets with the quality factor of 75 are generated. Secondly, aiming at two different image libraries in a space domain and a frequency domain, respectively adopting an S-UNIWARD steganography algorithm and a J-UNIWARD steganography algorithm to construct a steganographic image set with embedding rates of 0.1,0.2,0.3,0.4 and 0.5. Then, feature extraction is performed on the existing carrier image and the hidden image by respectively adopting typical Rich Model feature extraction algorithms SRM (34671 dimension) and CC-JRM (22510 dimension), as shown in Table 1. Finally, the feature extracted in the last step is reduced by using the feature reduction method provided by the invention.
Table 1 images used in the experiments
Figure BDA0003106278800000111
In the experimental process, after the steganography detection features are obtained by the steganography detection feature selection method provided by the invention, the training and testing process of an integrated classifier (reference 5 "Kodovsky J, Fridrich J, Holub V.Ensemble Classifiers for Steganalysis of Digital Media [ J ]. IEEE Transactions on Information dynamics and Security,2012,7(2): 432-. The features of 5000 images are randomly selected from 10000 carrier images and 10000 hidden images each time to be used for training the parameters of the integrated classifier, and the features of the 5000 remaining carrier images and the hidden images are used for testing the performance.
The standard for evaluating the test performance of the invention consists of an error rate PEIt is represented by a false alarm rate and a missed detection rate.
Figure BDA0003106278800000121
Wherein, PFAProbability of a carrier image being a hidden image, P, as false alarm rateMDAnd the probability of judging the hidden image as the carrier image is the omission ratio. For each set of experiments, the above process is repeated 10 times, and the minimum global average error rate median of 10 experiments is used as a measure of detection performance, and a smaller value indicates better performance of steganography detection.
Based on the reduction method provided by the invention, the effect of the reduction algorithm in the Rich Model submodel is verified firstly. The submodels used for this experiment were characterized by the SRM signature s1-minmax22h and the CCJRM signature Dix2_ T2. The results of the experiments are shown in tables 2 and 3.
TABLE 2 reduction Effect of the method of the present invention on the s1-minmax22h submodel in SRM
Figure BDA0003106278800000122
TABLE 3 reduction Effect of the method of the present invention on Dix2_ T2 submodel in CCJRM
Figure BDA0003106278800000123
In the experimental process, the detection effect of the randomly selected features is compared with the detection effect of the features reduced by the method, and the detection effect of the feature reduction algorithm is higher than that of the randomly selected features while the original detection accuracy is kept.
(II) comparison experiment before and after reduction of frequency domain steganography detection features
The comparison experiment of the invention starts from different embedding rates and different feature dimensions, and provides comparison data before and after feature reduction. Firstly, the invention describes the comparison experiment before and after CCJRM characteristic reduction under 0.1 embedding rate in detail, and gives the value of the truncation threshold T used in each characteristic reduction.
TABLE 4 accuracy of detection before and after reduction of CCJRM feature with embedding rate of 0.1 (%)
Figure BDA0003106278800000131
As can be seen from table 4, at 0.1 embedding rate, the accuracy of the original 22510-dimensional CCJRM feature in steganographic detection is 53.06%. The best detection effect of the reduced features based on the method can reach 53.48%, the accuracy is improved by 0.42% compared with the accuracy of the original features, meanwhile, the reduced features are 4436-dimensional and are only 19.7% of the original feature number, and the cost of classifier training is greatly reduced. Therefore, the method provided by the invention effectively reduces the feature dimension, better keeps the accuracy of the steganography detection, and can improve the accuracy of the steganography detection even under the condition of low embedding rate, thereby verifying the effectiveness of the feature reduction method of the invention.
FIG. 5 is a line graph showing the detection effect of each feature dimension after the reduction of the CCJRM features at the embedding rate of 0.1, wherein the horizontal axis represents the feature dimension, the vertical axis represents the detection accuracy, and the tangle-solidup represents the optimal feature reduction dimension. As can be seen from the figure, when the feature dimension is insufficient, part of the valid features are not selected, so the detection accuracy does not reach the peak. When the feature dimension is retained too much, many redundant and even harmful feature components are doped in the feature dimension, so that the accuracy of steganography detection is reduced, and the retention of the feature dimension too much increases the time and complexity of classification. As the number of the features gradually approaches to the original feature dimension, the steganography detection accuracy rate does not change obviously.
In order to test the performance of the method under other embedding rates, the invention also performs experiments on steganography detection characteristics with embedding rates of 0.2,0.3,0.4 and 0.5, and the detection effect is shown in fig. 6.
In fig. 6, (a) to (d) are graphs showing comparison of detection effects before and after feature reduction of CCJRM features at embedding rates of 0.2,0.3,0.4 and 0.5, respectively, where the horizontal axis represents the feature dimension, the vertical axis represents the accuracy of steganographic detection, a in the graph represents the optimal detection accuracy obtained after feature reduction, and ● represents the detection accuracy obtained under the original feature dimension. In the diagrams (b), (c) and (d), it can be seen that the original feature dimension can be effectively reduced after the features are reduced by the method of the present invention. Meanwhile, the detection accuracy rate does not slide down obviously, errors are within 0.4%, and the detection accuracy rate is even improved under the condition of low embedding rate.
The present invention is also compared with the feature extraction methods of documents 3 and 4. The three have small difference in the overall detection effect, and the detection errors are all within 0.5 percent. However, the feature reduction method proposed by the present invention is significantly lower in feature dimension than the latter two, and the specific results are shown in fig. 7 and 8. Different broken lines in the figure respectively represent the detection accuracy of the features selected by the 3 feature selection methods and the original features under different embedding rates, and it can be seen that the feature dimension reduced by the method of the invention is obviously lower than the first three features.
(III) comparison experiment before and after reduction of spatial domain steganography detection features
The effectiveness of the method in the frequency domain steganography detection has been examined in section (two), and in order to explore the universality of the method, the section examines the effectiveness of the method in the spatial domain steganography detection. As in the frequency domain, this section sets the experimental groups with embedding rates of 0.1,0.2,0.3,0.4 and 0.5(bbp), respectively, and the experimental results are shown in fig. 9.
As in section (ii), 5 graphs in fig. 9 show comparison of the detection effect before and after reduction of the SRM feature at embedding rates of 0.1,0.2,0.3,0.4, and 0.5, respectively. Wherein the horizontal axis represents the feature dimension, the vertical axis represents the accuracy of steganographic detection, the tangle-solidup in the graph represents the optimal detection accuracy obtained after feature reduction, and ● represents the detection accuracy obtained under the original feature dimension. It can be seen from the figure that, as the feature dimension increases gradually, the accuracy of steganography detection also increases, and when the feature dimension increases to a certain value, the accuracy of steganography detection tends to be stable, and no obvious improvement phenomenon occurs. Experiments show that the error between the detection accuracy of the feature vector selected by the method and the detection accuracy of the original feature vector is within 0.3 percent in the steganography detection accuracy. Moreover, the feature vector dimension selected by the method is far lower than the original feature dimension (less than 50%). Therefore, under the condition of not influencing the steganography detection accuracy, the method can effectively reduce the airspace Rich Model characteristic dimension, thereby reducing the overhead in the steganography detection process.
In the spatial domain, the comparison results between the method of the present invention and the methods proposed by Zhang and Ma are approximately the same as in the frequency domain, and the experimental results are shown in fig. 10.
As can be seen from fig. 10, the reduced feature dimension of the method of the present invention is significantly smaller than the reduced feature dimensions of the two methods. The contrast effect is particularly obvious under the condition of low embedding rate.
In summary, the feature reduction method provided by the invention can achieve effective dimension reduction in both frequency domain and spatial domain. In addition, in the frequency domain steganography detection with low embedding rate, the features reduced by the method are improved to a certain extent compared with the steganography detection accuracy of the original features. In addition, in the aspect of feature dimension reduction, the dimension reduction amplitude of the frequency domain steganography detection feature is larger than that of the spatial domain steganography detection feature. The main reason is that the Rich Model feature of the frequency domain steganography detection has stronger linear correlation, so that the purpose of better dimension reduction is achieved.
Aiming at image self-adaptive steganography detection, the Rich Model steganography detection feature has a good detection effect, but the Rich Model feature has the defects of high dimension, slow training and the like. Under the condition of not influencing the accuracy of steganography detection, aiming at the condition that a large amount of redundancy exists in the Rich Model feature, in order to reduce the huge calculation amount brought by the high-dimensional Rich Model feature, the method firstly introduces the ideas of correlation and Fisher criterion to reduce the feature dimension. Secondly, based on the proposed method, a large number of verification experiments are carried out on typical Rich Model characteristics SRM and CCJRM in the frequency domain and the space domain. The result shows that the method effectively and greatly reduces the dimension of the characteristic while maintaining or improving the steganography detection precision.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (6)

1. The Rich Model steganography detection feature selection method based on feature component correlation is characterized by comprising the following steps of:
step 1: disassembling the high-dimensional Rich Model steganography detection characteristics into a plurality of Rich Model submodels;
step 2: for each Rich Model submodel, measuring the separability of each characteristic component of the Rich Model submodel, and sequencing each characteristic component in a descending order according to the measured value of the separability;
and step 3: calculating the correlation between any two characteristic components of each Rich Model submodel, and selecting the characteristics of the characteristic components according to the strength of the correlation;
and 4, step 4: and combining the selected Rich Model submodels to serve as the final steganography detection feature.
2. The method of claim 1, further comprising, before the step 2:
and (4) removing the characteristic components of which the sample variances are 0 before and after steganography aiming at each Rich Model submodel.
3. The method of claim 1, wherein in step 2, for each Rich Model submodel, the separability of its respective feature components is measured according to equation (1):
Figure FDA0003106278790000011
wherein Fscore (d) represents Fisher value of d-dimension characteristic component in the Rich Model submodel, and the Fisher value is used as a measurement value of separability;
Figure FDA0003106278790000012
and
Figure FDA0003106278790000013
respectively representing the mean values of the d-dimension characteristic components of the carrier image set C and the hidden image set S;
Figure FDA0003106278790000014
and
Figure FDA0003106278790000015
the standard deviations of the d-th dimension feature components of the carrier image set C and the hidden image set S are respectively represented.
4. The method according to claim 1, wherein in step 3, for each Rich Model submodel, the correlation between any two feature components thereof is calculated, specifically:
respectively calculating the correlation coefficient between the characteristic components in the carrier image characteristic set according to the formula (2)
Figure FDA0003106278790000016
And the correlation coefficient between each characteristic component in the feature set of the hidden image
Figure FDA0003106278790000017
Further obtaining a correlation coefficient matrix R of the carrier image feature set shown in formula (3)CCorrelation coefficient matrix R of hidden image characteristic setS
Figure FDA0003106278790000018
Figure FDA0003106278790000021
Wherein, Xi,XjRespectively representing the ith dimension characteristic component and the jth dimension characteristic component in the carrier image characteristic set or the hidden image characteristic set; r represents the correlation coefficient between the two characteristic components, -1. ltoreq. r.ltoreq.1; cov denotes the covariance between two feature components; sigma is the standard deviation among samples of a single characteristic component; μ is the sample mean of the individual feature components; e represents expectation; rCAnd RSAre symmetric matrices of D × D, D representing the feature dimension of the Rich Model submodel.
5. The method according to claim 4, wherein in step 3, feature selection is performed on the feature components according to the strength of the correlation, specifically:
respectively in the order of r11→…→rD1→r22→…→rij→…→rDD-1→rDDTraverse correlation coefficient matrix RCAnd RSElements below or above the main diagonal;
when the element rijWhen the condition shown in the formula (4) is satisfied, the feature component group (X) is subjected to the selection rulei,Xj) And (3) selecting:
Figure FDA0003106278790000022
the selection rule is as follows:
when in use
Figure FDA0003106278790000023
And is
Figure FDA0003106278790000024
When, consider the characteristic component XiAnd XjIf there is strong positive linear correlation, then rejecting the characteristic component XiPreserving the characteristic component Xj
When in use
Figure FDA0003106278790000025
And is
Figure FDA0003106278790000026
When, consider the characteristic component XiAnd XjStrong negative linear correlation exists between the characteristic components, the characteristic component X is removediPreserving the characteristic component Xj
When in use
Figure FDA0003106278790000027
And is
Figure FDA0003106278790000028
Or
Figure FDA0003106278790000029
And is
Figure FDA00031062787900000210
Then both feature components are retained;
when in use
Figure FDA00031062787900000211
Or
Figure FDA00031062787900000212
Then both feature components are retained;
where T is the truncation threshold.
6. The method of claim 5, wherein step 3 further comprises: determining an optimal truncation threshold T based on a bisection method, specifically:
step 3.1: obtaining 9 initial truncation threshold values T at intervals of 0.01 in the interval of (0.9,1)1,T2,……,T9And respectively calculating to obtain the corresponding detection accuracy A1,A2,……,A9And the original feature detection accuracy A0
Step 3.2: according to A1→A2→…→A9In the order of (A) and0by comparison, when Am-A0> -0.005, then stop the subsequent comparison and assign m-1 and m to x and y, respectively;
step 3.3: order to
Figure FDA0003106278790000031
Calculating to obtain the corresponding detection accuracy AzAnd with AyBy comparison, if Az≥AyThen T will bezIs assigned to TyOtherwise, will TzIs assigned to Tx
Step 3.4: repeat step 3.3 until TzWhen the significant digit of the digital signal reaches 5 bits, the operation is stopped and T at that time is determinedzAs the optimal truncation threshold.
CN202110638762.4A 2021-06-08 2021-06-08 Rich Model steganography detection feature selection method based on feature component correlation Pending CN113556439A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110638762.4A CN113556439A (en) 2021-06-08 2021-06-08 Rich Model steganography detection feature selection method based on feature component correlation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110638762.4A CN113556439A (en) 2021-06-08 2021-06-08 Rich Model steganography detection feature selection method based on feature component correlation

Publications (1)

Publication Number Publication Date
CN113556439A true CN113556439A (en) 2021-10-26

Family

ID=78102070

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110638762.4A Pending CN113556439A (en) 2021-06-08 2021-06-08 Rich Model steganography detection feature selection method based on feature component correlation

Country Status (1)

Country Link
CN (1) CN113556439A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114627340A (en) * 2022-03-19 2022-06-14 河南师范大学 Image steganography detection feature self-adaptive selection method based on triple measurement

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107197297A (en) * 2017-06-14 2017-09-22 中国科学院信息工程研究所 A kind of video steganalysis method of the detection based on DCT coefficient steganography
CN107844795A (en) * 2017-11-18 2018-03-27 中国人民解放军陆军工程大学 Convolutional neural networks feature extracting method based on principal component analysis
CN108009434A (en) * 2017-12-13 2018-05-08 中国人民解放军战略支援部队信息工程大学 Rich model Stego-detection Feature Selection Algorithms based on rough set α-positive domain reduction

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107197297A (en) * 2017-06-14 2017-09-22 中国科学院信息工程研究所 A kind of video steganalysis method of the detection based on DCT coefficient steganography
CN107844795A (en) * 2017-11-18 2018-03-27 中国人民解放军陆军工程大学 Convolutional neural networks feature extracting method based on principal component analysis
CN108009434A (en) * 2017-12-13 2018-05-08 中国人民解放军战略支援部队信息工程大学 Rich model Stego-detection Feature Selection Algorithms based on rough set α-positive domain reduction

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SHUNHAO JIN 等: "Feature Selection of the Rich Model Based on the Correlation of Feature Components", 《HINDAWI》 *
张敏情等: "基于仿射传播聚类的富模型降维方法", 《四川大学学报(工程科学版)》 *
李薇等: "基于降维共生特征的JPEG通用隐写分析", 《火力与指挥控制》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114627340A (en) * 2022-03-19 2022-06-14 河南师范大学 Image steganography detection feature self-adaptive selection method based on triple measurement
CN114627340B (en) * 2022-03-19 2024-04-30 河南师范大学 Image steganography detection feature self-adaptive selection method based on triple measurement

Similar Documents

Publication Publication Date Title
CN111832608B (en) Iron spectrum image multi-abrasive particle identification method based on single-stage detection model yolov3
Zois et al. A comprehensive study of sparse representation techniques for offline signature verification
CN108280480B (en) Latent image carrier security evaluation method based on residual error co-occurrence probability
CN110069630B (en) Improved mutual information feature selection method
CN110968845B (en) Detection method for LSB steganography based on convolutional neural network generation
CN111556016B (en) Network flow abnormal behavior identification method based on automatic encoder
CN109949200B (en) Filter subset selection and CNN-based steganalysis framework construction method
CN112950445B (en) Compensation-based detection feature selection method in image steganalysis
CN112615881B (en) Data flow detection system based on block chain
CN111415323A (en) Image detection method and device and neural network training method and device
CN108154186B (en) Pattern recognition method and device
CN113556439A (en) Rich Model steganography detection feature selection method based on feature component correlation
CN115600194A (en) Intrusion detection method, storage medium and device based on XGboost and LGBM
Sarrafzadeh et al. Detecting different sub-types of acute myelogenous leukemia using dictionary learning and sparse representation
CN112132279A (en) Convolutional neural network model compression method, device, equipment and storage medium
CN104463922A (en) Image feature coding and recognizing method based on integrated learning
CN108009434B (en) Rich model steganography detection feature selection method based on rough set α -positive domain reduction
CN114037001A (en) Mechanical pump small sample fault diagnosis method based on WGAN-GP-C and metric learning
CN113542525B (en) Steganography detection feature selection method based on MMD residual error
CN109508350B (en) Method and device for sampling data
Yang et al. Relative entropy multilevel thresholding method based on genetic optimization
Kubal et al. Image Manipulation Detection Using Error Level Analysis and Deep Learning
CN113159181A (en) Industrial control network anomaly detection method and system based on improved deep forest
Sharifi et al. Prunedcaps: A case for primary capsules discrimination
CN111931757A (en) Finger vein quick sorting method and device based on MDLBP block histogram and PCA dimension reduction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20211026

RJ01 Rejection of invention patent application after publication