CN113556439A

CN113556439A - Rich Model steganography detection feature selection method based on feature component correlation

Info

Publication number: CN113556439A
Application number: CN202110638762.4A
Authority: CN
Inventors: 刘粉林; 金顺浩; 杨春芳; 马媛媛; 刘媛
Original assignee: Information Engineering University of PLA Strategic Support Force
Current assignee: Information Engineering University of PLA Strategic Support Force
Priority date: 2021-06-08
Filing date: 2021-06-08
Publication date: 2021-10-26

Abstract

The invention provides a Rich Model steganography detection feature selection method based on feature component correlation. The method comprises the following steps: step 1: disassembling the high-dimensional Rich Model steganography detection characteristics into a plurality of Rich Model submodels; step 2: for each Rich Model submodel, measuring the separability of each characteristic component of the Rich Model submodel, and sequencing each characteristic component in a descending order according to the measured value of the separability; and step 3: calculating the correlation between any two characteristic components of each Rich model submodel, and selecting the characteristics of the characteristic components according to the strength of the correlation; and 4, step 4: and combining the selected Rich Model submodels to serve as the final steganography detection feature. When the method is applied to the Rich Model characteristics of the frequency domain and the airspace, the Rich Model characteristic dimension can be effectively reduced under the condition of not influencing the steganography detection accuracy, and the effect on the frequency domain is more obvious.

Description

Rich Model steganography detection feature selection method based on feature component correlation

Technical Field

The invention relates to the technical field of steganography detection, in particular to a Rich Model steganography detection feature selection method based on feature component correlation.

Background

Digital steganography is a technique for embedding information in redundancy of digital images, audio, video, text and other media for the purpose of covert communication. With the proposal of the HUGO steganography algorithm in 2010, adaptive steganography taking 'distortion function design + STC coding' as a framework has become the mainstream of image steganography, and researchers successively propose a series of adaptive steganography algorithms with high detection resistance based on the framework. These algorithms make most of the conventional steganographic detection algorithms ineffective. The Rich Model feature was proposed by Fridrich et al in 2012 (reference 1 "Fridrich J, Kodovsky J. Rich Models for Steganalysis of Digital Images [ J ]. IEEE Transactions on Information formalism and Security,2012,7(3): 868-882"), which effectively improves the detection performance of HUGO steganography. Thereafter, steganography detection features such as the PSRM (project Spatial Rich model), PHARM, GFR, and CCJRM (reference 2 "Jan Kodovsky, Fridrich J. Steganalysis of JPEG images using Rich models [ C ]. Media Watermarking, Security, and formalics 2012.International Society for Optics and Photonics, 2012") are proposed in succession. The detection features have higher dimensionality, some of which even reach tens of thousands of dimensions, which brings huge calculation and storage expenses to the training of the classifier and even can cause the problem of dimension disaster. In response to this problem, researchers have conducted a series of work on both feature transformation and feature selection to reduce feature dimensionality.

Feature dimension reduction based on feature transformation mainly transforms a feature vector to another feature space, so that effective information in features is mainly concentrated in partial components of the transformed features, and then the most effective feature components are selected from the transformed features, so as to achieve the purpose of feature dimension reduction. Such as: qin and the like use Principal Component Analysis (PCA) to obtain Principal components in the features to reduce the dimensions of high-dimensional features, so that a good dimension reduction effect is achieved, but the PCA method is not ideal for the feature dimension reduction effect of a nonlinear structure; wang et al propose to perform one-dimensional discrete Fourier feature transformation on the SRM features, and only select the spectral coefficient of the positive half axis as the feature vector, thereby effectively reducing the feature dimension; borouland and the like obtain a type of nonlinear transformation through approximately symmetrical positive and semi-definite kernel functions, the detection efficiency is improved while the dimension reduction of the characteristics after the nonlinear transformation is realized, but the methods proposed by the Borouland and the like are only suitable for spatial domain characteristics.

The dimension reduction based on feature selection mainly selects part of feature components which can most effectively distinguish a carrier image and a hidden image from a feature vector so as to achieve the purpose of feature dimension reduction. Such as: xua and Jennifer measure the distinguishable performance of the characteristic component by using Bhattacharyya distance and Mahalanobis distance respectively, and select the combination of partial characteristic components when the distance between the carrier image characteristic and the hidden image characteristic is maximum to achieve the purpose of characteristic dimension reduction, but the characteristic dimension reduction effect is not outstanding; lu et al measure the importance of the feature vectors by using an improved Fisher criterion, and select the feature vector with the highest Fisher value for detection; zhang et al (reference 3 "Zhang Y, Liu F, Jia H, et al, optimization of rich model based on fisher criterion for image step analysis [ C ]//2018tent International Conference on Advanced computerized analysis (ICACI). 2018") further reduces feature dimensions by applying the ideas in Lu text in a sub-model space; ma et al (ref. 4 "Y.Y.Ma, X.Y.Luo, X.L.Li, Z.K.Bao, and Y.Zhang." Selection of channel Model catalysts Based on Decision making Rough Set α -Positive Region Reduction, "IEEE Transactions on Circuits and Systems for Video Technology, vol.29, No.2, pp.336-350,2019") propose a feature dimension Reduction method Based on Decision-making Rough Set α -Positive Region Reduction, selecting a combination of partial feature components that conforms to both Positive Region non-Reduction and independence principles, but this method introduces a large computational overhead during Positive Region Reduction.

Disclosure of Invention

In order to solve the problems that the traditional hidden writing detection feature dimension reduction method is limited in applicability and difficult to effectively reduce dimensions of a plurality of redundant feature components with strong correlation, the invention provides a Rich Model hidden writing detection feature selection method based on feature component correlation.

The invention provides a Rich Model steganography detection feature selection method based on feature component correlation, which comprises the following steps of:

step 1: disassembling the high-dimensional Rich Model steganography detection characteristics into a plurality of Rich Model submodels;

step 2: for each Rich Model submodel, measuring the separability of each characteristic component of the Rich Model submodel, and sequencing each characteristic component in a descending order according to the measured value of the separability;

and step 3: calculating the correlation between any two characteristic components of each Rich Model submodel, and selecting the characteristics of the characteristic components according to the strength of the correlation;

and 4, step 4: and combining the selected Rich Model submodels to serve as the final steganography detection feature.

Further, before the step 2, the method further comprises:

and (4) removing the characteristic components of which the sample variances are 0 before and after steganography aiming at each Rich Model submodel.

Further, in step 2, for each Rich Model submodel, the separability of its respective feature components is measured according to equation (1):

wherein Fscore (d) represents Fisher value of d-dimension characteristic component in the Rich Model submodel, and the Fisher value is used as a measurement value of separability;

and

respectively representing the mean values of the d-dimension characteristic components of the carrier image set C and the hidden image set S;

and

the standard deviations of the d-th dimension feature components of the carrier image set C and the hidden image set S are respectively represented.

Further, in step 3, for each Rich Model submodel, the correlation between any two feature components is calculated, specifically:

respectively calculating the correlation coefficient between the characteristic components in the carrier image characteristic set according to the formula (2)

And the correlation coefficient between each characteristic component in the feature set of the hidden image

Further obtaining a correlation coefficient matrix R of the carrier image feature set shown in formula (3)^CCorrelation coefficient matrix R of hidden image characteristic set^S：

Wherein, X_i,X_jRespectively representing the ith dimension characteristic component and the jth dimension characteristic component in the carrier image characteristic set or the hidden image characteristic set; r represents the correlation coefficient between the two characteristic components, -1. ltoreq. r.ltoreq.1; cov denotes the covariance between two feature components; sigma is the standard deviation among samples of a single characteristic component; μ is the sample mean of the individual feature components; e represents expectation; r^CAnd R^SAre symmetric matrices of D × D, D representing the feature dimension of the Rich Model submodel.

Further, in step 3, feature selection is performed on the feature component according to the strength of the correlation, specifically:

respectively in the order of r₁₁→…→r_D1→r₂₂→…→r_ij→…→r_DD-1→r_DDTraverse correlation coefficient matrix R^CAnd R^SElements below or above the main diagonal;

when the element r_ijWhen the condition shown in the formula (4) is satisfied, the feature component group (X) is subjected to the selection rule_i,X_j) And (3) selecting:

the selection rule is as follows:

when in use

And is

When, consider the characteristic component X_iAnd X_jIf there is strong positive linear correlation, then rejecting the characteristic component X_iPreserving the characteristic component X_j；

When in use

And is

When, consider the characteristic component X_iAnd X_jStrong negative linear correlation exists between the characteristic components, the characteristic component X is removed_iPreserving the characteristic component X_j；

When in use

And is

Or

And is

Then both feature components are retained;

when in use

Or

Then both feature components are retained; where T is the truncation threshold.

Further, step 3 further includes: determining an optimal truncation threshold T based on a bisection method, specifically:

step 3.1: obtaining 9 initial truncation threshold values T at intervals of 0.01 in the interval of (0.9,1)₁，T₂，……，T₉And respectively calculating to obtain the corresponding detection accuracy A₁，A₂，……，A₉And the original feature detection accuracy A₀；

Step 3.2: according to A₁→A₂→…→A₉In the order of (A) and₀by comparison, when A_m-A₀> -0.005, then stop the subsequent comparison and assign m-1 and m to x and y, respectively;

step 3.3: order to

Calculating to obtain the corresponding detection accuracy A_zAnd with A_yBy comparison, if A_z≥A_yThen T will be_zIs assigned to T_yOtherwise, will T_zIs assigned to T_x；

Step 3.4: repeat step 3.3 until T_zWhen the significant digit of the digital signal reaches 5 bits, the operation is stopped and T at that time is determined_zAs the optimal truncation threshold.

The invention has the beneficial effects that:

(1) the high-dimensional Rich Model feature generally consists of hundreds of submodels, a large number of redundant features exist in the interior, dimension disaster and huge calculated amount are brought to steganography detection, the existing dimension reduction method is difficult to effectively reduce dimensions of a plurality of redundant feature components with strong correlation, and the structure based on the correlation among the feature components and the Rich Model steganography detection feature can effectively reduce dimensions of the plurality of redundant feature components with strong correlation;

(2) experiments show that the method can be applied to Rich Model characteristics of a frequency domain and a space domain, and can effectively reduce Rich Model characteristic dimension (the characteristic vector dimension selected according to the method is far lower than the original characteristic dimension (less than 50 percent)) under the condition of not influencing the steganography accuracy, so that the overhead in the steganography detection process is reduced, and the effect on the frequency domain is more obvious.

Drawings

FIG. 1 is a block diagram of a high-dimensional Rich Model steganography detection feature provided by the prior art;

FIG. 2 is a schematic flow chart of a Rich Model steganography detection feature selection method based on feature component correlation according to an embodiment of the present invention;

FIG. 3 is a graph illustrating the performance effect of classification of the s1_ minmax22v _ q1 feature under different reduction degrees provided by the embodiment of the present invention;

FIG. 4 is a graph illustrating the classification performance effect of Ah _ T3 features under different reduction levels according to an embodiment of the present invention;

FIG. 5 is a diagram illustrating the effect of CCJRM feature reduction with embedding rate of 0.1 according to an embodiment of the present invention;

fig. 6 is a diagram of detection effects before and after CCJRM feature reduction under different embedding rates according to an embodiment of the present invention: (a) an intercalation ratio of 0.2; (b) an intercalation ratio of 0.3; (c) an insertion rate of 0.4; (d) an intercalation ratio of 0.5;

FIG. 7 is a comparison graph of the reduction effect of CCJRM characteristics under different embedding rates provided by the embodiment of the invention;

FIG. 8 is a comparison graph of the accuracy of CCJRM feature detection at different embedding rates according to the embodiment of the present invention;

fig. 9 is a diagram of detection effects before and after SRM feature reduction under different embedding rates according to an embodiment of the present invention: (a) an insertion rate of 0.1; (b) an intercalation ratio of 0.2; (c) an intercalation ratio of 0.3; (d) an insertion rate of 0.4; (e) an intercalation ratio of 0.5;

fig. 10 is a graph comparing reduction effects of SRM characteristics at different embedding rates according to embodiments of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in fig. 2, an embodiment of the present invention provides a method for selecting a Rich Model steganography detection feature based on feature component correlation, including:

s101: disassembling the high-dimensional Rich Model steganography detection characteristics into a plurality of Rich Model submodels;

specifically, as shown in fig. 1, document 1 in 2012 first proposes a main framework of the Rich Model feature, which mainly includes key links such as residual calculation, quantization and truncation, and co-occurrence matrix extraction. Rich and diverse linear and nonlinear high-pass filters are designed according to the RichModel characteristic from different directions and angles, the filters are used for filtering the images, various types of residual images are obtained as a result, the residual images are high-frequency components of the images and are high-pass filtered images, and preliminary characteristic elements can be obtained by extracting a fourth-order co-occurrence matrix of each residual image. Due to the symmetry of the co-occurrence matrices, part of the co-occurrence matrices can be merged to form a new co-occurrence matrix. And finally, combining the submodels into a complete Rich Model characteristic, wherein the complete Rich Model characteristic is the high-dimensional Rich Model steganography detection characteristic in the step.

S102: for each Rich Model submodel, removing characteristic components with the variance of 0 before and after steganography;

in particular, in the feature of the high-dimensional Rich Model, there is a case where the variance before and after partial feature component steganography is 0, which is particularly common in the frequency domain feature. These feature components have no positive effect on the training and classification of the ensemble classifier. Therefore, in this embodiment, before feature reduction, preprocessing is first performed to remove feature components with a sample variance of 0 before and after steganography. It will be understood that this step is not necessarily performed in all cases and may be omitted as desired.

S103: for each Rich Model submodel, measuring the separability of each characteristic component of the Rich Model submodel, and sequencing each characteristic component in a descending order according to the measured value of the separability;

s104: calculating the correlation between any two characteristic components of each Rich Model submodel, and selecting the characteristics of the characteristic components according to the strength of the correlation;

s105: and combining the selected Rich Model submodels to serve as the final steganography detection feature.

According to the Rich Model steganography detection feature selection method based on the feature component correlation, provided by the embodiment of the invention, from the perspective of correlation among feature components, a plurality of feature components with high separability and strong correlation and redundancy can be effectively reduced in dimension based on a specific sub-Model structure in the Rich Model.

On the basis of the above embodiment, as an implementable manner, in step S103, for each Rich Model submodel, the separability of its respective feature components is measured according to formula (1):

and

respectively representing sets C and C of carrier imagesA mean value of the d-th dimension feature component of the hidden image set S;

and

In particular, the amount of the solvent to be used,

reflects the inter-class dispersion degree between the d-th dimension characteristic components of the respective samples of the carrier image set C and the hidden image set S (also called classes C and S), and the larger the value of the inter-class dispersion degree is, the larger the inter-class difference is,

and

the intra-class cohesion of the d-th dimension characteristic component of each sample of the classes C and S is reflected, and the smaller the value of the intra-class cohesion is, the smaller the corresponding intra-class difference is. It is understood that in steganography detection, when the Fisher value is larger, the contribution of the feature component to the detection of the steganographic image is larger.

The characteristics after the preprocessing in step S102 are respectively recorded as

And

the feature dimension is D. In step S103, the Fisher values of the preprocessed feature components are measured by formula (1), and a corresponding D-dimensional Fisher value vector is obtained

The vector obtained by rearranging the vectors in descending order is expressed as

And

in addition to the above embodiments, as an implementation manner, in step S104, for each Rich Model submodel, the correlation between any two feature components thereof is calculated, specifically:

Wherein, X_i,X_jRespectively representing the ith dimension characteristic component and the jth dimension characteristic component in the carrier image characteristic set or the hidden image characteristic set; r is_ijRepresenting two characteristic components X_i,X_jCoefficient of correlation between, i.e. two characteristic components X_i,X_jThe strength of linear correlation between the two components is-1 ≦ r_ij≤1，r_ijThe larger the absolute value of (A), the two characteristic components X are represented_i,X_jThe stronger the correlation between them; when r is_ijWhen the two are completely inversely related, when r is-1_ijWhen the correlation value is 1, the two are completely positively correlated, and the correlation is strongest; cov denotes the covariance between two feature components;sigma is the standard deviation among samples of a single characteristic component; μ is the sample mean of the individual feature components; e represents expectation; r^CAnd R^SAre symmetric matrices of D × D, D representing the feature dimension of the Rich Model submodel.

As an implementation manner, in step S104, feature reduction is performed on the feature component according to the strength of the correlation, specifically:

in particular, due to R^CAnd R^SIs a symmetric matrix and thus may only need to be in order r₁₁→…→r_D1→r₂₂→…→r_ij→…→r_DD-1→r_DDTraverse correlation coefficient matrix R^CAnd R^SElements below or above the main diagonal without traversing all elements.

the selection rule is as follows:

when in use

And is

When in use

And is

When, consider the characteristic component X_iAnd X_jStrong negative linear correlation exists between the characteristic components, the characteristic component X is removed_iPreserving the characteristic component X_jWherein T is a truncation threshold.

It is understood that when the

And is

When, or when

And is

When, or when

When, or when

In the process, the correlation coefficient of the characteristic components before and after steganography is changed in sign, and two characteristic components need to be reserved to better carve out steganography noise. The feature reduction is not performed on the feature component groups at this time. The reduced characteristics based on the conditions shown in formula (4) are expressed as

And

in the feature reduction process, the selection of the truncation threshold T of the correlation coefficient r is crucial. The larger the value of T, the fewer the dimensionality of the reduced features and vice versa. The idea to be followed for the selection of the truncation threshold T is: and while the characteristics are reduced, the diversity of the original characteristics set is kept as much as possible in the reduced characteristics set, and new redundant characteristic components are introduced as little as possible, so that the stability of the detection accuracy is kept. In addition, in order to ensure the finite nature of the experimental process, the selection precision of the threshold value T is 5 significant digits. Therefore, on the basis of the foregoing embodiments, as an implementable manner, the truncation threshold T in the formula (4) is determined according to the following method, specifically: determining an optimal truncation threshold T based on a bisection method, comprising:

s201: obtaining 9 initial truncation threshold values T at intervals of 0.01 in the interval of (0.9,1)₁，T₂，……，T₉And respectively calculating to obtain the corresponding detection accuracy A₁，A₂，……，A₉And the original feature detection accuracy A₀(ii) a The original feature detection accuracy is the steganography detection accuracy when a feature dimension reduction method is not adopted.

In particular, T₁＝0.91，T₂＝0.92，……，T₉＝0.99。

S202: according to A₁→A₂→…→A₉In the order of (A) and₀by comparison, when A_m-A₀> -0.005, then stop the subsequent comparison and assign m-1 and m to x and y, respectively;

s203: order to

S204: step S203 is repeated until T_zWhen the significant digit of the digital signal reaches 5 bits, the operation is stopped and T at that time is determined_zAs the optimal truncation threshold.

In order to verify the effectiveness of the Rich Model steganography detection feature selection method based on feature component correlation provided by the invention, the following analysis process is provided.

Taking the SRM feature in document 1 as an example, firstly, decomposing the high-dimensional SRM feature into 106 sub-model feature sets; secondly, preprocessing the characteristic components in each sub-model, and deleting the characteristic components of which the sample variances are 0 in the carrier image and the hidden image; then, sorting the attribute importance of the sub-model feature set by using a formula (1); then, the submodels are reduced based on the proposed reduction rule; and finally, combining the feature vectors selected from each sub-model into a feature set finally used for steganography detection.

And the relation between the characteristics reduced by the correlation characteristics with different intensities and the detection performance is explored from the point of view of statistical phenomena. The present invention will employ (document 6 "Gretton, a., Borgwardt, m., Rasch, m.,

smola, A.J. (2007). Akernel method for the two-sample-publishing Systems in advance in Neural Information Processing Systems,2007,513-520 "and MMD (Maximum Mean distinction) standard mentioned in document 7" Pevny, T. (2008). Kernel methods in hierarchical analysis (Ph.D. theory). University of New York, Binghamton ") to measure the similarity of feature distributions before and after image steganography to characterize the classification performance of features. The MMD is calculated as follows:

wherein x is_iRepresenting the feature vector extracted from the ith carrier image, corresponding to which y_iRepresenting the feature vector extracted from the ith hidden image, m and n respectively representing the number of the carrier image and the hidden image, and k (·, ·,) being a Radial Basis Function (RBF). When the MMD value is smaller, the distribution of the features before and after the steganography is more similar, and the classification effect is poorer, and vice versa.

In order to research the relation between the features obtained by reduction of different intensities and MMD, steganography detection is carried out in a space domain and a frequency domain respectively, and the embedding rate of a steganographic image is 0.5. Aiming at the extracted features, the features are reduced to different degrees by the reduction method provided by the invention, and corresponding MMD values and detection accuracy are calculated to draw a scatter diagram. The effect graphs shown in fig. 3 and 4 are obtained by taking, as an example, the s1_ minmax22v _ q1 submodel feature in the SRM feature and the Ah _ T3 submodel feature in the CCJRM feature in document 2.

In the effect plots shown in fig. 3 and 4, a represents the MMD value of the characteristic after different degrees of reduction, and ● represents the corresponding detection correctness. Observing the two effect graphs, the MMD and the detection accuracy are increased along with the increase of the feature dimension, when the feature dimension reaches a certain limit, the MMD and the detection accuracy tend to be gentle, and the MMD and the detection accuracy are not greatly improved after the feature component with strong correlation is subsequently added. Therefore, it can be inferred that redundant information exists between feature components having strong correlation, and simply adding a feature component having strong correlation does not improve the classification performance of features. It is critical to select a suitable feature dimension by adjusting the truncation threshold T of the correlation coefficient.

In addition, the method is based on the existing typical steganography algorithm and the corresponding Rich Model steganography detection characteristics, and comparison experiments are respectively carried out on a space domain and a frequency domain. The following are specific experimental setups and experimental comparison results.

(one) Experimental setup

The images selected by the experiment of the invention are from a BOSSbase-1.01 image library. Firstly, 10000 gray images are selected to be subjected to DCT transformation, and 10000 JPEG carrier image sets with the quality factor of 75 are generated. Secondly, aiming at two different image libraries in a space domain and a frequency domain, respectively adopting an S-UNIWARD steganography algorithm and a J-UNIWARD steganography algorithm to construct a steganographic image set with embedding rates of 0.1,0.2,0.3,0.4 and 0.5. Then, feature extraction is performed on the existing carrier image and the hidden image by respectively adopting typical Rich Model feature extraction algorithms SRM (34671 dimension) and CC-JRM (22510 dimension), as shown in Table 1. Finally, the feature extracted in the last step is reduced by using the feature reduction method provided by the invention.

Table 1 images used in the experiments

In the experimental process, after the steganography detection features are obtained by the steganography detection feature selection method provided by the invention, the training and testing process of an integrated classifier (reference 5 "Kodovsky J, Fridrich J, Holub V.Ensemble Classifiers for Steganalysis of Digital Media [ J ]. IEEE Transactions on Information dynamics and Security,2012,7(2): 432-. The features of 5000 images are randomly selected from 10000 carrier images and 10000 hidden images each time to be used for training the parameters of the integrated classifier, and the features of the 5000 remaining carrier images and the hidden images are used for testing the performance.

The standard for evaluating the test performance of the invention consists of an error rate P_EIt is represented by a false alarm rate and a missed detection rate.

Wherein, P_FAProbability of a carrier image being a hidden image, P, as false alarm rate_MDAnd the probability of judging the hidden image as the carrier image is the omission ratio. For each set of experiments, the above process is repeated 10 times, and the minimum global average error rate median of 10 experiments is used as a measure of detection performance, and a smaller value indicates better performance of steganography detection.

Based on the reduction method provided by the invention, the effect of the reduction algorithm in the Rich Model submodel is verified firstly. The submodels used for this experiment were characterized by the SRM signature s1-minmax22h and the CCJRM signature Dix2_ T2. The results of the experiments are shown in tables 2 and 3.

TABLE 2 reduction Effect of the method of the present invention on the s1-minmax22h submodel in SRM

TABLE 3 reduction Effect of the method of the present invention on Dix2_ T2 submodel in CCJRM

In the experimental process, the detection effect of the randomly selected features is compared with the detection effect of the features reduced by the method, and the detection effect of the feature reduction algorithm is higher than that of the randomly selected features while the original detection accuracy is kept.

(II) comparison experiment before and after reduction of frequency domain steganography detection features

The comparison experiment of the invention starts from different embedding rates and different feature dimensions, and provides comparison data before and after feature reduction. Firstly, the invention describes the comparison experiment before and after CCJRM characteristic reduction under 0.1 embedding rate in detail, and gives the value of the truncation threshold T used in each characteristic reduction.

TABLE 4 accuracy of detection before and after reduction of CCJRM feature with embedding rate of 0.1 (%)

As can be seen from table 4, at 0.1 embedding rate, the accuracy of the original 22510-dimensional CCJRM feature in steganographic detection is 53.06%. The best detection effect of the reduced features based on the method can reach 53.48%, the accuracy is improved by 0.42% compared with the accuracy of the original features, meanwhile, the reduced features are 4436-dimensional and are only 19.7% of the original feature number, and the cost of classifier training is greatly reduced. Therefore, the method provided by the invention effectively reduces the feature dimension, better keeps the accuracy of the steganography detection, and can improve the accuracy of the steganography detection even under the condition of low embedding rate, thereby verifying the effectiveness of the feature reduction method of the invention.

FIG. 5 is a line graph showing the detection effect of each feature dimension after the reduction of the CCJRM features at the embedding rate of 0.1, wherein the horizontal axis represents the feature dimension, the vertical axis represents the detection accuracy, and the tangle-solidup represents the optimal feature reduction dimension. As can be seen from the figure, when the feature dimension is insufficient, part of the valid features are not selected, so the detection accuracy does not reach the peak. When the feature dimension is retained too much, many redundant and even harmful feature components are doped in the feature dimension, so that the accuracy of steganography detection is reduced, and the retention of the feature dimension too much increases the time and complexity of classification. As the number of the features gradually approaches to the original feature dimension, the steganography detection accuracy rate does not change obviously.

In order to test the performance of the method under other embedding rates, the invention also performs experiments on steganography detection characteristics with embedding rates of 0.2,0.3,0.4 and 0.5, and the detection effect is shown in fig. 6.

In fig. 6, (a) to (d) are graphs showing comparison of detection effects before and after feature reduction of CCJRM features at embedding rates of 0.2,0.3,0.4 and 0.5, respectively, where the horizontal axis represents the feature dimension, the vertical axis represents the accuracy of steganographic detection, a in the graph represents the optimal detection accuracy obtained after feature reduction, and ● represents the detection accuracy obtained under the original feature dimension. In the diagrams (b), (c) and (d), it can be seen that the original feature dimension can be effectively reduced after the features are reduced by the method of the present invention. Meanwhile, the detection accuracy rate does not slide down obviously, errors are within 0.4%, and the detection accuracy rate is even improved under the condition of low embedding rate.

The present invention is also compared with the feature extraction methods of

documents

3 and 4. The three have small difference in the overall detection effect, and the detection errors are all within 0.5 percent. However, the feature reduction method proposed by the present invention is significantly lower in feature dimension than the latter two, and the specific results are shown in fig. 7 and 8. Different broken lines in the figure respectively represent the detection accuracy of the features selected by the 3 feature selection methods and the original features under different embedding rates, and it can be seen that the feature dimension reduced by the method of the invention is obviously lower than the first three features.

(III) comparison experiment before and after reduction of spatial domain steganography detection features

The effectiveness of the method in the frequency domain steganography detection has been examined in section (two), and in order to explore the universality of the method, the section examines the effectiveness of the method in the spatial domain steganography detection. As in the frequency domain, this section sets the experimental groups with embedding rates of 0.1,0.2,0.3,0.4 and 0.5(bbp), respectively, and the experimental results are shown in fig. 9.

As in section (ii), 5 graphs in fig. 9 show comparison of the detection effect before and after reduction of the SRM feature at embedding rates of 0.1,0.2,0.3,0.4, and 0.5, respectively. Wherein the horizontal axis represents the feature dimension, the vertical axis represents the accuracy of steganographic detection, the tangle-solidup in the graph represents the optimal detection accuracy obtained after feature reduction, and ● represents the detection accuracy obtained under the original feature dimension. It can be seen from the figure that, as the feature dimension increases gradually, the accuracy of steganography detection also increases, and when the feature dimension increases to a certain value, the accuracy of steganography detection tends to be stable, and no obvious improvement phenomenon occurs. Experiments show that the error between the detection accuracy of the feature vector selected by the method and the detection accuracy of the original feature vector is within 0.3 percent in the steganography detection accuracy. Moreover, the feature vector dimension selected by the method is far lower than the original feature dimension (less than 50%). Therefore, under the condition of not influencing the steganography detection accuracy, the method can effectively reduce the airspace Rich Model characteristic dimension, thereby reducing the overhead in the steganography detection process.

In the spatial domain, the comparison results between the method of the present invention and the methods proposed by Zhang and Ma are approximately the same as in the frequency domain, and the experimental results are shown in fig. 10.

As can be seen from fig. 10, the reduced feature dimension of the method of the present invention is significantly smaller than the reduced feature dimensions of the two methods. The contrast effect is particularly obvious under the condition of low embedding rate.

In summary, the feature reduction method provided by the invention can achieve effective dimension reduction in both frequency domain and spatial domain. In addition, in the frequency domain steganography detection with low embedding rate, the features reduced by the method are improved to a certain extent compared with the steganography detection accuracy of the original features. In addition, in the aspect of feature dimension reduction, the dimension reduction amplitude of the frequency domain steganography detection feature is larger than that of the spatial domain steganography detection feature. The main reason is that the Rich Model feature of the frequency domain steganography detection has stronger linear correlation, so that the purpose of better dimension reduction is achieved.

Aiming at image self-adaptive steganography detection, the Rich Model steganography detection feature has a good detection effect, but the Rich Model feature has the defects of high dimension, slow training and the like. Under the condition of not influencing the accuracy of steganography detection, aiming at the condition that a large amount of redundancy exists in the Rich Model feature, in order to reduce the huge calculation amount brought by the high-dimensional Rich Model feature, the method firstly introduces the ideas of correlation and Fisher criterion to reduce the feature dimension. Secondly, based on the proposed method, a large number of verification experiments are carried out on typical Rich Model characteristics SRM and CCJRM in the frequency domain and the space domain. The result shows that the method effectively and greatly reduces the dimension of the characteristic while maintaining or improving the steganography detection precision.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. The Rich Model steganography detection feature selection method based on feature component correlation is characterized by comprising the following steps of:

2. The method of claim 1, further comprising, before the step 2:

3. The method of claim 1, wherein in step 2, for each Rich Model submodel, the separability of its respective feature components is measured according to equation (1):

and

and

4. The method according to claim 1, wherein in step 3, for each Rich Model submodel, the correlation between any two feature components thereof is calculated, specifically:

5. The method according to claim 4, wherein in step 3, feature selection is performed on the feature components according to the strength of the correlation, specifically: