CN117079669A - Feature vector extraction method for LSB audio steganography with low embedding rate - Google Patents

Feature vector extraction method for LSB audio steganography with low embedding rate Download PDF

Info

Publication number
CN117079669A
CN117079669A CN202311336594.9A CN202311336594A CN117079669A CN 117079669 A CN117079669 A CN 117079669A CN 202311336594 A CN202311336594 A CN 202311336594A CN 117079669 A CN117079669 A CN 117079669A
Authority
CN
China
Prior art keywords
audio
steganography
lsb
feature vector
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311336594.9A
Other languages
Chinese (zh)
Inventor
林萍萍
王忠臣
章云鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Future Network Research Institute Industrial Internet Innovation Application Base Of Zijinshan Laboratory
Boshang Shandong Network Technology Co ltd
Original Assignee
Shandong Future Network Research Institute Industrial Internet Innovation Application Base Of Zijinshan Laboratory
Boshang Shandong Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Future Network Research Institute Industrial Internet Innovation Application Base Of Zijinshan Laboratory, Boshang Shandong Network Technology Co ltd filed Critical Shandong Future Network Research Institute Industrial Internet Innovation Application Base Of Zijinshan Laboratory
Priority to CN202311336594.9A priority Critical patent/CN117079669A/en
Publication of CN117079669A publication Critical patent/CN117079669A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a feature vector extraction method for LSB audio steganography with low embedding rate, which comprises the following steps: feature vectors are extracted based on conditional probability, noise of adjacent voice sampling points in short time has larger correlation, and conditional probability is usedRepresenting the likelihood of the system transitioning to another state at time m+1 under known conditions in which the system is in one state at time m, where the sample value at time m isThe sampling value at the time m+1 is. According to the invention, the correlation of noise signals at different positions is represented by using a first-order discrete random variable conditional distribution law, a probability matrix of an audio digital coding sequence is constructed and is converted into a high-dimensional feature vector, the feature vector can effectively capture local micro differences introduced before and after steganography, the feature vector is used as an input value for training a CNN model, the recognition accuracy of LSB audio steganography with low embedding rate can be effectively improved, and a probability distribution modeling means is provided for noise sequences introduced by LSB steganography.

Description

Feature vector extraction method for LSB audio steganography with low embedding rate
Technical Field
The invention relates to the technical field of LSB audio steganography detection, in particular to a feature vector extraction method for LSB audio steganography with low embedding rate.
Background
1. LSB audio steganography detection technology
With the rapid development of computers and the internet, people rapidly communicate and exchange information through a network, and more digital multimedia contents, such as audio and video, become main carriers for people to communicate and exchange. Due to the digitization of the audio carrier and the redundancy of the information encoding, the hidden information can be embedded into the audio file under the condition that the physiological perception of the person cannot be perceived, so as to realize the hidden transmission of the data. LSB audio steganography refers to embedding hidden information in the Least Significant Bits (LSBs) of an audio file because modifying the least significant bits has less impact on audio quality.
The LSB audio steganography detection technology is a countermeasure technology of the LSB audio steganography technology, and is used for analyzing suspicious audio to judge whether an audio file is steganographically by the LSB.
2.CNN (CNN)
In recent years, convolutional neural networks CNN are widely used in the fields of image and voice recognition and the like. The detection method based on the convolutional neural network is the most advanced LSB audio steganography detection method at present, and compared with the traditional manual feature extraction method, such as chi-square detection and SPA detection, the method can realize the automation of various feature extraction; compared with Bagging detection which can only extract partial features, the method can realize comprehensive high-dimensional feature abstraction.
3. Feature vector
The acquisition of the digital voice signal firstly requires the interval timing sampling of the continuous voice signal, then rounding and quantizing the sampling value, and finally binary coding is carried out. However, because ideal acquisition conditions do not exist, noise is necessarily introduced into the digital voice signal during acquisition.
Voice signal acquisition sequenceCan be expressed as +.>Wherein->A voice sampling value without interference at the moment n is represented; />Is the noise signal value at time n. If LSB steganography is regarded as artificial noise, a new thought can be provided for extracting the distribution characteristics in our steganography analysis.
At this time, for the acquisition sequence of the LSB audio steganographically-written voice signalCan be updated as represented byWherein->Is the synthesized noise after the LSB audio steganography. That is, assuming that LSB steganography is regarded as a noise, the noise signal value is made of +.>Become->. This transition will necessarily change some of the original distribution characteristics of the pre-steganographic noise. If the distribution characteristic is energy-treated, an effective high-dimensional feature vector is extracted and can be used as a training sample of a CNN steganography detection model. In the training process of the CNN model, a high-quality feature vector training set is a key for improving the quality of model training and the prediction accuracy.
There are two ways to quantify the noise sequence feature variation introduced by LSB steganography at present, namely noise sequence estimation based on local correlation and noise sequence estimation of wavelet signal reconstruction, respectively.
Estimation based on adjacent sample sequences of audio signals
Because of the short-term correlation of speech, adjacent speech signal samples can be considered to be equal within the tolerance, i.e. At this time, noise randomly affects the sampled value of the voice signal, and accordingly, it is assumed that noise is only superimposed onOr->Either, the difference between the two is a noise value. Accordingly, the difference between adjacent speech signal sampled values can be used as noise, as shown in the following formula:
2) Noise sequence estimation for wavelet signal reconstruction
The wavelet denoising is realized by mainly selecting proper wavelet coefficients in wavelet space by wavelet signals to reconstruct the signals so as to achieve the aim of noise elimination. Studies have proposed improved wavelet transform (SGWT) denoising. The main reason is that since the output value of the conventional wavelet filter is not an integer but a floating point number, there is a serious problem that data after wavelet transformationWhen compressing and quantizing, there is a large error and the audio distortion is serious. The output value of the filter constructed by the lifted wavelet transform, also called second generation wavelet transform, is an integer, and the floating point number problem is avoided. If two-stage wavelet transform decomposition is performed on the voice signal, high-frequency and low-frequency signals can be obtained on each stage. The low frequency part reflects the average value of the signal and the high frequency part reflects the difference in detail of the signal, so the noise of the signal is mainly represented in the high frequency part. In the high frequency part, the wavelet transformation coefficients are thresholded, and the processing method comprises two kinds of hard thresholding and soft thresholding. The hard threshold can make the denoised signal oscillate near the singular point, and soft threshold processing is generally adopted to process the 1H high-frequency band signal, namely when the absolute value of the wavelet coefficient is smaller than a certain threshold, the wavelet coefficient is set to zero, and when the absolute value of the wavelet coefficient is larger than a certain threshold, the difference value between the wavelet coefficient and the threshold is used for replacing the original wavelet coefficient. Then the wavelet signal is reconstructed to obtain the original signalEstimate of->And taking the noise sequence:
noise sequence estimation based on local correlation, in short, is to take the difference between adjacent samples, since speech has short-term stationarity, we can consider adjacent samples equal, i.e. The embedded secret information is superimposed on a certain sampling value, so that the secret information is not equal to the adjacent sampling value. Therefore->It can be regarded as a noise and we can then construct a differential sequence of sample values to quantify this feature. The quantization mode has simple thought and canThe operability is strong. Another is to quantize LSB steganographically introduced noise based on wavelet transform, and to quantize the feature variation of this dimension by a noise sequence of wavelet signal reconstruction, since the noise of the audio signal is mainly represented in the high frequency part, we only threshold the high frequency part. Finally, the wavelet signal is subjected to inverse wavelet transformation to obtain the original signal +.>Estimate of +.>And performing differential operation on the two, wherein the obtained differential sequence is the feature vector for quantifying the feature change. However, the sensitivity of the feature vector extracted by the two features to steganography and the embedding rate of the secret information are positively correlated, and the higher the embedding rate is, the better the detection performance is. LSB audio steganography detection effect with low embedding rate is poor. LSB steganography is directed to the least significant bit, which changes the sampling amplitude very weakly, and the too low embedding rate makes the characterization capability of the feature vector more difficult to be effective. Since the steganographically embedded secret information causes very little disturbance to the overall content, more extensive research and higher level mathematical analysis must be performed to make it possible to find different features of a normal audio carrier and a steganographically audio carrier in a certain imperceptible dimension.
There is currently no effective solution to the above problems.
Disclosure of Invention
Aiming at the technical problems in the related art, the invention provides a feature vector extraction method aiming at LSB audio steganography with low embedding rate, which can overcome the defects in the prior art.
In order to achieve the technical purpose, the technical scheme of the invention is realized as follows:
a feature vector extraction method for LSB audio steganography with low embedding rate comprises the following steps:
s1, extracting feature vectors based on conditional probability, wherein noise of adjacent voice sampling points in short time has larger correlation, and the conditional probability is usedIndicating the possibility of the system to be transferred to another state at time m+1, under the known condition that the system is in a certain state at time m, where the sampling value at time m is +.>The sampling value at the time m+1 is +.>
The specific steps for extracting the feature vector based on the conditional probability in the S1 are as follows:
s11, acquiring a digital coding sequence of a voice fragment;
s12, determining an audio sampling discrete value range;
s13, counting the occurrence times and the duty ratio of each discrete point in the audio sampling discrete value domain in the voice segment;
s14 using conditional probability distribution lawEvaluating the correlation of the sampling points;
s15, bringing the probability value calculated in the step S14 into a probability matrix;
s16, converting the probability matrix into a target feature vector;
s2, a complete LSB steganography detection model training process;
s21, setting a data set;
training of S22 CNN model: extracting a high-dimensional feature vector based on a conditional probability distribution model, and performing CNN network model training by taking the extracted vector as an input value, wherein the method comprises the following specific steps of:
s221, extracting feature vectors Xi and Xi with the same dimension for the same number of original audio and hidden audio respectively
S222 toCNNs are trained as input vectors and return values.
Further, in step S14Is calculated by classical generalization.
Further, the specific steps of setting the data set in S21 are as follows:
s211, randomly selecting uncompressed voice fragments from a public data set, isochronously segmenting an original voice fragment into a plurality of small fragments, wherein the duration time of each audio is the same, acquiring a certain number of voice fragments as data sets and backing up, and performing LSB audio steganography on the backed up data sets by using the original data sets as normal non-steganography data sets;
s212, performing steganography operation on the backed-up data set by using an LSB audio steganography algorithm to obtain the same number of normal audio and steganography audio, wherein half of the normal audio and steganography audio are used for training, and the rest half of the normal audio and steganography audio are used for testing.
Further, the duration of each audio in step S211 is 5S.
Further, the sampling frequency of the sample in step S211 is 16kHz.
Further, the embedding rate of LSB steganography in step S212 is 5%.
Further, the specific steps of S222 are as follows:
s2221 uses a hyper-parametric convolution kernel to pre-process the input values;
s2222 performs superposition operation of a convolution group on the preprocessed data output in the step S2221, so as to realize extraction of layer-by-layer degradation and high-level semantics of the data;
s2223 inputs the data output in the step S2222 into the classifier, the result after processing is input into the softmax layer through a full connection layer, and finally the recognition result is output in a probability mode.
The invention has the beneficial effects that: according to the invention, the correlation of noise signals at different positions is represented by using a first-order discrete random variable conditional distribution law, a probability matrix of an audio digital coding sequence is constructed and is converted into a high-dimensional feature vector, the feature vector can effectively capture local micro differences introduced before and after steganography, the feature vector is used as an input value for training a CNN model, the recognition accuracy of LSB audio steganography with low embedding rate can be effectively improved, and a probability distribution modeling means is provided for noise sequences introduced by LSB steganography.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a complete LSB steganography detection process for a feature vector extraction method for low-embedding-rate LSB audio steganography according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which are derived by a person skilled in the art based on the embodiments of the invention, fall within the scope of protection of the invention.
The explanation and meaning of the nouns referred to in the invention are as follows:
LSB Least Significant Bit (least significant bit)
CNN Convolutional Neural Networks (convolutional neural network)
SGWT Second Generation Wavelet Transform (improved wavelet transform)
ReLU Linear rectification function (modified linear unit)
pool
conv Convolition (Convolution)
The number of channels, also called the width of the neural network
We propose for the first time a method for characterizing adjacent audio signal correlation using a discrete random variable conditional discipline for LSB audio steganography with low embedding rate. Since the modification introduced by LSB steganography with low embedding rate is less, the interference degree to the original audio content after data hiding is very low. Thus, those means that attempt to capture the specific noise content of the audio, which quantifies the noise distribution characteristics, are not suitable for this low-steganography detection problem. In consideration of the short-time stationarity of voice, a larger correlation is necessarily present between adjacent voice sampling points, the invention uses a first-order discrete random variable conditional distribution law to represent the correlation of noise signals at different positions, constructs a probability matrix of an audio digital coding sequence, and converts the probability matrix into a high-dimensional feature vector, wherein the feature vector can effectively capture local micro differences introduced before and after steganography. The recognition accuracy of LSB audio steganography with low embedding rate can be effectively improved by taking the LSB audio steganography as an input value for training a CNN model.
1. Feature vector extraction based on conditional probability
Modern speech coding is basically audio digital coding, i.e. continuously varying analog signals are converted into digital coding by three steps of sampling, quantization and coding. We can consider such a digital code sequence as a random sequence that is discontinuous in both time and state.
The voice sampling environment is stable within a certain and short time range, so that the noise of the adjacent voice sampling points within a short time has a larger correlation. Conditional probabilityIndicating the likelihood that the system will transition to another state at time m +1 under the known condition that the system is in that state at time m.
In view of this, the feature vector extraction based on conditional probability is specifically as follows:
1. acquiring a digital coding sequence of a voice fragment;
2. determining an audio sampling discrete value range;
3. the number of occurrences and the duty cycle of each discrete point (sampling point) in the statistical value domain in the speech segment;
4. evaluating the correlation of the sampling points by using a conditional probability distribution law (formula 2-1);
5. bringing the probability value calculated in the step 4 into a probability matrix;
6. the probability matrix is converted into a target feature vector.
The following we describe this process by way of an example:
we use the sequence of letters to simulate the quantized speech sample sequence, assuming the digital coding sequence of the speech segment S isFirst of all its state space is +.>I.e. the audio discrete sample value range, the actual audio sample discrete value range is much larger than this. The number and duty cycle of occurrences of each discrete point in the statistical value domain in the speech segment:
using equation 2-1, if the m-time sample value is a and the m+1-time sample value is b, the probability is:
wherein the method comprises the steps ofThe probability that the m time sampling value is a and the m+1 time sampling value is b is calculated by classical probability, and the number of times that the next state is a, b, c, d, e, f, g when the current state is a needs to be calculated respectively:
a- > a 0 times; a- > b 1 times; a- > c 1 times; a- > d 2 times; a- > e 0 times; a- > f 1 times; a- > g 0 times;
so that
By the above way, we evaluate the correlation between all the sampling points of the audio segment and construct a probability matrix as shown in the figure:
this matrix of 7*7 is converted into eigenvectors:
thus we succeeded in extracting a 49-dimensional feature vector from a speech segment. We find that the dimension of the extracted feature vector is exactly the square of the discrete points of the speech encoded discrete value range.
2. Complete LSB steganography detection model training process
1. Data set arrangement
Because of the popularity of voice research and the openness of audio data, in a network, we can conveniently acquire an audio data set suitable for training of a network model through a public data set, and the audio coding standard used in the audio data set is G.711 voice coding and is stored in a PCM format. The data set is set as follows:
1) Randomly selecting uncompressed voice fragments from the public data set, and isochronously slicing the original audio fragments into 20000 small fragments. The duration of each audio is 5 seconds, and it is noted that the sampling frequency of the samples is 16kHz. After acquiring 20000 data sets with the duration of 5 seconds, backing up the data sets, taking the original data sets as normal data sets which are not subjected to steganography, and performing LSB audio steganography on the backed up data;
2) And performing steganography operation on the backed-up data set by using an LSB audio steganography algorithm, wherein the embedding rate of LSB steganography is 5%. Thus 20000 pairs of normal audio and steganographic audio are obtained in total. One half of which is used for training and the other for testing. In the training phase 4000 pairs are taken out for post-training verification, the remaining 16000 pairs are used for training the neural network.
Training of CNN model
The invention is described in further detail below with reference to fig. 1 of the accompanying drawings. As shown in fig. 1, a high-dimensional feature vector is extracted based on a conditional probability distribution model, and the extracted vector is used as an input value to perform CNN network model training, and the complete flow comprises the following steps:
1. since the G.711 encoded discrete value domain has 130 discrete points, the feature vector can be constructed by the method based on the conditional probabilityThe probability matrix of (2) is converted into a feature vector with the dimension of 16900, and in order to facilitate dimension reduction calculation in CNN model training, 50 numerical values at the head and the tail of the vector are removed; i.e. we extract the dimensions +.for 16000 for original audio and steganographic audio, respectively>Feature vectors Xi and +.>
2. To be used forTraining the CNN as an input vector and a return value;
2.1. using a hyper-parametric convolution kernelAnd->Preprocessing an input value;
2.2. and (2) performing superposition operation of 7 convolution groups (G1-G7) on the preprocessed data output in the step (2.1) to realize layer-by-layer reduction of the data and extraction of high-level semantics, and performing detailed description of each convolution group.
The convolution groups G1 and G2 each use three convolution layers of different kernel sizes, different channel numbers and different step sizes, respectively: the core size isConvolution kernel with channel 1, kernel size +.>Convolution kernel with channel 8 and kernel size +.>Convolution kernel with channel 1 and step 2.
The same layer structure is used for convolution groups G3, G4, G5 and G6. First using a core size ofConvolution kernel with channel 1, and taking the output result as the input value of the activation function ReLU, after the activation function processing, we input data into the kernel size of +.>And the convolution kernel with channel being 2, then, taking the output result as the input value of the activation function ReLU to calculate, and taking the output result as the input value at the moment to be put into a pooling layer for further dimension reduction processing. The core size of the pooling layer is +.>A maximum pooling layer with a step size of 2.
The convolution group G7 adopts a global average pooling strategy, namely a core size is usedThe dimensionality of the data acquired by the previous layer is reduced to 1 at one time, so that the feature distribution learned by all previous layers is summarized.
2.3. The data output in the step 2.2 are input into a classifier, firstly, a full connection layer is adopted, the processed result is input into a softmax layer, and finally, the identification result is output in a probability mode.
In summary, by means of the technical scheme, the correlation of noise signals at different positions is represented by using a first-order discrete random variable conditional distribution law, a probability matrix of an audio digital coding sequence is constructed and is converted into a high-dimensional feature vector, the feature vector can effectively capture local micro-differences introduced before and after steganography, the feature vector is used as an input value for training a CNN model, the recognition accuracy of LSB audio steganography with low embedding rate can be effectively improved, and a probability distribution modeling means is provided for noise sequences introduced by LSB steganography.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims (7)

1. The feature vector extraction method for LSB audio steganography with low embedding rate is characterized by comprising the following steps:
s1, extracting feature vectors based on conditional probability, wherein noise of adjacent voice sampling points in short time has larger correlation, and the conditional probability is usedIndicating the possibility of the system to be transferred to another state at time m+1, under the known condition that the system is in a certain state at time m, where the sampling value at time m is +.>The sampling value at the time m+1 is
The specific steps for extracting the feature vector based on the conditional probability in the S1 are as follows:
s11, acquiring a digital coding sequence of a voice fragment;
s12, determining an audio sampling discrete value range;
s13, counting the occurrence times and the duty ratio of each discrete point in the audio sampling discrete value domain in the voice segment;
s14 using conditional probability distribution lawEvaluating the correlation of the sampling points;
s15, bringing the probability value calculated in the step S14 into a probability matrix;
s16, converting the probability matrix into a target feature vector;
s2, a complete LSB steganography detection model training process;
s21, setting a data set;
s22, training a CNN model, namely extracting a high-dimensional feature vector based on a conditional probability distribution model, and training the CNN model by taking the extracted vector as an input value, wherein the method comprises the following specific steps of:
s221, extracting feature vectors Xi and Xi with the same dimension for the same number of original audio and hidden audio respectively
S222 toCNNs are trained as input vectors and return values.
2. The feature vector extraction method for low-embedding-rate LSB audio steganography according to claim 1, characterized in that in step S14Is calculated by classical generalization.
3. The feature vector extraction method for low-embedding-rate LSB audio steganography according to claim 1, wherein the setting of the data set in S21 specifically includes the steps of:
s211, randomly selecting uncompressed voice fragments from a public data set, isochronously segmenting an original voice fragment into a plurality of small fragments, wherein the duration time of each audio is the same, acquiring a certain number of voice fragments as data sets and backing up, and performing LSB audio steganography on the backed up data sets by using the original data sets as normal non-steganography data sets;
s212, performing steganography operation on the backed-up data set by using an LSB audio steganography algorithm to obtain the same number of normal audio and steganography audio, wherein half of the normal audio and steganography audio are used for training, and the rest half of the normal audio and steganography audio are used for testing.
4. A feature vector extraction method for low-embedding-rate LSB audio steganography according to claim 3, characterized in that the duration of each audio in step S211 is 5S.
5. A feature vector extraction method for low-embedding-rate LSB audio steganography as claimed in claim 3, characterized in that the sampling frequency of the samples in step S211 is 16kHz.
6. The feature vector extraction method for low-embedding-rate LSB audio steganography according to claim 3, wherein the embedding rate of LSB steganography in step S212 is 5%.
7. The feature vector extraction method for low-embedding-rate LSB audio steganography according to claim 1, wherein the specific steps of S222 are as follows:
s2221 uses the super-parameter convolution to check the input value for preprocessing;
s2222 performs superposition operation of a convolution group on the preprocessed data output in the step S2221, so as to realize extraction of layer-by-layer degradation and high-level semantics of the data;
s2223 inputs the data output in the step S2222 into the classifier, the result after processing is input into the softmax layer through a full connection layer, and finally the recognition result is output in a probability mode.
CN202311336594.9A 2023-10-17 2023-10-17 Feature vector extraction method for LSB audio steganography with low embedding rate Pending CN117079669A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311336594.9A CN117079669A (en) 2023-10-17 2023-10-17 Feature vector extraction method for LSB audio steganography with low embedding rate

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311336594.9A CN117079669A (en) 2023-10-17 2023-10-17 Feature vector extraction method for LSB audio steganography with low embedding rate

Publications (1)

Publication Number Publication Date
CN117079669A true CN117079669A (en) 2023-11-17

Family

ID=88713764

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311336594.9A Pending CN117079669A (en) 2023-10-17 2023-10-17 Feature vector extraction method for LSB audio steganography with low embedding rate

Country Status (1)

Country Link
CN (1) CN117079669A (en)

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080193031A1 (en) * 2007-02-09 2008-08-14 New Jersey Institute Of Technology Method and apparatus for a natural image model based approach to image/splicing/tampering detection
CN102063907A (en) * 2010-10-12 2011-05-18 武汉大学 Steganalysis method for audio spread-spectrum steganography
CN102930495A (en) * 2012-10-16 2013-02-13 中国科学院信息工程研究所 Steganography evaluation based steganalysis method
CN105894437A (en) * 2016-03-31 2016-08-24 柳州城市职业学院 Steganography algorithm based on Markov chain
CN107610711A (en) * 2017-08-29 2018-01-19 中国民航大学 G.723.1 voice messaging steganalysis method based on quantization index modulation QIM
CN108073570A (en) * 2018-01-04 2018-05-25 焦点科技股份有限公司 A kind of Word sense disambiguation method based on hidden Markov model
CN108462708A (en) * 2018-03-16 2018-08-28 西安电子科技大学 A kind of modeling of the behavior sequence based on HDP-HMM and detection method
US20190313114A1 (en) * 2018-04-06 2019-10-10 Qatar University System of video steganalysis and a method of using the same
US20190356476A1 (en) * 2017-01-31 2019-11-21 Agency For Science, Technology And Research Method and apparatus for generatng a cover image for steganography
CN110968845A (en) * 2019-11-19 2020-04-07 天津大学 Detection method for LSB steganography based on convolutional neural network generation
KR102103306B1 (en) * 2020-01-28 2020-04-23 국방과학연구소 Steganography Discrimination Apparatus and Method
US20200356827A1 (en) * 2019-05-10 2020-11-12 Samsung Electronics Co., Ltd. Efficient cnn-based solution for video frame interpolation
CN115295018A (en) * 2022-08-04 2022-11-04 浙江农林大学暨阳学院 Bayesian network-based pitch period modulation information hiding detection method
CN115861407A (en) * 2023-02-28 2023-03-28 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) Safe distance detection method and system based on deep learning
CN116049467A (en) * 2023-01-17 2023-05-02 华中科技大学 Non-supervision image retrieval method and system based on label visual joint perception
CN116110565A (en) * 2022-08-19 2023-05-12 常州大学 Method for auxiliary detection of crowd depression state based on multi-modal deep neural network

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080193031A1 (en) * 2007-02-09 2008-08-14 New Jersey Institute Of Technology Method and apparatus for a natural image model based approach to image/splicing/tampering detection
CN102063907A (en) * 2010-10-12 2011-05-18 武汉大学 Steganalysis method for audio spread-spectrum steganography
CN102930495A (en) * 2012-10-16 2013-02-13 中国科学院信息工程研究所 Steganography evaluation based steganalysis method
CN105894437A (en) * 2016-03-31 2016-08-24 柳州城市职业学院 Steganography algorithm based on Markov chain
US20190356476A1 (en) * 2017-01-31 2019-11-21 Agency For Science, Technology And Research Method and apparatus for generatng a cover image for steganography
CN107610711A (en) * 2017-08-29 2018-01-19 中国民航大学 G.723.1 voice messaging steganalysis method based on quantization index modulation QIM
CN108073570A (en) * 2018-01-04 2018-05-25 焦点科技股份有限公司 A kind of Word sense disambiguation method based on hidden Markov model
CN108462708A (en) * 2018-03-16 2018-08-28 西安电子科技大学 A kind of modeling of the behavior sequence based on HDP-HMM and detection method
US20190313114A1 (en) * 2018-04-06 2019-10-10 Qatar University System of video steganalysis and a method of using the same
US20200356827A1 (en) * 2019-05-10 2020-11-12 Samsung Electronics Co., Ltd. Efficient cnn-based solution for video frame interpolation
CN110968845A (en) * 2019-11-19 2020-04-07 天津大学 Detection method for LSB steganography based on convolutional neural network generation
KR102103306B1 (en) * 2020-01-28 2020-04-23 국방과학연구소 Steganography Discrimination Apparatus and Method
CN115295018A (en) * 2022-08-04 2022-11-04 浙江农林大学暨阳学院 Bayesian network-based pitch period modulation information hiding detection method
CN116110565A (en) * 2022-08-19 2023-05-12 常州大学 Method for auxiliary detection of crowd depression state based on multi-modal deep neural network
CN116049467A (en) * 2023-01-17 2023-05-02 华中科技大学 Non-supervision image retrieval method and system based on label visual joint perception
CN115861407A (en) * 2023-02-28 2023-03-28 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) Safe distance detection method and system based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王忠臣: "低嵌入率LSB音频隐写的检测技术研究", CNKI优秀硕士学位论文全文库, no. 1, pages 5 - 57 *

Similar Documents

Publication Publication Date Title
CN112464837B (en) Shallow sea underwater acoustic communication signal modulation identification method and system based on small data samples
CN114564991B (en) Electroencephalogram signal classification method based on transducer guided convolutional neural network
CN110248190B (en) Multilayer residual coefficient image coding method based on compressed sensing
CN109785847B (en) Audio compression algorithm based on dynamic residual error network
CN110490816B (en) Underwater heterogeneous information data noise reduction method
CN110968845B (en) Detection method for LSB steganography based on convolutional neural network generation
CN113990330A (en) Method and device for embedding and identifying audio watermark based on deep network
CN116939226A (en) Low-code-rate image compression-oriented generated residual error repairing method and device
CN117743768B (en) Signal denoising method and system based on denoising generation countermeasure network and diffusion model
Zhu et al. A novel asymmetrical autoencoder with a sparsifying discrete cosine Stockwell transform layer for gearbox sensor data compression
Fleig et al. Edge-aware autoencoder design for real-time mixture-of-experts image compression
CN116434759B (en) Speaker identification method based on SRS-CL network
CN117079669A (en) Feature vector extraction method for LSB audio steganography with low embedding rate
Raj et al. Multilayered convolutional neural network-based auto-CODEC for audio signal denoising using mel-frequency cepstral coefficients
Movaghar et al. A new approach for digital image watermarking to predict optimal blocks using artificial neural networks
Amirtharajan et al. Info Hide–A Cluster Cover Approach
Hu et al. Adaptive Image Zooming based on Bilinear Interpolation and VQ Approximation
CN112927700B (en) Blind audio watermark embedding and extracting method and system
CN116029887A (en) Image high-capacity robust watermarking method based on wavelet neural network
CN115547344A (en) Training method of voiceprint recognition feature extraction model and voiceprint recognition system
CN116012662A (en) Feature encoding and decoding method, and method, device and medium for training encoder and decoder
Naik et al. Joint Encryption and Compression scheme for a multimodal telebiometric system
Yang et al. A robust blind audio watermarking scheme based on singular value decomposition and neural networks
CN115457985B (en) Visual audio steganography method based on convolutional neural network
Wu et al. A Fast Audio Digital Watermark Method Based on Counter-propagation Neural Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination