CN114339258B

CN114339258B - Information steganography method and device based on video carrier

Info

Publication number: CN114339258B
Application number: CN202111626840.5A
Authority: CN
Inventors: 钮可; 林洋平; 刘佳; 陈培; 张明书; 李秀广
Original assignee: Engineering University of Chinese Peoples Armed Police Force
Current assignee: Engineering University of Chinese Peoples Armed Police Force
Priority date: 2021-12-28
Filing date: 2021-12-28
Publication date: 2024-05-10
Anticipated expiration: 2041-12-28
Also published as: CN114339258A

Abstract

The invention discloses an information steganography method and device based on a video carrier, wherein the method comprises the following steps: generating a modification probability matrix according to a pre-trained hidden sketch reactance network based on a carrier video, wherein the carrier video is a pseudo video which is generated by the network and accords with natural semantics; the modification probability matrix is adaptively generated into an optimal embedding modification position diagram through an optimal binary embedding function; and embedding secret information in a position corresponding to the optimal embedded modification position diagram of the carrier video to generate the encrypted video. The invention can effectively solve the problem that the existing network cannot directly process the video carrier with large data volume and the problem that the optimization modification strategy is difficult to construct in the traditional video information hiding algorithm, and realize information steganography by taking the video as the carrier.

Description

Information steganography method and device based on video carrier

Technical Field

The invention relates to the technical field of information security, in particular to an information steganography method and device based on a video carrier.

Background

Steganography is a secret communication technology which conceals secret information in carrier information so that an attacker cannot know whether the carrier contains the secret information or not, and therefore the purpose of concealing transmission is achieved. Compared with encryption technology, the method is not easy to analyze and detect by malicious attackers because of imperceptibility, is a popular subject in the field of information security, and has important application value in departments such as military and intelligence.

However, most of the current digital steganography uses digital images as carriers, the research of digital video steganography is relatively less, and considering that digital video has larger absolute data volume compared with images, the embedded capacity and the security of the digital video have better performance, along with the development of a 5G high-speed network, a large amount of video medium information is rapidly spread in the Internet, and digital video is already a commonly used medium form, so that the digital video steganography is an ideal data hiding carrier, and the advantages of the video steganography in terms of carrier quantity and spreading security are increasingly highlighted.

The existing video steganography technology generally selects a certain position in DCT (Discrete Cosine Transform ) coefficients or motion vectors in a compressed video through a designed rule, and realizes information embedding by modifying data of the position; and relatively few video steganography algorithms based on antagonistic neural networks.

The existing video steganography technology mainly has the following defects: 1. the existing video steganography algorithm improves the security of the steganography algorithm by making the statistical distortion caused by embedding as small as possible through a certain modification strategy, but the modification strategy is generally constructed through experience, and the design of the optimal modification strategy is very difficult to manually realize. 2. The existing neural network-based steganography algorithm is mainly applied to image and text steganography, and cannot be directly applied to digital videos due to network construction, running speed, data processing capacity and the like. Meanwhile, the invisibility and the safety of the existing hidden-write algorithm based on the neural network are still to be improved. And pay attention to embedding secret information into a single image, the fact that each frame image of a video sequence has different bottom characteristic information is not considered, information cannot be reasonably distributed into the whole video sequence, and different secret information is embedded into different video frame images.

Therefore, how to effectively use video as a carrier for information steganography is a problem to be solved.

Disclosure of Invention

In view of the above, the present invention provides an information steganography method based on a video carrier, which can effectively use video as a carrier to steganographically.

The invention provides an information steganography method based on a video carrier, which comprises the following steps:

Generating a modification probability matrix according to a pre-trained hidden sketch reactance network based on a carrier video, wherein the carrier video is a pseudo video which is generated by the network and accords with natural semantics;

the modification probability matrix is adaptively generated into an optimal embedding modification position diagram through an optimal binary embedding function;

and embedding secret information in a position corresponding to the optimal embedded modification position diagram of the carrier video to generate a video containing the secret.

Preferably, before generating the modification probability matrix according to the pre-training generated steganography resistant network based on the carrier video, the method further comprises:

Generating a carrier video of the countermeasure network according to the video generated by the pre-training.

Preferably, the generating the countermeasure network generating carrier video according to the pre-training generated video includes:

and taking noise as an input of a video generation countermeasure network generated by the pre-training, and generating the carrier video composed of a foreground, a background and a mask.

Preferably, said generating noise as input to said pre-training generated video generation countermeasure network, generating said carrier video of foreground, background and mask composition, comprises:

And taking noise as an input of the pre-training generated video generation countermeasure network, generating the foreground and the mask through a foreground generator in the pre-training generated video generation countermeasure network, and generating the background through a background generator in the pre-training generated video generation countermeasure network, wherein the foreground, the background and the mask form the carrier video.

Preferably, the generating the modification probability matrix according to the pre-trained hidden sketching reactance network based on the carrier video comprises:

And taking the foreground in the carrier video as the input of the pre-generated hidden-sketch countering network, and generating a modification probability matrix through a hidden-sketch generator in the pre-generated hidden-sketch countering network.

An information steganography apparatus based on a video carrier, comprising:

The hidden photo generation contrast network generated by pre-training is used for generating a modification probability matrix based on a carrier video, wherein the carrier video is a pseudo video conforming to natural semantics;

The hidden photo generation reactance network generated by pre-training is also used for adaptively generating an optimal embedding modification position diagram through the optimal binary embedding function by the modification probability matrix;

The pre-trained hidden photo generation reactance network is also used for embedding secret information in the position corresponding to the optimal embedded modification position diagram of the carrier video to generate a secret-containing video.

Preferably, the apparatus further comprises:

The pre-trained generated video generates an countermeasure network for generating a carrier video.

Preferably, the pre-training generated video generation countermeasure network is specifically used for:

The carrier video is generated with the foreground, background and mask composition using noise as input.

Preferably, the pre-training generated video generation countermeasure network includes: a foreground generator and a background generator; wherein:

The foreground generator is used for taking noise as input and generating the foreground and the mask;

The background generator is used for taking noise as input to generate the background, wherein the carrier video is composed of the foreground, the background and a mask.

Preferably, the pre-training generated steganography contrast network comprises: a steganographic generator; wherein:

the hidden-write generator is used for taking the foreground in the carrier video as input to generate a modification probability matrix.

In summary, the invention discloses an information steganography method based on a video carrier, when information steganography is needed by taking a video as a carrier, a modified probability matrix is generated on the basis of the carrier video according to a steganography resistant network generated by pre-training, wherein the carrier video is a pseudo video conforming to natural semantics; and then, the modification probability matrix is adaptively generated into an optimal embedding modification position diagram through an optimal binary embedding function, secret information is embedded in a position corresponding to the optimal embedding modification position diagram of the carrier video, and the encrypted video is generated. The invention can effectively solve the problem that the existing network cannot directly process the video carrier with large data volume and the problem that the optimization modification strategy is difficult to construct in the traditional video information hiding algorithm, and realize information steganography by taking the video as the carrier.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of an embodiment of a method for steganography of information based on a video carrier in accordance with the present disclosure;

FIG. 2 is a schematic diagram of an embodiment of a video generator in a video generation countermeasure network according to the present disclosure;

FIG. 3 is a schematic diagram of the separation and merging of three channels of the steganographic arbiter of the present disclosure;

fig. 4 is a schematic structural diagram of an embodiment of an information steganography apparatus based on a video carrier according to the present disclosure.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

As shown in fig. 1, a method for information steganography based on a video carrier according to the present disclosure may include the following steps:

s101, generating a carrier video of an countermeasure network according to a video generated by pre-training;

When information steganography is needed by taking a video as a carrier, firstly generating a carrier video through a video generation countermeasure network, wherein the generated carrier video is a pseudo video which is generated by the network and accords with natural semantics. Wherein the video generation countermeasure network is pre-trained.

Specifically, the pre-training generated video generation countermeasure network includes: a video generator consisting of a foreground generator and a background generator. Generating a foreground and a mask by using noise as input through a foreground generator; the background is generated by the background generator taking noise as input. The generated foreground, background and mask form the carrier video.

Compared with the prior art, the prior art is a single-stream generation model for directly generating video samples, and the embodiment adopts a double-stream generation model, wherein the generated video model comprises a foreground generator and a background generator, the former generates a foreground and a space-time mask, the latter generates a background, the foreground is motion information, and the background is static information. And synthesizing the foreground and background information into a complete video through the space-time mask information. Compared with a single-stream model, the double-stream model is more flexible to use the video generated by the single-stream model; and in order to increase the resolution of the video, to improve the visual quality and the embedding capacity, on the basis of the existing proposed dual-stream video generation network, the number of deconvolution layers of the foreground generator and the background generator is increased by one layer, resolution ranges from 64 x 64 to 128 x 128, the size of the convolution kernel is set to 3×3×3, which is more suitable for processing data.

S102, generating a modification probability matrix according to a pre-trained hidden sketch reactance network based on a carrier video;

And then, taking the carrier video as input of a steganography reactance network, and generating a modification probability matrix through the steganography reactance network.

Specifically, the steganography-resistant network includes: a steganographic generator; and generating a modification probability matrix by using the foreground in the carrier video as input through the steganography generator.

Compared with the prior art, the hidden-sketching reactance-oriented network is responsible for generating an embedded modification probability matrix, which is called a hidden-sketching generator. Unlike image steganography networks, the video information stream mainly contains two types of information, including time domain information in addition to spatial domain information. In the image airspace steganography, airspace information of a single carrier image is analyzed, and modification probability is generated for steganography. Video information is often continuous sequence information, and a large amount of redundant information is contained in a static space and a motion space and can be used for embedding information, but compared with images, the video data has larger data volume and a characteristic extraction network is less easy to train. Therefore, in the architecture proposed in this embodiment, the foreground generated by the video generation countermeasure network is directly used as the input of the steganography generator, and the feature information of the foreground is extracted by using the convolution network. The two-dimensional convolution layer of the traditional processing diagram is expanded into a three-dimensional convolution layer, the method is suitable for processing video data, and the designed convolution kernel is 1 multiplied by 5.

S103, adaptively generating an optimal embedding modification position diagram through an optimal binary embedding function by using the modification probability matrix;

After the modification probability matrix is generated, the optimal embedding modification position diagram is further generated in a self-adaptive mode through the optimal binary embedding function.

S104, embedding secret information in a position corresponding to the optimal embedding modification position diagram of the carrier video to generate a video containing the secret.

And then, modifying and embedding secret information on the pixel least significant bit of the corresponding position of each channel of the video frame in the carrier video by utilizing the optimal embedding modification position diagram to obtain the video containing the secret.

Further, the sender may transmit the encrypted video to the receiver via the public channel, and the embedded modified location map may be transmitted to the receiver via the secret channel. And after receiving the video containing the secret, the receiver takes the modified position diagram transmitted by the video sender through the secret channel as an extraction reference to extract the secret information of the lowest bit of the pixel at the corresponding position in the video. The specific extraction process is that the modified position diagram is a matrix diagram with the same size as the video pixel, each modified position marked on the modified position diagram is embedded with information, and the specific position of the information is a value on the lowest position of the marked pixel.

Specifically, in the above embodiment, as shown in fig. 2, the input to the video generator is a low-dimensional noise, which in most cases may be sampled from a distribution, such as a gaussian distribution. As shown in fig. 2, the foreground generator includes a four-dimensional tensor (tensor), such as 2×4×4 (512), which represents a time dimension of 2, a row dimension of 4, a column dimension of 4, and a channel number of 512. Between the squares, there is a transposed three-dimensional convolution layer, the convolution kernel size being 3x 3, s=2 representing a convolution layer step size of 2, similarly, the background generator can be explained, but a two-dimensional convolution layer is arranged between the blocks of the background generator. The end of the last layer of the dual stream generator is added with the output of one layer of activation function to the convolution layer, and Tanh, sigmoldo in the figure are two different activation functions to control the output of the whole video generator. (to increase the resolution of the video to increase the embedding capacity, the number of deconvolution layers of the foreground generator and the background generator can be increased by one layer, and the resolution is from 64×64 to 128×128.) the flow design principle of video generation in a double-flow architecture is shown in the following formula:

G(z)＝m(z)⊙f(z)+(1-m(z))⊙b(z)

Where, as the Hadamard product, m (z) can be regarded as a spatio-temporal mask, and f (z) and b (z) represent foreground (foreground) information and background (background) information, respectively. For the position and time step of each pixel in the video, a spatiotemporal mask is used to select foreground information or background information, where foreground represents information of motion in the video and background represents stationary background in the video. To generate background information in a video time series, the background learns the static pixel information in the video image by two-dimensional convolution to generate a plane image that replicates over time. The foreground here is a four-dimensional spatio-temporal tensor representing the spatio-temporal information of each pixel.

When training the generated video to generate the countermeasure network, this embodiment uses a 5-layer spatio-temporal 3D convolutional network as a arbiter for the video generation countermeasure network, the convolution kernel is 3 x 3, the step size is 2. Thus, the convolution layer can learn statistical information in the video background and can also learn the space-time relationship of the object motion. The arbiter designed by the embodiment is opposite to a convolution layer structure of a generating prospect in a video generator, and the three-dimensional convolution replaces three-dimensional transpose convolution to process video characteristics. The input of the discriminator is a real video sample and a generated non-load-density sample, and the output is a category label and a category probability logic, and when the category label is 1, the real video sample is represented; when the tag is 0, the generated video sample is represented, and the classification probability logic is a value between 0 and 1. Except for sidmiod functions used after the last layer of convolution in the arbiter network, leakyReLU is used as an activation function for each of the first four layers of convolution operations. loss is a loss function of the discriminator, sigmoid cross entropy is adopted to train the discriminator, loss represents a specific performance index of the video generation network, and when the loss is smaller, the better the generation quality of the video generation network is reflected, the more the discriminator can be "deceived", so that the false label is output. The specific formula is as follows:

loss＝-[y*ln(p)+(1-y)ln(1-p)]

The invention takes the foreground information generated by the video generation network as the input of the steganography generator in the steganography countering network generated by pre-training, avoids the processing of the steganography countering network on the whole video data, and effectively solves the problem of overlarge video data volume.

Specifically, in the above embodiment, a generator suitable for video information embedding is constructed in a steganography-resistant network, and an image-based two-dimensional convolution layer is extended to a three-dimensional convolution layer to implement information embedding of a video carrier, which includes a set of steganography generator and steganography discriminator to generate an optimal modification probability matrix. Wherein, the hidden-write generator structure is shown in table 1:

Table 1 steganography generator structure

As shown in table 1, the input layer first shifts the pixel value of the input video from [0: the 255 interval is transformed into the 0:1 interval, (the pixel values are generally distributed in the interval of 0 to 255, the normalization processing is carried out for the sake of data processing, the whole is converted into 0 to 1), 3DConv represents a three-dimensional convolution layer, 3Ddeconv represents a transposed three-dimensional convolution layer, concat (L7) represents that the output of the layer is spliced with the output of the 7 th layer in the last dimension, for example, the ninth layer is an output structure of (n, h/2,w/2,128), and the input of the tenth layer is changed into (n, h/2,w/2,256) after Concat operation. Without Concat operated convolutional layers, the output of the upper layer is directly taken as the input of the lower layer. In the framework, 16 three-dimensional convolution layers are included in total, each of the first 8 three-dimensional convolution layers comprises a three-dimensional convolution layer with a convolution kernel of 1 multiplied by 5, a BN layer is connected to the three-dimensional convolution layers, a leakage-ReLU activation function is adopted in the first 8 groups, a ReLU activation function and a transposed three-dimensional convolution layer are adopted in the last 8 groups, a sigmoid activation function is used for output of the last group, characteristics extracted by a convolution network are mapped, the characteristics are converted into probabilities with values ranging from 0 to 1 through the sigmoid function, and in order to prevent the safety of embedded messages from being reduced due to overlarge modification probability, the intervals are controlled to be within a range from 0 to 0.5 by subtracting 0.5 from each probability. In Table 1, n, h, w represent the number of frames, height, width of the video, respectively, h/2 represents half the height of the input video, the remainder w/4,h/4 …, and so on.

When training to generate a steganography resistant network, the steganography discriminator is used for analyzing the correlation and the anti-detection capability between each channel in the color steganography sample. After game countertraining, the secret sample corresponding to the modified probability matrix generated by the steganography generator strengthens the capability of resisting steganography analysis and detection.

The embodiment designs a steganographic discriminator based on three-dimensional convolution. As shown in fig. 3, considering the space-time dimension and the number of channels of the video signal, in order to use the high-pass filter in the analysis and detection of the color three-channel dense video frame, the embodiment expands the high-pass filter in the time dimension and the channel dimension, and respectively pre-processes three channel samples of the image of each frame (the video frame image is composed of a red channel, a green channel and a blue channel) through the high-pass filter, so that the characteristic information of each channel can be respectively processed and the correlation among the three channels can be analyzed, the original high-pass filter is 6 convolution kernels of 5×5, the convolution kernels which are formed into a three-dimensional convolution layer are input into the three channels of each frame image to pre-process data, and the following formulas are respectively from left to right as the first convolution kernel to six convolution kernels:

Finally, combining the characteristic images into a characteristic image with the number of channels being 18 and the time dimension being 32 so as to process continuous video signals, inputting the residual characteristics obtained by the characteristic images into two 6-layer three-dimensional convolution networks to analyze the space-time characteristics of the characteristic images, wherein one network structure is responsible for extracting accurate characteristics of channel correlation in the encrypted video, judging the difference of the channel correlation between the encrypted video and an original video image, so that embedding capacity is distributed in three channels of each frame image, and the other network structure is responsible for extracting related information existing between frames of the encrypted video, further promoting a steganography generator to reasonably embed information in each frame image of a video sequence, and judging whether the input is embedded with secret information. The three-dimensional convolution kernel of the first convolution network has a size of 1 x 5, three-dimensional of a second convolutional network size of convolution kernel is 5×5×5. The convolution step length of the two is set to be 1, and the number of convolution kernels is as follows: the first layer is 16, the second layer is 32, the third layer is 64, the fourth layer is 128, the fifth layer is 256, and the sixth layer is 512. The result of convolution output is processed by a batch normalization layer and an activation function, the activation function of the first two layers is set as a tanh activation function, relu activation functions are used for the third to fifth layers, the last layer does not adopt an activation function, the output of two convolution networks is combined together in the last dimension, and finally, the space-time characteristics are mapped into identification probability values by using a full connection layer.

In two stages of training, two countermeasure networks are respectively trained, and discriminators of the two countermeasure networks have differences in structural design, but label information output by the networks is used as key reference information for measuring the performance of the discriminators, and the loss of the steganography discriminators is defined as:

in the above-mentioned loss function, y' is defined as the activation function output in the steganographic discriminator, which is the recognition probability of the steganographic discriminator on the input sample, and takes a value between 0 and 1. y _i is the classification label of the steganographic discriminator on the carrier sample and the secret sample, the label value of 1 represents the carrier sample, and the label value of 0 represents the secret sample.

For the loss function of the steganographic generator, it is defined as:

l_G＝-α×l_D+β×(C-3×N×H×W×Q)²

C＝C₁+C₂+C₃

Where l _D represents the loss of the steganographic discriminant, nxh x W x Q is the expected load set before training, N is the number of video frames, H is the video height, W is the video width, Q is the channel embedding rate, which is the average number of secret message bits embedded per channel.

Different from the image steganography network, because the network uses a three-dimensional convolution network, the set embedding capacity is distributed in the whole video sequence, the embedding capacity of the steganography generator in each frame of image in the video sequence is automatically adjusted according to the airspace characteristic information of the steganography generator, the size of the embedding capacity in each frame is automatically adjusted through the learning of the network, and the 32-frame video image evenly distributes the number of secret message bits to be embedded in the whole video.

In the training of two phases, β is set to 0 and α is set to a constant 1 in the first phase; in the next stage, the actual training requirement is reasonably set to ensure optimization of the objective function, and the training process parameter β is set to 10 ^-7.C_k as secret message loads in three channels of the embedded sample, which is defined as:

P _i,j,k in the above refers to the steganography generator generating a modified probability value for the kth channel of the corresponding pixel x _i,j, The probabilities of the embedding values of +1, -1 and 0 are respectively indicated, the probabilities of +1 and-1 are respectively indicated as the probabilities of the embedding mode being added by one or subtracted by one, and 0 represents that no embedding is performed.

After video steganography challenge modification probability matrices, a corresponding embedding modification position map is needed for embedding the secret message, where the embedding modification position map is generated by the following optimal embedding simulator.

Wherein i, j, k and c respectively represent the i-th row and j-th column pixels in the c-th channel in the image of the k-th frame, and p ^c _i,j,k is the embedding probability in the modification probability matrix generated by the steganography generator; n ^c _i,j,k is a random number generated by uniform distribution between 0 and 1;

m ^c _i,j,k is a modification policy, when it is 0, the least significant bit of the pixel is not modified, the pixel is skipped when secret information is embedded, when it is not 0, the least significant bit of the pixel is compared with the information bit, if it is the same, it is not modified; if not, modifying the lowest bit of the pixel according to m ^c _i,j,k, namely adding one to the pixel when m ^c _i,j,k is 1; when m ^c _i,j,k is-1, the pixel is decremented by one.

However, this function cannot generate a gradient counter-propagating transfer gradient of the network in actual training, resulting in an excessively long training time. In order to solve the problem of discontinuous embedding function, an optimal embedding activation function based on the tanh function is introduced as follows:

tanh is the hyperbolic tangent function:

Where λ is the scaling factor, controlling the gradient of the function change in the step state, different scaling factors corresponding to the change of the function.

The modified position diagram is formed by a modification strategy, when the modification strategy is 0, the pixels containing the dense video are not embedded with the message, and the bit is skipped during extraction; and when the modification strategy is not 0, the lowest pixel bit of the video with secret is the secret message bit.

Training of generating an countermeasure network in model training is divided into two stages, wherein a first video generating countermeasure network is trained firstly to enable the first video generating countermeasure network to generate a pseudo video conforming to natural semantics, after 3000 rounds of iterative training, a second generating network is trained to be responsible for generating an embedded modification matrix, and the training is performed for 800 times. For training in the first network, training of the discriminator is performed after training the generator once. In the training, an Adam optimizer with a learning rate of 0.0002 is used for training a model, and the optimizer is responsible for adjusting parameters in the neural network so as to optimize the output of the neural network to minimize a loss function. Each batch of video samples was trained 3800 times for a total of 500 epochs.

In summary, the invention can effectively solve the problem that the existing network cannot directly process the video carrier with large data volume and the problem that the optimization modification strategy is difficult to construct in the traditional video information hiding algorithm, and effectively realizes information hiding by taking video as the carrier.

As shown in fig. 4, a schematic structural diagram of an embodiment of an information steganography apparatus based on a video carrier according to the present invention may include:

The pre-trained generated video generation countermeasure network 401 is used for generating carrier videos;

the pre-training generated hidden photo generation reactance network 402 is used for generating a modification probability matrix based on the carrier video;

the pre-training generated hidden-sketching reactance network 402 is further used for adaptively generating an optimal embedding modification position diagram through an optimal binary embedding function by the modification probability matrix;

The pre-trained and generated steganography reactance network 402 is further used for embedding secret information in a position corresponding to the optimal embedding modification position diagram of the carrier video to generate a secret-containing video.

The working principle of the information steganography device based on the video carrier disclosed in this embodiment is the same as that of the above embodiment of the information steganography method based on the video carrier, and will not be described herein again.

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method of information steganography based on a video carrier, comprising:

generating a modification probability matrix according to a pre-trained hidden sketch reactance network based on a carrier video, wherein the carrier video is a pseudo video conforming to natural semantics;

Embedding secret information in a position corresponding to the optimal embedded modification position diagram of the carrier video to generate a secret-containing video;

before generating the modification probability matrix according to the pre-trained steganography reactance network, the method further comprises the following steps:

Generating a carrier video of the countermeasure network according to the video generated by the pre-training;

The generating the countermeasure network generating carrier video according to the pre-training generated video comprises:

Generating noise as input of an countermeasure network by using the pre-training generated video, and generating the carrier video composed of a foreground, a background and a mask;

-said generating noise as input to said pre-training generated video against a network, -generating said carrier video of foreground, background and mask composition comprising:

Generating a foreground and the mask through a foreground generator in the pre-training generated video generation countermeasure network, and generating the background through a background generator in the pre-training generated video generation countermeasure network, wherein the carrier video is composed of the foreground, the background and the mask;

the generating the modification probability matrix based on the carrier video according to the pre-trained hidden photo generation reactance network comprises the following steps:

the foreground in the carrier video is used as the input of the pre-generated hidden-sketch countering network, and a modification probability matrix is generated by a hidden-sketch generator in the pre-generated hidden-sketch countering network;

Selecting foreground information or background information by using a space-time mask, wherein the foreground represents information of motion in the video, and the background represents a static background in the video;

When training to generate a video to generate an countermeasure network, the input of the discriminator is a real video sample and a generated non-load sample, the output is a class label and a class probability logic, and when the class label is 1, the input represents the real video sample; when the tag is 0, representing the generated video sample, wherein the classification probability logic is a value between 0 and 1;

loss is a loss function of the discriminator, sigmoid cross-over is adopted to train the discriminator, and loss represents a specific performance index of the video generation network;

Constructing a generator suitable for video information embedding in a hidden-generation-resistant network, expanding a two-dimensional convolution layer based on an image into a three-dimensional convolution layer, and realizing information embedding of a video carrier, wherein the generator comprises a group of hidden generators and hidden discriminators, so as to generate an optimal modification probability matrix, the hidden discriminators are based on three-dimensional convolution, and the loss function of the hidden generators is defined as:

l_G＝-α×l_D+β×(C-3×N×H×W×Q)²

C＝C₁+C₂+C₃

Wherein l _D represents the loss of the steganographic discriminant, nxh x W x Q is the expected load set before training, N is the number of video frames, H is the video height, W is the video width, Q is the channel embedding rate, which is the average number of secret message bits embedded per channel;

In the training of two phases, β is set to 0 and α is set to a constant 1 in the first phase; in the next stage, the actual training requirement is reasonably set to ensure optimization of the objective function, and the training process parameter β is set to 10 ^-7;C_k as secret message loads in three channels of the embedded sample, which is defined as:

p _i,j,k in the above refers to that the steganographic generator generates a modified probability value corresponding to the kth channel of the pixel x _i,j, The probability of the embedding values of +1, -1 and 0 is respectively indicated, the probability of the embedding mode of +1 and-1 being one plus or one minus is corresponding, and 0 represents that no embedding is performed;

The embedding capacity of the hidden photo generator in each frame of image in the video sequence is that the size of the embedding capacity in each frame is automatically adjusted through the learning of a network according to the airspace characteristic information of the hidden photo generator, and the 32 frames of video images evenly distribute the bit number of secret information to be embedded in the whole video;

After the video steganography challenge-modifying probability matrix, a corresponding embedding modification position diagram needs to be obtained to embed the secret message, wherein the embedding modification position diagram is generated by the following optimal embedding simulator;

Wherein i, j, k, c respectively represent the i-th row and j-th column pixels in the c-th channel in the image of the k-th frame, Embedding probabilities in a modified probability matrix generated for the steganographic generator; /(I)Random numbers generated for 0 to 1 are uniformly distributed; λ is a scaling factor, controlling the gradient of the function in the step state, the different scaling factors corresponding to the change of the function; /(I)In order to modify the strategy, the modified position diagram is formed by the modified strategy, when the modified strategy is 0, the pixels containing the dense video are not embedded with the message, and the bit is skipped during extraction; when the modification strategy is not 0, the lowest pixel bit of the video with secret is the secret information bit; tan h is a hyperbolic tangent function, and has the following specific formula:

2. an information steganography device based on a video carrier, comprising:

the pre-trained hidden-sketching reactance network is also used for adaptively generating an optimal embedding modification position diagram through the optimal binary embedding function by the modification probability matrix;

The pre-training generated hidden photo generation reactance network is further used for embedding secret information in a position corresponding to the optimal embedding modification position diagram of the carrier video to generate a secret-containing video;

Wherein, still include:

pre-training the generated video to generate an countermeasure network for generating a carrier video;

the pre-training generated video generation countermeasure network is specifically used for:

taking noise as input, and generating the carrier video composed of a foreground, a background and a mask;

the pre-training generated video generation countermeasure network comprises: a foreground generator and a background generator; wherein:

The background generator is used for taking noise as input to generate the background, wherein the carrier video is composed of the foreground, the background and a mask;

the pre-training generated steganography reactance network comprises the following components: a steganographic generator; wherein:

the hidden-write generator is used for taking the foreground in the carrier video as input to generate a modification probability matrix;

l_G＝-α×l_D+β×(C-3×N×H×W×Q)²

C＝C₁+C₂+C₃