CN112911341B

CN112911341B - Image processing method, decoder network training method, device, equipment and medium

Info

Publication number: CN112911341B
Application number: CN202110138975.0A
Authority: CN
Inventors: 夏冬; 王贺
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2021-02-01
Filing date: 2021-02-01
Publication date: 2023-02-28
Anticipated expiration: 2041-02-01
Also published as: CN112911341A

Abstract

The present disclosure provides an image processing method. The image processing method comprises the following steps: acquiring a first image, wherein the first image comprises first steganographic information; decoding the first image by using a decoder network to obtain the first steganographic information; wherein training the decoder network comprises: acquiring at least one second image; adding second steganographic information to the second image using an encoder network to obtain a first encrypted image; performing physical and chemical processing on the first encrypted image to obtain a second encrypted image, wherein the physical and chemical processing is used for enabling the second encrypted image to have an environmental disturbance effect generated in the process of acquiring the first encrypted image when the first encrypted image is displayed in a physical environment; training the decoder network with at least one of the second encrypted images. The disclosure also provides a decoder network training method, a device, an electronic device and a storage medium.

Description

Image processing method, decoder network training method, device, equipment and medium

Technical Field

The present disclosure relates to the field of information security, and more particularly, to an image processing method, a decoder network training method, an apparatus, an electronic device, and a storage medium.

Background

The image steganography is to embed information into an image in a mode which is not easy to be perceived, so as to achieve the purpose of transmitting messages in a concealed mode, for example, the image steganography can be carried out by utilizing a deep learning technology, and steganography analysis is carried out.

In the course of implementing the disclosed concept, the inventors found that there are at least the following problems in the prior art:

the existing image steganography and steganography analysis technology based on deep learning has insufficient robustness in a real environment, and the steganography information extracted from an image has poor accuracy.

Disclosure of Invention

In view of this, the embodiments of the present disclosure provide an image processing method, a decoder network training method, and apparatus, electronic device and storage medium capable of having higher decoding accuracy in a real environment.

An aspect of the embodiments of the present disclosure provides an image processing method. The image processing method comprises the following steps: acquiring a first image, wherein the first image comprises first steganographic information; decoding the first image by using a decoder network to obtain the first steganographic information; wherein training the decoder network comprises: acquiring at least one second image; adding second steganographic information to the second image using an encoder network to obtain a first encrypted image; the method comprises the steps of performing physical and chemical processing on a first encrypted image to obtain a second encrypted image, wherein the physical and chemical processing is used for enabling the second encrypted image to have an environment disturbance effect generated in the process of acquiring the first encrypted image when the first encrypted image is displayed in a physical environment; training the decoder network with at least one of the second encrypted images.

According to an embodiment of the present disclosure, said training said decoder network with at least one said second encrypted image comprises: taking at least one second encrypted image as the input of the decoder network to obtain third steganographic information; obtaining a decoding loss of the decoder network based on the third steganographic information and the second steganographic information corresponding to the third steganographic information; and training the decoder network based on a first objective function, wherein the first objective function includes the decoding loss.

According to an embodiment of the disclosure, training the decoder network further comprises: obtaining coding losses for the network of encoders based on differences in quality of at least one of the second images and the first encrypted image corresponding to each of the second images; and training the decoder network and the encoder network simultaneously based on a second objective function, wherein the second objective function includes the decoding loss and the encoding loss.

According to an embodiment of the present disclosure, a critic network forms a countermeasure network with the encoder network during training of the decoder; wherein said training said decoder network further comprises: respectively obtaining the pixel distribution of the first encrypted image and the second image corresponding to the first encrypted image by using the critic network; calculating a distance measure of the pixel distribution of the first encrypted image and the second image corresponding thereto to obtain a loss of comment; and training the decoder network, the encoder network, and the critic network simultaneously based on a third objective function, wherein the third objective function includes the second objective function and the critic loss.

According to an embodiment of the disclosure, the decoding, with a decoder network, the first image to obtain the first steganographic information includes: pre-processing the first image with a detector network; and decoding the preprocessed first image by using the decoder network to obtain the first steganographic information.

According to an embodiment of the present disclosure, the training the decoder network comprises: processing at least one of said second encrypted images with said detector network to obtain at least one third encrypted image, comprising: detecting at least one first region from the second encrypted image; determining a second area from at least one of the first areas, wherein the second area includes the content of the first encrypted image; cropping the second encrypted image based on the second region to obtain a third encrypted image; and training the decoder network using at least one of the third encrypted images.

According to an embodiment of the present disclosure, the cropping the second encrypted image based on the second area to obtain a third encrypted image includes: fitting the contour of the second region by a polygon; cropping the second encrypted image based on the polygon; and performing homography transformation on the cut polygonal image to obtain the third encrypted image.

According to an embodiment of the present disclosure, the training the decoder network further comprises: obtaining a detection loss of the detector network based on a perceptual loss of the third encrypted image and the first encrypted image corresponding thereto; and training the detector network based on the detection loss.

According to an embodiment of the present disclosure, the performing the physical processing on the first encrypted image includes: performing at least one of perspective transformation, motion blur, or defocus blur on the first encrypted image.

According to an embodiment of the present disclosure, the physically processing the first encryption map further includes: and performing at least one operation of affine color transformation, gaussian noise or JPEG compression on the first encrypted image.

According to an embodiment of the present disclosure, the adding, with the encoder network, second steganographic information to the second image includes: embedding a bit string obtained by encoding into the second image.

Another aspect of an embodiment of the present disclosure provides a decoder network training method. The decoder network training method comprises the following steps: acquiring at least one second image; adding second steganographic information to the second image using an encoder network to obtain a first encrypted image; the method comprises the steps of performing physical and chemical processing on a first encrypted image to obtain a second encrypted image, wherein the physical and chemical processing is used for enabling the second encrypted image to have an environment disturbance effect generated in the process of acquiring the first encrypted image when the first encrypted image is displayed in a physical environment; training the decoder network with at least one of the second encrypted images.

Another aspect of the disclosed embodiments provides an image processing apparatus. The image processing device comprises an acquisition module and a decoding module. The acquisition module is used for acquiring a first image, and the first image comprises first steganographic information. The decoding module is used for decoding the first image by utilizing a decoder network to obtain the first steganographic information; wherein training the decoder network comprises: acquiring at least one second image; adding second steganographic information to the second image using an encoder network to obtain a first encrypted image; the method comprises the steps of performing physical and chemical processing on a first encrypted image to obtain a second encrypted image, wherein the physical and chemical processing is used for enabling the second encrypted image to have an environment disturbance effect generated in the process of acquiring the first encrypted image when the first encrypted image is displayed in a physical environment; training the decoder network with at least one of the second encrypted images.

Another aspect of the disclosed embodiments provides a decoder network training apparatus. The decoder network training device comprises an acquisition module, a coding module, a processing module and a training module. The acquisition module is used for acquiring at least one second image. The encoding module is configured to add second steganographic information to the second image using an encoder network to obtain a first encrypted image. The processing module is used for carrying out physical and chemical processing on the first encrypted image to obtain a second encrypted image, wherein the physical and chemical processing is used for enabling the second encrypted image to have an environment disturbance effect generated in the process of acquiring the first encrypted image when the first encrypted image is displayed in a physical environment. The training module is configured to train the decoder network using at least one of the second encrypted images.

Another aspect of the embodiments of the present disclosure provides an electronic device. The electronic device includes one or more memories, and one or more processors. The memory has stored thereon computer-executable instructions. The processor executes the instructions to implement the method as described above.

Another aspect of the embodiments of the present disclosure provides a computer-readable storage medium having stored thereon executable instructions, which when executed by a processor, cause the processor to perform the method as described above.

Yet another aspect of embodiments of the present disclosure provides a computer program product comprising computer programs/instructions which, when executed by a processor, implement the method as described above.

One or more of the above-described embodiments may provide the following advantages or benefits: the problem that the robustness of the existing image steganography technology in a real environment is insufficient can be at least partially solved, and the decoder network is trained by the second encrypted image with an environment disturbance effect in the physical environment through physical and chemical processing on the first encrypted image, so that the decoder network has higher decoding precision in the real environment, namely higher robustness.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments of the present disclosure with reference to the accompanying drawings, in which:

fig. 1 schematically illustrates an application scenario in which a deep learning based image steganography technique may be applied, according to an embodiment of the present disclosure;

FIG. 2 schematically illustrates an exemplary system architecture to which an image processing method may be applied, according to an embodiment of the disclosure;

FIG. 3 schematically shows a flow chart of an image processing method according to an embodiment of the present disclosure;

fig. 4 schematically shows a flow diagram of a decoder network training method according to an embodiment of the present disclosure;

fig. 5 schematically shows a schematic diagram of physico-chemically processing a first encrypted image according to an embodiment of the present disclosure;

fig. 6 schematically shows a flow diagram of a decoder network training method according to another embodiment of the present disclosure;

fig. 7 schematically shows a flow diagram of a decoder network training method according to a further embodiment of the present disclosure;

fig. 8 schematically shows a flow diagram of a decoder network training method according to a further embodiment of the present disclosure;

FIG. 9 schematically shows a flow chart of an image processing method according to another embodiment of the present disclosure;

fig. 10 schematically shows a flow diagram of a decoder network training method according to a further embodiment of the present disclosure;

FIG. 11 schematically shows a flow chart for obtaining a third encrypted image according to an embodiment of the disclosure;

FIG. 12 schematically illustrates a flow diagram for obtaining a third encrypted image according to yet another embodiment of the present disclosure;

fig. 13 schematically shows a flow chart of a decoder network training method according to a further embodiment of the present disclosure;

fig. 14 schematically shows a block diagram of an image processing apparatus according to an embodiment of the present disclosure;

fig. 15 schematically shows a block diagram of a decoder network training apparatus according to an embodiment of the present disclosure; and

FIG. 16 schematically illustrates a block diagram of a computer system suitable for implementing the object processing method and system according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

In those instances where a convention analogous to "at least one of A, B, and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B, and C" would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc.). Where a convention analogous to "at least one of A, B, or C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B, or C" would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc.).

An embodiment of the present disclosure provides an image processing method. The image processing method comprises the following steps: a first image is acquired, wherein the first image comprises first steganographic information. The first image is decoded using a decoder network to obtain first steganographic information. Wherein training a decoder network by: at least one second image is acquired and second steganographic information is added to the second image using the encoder network to obtain a first encrypted image. And performing physical and chemical processing on the first encrypted image to obtain a second encrypted image, wherein the physical and chemical processing is used for enabling the second encrypted image to have: and when the first encrypted image is displayed in a physical environment, acquiring an environment disturbance effect generated in the process of the first encrypted image. And training the decoder network using the at least one second encrypted image.

By using the image processing method of the embodiment of the disclosure, the first encrypted image is physically processed, so that the second encrypted image has an environmental disturbance effect generated in the process of acquiring the first encrypted image in a physical environment, and then the decoder network is trained by using the second encrypted image, so that the decoder network has higher decoding accuracy in a real environment.

The embodiment of the disclosure also provides a decoder network training method. The decoder network training method comprises the following steps: at least one second image is acquired and second steganographic information is added to the second image using the encoder network to obtain a first encrypted image. And performing physical and chemical processing on the first encrypted image to obtain a second encrypted image, wherein the physical and chemical processing is used for enabling the second encrypted image to have an environment disturbance effect generated in the process of acquiring the first encrypted image when the first encrypted image is displayed in a physical environment. The decoder network is trained using at least one second encrypted image.

Fig. 1 schematically illustrates an application scenario 100 in which a deep learning based image steganography technique may be applied according to an embodiment of the present disclosure.

As shown in fig. 1, in the application scenario 100, first, a user 107 acquires an original image 101 and inputs it into the encoder network 103. Then, the encoder network 103 embeds steganographic information 102 (i.e., the first steganographic information in this document) that the user 107 wants to send into the original image 101, and outputs an encrypted image 104. The encrypted image 104 is then presented in a physical environment, for example the encrypted image 104 may be printed out or presented on a display. As shown in fig. 1, for example, the user 108 captures the printed encrypted image 104 to obtain an encrypted image 105 (i.e., a first image in this document). In the process of shooting the printed encrypted image 104 to obtain the encrypted image 105, factors such as shooting range, light, equipment, shake, and electronic data transmission may affect the encrypted image 104 to occupy only a part of the area of the encrypted image 105, and the shape and angle of the encrypted image 104 displayed in the encrypted image 105 may change. Finally, the encrypted image 105 is input into the decoder network 106 to obtain the decoded information 109, and the content of the decoded information 109 is the content of the steganographic information 102. The user 108 knows the message that the user 107 wants to deliver by reading the decoded information 109.

According to an embodiment of the present disclosure, the encoder network 103 and the decoder network 106 may implement the encoding and decoding functions through a deep neural network, respectively.

Fig. 2 schematically illustrates an exemplary system architecture 200 to which the image processing method may be applied, according to an embodiment of the present disclosure. It should be noted that fig. 2 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios. Where the encrypted image 206 is one embodiment of the first image herein.

As shown in fig. 1, the system architecture 200 according to this embodiment may include a terminal device 203, a network 204, and a server 205. The network 204 is used to provide a medium for communication links between the terminal devices 203 and the server 205. Network 204 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the terminal device 203 to interact with the server 205 via the network 204 to receive or send messages or the like. The terminal device 203 may have installed thereon various communication client applications, such as a shopping application, a web browser application, a search application, an instant messaging tool, a mailbox client, social platform software, etc. (just examples), wherein the various communication client applications may have a function of calling a camera to capture an image in a real environment, for example.

The terminal device 203 may be various electronic devices having a camera or a display screen and supporting web browsing, including but not limited to a smart phone, a tablet computer, a laptop portable computer, a desktop computer, and the like.

The server 205 may be a server that provides various services, such as a background management server (for example only) that provides support for websites browsed by users using the terminal devices 203. The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (for example, decoding information, a webpage, information, or data obtained, generated, or sent according to the user request) to the terminal device.

Referring to fig. 2, first, an encrypted image 202 may be displayed on a screen of the device 201. Then, the user can take the encrypted image 202 by a camera using the terminal apparatus 203. During the shooting process, the user may bring the device 201 and its surroundings into the shooting range, and the end user obtains the encrypted image 206 through the terminal device 203 by shooting, wherein the encrypted image 202 only occupies a part of the area of the encrypted image 206, and the angle and the shape shown in the encrypted image 206 are changed. Next, the user can use the terminal device 203 to transmit the encrypted image 206 to the server 205, and obtain the decoded information using the decoder network.

It should be noted that the image processing method or the decoder network training method provided by the embodiment of the present disclosure may be generally executed by the server 205. Accordingly, the image processing apparatus provided by the embodiment of the present disclosure may be generally disposed in the server 205. The image processing method or the decoder network training method provided by the embodiment of the present disclosure may also be performed by a server or a server cluster that is different from the server 205 and is capable of communicating with the terminal device 203 and/or the server 205. Accordingly, the image processing apparatus provided in the embodiment of the present disclosure may also be disposed in a server or a server cluster that is different from the server 205 and is capable of communicating with the terminal device 203 and/or the server 205.

It should be understood that the number of terminal devices, networks, encrypted images, and servers in fig. 2 are merely illustrative. There may be any number of terminal devices, networks, encrypted images, and servers, as desired for implementation.

Fig. 3 schematically shows a flow chart of an image processing method according to an embodiment of the present disclosure.

As shown in fig. 3, the image processing method according to an embodiment of the present disclosure may include operations S310 to S360.

First, in operation S310, a first image is acquired, where the first image includes first steganographic information. In the application scenario 100, this first image is for example an encrypted image 105. The first image may be, for example, an encrypted image 206 in the system architecture 200.

Then, in operation S320, the first image is decoded using a decoder network to obtain first steganographic information.

According to the embodiment of the disclosure, the decoder network can be trained through a deep learning method.

Fig. 4 schematically shows a flow chart of a decoder network training method according to yet another embodiment of the present disclosure.

As shown in fig. 4, the decoder network training method may include operations S410 to S440.

In operation S410, at least one second image is acquired.

In operation S420, second steganographic information is added to the second image using the encoder network to obtain a first encrypted image.

According to an embodiment of the disclosure, adding the second steganographic information to the second image with the encoder network comprises: and embedding the bit string obtained by encoding into the second image.

According to the embodiment of the present disclosure, the steganographic information may be in a text form or a picture form, for example. For example, a unique bit string (i.e., a sequence of binary bits, each bit having a value of 0 or 1) may be generated for steganographic information by way of BCH (Bose-Chaudhuri-Hocquenghem) encoding, and then the bit string may be embedded into the original image (e.g., the second image) using the encoder network, thereby generating an encrypted image (e.g., the first image or the first encrypted image) that is visually indistinguishable from the original image.

According to an embodiment of the present disclosure, the encoder network may embed bit strings of different lengths into the second image, for example, may be encoded with a BCH code, and have the bit strings at least 56 bits in length. In one embodiment, for example, the bit string may be made 100 bits long, which enables both the encoder network to provide good image quality and the decoder network to achieve better decoding accuracy.

According to an embodiment of the present disclosure, for example, a plurality of data sets may be produced, each data set comprising at least one second image. Specifically, for example, at least one second image may be randomly selected from the database and resized to 400 × 3 to form a data set.

In operation S430, the first encrypted image is physically processed to obtain a second encrypted image, where the physical processing is used to enable the second encrypted image to have an environmental disturbance effect generated in the process of capturing the first encrypted image when the first encrypted image is displayed in a physical environment.

According to embodiments of the present disclosure, the presentation of the first encrypted image in the physical environment may include, for example, printing or displaying the first encrypted image on an electronic display and rendering in the real world. The user can take the first encrypted image after printing in a real environment or take the first encrypted image on a display. In the shooting process, an environmental disturbance effect can be generated and is reflected in the shot image. Thus, the first encrypted image may be physically processed to mimic the characteristics of the first encrypted image propagating in the real world, and a set of sample data for training the decoder network may be generated in bulk to improve the robustness of the decoder network in the real world.

According to an embodiment of the present disclosure, the performing the physical processing on the first encrypted image includes: the first encrypted image is subjected to at least one of perspective transformation, motion blur, or defocus blur.

According to an embodiment of the present disclosure, physically processing the first encryption map further includes: and performing at least one of affine color transformation, gaussian noise or JPEG compression on the first encrypted image.

The detailed process of the physical and chemical processing of the first encrypted image will be further described with reference to fig. 5.

Fig. 5 schematically shows a schematic diagram of physically processing a first encrypted image according to an embodiment of the present disclosure.

As shown in fig. 5, picture 510 may be, for example, an encrypted image output from an encoder network.

According to an embodiment of the disclosure, a perspective transformation operation is performed on the picture 510, so as to obtain a picture 520. When a user takes a picture 510 in the real world, the camera is not always precisely aligned with the picture 510, so that a perspective transformation effect is generated. Assuming that a pinhole camera is used for shooting, any two images of the same plane in space are related together by homography. Thus, for example, a randomly generated homography matrix may be employed to simulate producing a perspective change, uniformly randomly labeling the positions of the four corners within a fixed range (+ -10%, i.e. + -40 pixels), then solving the homography matrix to map the corners of picture 510 to new positions, and bilinear resampling to generate perspective transformed picture 520.

According to the embodiment of the disclosure, the picture 530 can be obtained by performing a motion blur or defocus blur operation on the picture 520. In which camera motion and mis-focus generated during user photographing may cause blurring. Thus, for example, random angles may be sampled and motion blur simulated using a straight-line blur kernel, where the width of the straight-line blur kernel is sampled randomly between 3-7 pixels. It is also possible to simulate a mis-focus using, for example, gaussian blur with a standard deviation of randomly sampled between 1-3 pixels. It should be understood that in practical applications, motion blur or defocus blur may be applied simultaneously or alternatively.

According to the embodiment of the disclosure, the picture 540 may be obtained by performing a random affine color transformation operation on the picture 530. The color gamut of printers and displays is quite limited compared to the full RGB color space, while cameras use exposure settings, white balance and color correction matrices to modify and output images captured from printers and displays. Thus, for example, the perturbations produced in the camera output image can be approximated by a random affine color transform (constant over the entire image). Specifically, it may include: 1) Hue shift: adding a random color shift to each RGB channel (sampled uniformly from [0; 2) Degree of desaturation: carrying out random linear interpolation between the whole RGB image and the gray equivalent image thereof; 3) Brightness and contrast adjustment: affine histogram adjustment is performed according to a x + b, where a is responsible for adjusting the contrast and uniformly sampling between 0.5,1.5 b is responsible for adjusting the brightness and uniformly sampling between-0.3, and x is an element in the pixel matrix of the picture 530.

According to an embodiment of the present disclosure, gaussian noise is added to picture 540 to obtain picture 550. During shooting, imaging noise caused by the camera may include photon noise, dark noise, and shot noise. Under the scene of non-weak light imaging, gaussian noise can be used to simulate the imaging noise (the standard deviation is uniformly sampled in the [0.

According to an embodiment of the present disclosure, a JPEG compression operation is performed on picture 550 to obtain picture 560. Images taken by cameras are typically stored in a lossy format (e.g., JPEG) which is based on compressing the image by quantizing the resulting coefficients by computing a discrete cosine transform for each 8 x 8 region in the image 450 and rounding it to the nearest integer, where the JPEG quality is randomly sampled in [50, 100 ].

It should be noted that, although the picture 510 shows a snake, the picture 510 is only an example for explaining a detailed process of physically processing the first encrypted image, and the content and the shape of the first encrypted image are not limited in the present disclosure.

By using the image processing method of the embodiment of the disclosure, the distortion caused by physical display and imaging can be approximately estimated by enabling the second encrypted image to have an environmental disturbance effect generated in the process of acquiring the first encrypted image in a real environment, for example, by physically and chemically processing the first encrypted image. Therefore, a plurality of second encrypted images can be generated in batch to train the decoder network, so that the decoder network can have higher decoding precision when applied in a real environment.

In operation S440, the decoder network is trained using at least one second encrypted image.

In one embodiment of the present disclosure, for example, an image processing method may be used instead of the prior art method of spreading information in the form of two-dimensional codes or bar codes. For example, in a payment scene, a landscape picture is used for replacing a payment two-dimensional code, so that the high aesthetic property is achieved. For example, a picture similar to or identical to the content of the surrounding environment is arranged in the field environment, so that the steganography information is acquired by using a decoder network with certain concealment and stronger robustness in the field environment.

By using the image processing method of the embodiment of the disclosure, steganographic information can be encoded into any natural image. Because the environmental disturbance effect is added in the training process, the robustness of the decoder network in the real world can be improved, and the decoding precision is higher. The decoder network can obtain accurate steganographic information from the captured encrypted image even in the physical display scenarios of different types of printer, display and camera combinations.

Fig. 6 schematically shows a flow diagram of a decoder network training method according to another embodiment of the present disclosure.

As shown in fig. 6, the method for training a decoder network may further include operations S641 through S643 in addition to operations S410 through S430.

In operation S410, at least one second image is acquired.

In operation S430, the first encrypted image is physically processed to obtain a second encrypted image, where the physical processing is used to make the second encrypted image have an environmental disturbance effect generated in the process of capturing the first encrypted image when the first encrypted image is displayed in a physical environment.

Operations S410 to S430 are the same as above, and are not described herein again.

Next, in operation S641, at least one second encrypted image is used as an input of the decoder network, resulting in third steganographic information.

In operation S642, a decoding loss of the decoder network is obtained based on the third steganographic information and its corresponding second steganographic information.

In operation S643, a decoder network is trained based on a first objective function, wherein the first objective function includes a decoding loss.

According to an embodiment of the disclosure, the goal of the decoder network is to recover the second steganographic information from the second encrypted image. The image may first be processed, for example, using a Spatial Transformer Network (STN) to make the decoder network spatially invariant and robust against the perspective transform present in the second encrypted image. The processed image will then pass through a series of convolutional layers, fully-connected layers, and Sigmoid functions to generate third steganographic information of the same length as the second steganographic information.

According to embodiments of the present disclosure, a cross-entropy loss function may be used as a decoding loss in the training process of the decoder network. The cross entropy can be used to determine the proximity of the third steganographic information and the corresponding second steganographic information.

Fig. 7 schematically shows a flow diagram of a decoder network training method according to yet another embodiment of the present disclosure.

As shown in fig. 7, the method for training a decoder network may further include operations S710 to S720, in addition to operations S410 to S430 and operations S641 to S642.

In operation S410, at least one second image is acquired.

In operation S641, at least one second encrypted image is used as an input to the decoder network, resulting in third steganographic information.

Operations S410 to S430 and operations S641 to S642 are the same as above, and are not described herein again.

Next, in operation S710, an encoding loss of the encoder network is obtained based on the quality difference of the at least one second image and the first encrypted image corresponding to each second image.

In operation S720, the decoder network and the encoder network are simultaneously trained based on a second objective function, wherein the second objective function includes a decoding loss and an encoding loss.

According to an embodiment of the present disclosure, a goal of training the encoder network is to embed the second steganographic information in the second image while minimizing quality differences between the first encrypted image and the second image. The encoder network can be implemented, for example, using a U-Net network architecture, the inputs of which comprise a second image (RGB three channel) and second steganographic information.

According to an embodiment of the present disclosure, to accelerate convergence, the encoder network may pre-process the second steganographic information, for example, forming a bit string of length 100 bits into a tensor via the full connection layer, and then upsampling it. And embedding the preprocessed second steganographic information into the second image.

According to the embodiment of the present disclosure, a residual regularization term may also be introduced to obtain coding loss, for example, by calculating the pixel squared differences of the second images and the first encrypted image corresponding to each second image.

According to an embodiment of the present disclosure, the second objective function may be of the form:

L＝λ ₁ L ₁ +λ ₂ L ₂ (formula 1)

Wherein L is ₁ For decoding loss, L ₂ For coding losses, λ ₁ And λ ₂ Coefficients for adjusting the degree of influence of the decoding and coding losses on the second objective function, where ₁ And λ ₂ Is less than or equal to 1.

Fig. 8 schematically shows a flow chart of a decoder network training method according to a further embodiment of the present disclosure.

As shown in fig. 8, the method for training a decoder network may further include operations S810 to S830 in addition to operations S410 to S430, operations S641 to S642, and operation 710. The critic network and the encoder network can form a confrontation network in the training process of the decoder network, and the decoder network, the encoder network and the critic network are trained simultaneously.

In operation S410, at least one second image is acquired.

In operation S710, an encoding loss of the encoder network is obtained based on a quality difference of the at least one second image and the first encrypted image corresponding to each second image.

Operations S410 to S430, S641 to S642, and S710 are the same as above, and are not described herein again.

Next, in operation S810, pixel distributions of the first encrypted image and the second image corresponding thereto are respectively obtained using the critic network.

In operation S820, a distance metric of a pixel distribution of the first encrypted image and the second image corresponding thereto is calculated to obtain a loss of comment.

In operation S830, a decoder network, an encoder network, and a critic network are simultaneously trained based on a third objective function, wherein the third objective function includes a second objective function and a critic loss.

According to an embodiment of the present disclosure, a critic network may be used to predict whether the second steganographic message was successfully embedded into the second image. The encoder network and the critic network form a confrontation network, and the second image or the first encrypted image is input into the critic network consisting of a series of convolution layers and a maximum pooling layer, so that the critic network commends the embedding effect of the encoder network. For example, the critic network may determine whether the input image is the second image or the first encrypted image, and feed back the determination result to the encoder network, so that the first encrypted image output by the encoder network gradually achieves the purpose of deceiving the critic network (that is, the critic network determines the first encrypted image to obtain the result that the first encrypted image is the second image).

According to an embodiment of the present disclosure, in training a critic network, a Wasserstein loss function may be used as a critic loss. Specifically, the critic network may calculate a Wasserstein distance of the pixel distribution of the first encrypted image and the second image corresponding to the first encrypted image by obtaining the pixel distribution of the first encrypted image and the second image corresponding to the first encrypted image, so as to construct the critic loss by using the Wasserstein distance, where, for example, a similarity of the pixel probability distribution of the first encrypted image and the second image corresponding to the first encrypted image may be represented by the Wasserstein distance to serve as a training index of the critic network.

According to an embodiment of the present disclosure, the third objective function may be of the form:

L′＝L+λ ₃ L ₃ ＝λ ₁ L ₁ +λ ₂ L ₂ +λ ₃ L ₃ (formula 2)

Wherein L is ₃ To comment on loss, λ _i (i =1,2,3) is a coefficient for adjusting the degree of influence of the encoding loss, the decoding loss, and the comment loss on the third objective function, wherein λ _i (i =1,2,3) is 1 or less.

According to the embodiment of the disclosure, before the encoder network, the decoder network and the critic network are trained simultaneously, one of the networks can be initially trained independently to have basic functions. For example, the critic network may be trained separately so that it can correctly classify the second image and the first encrypted image, and then trained against the encoder network. When training a critic network alone, let λ ₁ And λ ₂ Equal to zero.

By utilizing the image processing method of the embodiment of the disclosure, the encoder network, the decoder network and the critic network are trained at the same time, so that the encrypted image output by the encoder network has better quality (has smaller difference with the original image) and the decoder network has higher decoding precision. The balance between the encoder network and the decoder network can be obtained, namely, the quality and the decoding precision of the encrypted image are considered simultaneously.

Fig. 9 schematically shows a flow chart of an image processing method according to another embodiment of the present disclosure.

As shown in fig. 9, the image processing method may include operation S310 and operations S921 to S922

In operation S310, a first image including first steganographic information is acquired.

In operation S921, the first image is preprocessed using a detector network.

In operation S922, first steganographic information is decoded from the preprocessed first image by using a decoder network.

According to an embodiment of the present disclosure, referring to fig. 1, the user 108 captures the printed encrypted image 104 to obtain the encrypted image 105, and it can be seen that the encrypted image 105 contains the content of the encrypted image 104 and also has other redundant content, i.e. the encrypted image 105 has a larger field of view than the encrypted image 104. The decoder network may have a large impact on the decoding accuracy if it directly processes the encrypted image 105 with a larger field of view. Therefore, in order to improve robustness in the real world, it is necessary to detect and correct an image of a large field of view region to acquire the encrypted image 104, and then decode it.

According to embodiments of the present disclosure, the detector network may segment out regions containing encrypted image 104 content using the semantic segmentation network BiSeNet.

Fig. 10 schematically shows a flow chart of a decoder network training method according to yet another embodiment of the present disclosure.

As shown in fig. 10, the method of training a decoder network may further include operations S1010 to S1020, in addition to operations S410 to S430.

In operation S410, at least one second image is acquired.

Next, in operation S1010, the at least one second encrypted image is processed using the detector network to obtain at least one third encrypted image.

In operation S1020, the decoder network is trained using at least one third encrypted image.

The process of obtaining the third encrypted image in operation S1010 is described in detail below with reference to fig. 11 and 12.

Fig. 11 schematically shows a flowchart for obtaining a third encrypted image according to an embodiment of the present disclosure.

As shown in fig. 11, operation S1010 may further include, for example, operations S1111 through S1113.

In operation S1111, at least one first region is detected from the second encrypted image.

In operation S1112, a second area is determined from the at least one first area, wherein the second area includes the content of the first encrypted image.

In operation S1113, the second encrypted image is cropped based on the second area to obtain a third encrypted image.

According to an embodiment of the present disclosure, referring to fig. 1, for example, when the encrypted image 105 is detected by a detector network, a plurality of first areas are detected. Further, a second area is determined from the plurality of first areas, the second area containing the content of the encrypted image 104. Then, the second area is cut out from the encrypted image 105, and the encrypted image 104 is obtained.

It should be noted that, during the process of physically processing the first encrypted image, other image contents may be added on the basis of the first encrypted image, so that the obtained second encrypted image has a larger field of view area similar to the encrypted image 105, thereby introducing the detector network to participate in the training process of the decoder network.

According to the embodiment of the disclosure, after the first encrypted image is subjected to the physical processing, the operation of partially shielding the area containing the content of the first encrypted image in the second encrypted image can be performed, and then the second encrypted image is utilized to train the decoder network. Therefore, the method can better simulate the environmental disturbance generated by collecting the encrypted image in the real environment, and improve the robustness of the decoder network.

Fig. 12 schematically shows a flowchart for obtaining a third encrypted image according to still another embodiment of the present disclosure.

As shown in fig. 12, operation S1113 may further include, for example, operations S1210 to S1230.

In operation S1210, a contour of the second region is fitted by a polygon.

In operation S1220, the second encrypted image is cropped based on the polygon.

In operation S1230, the cropped polygon image is homography-transformed to obtain a third encrypted image.

In accordance with embodiments of the present disclosure, a trimmed semantic segmentation network BiSeNet may be used in the design of the detector network. The data set required to make its fine adjustments may be, for example, a random synthesis of the encoded image into a DIV2K high resolution image. When using a detector network, the contour of the proposed area (the proposed area is for example an image area comprising steganographic information) can be fitted by a quadrilateral, then a homography matrix is calculated, and this quadrilateral is then homography transformed to provide input to the decoder network.

When the first encrypted image is displayed in a physical environment, the content of the first encrypted image only occupies a part of the area of the second encrypted image in the acquisition process, and the original angle and the original area of the image content displayed in the area may change compared with the first encrypted image (refer to fig. 1). By using the image processing method of the embodiment of the disclosure, the encrypted image can be detected and processed in advance by using the detector network, and then the encrypted image is input into the decoder network, so as to ensure the decoding precision of the decoder network.

Fig. 13 schematically shows a flow chart of a decoder network training method according to yet another embodiment of the present disclosure.

As shown in fig. 13, training the decoder network using the third encryption network may further include operations S1310 to S1320, in addition to operations S410 to S430 and operations S1010 to S1020, wherein the detector network is trained through operations S1310 to S1320.

In operation S410, at least one second image is acquired.

In operation S1010, the at least one second encrypted image is processed using the detector network to obtain at least one third encrypted image.

Operations S410 to S430 and operations S1010 to S1020 are the same as above, and are not described herein again.

Next, in operation S1310, a detection loss of the detector network is obtained based on the perceptual loss of the third encrypted image and its corresponding first encrypted image.

According to an embodiment of the present disclosure, the detector network may obtain the detection loss using LPIPS (linear Perceptual Image Patch Similarity) Perceptual loss. The third encrypted image obtained by cropping through the contrast detector network and the first encrypted image corresponding to the third encrypted image are perceptually similar, for example, the image features of the third encrypted image and the first encrypted image corresponding to the third encrypted image can be extracted through a neural network, and the LPIPS score is obtained based on the image feature difference between the third encrypted image and the first encrypted image, so as to construct the detection loss.

In operation S1320, a detector network is trained based on the detection loss.

It is understood that the sequence of steps shown in fig. 13 is only an example, and operations S1310 to S1320 of the training detector network and operation S1020 of the training decoder network in the embodiment of the present disclosure may not have a specific sequential order, and may be parallel, for example.

Fig. 14 schematically shows a block diagram of an image processing apparatus 1400 according to an embodiment of the present disclosure.

As shown in fig. 14, the image processing apparatus 1400 may include an acquisition module 1410 and an encoding module 1420.

The acquisition module 1410 may perform operation S310, for example, for acquiring a first image, the first image including first steganographic information.

The decoding module 1420 may, for example, perform operation S320 for decoding the first image with a decoder network to obtain first steganographic information. Wherein training a decoder network by: at least one second image is acquired, and second steganographic information is added to the second image using the encoder network to obtain a first encrypted image. And performing physical and chemical processing on the first encrypted image to obtain a second encrypted image, wherein the physical processing is used for enabling the second encrypted image to have an environment disturbance effect generated in the process of acquiring the first encrypted image when the first encrypted image is displayed in a physical environment, and the decoder network is trained by utilizing at least one second encrypted image.

According to an embodiment of the present disclosure, the image processing apparatus 1400 may further include a first training sub-module, which may perform operations S641 to S643, for example, to take at least one second encrypted image as an input of the decoder network, obtain third steganographic information, obtain a decoding loss of the decoder network based on the third steganographic information and a corresponding second steganographic information thereof, and train the decoder network based on a first objective function, where the first objective function includes the decoding loss.

According to an embodiment of the present disclosure, the image processing apparatus 1400 may further include a second training sub-module, which may perform, for example, operations S710 to S720, for obtaining an encoding loss of the encoder network based on a quality difference of at least one second image and the first encrypted image corresponding to each second image, and training the decoder network and the encoder network simultaneously based on a second objective function, wherein the second objective function includes a decoding loss and an encoding loss.

According to an embodiment of the present disclosure, the image processing apparatus 1400 may further include a third training sub-module, which causes the critic network and the encoder network to form a countermeasure network during training of the decoder, and the third training sub-module may perform operations S810 to S830, for example, to obtain pixel distributions of the first encrypted image and the second image corresponding thereto, respectively, by using the critic network, and calculate a distance metric of the pixel distributions of the first encrypted image and the second image corresponding thereto, so as to obtain the criticality loss. And simultaneously training the decoder network, the encoder network, and the critic network based on a third objective function, wherein the third objective function comprises the second objective function and the critic loss.

According to an embodiment of the present disclosure, the image processing apparatus 1400 may further include a first preprocessing module, which may perform operations S921 to S922, for example, to preprocess the first image by using the detector network and decode the preprocessed first image by using the decoder network to obtain the first steganographic information.

According to an embodiment of the present disclosure, the image processing apparatus 1400 may further include a second preprocessing module, which may perform operations S1010 to S1020, for example, for processing the at least one second encrypted image by using the detector network to obtain at least one third encrypted image, wherein the operations include: at least one first area is detected from the second encrypted image, and a second area is determined from the at least one first area, wherein the second area includes the content of the first encrypted image. Cropping the second encrypted image based on the second region to obtain a third encrypted image, and training a decoder network with the at least one third encrypted image.

According to an embodiment of the present disclosure, the image processing apparatus 1400 may further include a third preprocessing module, which may perform operations S1210 to S1230, for example, for fitting a contour of the second region by one polygon, cropping the second encrypted image based on the polygon, and performing homography transformation on the cropped polygon image to obtain a third encrypted image.

According to an embodiment of the present disclosure, the image processing apparatus 1400 may further include a fourth training sub-module, which may perform operations S1310 to S1320, for example, to obtain a detection loss of the detector network based on the perceptual loss of the third encrypted image and the corresponding first encrypted image thereof, and train the detector network based on the detection loss.

Fig. 15 schematically shows a block diagram of a decoder network training apparatus 1500 according to an embodiment of the present disclosure.

As shown in fig. 15, the decoder network training apparatus 1500 may include an acquisition module 1510, an encoding module 1520, a processing module 1530, and a training module 1540.

The acquisition module 1510 may perform operation S410, for example, for acquiring at least one second image.

The encoding module 1520 may perform, for example, operation S420 for adding the second steganographic information to the second image using the encoder network to obtain the first encrypted image.

The processing module 1530 may, for example, perform operation S430, for performing a physicochemical process on the first encrypted image to obtain a second encrypted image, where the physicochemical process is used to make the second encrypted image have an environmental disturbance effect generated in a process of acquiring the first encrypted image when the first encrypted image is displayed in a physical environment.

The training module 1540 may, for example, perform operation S440 for training the decoder network with at least one second encrypted image.

Any number of modules, sub-modules, units, sub-units, or at least part of the functionality of any number thereof according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, and sub-units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or by any other reasonable means of hardware or firmware for integrating or packaging a circuit, or by any one of or a suitable combination of any of software, hardware, and firmware. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the disclosure may be at least partially implemented as a computer program module, which when executed may perform the corresponding functions.

For example, the modules in the image processing apparatus 1400 or the decoder network training apparatus 1500 may be combined and implemented in one module, or any one of the modules may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the modules in the image processing apparatus 1400 or the decoder network training apparatus 1500 may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or in any one of or a suitable combination of three implementations of software, hardware, and firmware. Alternatively, at least one of the modules in the image processing apparatus 1400 or the decoder network training apparatus 1500 may be at least partially implemented as a computer program module, which when executed, may perform a corresponding function.

FIG. 16 schematically illustrates a block diagram of a computer system suitable for implementing the object processing method and system according to an embodiment of the present disclosure. The computer system illustrated in FIG. 16 is only one example and should not impose any limitations on the scope of use or functionality of embodiments of the disclosure.

As shown in fig. 16, a computer system 1600 according to an embodiment of the present disclosure includes a processor 1601, which can perform various suitable actions and processes according to a program stored in a Read Only Memory (ROM) 1602 or a program loaded from a storage portion 1608 into a Random Access Memory (RAM) 1603. Processor 1601 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or related chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. Processor 1601 may also include on-board memory for caching purposes. Processor 1601 may include a single processing unit or multiple processing units for performing different actions of a method flow according to embodiments of the present disclosure.

In the RAM 1603, various programs and data necessary for the operation of the system 1600 are stored. The processor 1601, the ROM 1602, and the RAM 1603 are connected to each other by a bus 1604. The processor 1601 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM 1602 and/or the RAM 1603. It is to be noted that the program may also be stored in one or more memories other than the ROM 1602 and the RAM 1603. The processor 1601 may also perform various operations of the method flows according to the embodiments of the present disclosure by executing programs stored in one or more memories.

In accordance with an embodiment of the present disclosure, the system 1600 may also include an input/output (I/O) interface 1605, the input/output (I/O) interface 1605 also being connected to the bus 1604. The system 1600 may also include one or more of the following components connected to the I/O interface 1605: an input portion 1606 including a keyboard, a mouse, and the like; an output portion 1607 including a display device such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage portion 1608 including a hard disk and the like; and a communication section 1609 including a network interface card such as a LAN card, a modem, or the like. The communication section 1609 performs communication processing via a network such as the internet. The driver 1610 is also connected to the I/O interface 1605 as needed. A removable medium 1611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1610 as necessary, so that a computer program read out therefrom is mounted in the storage portion 1608 as necessary.

According to an embodiment of the present disclosure, the method flow according to an embodiment of the present disclosure may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer-readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such embodiments, the computer program may be downloaded and installed from a network through the communication part 1609, and/or installed from the removable medium 1611. The computer program, when executed by the processor 1601, performs the above-described functions defined in the system of the embodiments of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include one or more memories other than ROM 1602 and/or RAM 1603 and/or ROM 1602 and RAM 1603 described above.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

It will be appreciated by those skilled in the art that various combinations and/or combinations of the features recited in the various embodiments of the disclosure and/or the claims may be made even if such combinations or combinations are not explicitly recited in the disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.

The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims

1. An image processing method comprising:

acquiring a first image, wherein the first image comprises first steganographic information;

after the first image is subjected to cutting and homography transformation preprocessing by using a detector network, decoding the preprocessed first image by using a decoder network to obtain first steganographic information; wherein training the decoder network comprises:

acquiring at least one second image;

adding second steganographic information to the second image using an encoder network to obtain a first encrypted image;

the method comprises the steps of performing physical and chemical processing and/or partial shielding on a first encrypted image to obtain a second encrypted image, wherein the physical and chemical processing comprises perspective transformation, and the physical and chemical processing is used for enabling the second encrypted image to have an environment disturbance effect generated in the process of collecting the first encrypted image when the first encrypted image is displayed in a physical environment;

training the detector network and the decoder network with at least one third encrypted image;

wherein, prior to training the detector network and the decoder network with at least one third encrypted image, the method further comprises processing at least one of the second encrypted images with the detector network to obtain at least one third encrypted image:

fitting a contour of a second region by a polygon, wherein the second region is determined from the second encrypted image by a detector network;

cropping the second encrypted image based on the polygon; and

and performing homography transformation on the cut polygonal image to obtain the third encrypted image.

2. The image processing method of claim 1, wherein said training the decoder network with at least one of the second encrypted images comprises:

taking at least one second encrypted image as the input of the decoder network to obtain third steganographic information;

obtaining a decoding loss of the decoder network based on the third steganographic information and the second steganographic information corresponding to the third steganographic information; and

training the decoder network based on a first objective function, wherein the first objective function includes the decoding loss.

3. The image processing method of claim 2, wherein training the decoder network further comprises:

obtaining an encoding loss for the encoder network based on a difference in quality of at least one of the second images and the first encrypted image corresponding to each of the second images; and

training the decoder network and the encoder network simultaneously based on a second objective function, wherein the second objective function includes the decoding penalty and the encoding penalty.

4. The image processing method according to claim 3, wherein a critic network forms a countermeasure network with the encoder network during training of the decoder; wherein said training said decoder network further comprises:

respectively obtaining the pixel distribution of the first encrypted image and the second image corresponding to the first encrypted image by using the critic network;

calculating a distance measure of the pixel distribution of the first encrypted image and the second image corresponding thereto to obtain a loss of comment; and

simultaneously training the decoder network, the encoder network, and the critic network based on a third objective function, wherein the third objective function includes the second objective function and the critic loss.

5. The image processing method of claim 1, wherein said processing at least one of said second encrypted images with said detector network to obtain at least one third encrypted image further comprises:

detecting at least one first region from the second encrypted image;

determining a second area from at least one of the first areas, wherein the second area includes the content of the first encrypted image;

cropping the second encrypted image based on the second region to obtain a third encrypted image;

and

training the decoder network with at least one of the third encrypted images.

6. The image processing method of claim 5, wherein the training of the detector network and the decoder network with at least one third encrypted image comprises:

obtaining a detection loss of the detector network based on a perceptual loss of the third encrypted image and the first encrypted image corresponding thereto; and

training the detector network based on the detection loss.

7. The image processing method according to claim 1, wherein said physically processing the first encrypted image further comprises:

performing at least one of motion blur or defocus blur on the first encrypted image.

8. The image processing method according to claim 7, wherein the physically processing the first encryption map further comprises:

performing at least one of affine color transformation, gaussian noise, or JPEG compression on the first encrypted image.

9. The image processing method of claim 1, wherein said adding second steganographic information to the second image with an encoder network comprises:

embedding a bit string obtained by encoding into the second image.

10. A decoder network training method, comprising:

acquiring at least one second image;

training a detector network and the decoder network with at least one third encrypted image;

cropping the second encrypted image based on the polygon; and

11. An image processing apparatus comprising:

the acquisition module is used for acquiring a first image, and the first image comprises first steganographic information;

the decoding module is used for decoding the preprocessed first image by using a decoder network after the first image is subjected to cutting and homography transformation preprocessing by using a detector network so as to obtain first steganographic information; wherein training the decoder network comprises:

acquiring at least one second image;

performing physical and chemical processing and/or partial blocking on the first encrypted image to obtain a second encrypted image, wherein the physical and chemical processing comprises perspective transformation, and the physical and chemical processing is used for enabling the second encrypted image to have an environment disturbance effect generated in the process of acquiring the first encrypted image when the first encrypted image is displayed in a physical environment;

wherein, before training the detector network and the decoder network with the at least one third encrypted image, further comprising processing the at least one second encrypted image with the detector network to obtain at least one third encrypted image:

cropping the second encrypted image based on the polygon; and

12. A decoder network training apparatus, comprising:

an acquisition module for acquiring at least one second image;

an encoding module for adding second steganographic information to the second image using an encoder network to obtain a first encrypted image;

the processing module is used for carrying out physical and chemical processing and/or partial shielding on the first encrypted image to obtain a second encrypted image, wherein the physical and chemical processing comprises perspective transformation, and the physical and chemical processing is used for enabling the second encrypted image to have an environment disturbance effect generated in the process of acquiring the first encrypted image when the first encrypted image is displayed in a physical environment;

a training module for training a detector network and the decoder network with at least one third encrypted image;

cropping the second encrypted image based on the polygon; and

13. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-10.

14. A computer readable medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any one of claims 1 to 10.