CN113362403A - Training method and device of image processing model - Google Patents

Training method and device of image processing model Download PDF

Info

Publication number
CN113362403A
CN113362403A CN202110820836.6A CN202110820836A CN113362403A CN 113362403 A CN113362403 A CN 113362403A CN 202110820836 A CN202110820836 A CN 202110820836A CN 113362403 A CN113362403 A CN 113362403A
Authority
CN
China
Prior art keywords
model
result
image
image processing
processing model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110820836.6A
Other languages
Chinese (zh)
Inventor
程正雪
符婷
胡家鹏
周大江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202110820836.6A priority Critical patent/CN113362403A/en
Publication of CN113362403A publication Critical patent/CN113362403A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

According to the method, a coder included in the image processing model is used for compressing a sample image to obtain compressed data, and a decoder included in the image processing model is used for decompressing and restoring the compressed data to obtain a restored image; inputting the restored image into a discrimination model, and judging the authenticity attribute of the restored image by the discrimination model to obtain a first result; based on at least the first result, model parameters of the image processing model are adjusted.

Description

Training method and device of image processing model
Technical Field
One or more embodiments of the present disclosure relate to the field of machine learning technologies, and in particular, to a method and an apparatus for training an image processing model.
Background
The image encoding and decoding technology is one of the key technologies in the field of internet multimedia, and greatly influences the transmission and use of images. For example, in some cases, it is necessary to compress an image, transmit the compressed data to improve transmission efficiency, and then restore the image. There are currently schemes for image compression and restoration by neural network models.
Disclosure of Invention
One or more embodiments of the present specification provide a method and an apparatus for training an image processing model.
According to a first aspect, there is provided a training method of an image processing model, comprising:
compressing a sample image by using an encoder included in the image processing model to obtain compressed data, and decompressing and restoring the compressed data by using a decoder included in the image processing model to obtain a restored image;
inputting the restored image into a discrimination model, and judging the authenticity attribute of the restored image by the discrimination model to obtain a first result;
based on at least the first result, model parameters of the image processing model are adjusted.
Optionally, the method further includes:
inputting the sample image into the discrimination model, and judging the authenticity attribute of the sample image by the discrimination model to obtain a second result;
based on at least the first result, adjusting model parameters of the image processing model, including:
based on the first result and the second result, model parameters of the image processing model are adjusted.
Optionally, the adjusting model parameters of the image processing model based on the first result and the second result includes:
determining a first loss comprising a first loss term; the first loss term is obtained by adopting a relative average least square method based on the first result and the second result;
based on the first loss, model parameters of the image processing model are adjusted.
Optionally, the first loss further includes a second loss term; the second loss term is derived based on a loss of quality of the restored image relative to the sample image.
Optionally, the method further includes:
and adjusting the model parameters of the discriminant model based on the first result and the second result.
Optionally, wherein the method further comprises:
inputting the compressed data into the discriminant model;
wherein the determining the authenticity attribute of the restored image comprises:
judging the authenticity attribute of the restored image based on the compressed data;
wherein the determining the authenticity attribute of the sample image comprises:
and judging the authenticity attribute of the sample image based on the compressed data.
Optionally, the encoder is a first neural network; the decoder is a second neural network.
Optionally, the adjusting the model parameters of the image processing model includes:
network parameters of the first neural network and the second neural network are adjusted respectively.
Optionally, the encoder includes a compressor adopting a preset compression standard; the decoder comprises a decompressor and a third neural network, wherein the decompressor adopts a decompression standard corresponding to the preset compression standard.
Optionally, the adjusting the model parameters of the image processing model includes:
adjusting a network parameter of the third neural network.
According to a second aspect, there is provided an apparatus for training an image processing model, comprising:
the processing module is used for compressing the sample image by using an encoder included in the image processing model to obtain compressed data, and decompressing and restoring the compressed data by using a decoder included in the image processing model to obtain a restored image;
the first judging module is used for inputting the restored image into a judging model, and judging the authenticity attribute of the restored image by the judging model to obtain a first result;
a first adjusting module for adjusting model parameters of the image processing model based at least on the first result.
Optionally, the apparatus further comprises:
the second judging module is used for inputting the sample image into the judging model, and judging the authenticity attribute of the sample image by the judging model to obtain a second result;
wherein the first adjusting module comprises:
an adjusting sub-module for adjusting model parameters of the image processing model based on the first result and the second result.
Optionally, the adjusting sub-module is configured to:
determining a first loss comprising a first loss term; the first loss term is obtained by adopting a relative average least square method based on the first result and the second result;
based on the first loss, model parameters of the image processing model are adjusted.
Optionally, the first loss further includes a second loss term; the second loss term is derived based on a loss of quality of the restored image relative to the sample image.
Optionally, the apparatus further comprises:
and the second adjusting module is used for adjusting the model parameters of the discriminant model based on the first result and the second result.
Optionally, the apparatus further comprises:
the input module is used for inputting the compressed data into the discrimination model;
wherein the first determination module is configured to:
judging the authenticity attribute of the restored image based on the compressed data;
wherein the second determination module is configured to:
and judging the authenticity attribute of the sample image based on the compressed data.
Optionally, the encoder is a first neural network; the decoder is a second neural network.
Optionally, the first adjusting module is configured to:
network parameters of the first neural network and the second neural network are adjusted respectively.
Optionally, the encoder includes a compressor adopting a preset compression standard; the decoder comprises a decompressor and a third neural network, wherein the decompressor adopts a decompression standard corresponding to the preset compression standard.
Optionally, the first adjusting module is configured to:
adjusting a network parameter of the third neural network.
According to a third aspect, there is provided a computer readable storage medium, storing a computer program which, when executed by a processor, implements the method of any of the first aspects above.
According to a fourth aspect, there is provided a computing device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any of the first aspects when executing the program.
The technical scheme provided by the embodiment of the specification can have the following beneficial effects:
in the method and apparatus for training an image processing model provided in an embodiment of the present specification, in consideration of user perceptual quality, a sample image is compressed by an encoder included in the image processing model to obtain compressed data, the compressed data is decompressed and restored by a decoder included in the image processing model to obtain a restored image, the restored image is input to a discrimination model, an authenticity attribute of the restored image is determined by the discrimination model to obtain a first result, and a model parameter of the image processing model is adjusted based on at least the first result. Therefore, after the image processing model obtained by training compresses and restores the image, the texture details of the original image can be better saved, the definition of the restored image is improved, and the restored image is closer to the original image.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1a is a block diagram of a training image processing model architecture, shown in accordance with an exemplary embodiment of the present description;
FIG. 1b is a block diagram illustrating the architecture of an image processing model according to an exemplary embodiment of the present description;
FIG. 1c is a block diagram of another image processing model shown in accordance with an exemplary embodiment of the present description;
FIG. 2 is a flow diagram illustrating a method of training an image processing model according to an exemplary embodiment of the present description;
FIG. 3 is a block diagram of an image processing model training apparatus according to an exemplary embodiment shown in the present specification;
FIG. 4 is a block diagram illustrating a computing device according to an example embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. The following description refers to the accompanying drawings in which like numerals refer to the same or similar elements throughout the different views unless otherwise specified. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present specification. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the specification, as detailed in the appended claims.
The terminology used in the description herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
In the related art, image compression and decompression are performed by a neural network model, which is generally optimized with objective indexes such as PSNR and SSIM as targets. The optimization mode enables the image decompressed and restored through the model to have low user perception quality (including texture detail quality of the image and the like) and poor user experience.
In the embodiment of the description, the image processing model for image compression and decompression is trained based on the discrimination model, and the discrimination model is used for discriminating the authenticity of the restored image output by the image processing model, so that the trained image processing model can better store the texture details of the original image after the image is compressed and decompressed, and the user perception quality is improved.
FIG. 1a is a block diagram illustrating a framework for training an image processing model according to an exemplary embodiment.
As shown in fig. 1a, the discriminant model may be a previously trained discriminant or a discriminant to be trained. The image processing model may include an encoder and a decoder, wherein, in one implementation, both the encoder and the decoder may be convolutional neural networks. As shown in fig. 1b, the encoder is a first neural network and the decoder is a second neural network. In another implementation, the encoder may be a compressor using a common image compression standard, and the decoder may include a decompressor corresponding to the compressor and a convolutional neural network. As shown in fig. 1c, the encoder is a compressor and the decoder comprises a decompressor and a third neural network.
During the model training process, the following steps may be performed iteratively: first, a sample image may be taken from the sample data set and input to an encoder included in the image processing model. And compressing the sample image by using the encoder to obtain compressed data. Then, the compressed data is input to a decoder included in the image processing model, and the compressed data is decompressed and restored by the decoder to obtain a restored image.
Then, the restored image is input into a discrimination model, and the discrimination model judges the authenticity attribute of the restored image to obtain a first result aiming at the restored image. Optionally, the sample image may be input into a discrimination model, and the discrimination model may determine the authenticity attribute of the sample image to obtain a second result for the sample image. Further optionally, the compressed data may be input to a discrimination model, so that the discrimination model can determine the authenticity attribute of the sample image and the restored image based on the compressed data, thereby improving the accuracy of discrimination model determination.
Finally, model parameters of the image processing model may be adjusted based on the obtained first and second results. Specifically, a first loss for the image processing model may be determined based on the obtained first and second results, and a convolutional neural network included in the image processing model may be adjusted based on the first loss. Optionally, if the discriminant model is a discriminant to be trained, a second loss for the discriminant model may be determined based on the obtained first result and the second result, and the model parameter of the discriminant may be adjusted based on the second loss.
And stopping iteration to execute the steps after the condition of stopping iteration is met, and obtaining the trained image processing model. When the image needs to be compressed, the image can be compressed by an encoder included in the image processing model, so that compressed data is obtained. When decompression is needed, a decoder included in the image processing model can be used for decompressing the compressed data to obtain a restored image which is closer to the original image.
The embodiments provided in the present specification will be described in detail with reference to specific examples.
FIG. 2 is a flow diagram illustrating a method of training an image processing model according to an exemplary embodiment, which may be applied to any computing, processing capable device, platform, server, or cluster of devices, as shown in FIG. 2. The method comprises the following steps of carrying out updating operation in a plurality of iterations, wherein each updating operation comprises the following steps:
in step 201, an encoder included in the image processing model is used to compress the sample image to obtain compressed data, and a decoder included in the image processing model is used to decompress and restore the compressed data to obtain a restored image.
In this embodiment, the image processing model to be trained may include an encoder and a decoder, where the encoder may be used to compress an image, and the decoder may decompress the compressed image compressed by the encoder. In one implementation, the encoder may be embodied as a first neural network and the decoder may be embodied as a second neural network corresponding to the encoder. In another implementation, the encoder may be specifically a compressor that uses a preset compression standard, and the decoder may be specifically a decompressor and a third neural network that use a decompression standard corresponding to the preset compression standard. The preset compression standard may include, but is not limited to, VVC (scalable Video Coding), HEVC (High Efficiency Video Coding), WebP, JPEG, and the like.
In the model training process, each updating operation needs to acquire a batch of a plurality of sample images from the sample data set. And respectively inputting each sample image into an encoder included in the image processing model, and compressing each sample image by using the encoder to obtain each compressed data corresponding to each sample image. And decompressing and restoring each compressed data by using a decoder included in the image processing model to obtain each restored image corresponding to each sample image.
In step 202, the restored image is input to the discrimination model, and the discrimination model determines the authenticity attribute of the restored image to obtain a first result.
In this embodiment, the discriminant model may be a discriminant trained in advance, or may be a discriminant to be trained. The discrimination model may determine the authenticity attribute of the image, and the determination result indicates a probability that the image is true/false. Each restored image obtained by the image processing model can be input into the discrimination model, and the discrimination model can judge the authenticity attribute of each restored image to obtain each first result corresponding to each restored image.
Optionally, in step 203, the sample image is input into the discrimination model, and the authenticity attribute of the sample image is determined by the discrimination model to obtain a second result.
In this embodiment, each sample image may be input to the discrimination model, and the discrimination model may determine the authenticity attribute of each sample image to obtain each second result corresponding to each sample image.
Optionally, for any sample image, the compressed data corresponding to the sample image may be input to the discrimination model, so that the discrimination model can determine the authenticity attribute of the sample image and the authenticity attribute of the restored image corresponding to the sample image based on the compressed data, thereby obtaining a more accurate determination result.
In step 204, model parameters of the image processing model are adjusted based on at least the first result.
In one implementation, if only the restored image is input to the discriminant model, resulting in a first result output by the discriminant model, the model parameters of the image processing model may be adjusted based only on the first result. In another implementation, if the sample image is further input into the discriminant model, and a second result output by the discriminant model is obtained, adjusting the model parameters of the image processing model based on at least the first result may include: and a substep 211 of adjusting model parameters of the image processing model based on the first result and the second result.
Specifically, first, a first loss may be determined based on the first result and the second result, and then, based on the first loss, the model parameters of the image processing model are adjusted. Alternatively, a first loss term included in the first loss may be obtained based on the first result and the second result by using a relative average least square method. Further optionally, a second loss term included in the first loss may also be derived based on a quality loss of the restored image relative to the sample image. For example, the first loss may be calculated by the following formula:
Figure BDA0003171881950000071
wherein the content of the first and second substances,
Figure BDA0003171881950000072
representing the first loss, x represents any one sample image in the current batch,
Figure BDA0003171881950000073
representing any one of the restored images in the current batch,
Figure BDA0003171881950000074
represents: for any one sample in the current batchThe difference between the second result corresponding to the sample image and the average of the first results of all the batches,
Figure BDA0003171881950000075
represents: aiming at any one restored image in the current batch, the difference between the first result corresponding to the restored image and the average value of all the second results of the batch,
Figure BDA0003171881950000076
represents all the current batch
Figure BDA0003171881950000077
The square of the sum and 1 is averaged,
Figure BDA0003171881950000078
represents all the current batch
Figure BDA0003171881950000079
The square of the difference with 1 is averaged,
Figure BDA00031718819500000710
a first loss term is represented as a function of,
Figure BDA00031718819500000711
representing the mean square error between all sample images of the current batch and the corresponding restored images,
Figure BDA00031718819500000712
a second loss term is represented as a function of,
Figure BDA00031718819500000713
representing the code rate of the compressed image.
In one implementation, if the encoder is embodied as a first neural network, the decoder is embodied as a second neural network corresponding to the encoder. Network parameters of the first and second neural networks may be adjusted based on the first loss, respectively. In another implementation, if the encoder is a compressor that uses a preset compression standard, the decoder is a decompressor and a third neural network that use a decompression standard corresponding to the preset compression standard. The network parameters of the third neural network may be adjusted based on the first loss.
Optionally, in step 205, based on the first result and the second result, the model parameters of the discriminant model are adjusted.
In this embodiment, if the discriminant model is a discriminant to be trained, the model parameters of the discriminant model may also be adjusted based on the first result and the second result.
Specifically, first, a second loss may be determined based on the first result and the second result, and then, based on the second loss, the model parameters of the discriminant model are adjusted. Alternatively, a relative average least squares method may be used to obtain the second loss based on the first result and the second result. For example, the second loss may be calculated by the following formula:
Figure BDA00031718819500000714
wherein the content of the first and second substances,
Figure BDA00031718819500000715
representing the second loss, other parameters can be referred to the interpretation of the above equation.
After the updating operation is executed in multiple iterations, if the condition of stopping the iteration is determined to be met, the training of the image processing model is completed. The image processing model comprises an encoder for compressing the image and a decoder for decompressing the image.
In the method for training an image processing model according to the above embodiment of the present specification, in consideration of user perceived quality, a sample image is compressed by an encoder included in the image processing model to obtain compressed data, the compressed data is decompressed and restored by a decoder included in the image processing model to obtain a restored image, the restored image is input to a determination model, an authenticity attribute of the restored image is determined by the determination model to obtain a first result, and a model parameter of the image processing model is adjusted based on at least the first result. Therefore, after the image processing model obtained by training compresses and restores the image, the texture details of the original image can be better saved, the definition of the restored image is improved, and the restored image is closer to the original image.
It should be noted that the training method of the image processing model provided in fig. 2 may be performed in a training device, and after the training device has trained the image processing model, the training device may provide the encoder and the decoder included in the image processing model to each terminal device, which may include, but is not limited to, a mobile terminal device such as a smart phone, a smart wearable device, an image capture device, a computer, a server, and so on. For example, in the process of playing a video, the video playing platform may compress each frame of image of the video by using the encoder included in the image processing model, so as to obtain a compressed code stream. Then, the compressed code stream is transmitted to a user terminal, and the user terminal decompresses the compressed code stream by adopting a decoder included in the image processing model to obtain a restored image of each frame of the video. Therefore, the video can be played, the image of the video has higher definition, and the image quality of the video is closer to that of the original video.
It should be noted that although in the above embodiments, the operations of the methods of the embodiments of the present specification have been described in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Rather, the steps depicted in the flowcharts may change the order of execution. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
Corresponding to the foregoing embodiments of the training method of the image processing model, the present specification also provides embodiments of a training apparatus of the image processing model.
As shown in fig. 3, fig. 3 is a block diagram of an apparatus for training an image processing model according to an exemplary embodiment, and the apparatus may include: a processing module 301, a first determining module 302 and a first adjusting module 303.
The processing module 301 is configured to compress the sample image by using an encoder included in the image processing model to obtain compressed data, and decompress and restore the compressed data by using a decoder included in the image processing model to obtain a restored image.
The first determining module 302 is configured to input the restored image into the determination model, and determine the authenticity attribute of the restored image by the determination model to obtain a first result.
A first adjusting module 303, configured to adjust a model parameter of the image processing model based on at least the first result.
In some embodiments, the apparatus may further comprise: a second decision block (not shown).
And the second judging module is used for inputting the sample image into the judging model, and judging the authenticity attribute of the sample image by the judging model to obtain a second result.
Wherein, the first adjusting module may include: a tuning sub-module (not shown).
And the adjusting submodule is used for adjusting the model parameters of the image processing model based on the first result and the second result.
In other embodiments, the adjustment submodule is configured to: and determining a first loss comprising a first loss term, wherein the first loss term is obtained by adopting a relative average least square method based on the first result and the second result. Based on the first loss, model parameters of the image processing model are adjusted.
In other embodiments, the first loss further includes a second loss term, the second loss term derived based on a loss of quality of the restored image relative to the sample image.
In other embodiments, the apparatus may further comprise: a second adjusting module (not shown).
And the second adjusting module is used for adjusting the model parameters of the discriminant model based on the first result and the second result.
In other embodiments, the apparatus may further comprise: an input module (not shown).
The input module is used for inputting the compressed data into the discriminant model.
Wherein the first determination module is configured to: the authenticity attribute of the restored image is determined based on the compressed data.
Wherein the second determination module is configured to: and judging the authenticity attribute of the sample image based on the compressed data.
In other embodiments, the encoder is a first neural network and the decoder is a second neural network.
In other embodiments, the first adjustment module is configured to: network parameters of the first neural network and the second neural network are adjusted respectively.
In other embodiments, the encoder includes a compressor that employs a predetermined compression standard, and the decoder includes a decompressor that employs a decompression standard corresponding to the predetermined compression standard and a third neural network.
In other embodiments, the first adjustment module is configured to: network parameters of the third neural network are adjusted.
It should be understood that the above-mentioned apparatus may be preset in a device or server with computing and processing capabilities, and may also be loaded into the device or server with computing and processing capabilities through downloading and the like. The corresponding modules in the above-described apparatus may cooperate with modules in a device or server having computing and processing capabilities to implement a training scheme for the image processing model.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of one or more embodiments of the present specification. One of ordinary skill in the art can understand and implement it without inventive effort.
One or more embodiments of the present specification further provide a computer-readable storage medium storing a computer program, where the computer program is operable to execute the method for training an image processing model provided in any one of the embodiments of fig. 2.
Corresponding to the above-described method for training an image processing model, one or more embodiments of the present specification also propose a schematic block diagram of a computing device according to an exemplary embodiment of the present specification shown in fig. 4. Referring to fig. 4, at the hardware level, the computing device includes a processor, an internal bus, a network interface, a memory, and a non-volatile memory, but may also include hardware required for other services. The processor reads a corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to form the training device of the image processing model on a logic level. Of course, besides software implementation, the one or more embodiments in this specification do not exclude other implementations, such as logic devices or combinations of software and hardware, and so on, that is, the execution subject of the following processing flow is not limited to each logic unit, and may also be hardware or logic devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
It will be further appreciated by those of ordinary skill in the art that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application. The software modules may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above-mentioned embodiments, objects, technical solutions and advantages of the present application are described in further detail, it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present application, and are not intended to limit the scope of the present application, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present application should be included in the scope of the present application.

Claims (22)

1. A method of training an image processing model, the method comprising:
compressing a sample image by using an encoder included in the image processing model to obtain compressed data, and decompressing and restoring the compressed data by using a decoder included in the image processing model to obtain a restored image;
inputting the restored image into a discrimination model, and judging the authenticity attribute of the restored image by the discrimination model to obtain a first result;
based on at least the first result, model parameters of the image processing model are adjusted.
2. The method of claim 1, wherein the method further comprises:
inputting the sample image into the discrimination model, and judging the authenticity attribute of the sample image by the discrimination model to obtain a second result;
based on at least the first result, adjusting model parameters of the image processing model, including:
based on the first result and the second result, model parameters of the image processing model are adjusted.
3. The method of claim 2, wherein said adjusting model parameters of the image processing model based on the first result and the second result comprises:
determining a first loss comprising a first loss term; the first loss term is obtained by adopting a relative average least square method based on the first result and the second result;
based on the first loss, model parameters of the image processing model are adjusted.
4. The method of claim 3, wherein the first loss further comprises a second loss term; the second loss term is derived based on a loss of quality of the restored image relative to the sample image.
5. The method of claim 2, wherein the method further comprises:
and adjusting the model parameters of the discriminant model based on the first result and the second result.
6. The method of claim 2, wherein the method further comprises:
inputting the compressed data into the discriminant model;
wherein the determining the authenticity attribute of the restored image comprises:
judging the authenticity attribute of the restored image based on the compressed data;
wherein the determining the authenticity attribute of the sample image comprises:
and judging the authenticity attribute of the sample image based on the compressed data.
7. The method of claim 1, wherein the encoder is a first neural network; the decoder is a second neural network.
8. The method of claim 7, wherein said adjusting model parameters of said image processing model comprises:
network parameters of the first neural network and the second neural network are adjusted respectively.
9. The method of claim 1, wherein the encoder comprises a compressor that employs a preset compression standard; the decoder comprises a decompressor and a third neural network, wherein the decompressor adopts a decompression standard corresponding to the preset compression standard.
10. The method of claim 9, wherein said adjusting model parameters of said image processing model comprises:
adjusting a network parameter of the third neural network.
11. An apparatus for training an image processing model, the apparatus comprising:
the processing module is used for compressing the sample image by using an encoder included in the image processing model to obtain compressed data, and decompressing and restoring the compressed data by using a decoder included in the image processing model to obtain a restored image;
the first judging module is used for inputting the restored image into a judging model, and judging the authenticity attribute of the restored image by the judging model to obtain a first result;
a first adjusting module for adjusting model parameters of the image processing model based at least on the first result.
12. The apparatus of claim 11, wherein the apparatus further comprises:
the second judging module is used for inputting the sample image into the judging model, and judging the authenticity attribute of the sample image by the judging model to obtain a second result;
wherein the first adjusting module comprises:
an adjusting sub-module for adjusting model parameters of the image processing model based on the first result and the second result.
13. The apparatus of claim 12, wherein the adjustment submodule is configured to:
determining a first loss comprising a first loss term; the first loss term is obtained by adopting a relative average least square method based on the first result and the second result;
based on the first loss, model parameters of the image processing model are adjusted.
14. The apparatus of claim 13, wherein the first loss further comprises a second loss term; the second loss term is derived based on a loss of quality of the restored image relative to the sample image.
15. The apparatus of claim 12, wherein the apparatus further comprises:
and the second adjusting module is used for adjusting the model parameters of the discriminant model based on the first result and the second result.
16. The apparatus of claim 12, wherein the apparatus further comprises:
the input module is used for inputting the compressed data into the discrimination model;
wherein the first determination module is configured to:
judging the authenticity attribute of the restored image based on the compressed data;
wherein the second determination module is configured to:
and judging the authenticity attribute of the sample image based on the compressed data.
17. The apparatus of claim 11, wherein the encoder is a first neural network; the decoder is a second neural network.
18. The apparatus of claim 17, wherein the first adjustment module is configured to:
network parameters of the first neural network and the second neural network are adjusted respectively.
19. The apparatus of claim 11, wherein the encoder comprises a compressor that employs a preset compression standard; the decoder comprises a decompressor and a third neural network, wherein the decompressor adopts a decompression standard corresponding to the preset compression standard.
20. The apparatus of claim 19, wherein the first adjustment module is configured to:
adjusting a network parameter of the third neural network.
21. A computer-readable storage medium, having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any one of claims 1-10.
22. A computing device comprising a memory having executable code stored therein and a processor that, when executing the executable code, implements the method of any of claims 1-10.
CN202110820836.6A 2021-07-20 2021-07-20 Training method and device of image processing model Pending CN113362403A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110820836.6A CN113362403A (en) 2021-07-20 2021-07-20 Training method and device of image processing model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110820836.6A CN113362403A (en) 2021-07-20 2021-07-20 Training method and device of image processing model

Publications (1)

Publication Number Publication Date
CN113362403A true CN113362403A (en) 2021-09-07

Family

ID=77540011

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110820836.6A Pending CN113362403A (en) 2021-07-20 2021-07-20 Training method and device of image processing model

Country Status (1)

Country Link
CN (1) CN113362403A (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109040763A (en) * 2018-08-07 2018-12-18 北京飞搜科技有限公司 A kind of method for compressing image and system based on production confrontation network
CN109377532A (en) * 2018-10-18 2019-02-22 众安信息技术服务有限公司 Image processing method and device neural network based
US10284432B1 (en) * 2018-07-03 2019-05-07 Kabushiki Kaisha Ubitus Method for enhancing quality of media transmitted via network
US20190206091A1 (en) * 2017-12-29 2019-07-04 Baidu Online Network Technology (Beijing) Co., Ltd Method And Apparatus For Compressing Image
CN110442804A (en) * 2019-08-13 2019-11-12 北京市商汤科技开发有限公司 A kind of training method, device, equipment and the storage medium of object recommendation network
CN110598843A (en) * 2019-07-23 2019-12-20 中国人民解放军63880部队 Generation countermeasure network organization structure based on discriminator sharing and training method thereof
CN110880193A (en) * 2019-12-03 2020-03-13 山东浪潮人工智能研究院有限公司 Image compression method using depth semantic segmentation technology
CN111050174A (en) * 2019-12-27 2020-04-21 清华大学 Image compression method, device and system
CN111738351A (en) * 2020-06-30 2020-10-02 创新奇智(重庆)科技有限公司 Model training method and device, storage medium and electronic equipment
CN112529058A (en) * 2020-12-03 2021-03-19 北京百度网讯科技有限公司 Image generation model training method and device and image generation method and device
CN112565777A (en) * 2020-11-30 2021-03-26 通号智慧城市研究设计院有限公司 Deep learning model-based video data transmission method, system, medium and device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190206091A1 (en) * 2017-12-29 2019-07-04 Baidu Online Network Technology (Beijing) Co., Ltd Method And Apparatus For Compressing Image
US10284432B1 (en) * 2018-07-03 2019-05-07 Kabushiki Kaisha Ubitus Method for enhancing quality of media transmitted via network
CN109040763A (en) * 2018-08-07 2018-12-18 北京飞搜科技有限公司 A kind of method for compressing image and system based on production confrontation network
CN109377532A (en) * 2018-10-18 2019-02-22 众安信息技术服务有限公司 Image processing method and device neural network based
CN110598843A (en) * 2019-07-23 2019-12-20 中国人民解放军63880部队 Generation countermeasure network organization structure based on discriminator sharing and training method thereof
CN110442804A (en) * 2019-08-13 2019-11-12 北京市商汤科技开发有限公司 A kind of training method, device, equipment and the storage medium of object recommendation network
CN110880193A (en) * 2019-12-03 2020-03-13 山东浪潮人工智能研究院有限公司 Image compression method using depth semantic segmentation technology
CN111050174A (en) * 2019-12-27 2020-04-21 清华大学 Image compression method, device and system
CN111738351A (en) * 2020-06-30 2020-10-02 创新奇智(重庆)科技有限公司 Model training method and device, storage medium and electronic equipment
CN112565777A (en) * 2020-11-30 2021-03-26 通号智慧城市研究设计院有限公司 Deep learning model-based video data transmission method, system, medium and device
CN112529058A (en) * 2020-12-03 2021-03-19 北京百度网讯科技有限公司 Image generation model training method and device and image generation method and device

Similar Documents

Publication Publication Date Title
CN111263161B (en) Video compression processing method and device, storage medium and electronic equipment
EP3756143A1 (en) Data compression using conditional entropy models
US20200294273A1 (en) Information processing apparatus and method
WO2018150083A1 (en) A method and technical equipment for video processing
US10965948B1 (en) Hierarchical auto-regressive image compression system
US11335034B2 (en) Systems and methods for image compression at multiple, different bitrates
US20140050414A1 (en) Encoder, Decoder and Methods Thereof for Texture Compression
CN111641826B (en) Method, device and system for encoding and decoding data
CN113628116B (en) Training method and device for image processing network, computer equipment and storage medium
US20160316204A1 (en) Method and system for image compression using image block characteristics
CN114786007A (en) Intelligent video transmission method and system combining coding and image super-resolution
US20240080495A1 (en) Iteratively updating a filtering model
CN116567237A (en) Video encoding and decoding method, device, computer equipment and storage medium
CN111050169A (en) Method and device for generating quantization parameter in image coding and terminal
CN111918067A (en) Data processing method and device and computer readable storage medium
CN110730347A (en) Image compression method and device and electronic equipment
CN113362403A (en) Training method and device of image processing model
CN111163320A (en) Video compression method and system
CN114900714A (en) Video generation method based on neural network and related device
CN110717948A (en) Image post-processing method, system and terminal equipment
US11350134B2 (en) Encoding apparatus, image interpolating apparatus and encoding program
CN111565314A (en) Image compression method, coding and decoding network training method and device and electronic equipment
WO2023169303A1 (en) Encoding and decoding method and apparatus, device, storage medium, and computer program product
CN117857842A (en) Image quality processing method in live broadcast scene and electronic equipment
CN115474045A (en) Image encoding and decoding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210907