CN114845069A - Video processing method and device, electronic equipment and storage medium - Google Patents

Video processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114845069A
CN114845069A CN202110142063.0A CN202110142063A CN114845069A CN 114845069 A CN114845069 A CN 114845069A CN 202110142063 A CN202110142063 A CN 202110142063A CN 114845069 A CN114845069 A CN 114845069A
Authority
CN
China
Prior art keywords
image
hidden
preset
style
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110142063.0A
Other languages
Chinese (zh)
Inventor
徐璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan TCL Group Industrial Research Institute Co Ltd
Original Assignee
Wuhan TCL Group Industrial Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan TCL Group Industrial Research Institute Co Ltd filed Critical Wuhan TCL Group Industrial Research Institute Co Ltd
Priority to CN202110142063.0A priority Critical patent/CN114845069A/en
Publication of CN114845069A publication Critical patent/CN114845069A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/268Signal distribution or switching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/149Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N9/00Details of colour television systems
    • H04N9/64Circuits for processing colour signals
    • H04N9/646Circuits for processing colour signals for image enhancement, e.g. vertical detail restoration, cross-colour elimination, contour correction, chrominance trapping filters

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Image Processing (AREA)

Abstract

The application is applicable to the technical field of image processing, and provides a video processing method, a video processing device, an electronic device and a storage medium, wherein the method comprises the following steps: acquiring an initial video, determining a second hidden coding sequence according to the first hidden coding sequences of the first quantity and the target image style, inputting each frame of initial image contained in the initial video and the corresponding second hidden coding sequence into a preset image color conversion model for processing, and outputting a conversion image corresponding to each frame of initial image; and synthesizing all the frame transformation images to obtain the target landscape video. According to the method, the converted image frame corresponding to the initial image frame is obtained through the trained image color conversion model, the video with rich colors can be manufactured, and the situations that color spots, bright spots, color imbalance, artifacts and the like are abnormal when the video is subjected to color conversion, sometimes even large-area color blocks are generated, and the target is completely abnormal are avoided.

Description

Video processing method and device, electronic equipment and storage medium
Technical Field
The present application belongs to the field of image processing technologies, and in particular, to a video processing method and apparatus, an electronic device, and a storage medium.
Background
With the development of electronic equipment, more and more people use electronic equipment such as mobile phones and unmanned aerial vehicles to shoot outdoor images, but due to factors such as shooting time and shooting duration, the obtained images are not rich enough in color. After the image is captured, the user wishes to perform a stylistic color transformation on the original image. The existing method for performing style and color transformation on an initial image mainly uses a deep learning mode and obtains an image color transformation model through training a neural network. However, in the deep learning process, only a simple image training set is used for training, which results in the situation that the obtained model has abnormalities such as color spots, bright spots, color imbalance, artifacts and the like, sometimes even a large area of color blocks, and the target is completely abnormal when in use.
Disclosure of Invention
The embodiment of the application provides a video processing method, a video processing device, electronic equipment and a storage medium, and can solve the problems that a model obtained by the existing method for training an image color transformation model has color spots, bright spots, color imbalance, artifacts and other abnormalities, sometimes even a large-area color block, and a target is completely abnormal when the model is used.
In a first aspect, an embodiment of the present application provides a video processing method, including:
acquiring an initial video, wherein the frame number of the initial video is a first number;
determining a second hidden coding sequence according to the first quantity and the first hidden coding sequence of the target image style;
inputting each frame of initial image contained in the initial video and a second implicit coding sequence corresponding to the initial image into a preset image color conversion model for processing, and outputting a conversion image corresponding to each frame of initial image;
and synthesizing all the frame transformation images to obtain the target video.
In a second aspect, an embodiment of the present application provides a video processing apparatus, including:
the first obtaining unit is used for obtaining an initial video, and the number of frames of the initial video is a first number;
the determining unit is used for determining a second hidden coding sequence according to the first number and the first hidden coding sequence of the target image style;
the processing unit is used for inputting each frame of initial image contained in the initial video and the corresponding second hidden coding sequence into a preset image color transformation model for processing and outputting a transformation image corresponding to each frame of initial image;
and the synthesizing unit is used for synthesizing all the frame conversion images to obtain the target video.
In a third aspect, an embodiment of the present application provides an electronic device, where the electronic device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor, when executing the computer program, implements the video processing method according to the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, where a computer program is stored, and when executed by a processor, the computer program implements the video processing method according to the first aspect.
In the embodiment of the application, an initial video is obtained, a second hidden coding sequence is determined according to a first hidden coding sequence of a first number and a target image style, each frame of initial image contained in the initial video and the corresponding second hidden coding sequence are input into a preset image color conversion model for processing, and a conversion image corresponding to each frame of initial image is output; and synthesizing all the frame transformation images to obtain the target video. According to the method, the converted image frame corresponding to the initial image frame is obtained through the trained image color conversion model, the video with rich colors can be manufactured, and the situations that color spots, bright spots, color imbalance, artifacts and the like are abnormal when the video is subjected to color conversion, sometimes even large-area color blocks are generated, and the target is completely abnormal are avoided.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic flow chart of a video processing method according to a first embodiment of the present application;
fig. 2 is a schematic flowchart of a refinement of S102 in a video processing method according to a first embodiment of the present application;
fig. 3 is a schematic flowchart of S105 to S106 in a video processing method according to a first embodiment of the present application;
fig. 4 is a schematic flowchart of a refinement of S106 in a video processing method according to a first embodiment of the present application;
fig. 5 is a schematic flowchart of a refinement at S1061 in a video processing method according to a first embodiment of the present application;
fig. 6 is a schematic diagram of a video processing apparatus according to a second embodiment of the present application;
fig. 7 is a schematic diagram of an electronic device provided in a third embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to" determining "or" in response to detecting ". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.
Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.
Referring to fig. 1, fig. 1 is a schematic flowchart of a video processing method according to a first embodiment of the present application. In this embodiment, an execution subject of the video processing method is a device with a video color transformation function, such as a desktop computer and a server. The video processing method as shown in fig. 1 may include:
s101: an initial video is obtained, and the frame number of the initial video is a first number.
In fact, the type of the initial video may be a landscape, which is not limited herein.
The method comprises the steps that equipment obtains an initial video; the initial video can be acquired by a built-in camera of the local terminal device, can also be a local video of the local terminal device, and can also be an initial video received by the local terminal device and sent by other devices.
The device acquires a first number, denoted F, of initial image frames comprised by the initial video.
S102: a second steganographic encoding sequence is determined based on the first number and the first steganographic encoding sequence of the target image style.
The method comprises the steps that equipment obtains a first hidden coding sequence of a target image style, wherein the target image style is the style to be converted by an initial video. The first steganographic encoding sequence of the target image style is pre-stored in the device.
The first implicit coding sequence of the target image style can be obtained by processing the image color conversion model obtained by the training method in the first embodiment, and after being obtained by processing the image color conversion model, the first implicit coding sequence of the target image style can be stored in the local terminal device for video color conversion.
The device determines a second hidden code sequence according to the first number and the first hidden code sequence of the target image style, wherein the length of the second hidden code sequence is the same as the first number. Specifically, the length of the second steganographic encoding sequence is equal to the first number of initial image frames of the initial video by means of interpolation. Further, the difference between the length of the second implicit coding sequence and the first number may also be within a preset range.
In one embodiment, S102 may include S1021 to S1024, and as shown in fig. 2, S1021 to S1024 are as follows:
s1021: and acquiring the length of the first implicit coding sequence of the target image style.
The equipment obtains the sequence length of a first implicit coding sequence of the target image style, and the sequence length is marked as q.
S1022: and calculating the length of the sequence to be inserted in the first hidden coding sequence according to the length and the first number of the first hidden coding sequence, and determining the position to be inserted in the first hidden coding sequence according to a preset insertion rule.
The device calculates the length P of the sequence to be inserted in the first hidden coding sequence according to the sequence length and the first number i . Specifically, the average sequence length of the insertion between the first steganographic sequences is:
s=int((F-q)/(q-1))
the remaining lengths are:
t=F-q-s*(q-1)
wherein the remaining insertion length needs to be allocated to the first t intervals, the first t intervals P i S +1, the remainder of P i =s。
S1023: and determining the hidden codes to be inserted corresponding to the positions to be inserted according to the hidden codes on the two sides of the positions to be inserted and the length of the sequence to be inserted.
The device determines the hidden codes to be inserted corresponding to the positions to be inserted according to the hidden codes on the two sides of the positions to be inserted and the length of the sequence to be inserted. Specifically, the jth steganographic code in the ith interval is calculated according to the following formula:
λ j =j/(P i +1)
z ij =(1-λ j )z 1 +z 2
wherein z is 1 ,z 2 Refers to the hidden codes at both sides of the position to be inserted.
S1024: and inserting the hidden codes to be inserted into the positions to be inserted to obtain second hidden coding sequences, wherein the length of the second hidden coding sequences is the same as the first number.
And the equipment inserts the hidden code to be inserted into the position to be inserted to obtain a second hidden code sequence. Therefore, the length of the second hidden coding sequence is ensured to be consistent with that of the initial video, and the subsequent transformation of the style video by using a preset image color transformation model is facilitated.
S103: and inputting each frame of initial image contained in the initial video and the corresponding second hidden coding sequence into a preset image color conversion model for processing, and outputting a conversion image corresponding to each frame of initial image.
And the equipment inputs each frame of initial image and the corresponding second hidden coding sequence into a preset image color transformation model for processing to obtain a transformation image corresponding to each frame of initial image. And inputting the initial image and the corresponding second hidden coding sequence into a preset image color transformation model for processing to obtain characteristic information. And inputting the characteristic information into an attention mechanism module for processing to obtain a target color transformation parameter matrix, and decomposing the target color transformation parameter matrix into a target weight parameter and a target bias parameter. And transforming the initial image according to a preset transformation rule, the target weight parameter and the target bias parameter to obtain a transformed image corresponding to the initial image.
S104: and synthesizing all the frame transformation images to obtain the target video.
And synthesizing all the frame transformation images by the equipment to obtain the target video.
Before S101, this embodiment may further include S105 to S106, as shown in fig. 3, S105 to S106 are specifically as follows:
s105: and acquiring a sample training set, wherein the sample training set comprises content images and style images corresponding to the content images.
The equipment obtains a sample training set, wherein the sample training set comprises content images and style images corresponding to the content images. Since the method in this embodiment is set for an image, both the content image and the genre image corresponding to the content image are images. The content image may be a reference image, and the style image corresponding to the content image is an image obtained by color conversion of the content image.
In the present embodiment, the style transformation is actually a color transformation performed on the image, so that the style of the image is changed. For example, normal photos are converted into the styles of Van-Gao, Monai paintings and cartoon paintings; alternatively, the picture in the morning is converted into the picture in the evening. These can all be done by a color transform.
In one embodiment, the device may process the initial image in a preset manner to generate a sample training set. First, the device may acquire an initial image, for example, an image of the day as the initial image. And obtaining the initial style images at different moments corresponding to the initial images by using a preset style migration network. The device can select a high-quality initial image from the initial images and the initial style images corresponding to the high-quality initial images as content images, select the initial style images corresponding to the high-quality initial images as the style images corresponding to the content images, and generate a sample training set.
The high-quality initial image and the corresponding initial style image refer to images without obvious abnormality, color deviation and bright spots.
S106: and inputting each content image and the style image corresponding to the content image into a preset neural network for training to obtain a preset image color transformation model for outputting a transformation image.
The equipment inputs each content image and the style image corresponding to the content image into a preset neural network for training to obtain an image color transformation model for outputting a transformation image. In the training process, the input of the preset neural network is a content image and a style image corresponding to the content image, and the output of the preset neural network is a transformation image corresponding to the content image. The training process mainly includes inputting content images into a preset neural network, calculating loss through forward inference of the preset neural network, and updating parameters of the preset neural network in a loss reverse mode, wherein the loss continuously decreases along with continuous updating of the parameters until a training stopping condition is met, and then an image color transformation model for outputting transformation images is obtained.
Specifically, to obtain a more accurate image color transformation model, S106 may include S1061 to S1064, as shown in fig. 4, where S1061 to S1064 are specifically as follows:
s1061: and inputting the content image and the style image corresponding to the content image into a preset neural network for processing, and outputting a transformation image corresponding to the content image.
The equipment inputs the content image and the style image corresponding to the content image into a preset neural network for processing to obtain a transformation image corresponding to the content image. The pre-set neural network may include an input layer, an implied layer, an output layer (a loss function layer). The input layer includes an input layer node for receiving an input content image and its corresponding genre image from the outside. The hidden layer is used for processing the content image and the style image corresponding to the content image. The output layer is used for outputting a conversion image corresponding to the content image.
In one embodiment, the predetermined neural network includes an encoder module, a codec module, and an attention mechanism module, and S1061 may include S10611 to S10614, as shown in fig. 5, where S10611 to S10614 are specifically as follows:
s10611: and inputting the style image into an encoder module for encoding, and outputting a first hidden code.
The device inputs the style image into an encoder module for encoding to obtain a first hidden code. The first implicit code is a vector, for example, the first implicit code is an 8-dimensional vector composed of floating point numbers.
S10612: and inputting the content image and the first hidden code into a coder-decoder module for processing, and outputting first characteristic information.
In this embodiment, the codec module is used to extract features. The device inputs the content image and the first hidden code into the codec module for processing to obtain first characteristic information.
S10613: and inputting the first characteristic information into an attention mechanism module for processing, outputting a color transformation parameter matrix, and decomposing the color transformation parameter matrix into weight parameters and bias parameters.
And the equipment inputs the first characteristic information into an attention mechanism module for processing to obtain a color transformation parameter matrix, and the color transformation parameter matrix is marked as C.
The device then decomposes the color transformation parameter matrix into a weight parameter w and a bias parameter b.
S10614: and transforming the content image according to a preset transformation rule, the weight parameter and the bias parameter to obtain a transformed image corresponding to the content image.
The device pre-stores a preset transformation rule, and transforms the content image according to the preset transformation rule, the first parameter and the second parameter to obtain a transformed image corresponding to the content image. Specifically, the preset transformation rule may be:
Figure BDA0002929049500000081
wherein I' is a transformed image corresponding to the content image, I is the content image, w is a weight parameter, b is a bias parameter,
Figure BDA0002929049500000091
representing the hadamard inner product of the matrix.
S1062: and acquiring difference information between the style image and the transformed image according to a preset loss function.
The device obtains difference information between the stylized image and the transformed image according to a preset loss function.
Specifically, a preset loss function is prestored in the device, and a loss value can be calculated according to the preset loss function, wherein the loss value is the difference information between the style image and the transformation image.
In one mode, the preset loss function of the image color transformation model in the training process includes a style loss function, a spatial color loss function, a content loss function, a regularization loss function, a recovery loss function or a hidden coding loss function.
The style loss function may be expressed as:
Figure BDA0002929049500000092
wherein, F l Is the l-th layer characteristic of VGG, G is the gram matrix operation, I s Is a genre image, and I is a content image.
The spatial color loss function can be expressed as:
Figure BDA0002929049500000095
wherein, SP refers to a space pyramid pooling function, I' is a transformation image corresponding to the content image, and I is the content image.
The content loss function can be expressed as:
Figure BDA0002929049500000093
wherein, F l Is the l-th layer characteristic of VGG, I' is the conversion image corresponding to the content image, and I is the content image.
The regularization loss function can be expressed as: color transformation parameter matrix obtained by neural network prediction
Figure BDA0002929049500000094
Wherein, I is a content picture, p and q both refer to positions, the positions include x direction and y direction, q is the next position of p, n (p) is the neighborhood of p position, and C is a color change parameter matrix obtained by neural network prediction.
The recovery loss function can be expressed as:
L rec =||I′-I|| 1
wherein, I' is a transformation image corresponding to the content image, and I is the content image.
The steganographic loss function can be expressed as:
L z =||z′-z|| 1
where z is the implicit coding of the stylized image and z' is the implicit coding of the transformed image.
During the training process, the preset loss function may include one or more of the above-mentioned loss functions. It can be understood that each loss function is calculated to obtain a corresponding loss value, and when the preset loss function is a plurality of loss functions, the difference information can be calculated by performing weighted summation on all the obtained loss values.
For example, the difference information may be:
L total =λ s L ssp L spc L ctv L tvrec L recz L z
wherein, λ is a weighting coefficient, and the values can be: 1.0, 0.01, 1e-5, 0.1, 0.02, 0.1.
S1063: and if the difference information meets the preset training stopping condition, stopping training, and taking the current preset neural network as a preset image color transformation model.
And setting a preset training stopping condition in the equipment, if the difference information meets the preset training stopping condition, sufficiently converging the current preset neural network, stopping training, and taking the current preset neural network as an image color transformation model.
S1064: and if the difference information does not meet the preset training stopping condition, adjusting the network parameters of the preset neural network, returning to the step of inputting the content image and the style image corresponding to the content image into the preset neural network for processing, and outputting the transformation image corresponding to the content image.
And if the difference information does not meet the preset training stopping condition, adjusting the network parameters of the preset neural network, returning to input the content image and the style image corresponding to the content image into the preset neural network for processing to obtain a transformation image corresponding to the content image, and performing iterative training again.
In the embodiment of the application, an initial video is obtained, a second hidden coding sequence is determined according to a first hidden coding sequence of a first number and a target image style, each frame of initial image contained in the initial video and the corresponding second hidden coding sequence are input into a preset image color conversion model for processing, and a conversion image corresponding to each frame of initial image is output; and synthesizing all the frame transformation images to obtain the target video. According to the method, the transformed image corresponding to each frame of initial image is obtained through the trained image color transformation model, the video with rich colors can be manufactured, and the situations that color spots, bright spots, color imbalance, artifacts and the like are abnormal when the video is subjected to color transformation, sometimes even large-area color blocks are generated, and the target is completely abnormal are avoided.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Referring to fig. 6, fig. 6 is a schematic view of a video processing apparatus according to a second embodiment of the present application. The units are used for executing the steps in the embodiment corresponding to the figures 1-5. Please refer to fig. 1-5 for related descriptions. For convenience of explanation, only the portions related to the present embodiment are shown. Referring to fig. 6, the video processing apparatus 6 includes:
a first obtaining unit 610, configured to obtain an initial video, where a frame number of the initial video is a first number;
a determining unit 620, configured to determine a second hidden coding sequence according to the first number and the first hidden coding sequence of the target image style;
a processing unit 630, configured to input each frame of initial image included in the initial video and the corresponding second latent coding sequence into a preset image color transform model for processing, and output a transformed image corresponding to each frame of initial image;
and a synthesizing unit 640, configured to synthesize all the frame transformation images to obtain a target video.
Further, the determining unit 620 is specifically configured to:
acquiring the length of a first hidden coding sequence of a target image style;
calculating the length of a sequence to be inserted in the first hidden coding sequence according to the length and the first quantity of the first hidden coding sequence, and determining the position to be inserted in the first hidden coding sequence according to a preset insertion rule;
determining the hidden codes to be inserted corresponding to the positions to be inserted according to the hidden codes on the two sides of the positions to be inserted and the length of the sequences to be inserted;
and inserting the hidden codes to be inserted into the positions to be inserted to obtain second hidden coding sequences, wherein the length of the second hidden coding sequences is the same as the first number.
Further, the video processing apparatus 6 further includes:
the second acquisition unit is used for acquiring a sample training set, and the sample training set comprises content images and style images corresponding to the content images;
and the training unit is used for inputting each content image and the style image corresponding to the content image into a preset neural network for training to obtain a preset image color transformation model for outputting a transformation image.
Further, the training unit is specifically configured to:
inputting the content image and the style image corresponding to the content image into a preset neural network for processing, and outputting a transformation image corresponding to the content image;
acquiring difference information between the style image and the transformed image according to a preset loss function;
if the difference information meets the preset training stopping condition, stopping training, and taking the current preset neural network as a preset image color transformation model; alternatively, the first and second electrodes may be,
and if the difference information does not meet the preset training stopping condition, adjusting the network parameters of the preset neural network, returning to the step of inputting the content image and the style image corresponding to the content image into the preset neural network for processing, and outputting the transformation image corresponding to the content image.
Furthermore, the preset neural network comprises an encoder module, a coder-decoder module and an attention mechanism module;
a training unit, specifically configured to:
inputting the style image into an encoder module for encoding, and outputting a first hidden code;
inputting the content image and the first hidden code into a coder-decoder module for processing, and outputting first characteristic information;
inputting the first characteristic information into an attention mechanism module for processing, outputting a color transformation parameter matrix, and decomposing the color transformation parameter matrix into a weight parameter and a bias parameter;
and transforming the content image according to a preset transformation rule, the weight parameter and the bias parameter to obtain a transformed image corresponding to the content image.
Further, the preset loss function of the image color transformation model in the training process comprises a style loss function, a space color loss function, a content loss function, a regularization loss function, a recovery loss function or a hidden coding loss function.
Fig. 7 is a schematic diagram of an electronic device provided in a third embodiment of the present application. As shown in fig. 7, the electronic apparatus 7 of this embodiment includes: a processor 70, a memory 71, and a computer program 72, such as a video processing program, stored in the memory 71 and executable on the processor 70. The processor 70, when executing the computer program 72, implements the steps in the various video processing method embodiments described above, such as steps 101 to 104 shown in fig. 1. Alternatively, the processor 70, when executing the computer program 72, implements the functions of the modules/units in the above-described device embodiments, such as the functions of the modules 610 to 640 shown in fig. 6.
Illustratively, the computer program 72 may be divided into one or more modules/units, which are stored in the memory 71 and executed by the processor 70 to accomplish the present application. One or more of the modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 72 in the electronic device 7. For example, the computer program 72 may be divided into a first acquiring unit, a determining unit, a processing unit, and a synthesizing unit, and the specific functions of each unit are as follows:
the first obtaining unit is used for obtaining an initial video, and the number of frames of the initial video is a first number;
the determining unit is used for determining a second hidden coding sequence according to the first quantity and the first hidden coding sequence of the target image style;
the processing unit is used for inputting each frame of initial image contained in the initial video and the corresponding second hidden coding sequence into a preset image color transformation model for processing and outputting a transformation image corresponding to each frame of initial image;
and the synthesizing unit is used for synthesizing all the frame transformation images to obtain the target video.
The electronic device may include, but is not limited to, a processor 70, a memory 71. It will be appreciated by those skilled in the art that fig. 7 is merely an example of the electronic device 7 and does not constitute a limitation of the electronic device 7 and may include more or less components than those shown, or combine certain components, or different components, e.g. the electronic device may also include input output devices, network access devices, buses, etc.
The Processor 70 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage 71 may be an internal storage unit of the electronic device 7, such as a hard disk or a memory of the electronic device 7. The memory 71 may also be an external storage device of the electronic device 7, such as a plug-in hard disk provided on the electronic device 7, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the electronic device 7 may also include both an internal storage unit and an external storage device of the electronic device 7. The memory 71 is used for storing computer programs and other programs and data required by the electronic device. The memory 71 may also be used to temporarily store data that has been output or is to be output.
It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules, so as to perform all or part of the functions described above. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
An embodiment of the present application further provides a timing device of a virtual timer, where the timing device of the virtual timer includes: at least one processor, a memory, and a computer program stored in the memory and executable on the at least one processor, the processor implementing the steps of any of the various method embodiments described above when executing the computer program.
The embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps that can be implemented in the above method embodiments.
The embodiments of the present application provide a computer program product, which when running on a mobile terminal, enables the mobile terminal to implement the steps in the above method embodiments when executed.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/device and method may be implemented in other ways. For example, the above-described apparatus/device embodiments are merely illustrative, and for example, a module or a unit may be divided into only one logic function, and may be implemented in other ways, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (10)

1. A video processing method, comprising:
acquiring an initial video, wherein the frame number of the initial video is a first number;
determining a second hidden coding sequence according to the first quantity and the first hidden coding sequence of the target image style;
inputting each frame of initial image contained in the initial video and the corresponding second hidden coding sequence into a preset image color transformation model for processing, and outputting a transformation image corresponding to each frame of initial image;
and synthesizing all the frame transformation images to obtain the target video.
2. The method of claim 1, wherein said determining a second steganographic encoding sequence from the first number and the first steganographic encoding sequence of the target image style comprises:
acquiring the length of a first hidden coding sequence of a target image style;
calculating the length of a sequence to be inserted in the first hidden coding sequence according to the length of the first hidden coding sequence and the first quantity, and determining the position of the sequence to be inserted in the first hidden coding sequence according to a preset insertion rule;
determining the hidden codes to be inserted corresponding to the positions to be inserted according to the hidden codes on the two sides of the positions to be inserted and the length of the sequences to be inserted;
and inserting the hidden codes to be inserted into the positions to be inserted to obtain second hidden coding sequences, wherein the length of the second hidden coding sequences is the same as the first number.
3. The method of claim 1 or 2, wherein prior to said obtaining the initial video, the method further comprises:
acquiring a sample training set, wherein the sample training set comprises content images and style images corresponding to the content images;
and inputting each content image and the style image corresponding to the content image into a preset neural network for training to obtain the preset image color transformation model for outputting the transformation image.
4. The method of claim 3, wherein the inputting each of the content images and the corresponding style image thereof into a predetermined neural network for training to obtain the predetermined image color transformation model for outputting a transformed image comprises:
inputting the content image and the style image corresponding to the content image into a preset neural network for processing, and outputting a transformation image corresponding to the content image;
acquiring difference information between the style image and the transformation image according to a preset loss function;
if the difference information meets a preset training stopping condition, stopping training, and taking a current preset neural network as the preset image color transformation model; alternatively, the first and second electrodes may be,
and if the difference information does not meet the preset training stopping condition, adjusting the network parameters of the preset neural network, returning to the step of inputting the content image and the style image corresponding to the content image into the preset neural network for processing, and outputting the transformation image corresponding to the content image.
5. The method of claim 4, wherein the predictive neural network comprises an encoder module, a codec module, an attention mechanism module;
the step of inputting the content image and the style image corresponding to the content image into a preset neural network for processing and outputting a transformation image corresponding to the content image comprises the following steps:
inputting the style image into the encoder module for encoding, and outputting a first hidden code;
inputting the content image and the first implicit code into the codec module for processing, and outputting first characteristic information;
inputting the first characteristic information into the attention mechanism module for processing, outputting a color transformation parameter matrix, and decomposing the color transformation parameter matrix into weight parameters and bias parameters;
and transforming the content image according to a preset transformation rule, the weight parameter and the bias parameter to obtain a transformed image corresponding to the content image.
6. The method of claim 4 or 5, wherein the preset loss function of the image color transformation model in the training process comprises a style loss function, a spatial color loss function, a content loss function, a regularization loss function, a restoration loss function, or a steganographic loss function.
7. A video processing apparatus, comprising:
the device comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring an initial video, and the frame number of the initial video is a first number;
the determining unit is used for determining a second hidden coding sequence according to the first quantity and the first hidden coding sequence of the target image style;
the processing unit is used for inputting each frame of initial image contained in the initial video and the corresponding second hidden coding sequence into a preset image color transformation model for processing and outputting a transformation image corresponding to each frame of initial image;
and the synthesizing unit is used for synthesizing all the frame conversion images to obtain the target video.
8. The apparatus according to claim 7, wherein the determining unit is specifically configured to:
acquiring the length of a first hidden coding sequence of a target image style;
calculating the length of a sequence to be inserted in the first hidden coding sequence according to the length of the first hidden coding sequence and the first quantity, and determining the position of the sequence to be inserted in the first hidden coding sequence according to a preset insertion rule;
determining the hidden codes to be inserted corresponding to the positions to be inserted according to the hidden codes on the two sides of the positions to be inserted and the length of the sequences to be inserted;
and inserting the hidden codes to be inserted into the positions to be inserted to obtain second hidden coding sequences, wherein the length of the second hidden coding sequences is the same as the first number.
9. An electronic device, characterized in that the electronic device comprises a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the method according to any of claims 1-6 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the method according to any one of claims 1-6.
CN202110142063.0A 2021-02-02 2021-02-02 Video processing method and device, electronic equipment and storage medium Pending CN114845069A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110142063.0A CN114845069A (en) 2021-02-02 2021-02-02 Video processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110142063.0A CN114845069A (en) 2021-02-02 2021-02-02 Video processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114845069A true CN114845069A (en) 2022-08-02

Family

ID=82561301

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110142063.0A Pending CN114845069A (en) 2021-02-02 2021-02-02 Video processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114845069A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117314890A (en) * 2023-11-07 2023-12-29 东莞市富明钮扣有限公司 Safety control method, device, equipment and storage medium for button making processing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117314890A (en) * 2023-11-07 2023-12-29 东莞市富明钮扣有限公司 Safety control method, device, equipment and storage medium for button making processing
CN117314890B (en) * 2023-11-07 2024-04-23 东莞市富明钮扣有限公司 Safety control method, device, equipment and storage medium for button making processing

Similar Documents

Publication Publication Date Title
CN110324664B (en) Video frame supplementing method based on neural network and training method of model thereof
CN111355941B (en) Image color real-time correction method, device and system
CN106780363B (en) Picture processing method and device and electronic equipment
CN110069974B (en) Highlight image processing method and device and electronic equipment
CN108600783B (en) Frame rate adjusting method and device and terminal equipment
CN110809126A (en) Video frame interpolation method and system based on adaptive deformable convolution
WO2019090580A1 (en) System and method for image dynamic range adjusting
CN113421312A (en) Method and device for coloring black and white video, storage medium and terminal
CN108564546B (en) Model training method and device and photographing terminal
CN113781318A (en) Image color mapping method and device, terminal equipment and storage medium
CN107808394B (en) Image processing method based on convolutional neural network and mobile terminal
CN110717864B (en) Image enhancement method, device, terminal equipment and computer readable medium
CN112200817A (en) Sky region segmentation and special effect processing method, device and equipment based on image
CN114845069A (en) Video processing method and device, electronic equipment and storage medium
CN115170834A (en) Chromatic aberration measuring method and device and electronic equipment
CN111489289B (en) Image processing method, image processing device and terminal equipment
CN109640084B (en) Video stream noise reduction method and device and storage medium
CN114820755B (en) Depth map estimation method and system
CN110856014A (en) Moving image generation method, moving image generation device, electronic device, and storage medium
CN111754412A (en) Method and device for constructing data pairs and terminal equipment
CN109308690B (en) Image brightness balancing method and terminal
CN113287147A (en) Image processing method and device
CN112989924B (en) Target detection method, target detection device and terminal equipment
CN112561818B (en) Image enhancement method and device, electronic equipment and storage medium
CN116977190A (en) Image processing method, apparatus, device, storage medium, and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination