WO2022105779A1 - 图像处理方法、模型训练方法、装置、介质及设备 - Google Patents
图像处理方法、模型训练方法、装置、介质及设备 Download PDFInfo
- Publication number
- WO2022105779A1 WO2022105779A1 PCT/CN2021/131155 CN2021131155W WO2022105779A1 WO 2022105779 A1 WO2022105779 A1 WO 2022105779A1 CN 2021131155 W CN2021131155 W CN 2021131155W WO 2022105779 A1 WO2022105779 A1 WO 2022105779A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- target object
- difference information
- resolution
- model
- Prior art date
Links
- 238000012549 training Methods 0.000 title claims abstract description 211
- 238000000034 method Methods 0.000 title claims abstract description 63
- 238000003672 processing method Methods 0.000 title claims abstract description 12
- 238000012545 processing Methods 0.000 claims abstract description 173
- 230000004927 fusion Effects 0.000 claims abstract description 48
- 238000004590 computer program Methods 0.000 claims description 27
- 230000004044 response Effects 0.000 claims description 18
- 238000000605 extraction Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 9
- 230000008569 process Effects 0.000 description 9
- 238000004891 communication Methods 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 5
- 239000000835 fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 210000000887 face Anatomy 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4038—Image mosaicing, e.g. composing plane images from plane sub-images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/751—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Definitions
- the present application is based on the CN application number 202011298847.4 and the filing date is Nov. 18, 2020, and claims its priority.
- the disclosure of the CN application is hereby incorporated into the present application as a whole.
- the present disclosure relates to the technical field of image processing, and in particular, to an image processing method, a model training method, an apparatus, a medium, and a device.
- the quality of an image is affected by many factors, such as blurring, ripples, and dense noise due to noise, image compression, and other reasons.
- the restoration and processing of images are mainly performed manually by technicians.
- the manual processing method is time-consuming and labor-intensive, especially for the processing of a large number of image frames in videos or movies, and the efficiency is low.
- the current methods of image restoration usually directly process the entire image as a whole, and the processing effect on specific objects in the image is not good.
- the present disclosure provides an image processing method, the method comprising: extracting a first target object image from an image to be processed; inputting the first target object image into a target object image processing model to obtain the The second target object image output by the target object image processing model, wherein the resolution of the second target object image is higher than that of the first target object image; Image fusion to get the target image.
- the target object image processing model is a generative adversarial network model including a generator, and the target object image processing model is obtained by training in the following way: using a low-resolution image of the original training sample image as the generated
- the input of the generator is to obtain the high-resolution image output after the generator processes the low-resolution image; according to the target difference information between the high-resolution image and the original training sample image, it is determined whether the model is trained Completed, wherein the target difference information includes at least one of the following: feature point difference information between the high-resolution image and the original training sample image, the high-resolution image and the original training sample
- the difference information of the specified feature area in the image in response to the completion of the model training, the image processing model of the target object is obtained.
- the present disclosure provides a training method for a target object image processing model
- the target object image processing model is a generative adversarial network model including a generator
- the method includes: The image is used as the input of the generator, and the high-resolution image output by the generator after processing the low-resolution image is obtained; according to the target difference information between the high-resolution image and the original training sample image , determine whether the training of the model is completed, wherein the target difference information includes at least one of the following: feature point difference information between the high-resolution image and the original training sample image, the high-resolution image and the The specified feature area difference information in the original training sample image; in response to the completion of model training, the target object image processing model is obtained.
- the present disclosure provides an image processing apparatus, the apparatus comprising: an extraction module for extracting a first target object image from an image to be processed; an input module for inputting the first target object image into a In the target object image processing model, a second target object image output by the target object image processing model is obtained, wherein the resolution of the second target object image is higher than that of the first target object image; the image fusion module, using performing image fusion according to the second target object image and the to-be-processed image to obtain a target image.
- the target object image processing model is a generative adversarial network model including a generator, and the target object image processing model is obtained by training a training device for the target object image processing model.
- the training of the target object image processing model The device includes: an image obtaining module, configured to use the low-resolution image of the original training sample image as the input of the generator to obtain a high-resolution image output by the generator after processing the low-resolution image; a determining module , for determining whether the model is trained according to the target difference information between the high-resolution image and the original training sample image, wherein the target difference information includes at least one of the following: the high-resolution The feature point difference information between the image and the original training sample image, and the specified feature area difference information in the high-resolution image and the original training sample image; the model obtaining module is used to respond to the completion of the model training, obtain The target object image processing model.
- the present disclosure provides a training device for a target object image processing model
- the target object image processing model is a generative adversarial network model including a generator
- the device includes: an image acquisition module for converting the original training The low-resolution image of the sample image is used as the input of the generator to obtain a high-resolution image outputted by the generator after processing the low-resolution image; the determining module is configured to match the high-resolution image with the high-resolution image.
- the target difference information between the original training sample images is used to determine whether the training of the model is completed, wherein the target difference information includes at least one of the following: a feature between the high-resolution image and the original training sample image point difference information, and the difference information of the designated feature area in the high-resolution image and the original training sample image; a model obtaining module, configured to obtain the target object image processing model in response to the completion of model training.
- the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processing device, implements the steps of the method provided in the first aspect of the present disclosure.
- the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processing apparatus, implements the steps of the method provided in the second aspect of the present disclosure.
- the present disclosure provides an electronic device, comprising: a storage device on which a computer program is stored; and a processing device for executing the computer program in the storage device, so as to implement the computer program provided in the first aspect of the present disclosure. the steps of the method.
- the present disclosure provides an electronic device, comprising: a storage device on which a computer program is stored; and a processing device for executing the computer program in the storage device, so as to implement the computer program provided in the second aspect of the present disclosure. the steps of the method.
- the first target object image extracted from the image to be processed is input into the target object image processing model to obtain a second target object image with a higher resolution, and the first target object image in the to-be-processed image is individually After processing, the processed second target object image with higher resolution and the to-be-processed image are image-fused to obtain a target image, which can make the target object in the obtained target image clearer and the details more realistic.
- the model is trained through the difference information of feature points or the difference information of the specified feature area, without comparing the entire image, the model training speed is faster, or the target difference information can include both of these two.
- the difference information considered is more comprehensive, so that the difference between the high-resolution image and the original training sample image can be more accurately characterized according to the target difference information.
- Training the model according to the target difference information can make the difference between the high-resolution image generated by the generator and the original training sample image smaller, that is, the image is more accurate.
- FIG. 1 is a flow chart of a training method of a target object image processing model according to an exemplary embodiment.
- Fig. 2 is a flowchart of an image processing method according to an exemplary embodiment.
- Fig. 3 is a block diagram of an image processing apparatus according to an exemplary embodiment.
- Fig. 4 is a block diagram of an apparatus for training a target object image processing model according to an exemplary embodiment.
- Fig. 5 is a schematic structural diagram of an electronic device according to an exemplary embodiment.
- the term “including” and variations thereof are open-ended inclusions, ie, "including but not limited to”.
- the term “based on” is “based at least in part on.”
- the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
- the target object can be objects such as people, faces, buildings, plants, animals, etc.
- the present disclosure does not make specific restrictions on the target object.
- the model can be used to perform resolution enhancement processing on the target object image.
- FIG. 1 is a flow chart of a training method of a target object image processing model according to an exemplary embodiment, and the method can be applied to an electronic device with processing capability.
- the target object image processing model may be a generative adversarial network model including a generator, such as enhanced super-resolution generative adversarial networks (ESRGAN, Enhanced Super-Resolution Generative Adversarial Networks), super-resolution generative adversarial networks (SRGAN, Super- Resolution Generative Adversarial Networks).
- ESRGAN enhanced super-resolution generative adversarial networks
- SRGAN Super- Resolution Generative Adversarial Networks
- the method may include S101 to S103.
- the low-resolution image of the original training sample image is used as the input of the generator, and the high-resolution image output by the generator after processing the low-resolution image is obtained.
- the original training sample image may be any preset image, and the original training sample image may be an individual image or an image frame in a video file.
- an image including the target object in the image may be used as the original training sample image, for example, if the target object is a human face, the image including the human face may be used as the original training sample image.
- the low-resolution image of the original training sample image can be obtained in various ways, for example, by down-sampling the original training sample image, or by blurring the original training sample image to obtain the corresponding low-resolution image.
- Resolution The resolution of the image is not specifically limited in the present disclosure.
- the generator in the generative adversarial network model can be used to perform super-resolution processing on the image, that is, to improve the resolution of the image, so that the image is clearer and more realistic. After the low-resolution image of the original training sample image is input to the generator, the high-resolution image output by the generator can be obtained.
- target difference information between the high-resolution image and the original training sample image can be obtained.
- the target difference information may include at least one of the following: feature point difference information between the high-resolution image and the original training sample image, and specified feature area difference information between the high-resolution image and the original training sample image.
- the difference information may include, for example, pixel difference information, color difference information, and the like.
- the feature points in the image can be key points in the image, and the feature points can be obtained by performing feature point detection on the image.
- the feature points in the image may include key points in the human face.
- the feature points can be extracted from the high-resolution image, and the feature points can be extracted from the original training sample image, so as to determine the difference between the feature points extracted from the high-resolution image and the feature points extracted from the original training sample image.
- Feature point difference information can be used to determine the difference between the feature points extracted from the high-resolution image and the feature points extracted from the original training sample image.
- the designated feature region in the image may be a feature region in the image.
- the designated feature regions of the image may be regions such as eyes, mouth, nose, and the like.
- the designated feature area of the image can be leaves or flowers.
- the specified feature region can be extracted from the high-resolution image, and the specified feature region can also be extracted from the original training sample image. Taking a face image as an example, for example, the region where the eyes are located in the image are both extracted, and the difference between the two is determined. The difference information is used as the difference information of the specified feature area.
- the target difference information may include feature point difference information or specified feature area difference information, and the model is trained by the feature point difference information or the specified feature area difference information, without the need to compare the entire image, and the model training speed is faster.
- the target difference information may include both, and the considered difference information is more comprehensive, so that the difference between the high-resolution image and the original training sample image can be more accurately represented according to the target difference information.
- the generative adversarial network also includes a discriminator, which is used to distinguish the authenticity of the high-resolution images generated by the generator.
- Training the generator and the discriminator according to the target difference information can make the difference between the high-resolution image generated by the generator and the original training sample image smaller, that is, the image is more accurate, the image details are more realistic, and the discrimination can be improved.
- whether the model is trained can be determined according to the target difference information between the high-resolution image and the original training sample image. Among them, if the difference between the high-resolution image and the original training sample image is large, it can indicate that the image generated by the generator is not accurate enough and realistic enough, and the model needs to be trained continuously.
- the conditions for the completion of model training may include that the difference between the high-resolution image and the original training sample image is small, and the discriminator judges that the high-resolution image generated by the generator is real.
- Target object image processing model may include that the difference between the high-resolution image and the original training sample image is small, and the discriminator judges that the high-resolution image generated by the generator is real.
- the target difference information may include at least one of the following: feature point difference information between the high-resolution image and the original training sample image, and specified feature area difference information between the high-resolution image and the original training sample image.
- the model is trained through the feature point difference information or the specified feature area difference information, without the need to compare the entire image, the model training speed is faster, or the target difference information can include both, the difference information considered is more comprehensive, Therefore, the difference between the high-resolution image and the original training sample image can be more accurately characterized according to the target difference information. Training the model according to the target difference information can make the difference between the high-resolution image generated by the generator and the original training sample image smaller, that is, the image is more accurate and the image details are more realistic.
- the target difference information between the high-resolution image and the original training sample image may further include: between the high-resolution image adjusted to the preset resolution and the original training sample image adjusted to the preset resolution overall difference information.
- the high-resolution image after obtaining the high-resolution image generated by the generator, the high-resolution image can be adjusted to the preset resolution, and the original training sample image can also be adjusted to the preset resolution, that is, the resolution of the two can be adjusted. In order to be consistent, the overall difference information between the two after adjustment is compared.
- the present disclosure does not specifically limit the value of the preset resolution.
- the target difference information in the present disclosure may include at least one of feature point difference information and specified feature area difference information and the overall difference information.
- the target difference information is more comprehensive, and the model can not only be trained from the perspective of the overall difference, but also can be trained according to the difference information between the feature points and/or the specified feature regions, so that the high-resolution generated by the generator can be trained.
- the image is closer to the original training sample image, and the image details are more realistic.
- the target object image processing model may further include a discriminator, and the above-mentioned S102 may include: the degree of difference represented by each type of difference information included in the target difference information is smaller than the corresponding difference degree threshold, and the discriminator pair When the true-false judgment result of the high-resolution image is true, it is determined that the model training is completed.
- the target difference information including the overall difference information, the feature point difference information and the specified feature area difference information
- the difference degree represented by the overall difference information is less than the corresponding first difference degree threshold
- the feature point difference information The degree of difference is less than the corresponding second degree of difference threshold
- the degree of difference represented by the difference information of the specified feature area is less than the corresponding third degree of difference threshold, that is, it is considered that the difference between the high-resolution image and the original training sample image is small
- the high-resolution images generated by the generator are more accurate, and if the discriminator's true and false determination results of the high-resolution images are true at this time, that is, the discriminator cannot distinguish the true and false of the high-resolution images generated by the generator, it can be determined.
- the image processing model of the target object is obtained.
- the first difference degree threshold, the second difference degree threshold, and the third difference degree threshold may be pre-calibrated, and may be the same or different.
- the difference information included in the target difference information contains difference information whose degree of difference is greater than or equal to the corresponding threshold of difference degree, or the discriminator’s true or false determination result of the high-resolution image is false, it can indicate that the model has not been trained yet. , another original training sample image can be obtained at this time to continue training the model.
- the target object image processing model further includes a discriminator; the above S102 may include: fusing each type of difference information included in the target difference information to obtain fusion difference information; When it is less than the preset fusion difference degree threshold, and the discriminator's true or false determination result of the high-resolution image is true, it is determined that the model training is completed.
- the manner of merging each kind of difference information may be to perform weighting processing on the difference degrees represented by each kind of difference information, and there is no specific limitation on the weight occupied by each kind of difference information. If the degree of difference represented by the fusion difference information is less than the preset fusion difference degree threshold, it can indicate that the high-resolution image generated by the generator is more accurate, and at this time, if the discriminator's true or false determination result of the high-resolution image is true , it can be determined that the model training is completed, that is, the image processing model of the target object is obtained.
- the discriminator's true or false determination result for the high-resolution image is false, or, the degree of difference represented by the fusion difference information is greater than or equal to the preset fusion degree of difference threshold, that is, the difference between the high-resolution image and the original training sample image
- the preset fusion degree of difference threshold that is, the difference between the high-resolution image and the original training sample image
- the condition for whether the model is trained can include that the degree of difference represented by each type of difference information included in the target difference information is smaller than the corresponding difference degree threshold, or the degree of difference represented by the fusion difference information is smaller than the preset difference.
- the difference threshold is fused, so that the target difference information can include various difference information between the high-resolution image and the original training sample image, and training the model according to the various difference information can make the obtained target object image processing model more accurate .
- Fig. 2 is a flowchart of an image processing method according to an exemplary embodiment. As shown in Fig. 2 , the image processing method may include S201-S203.
- a first target object image is extracted from the image to be processed.
- the image to be processed may be a pre-stored image or an image captured by a user in real time, and the image to be processed may also be an image frame in a video file. If the image to be processed is an image frame of a video file, multiple Image frames are processed separately.
- the first target object image may be an image of a target object detected from an image to be processed, for example, the target object is a human face, and the first target object image may be an extracted face image, for example, the target object is a building, and the first target object image may be an extracted human face image.
- a target object image may be an extracted building image.
- the first target object image is input into the target object image processing model to obtain the second target object image output by the target object image processing model.
- the resolution of the second target object image is higher than that of the first target object image.
- the first target object image After the first target object image is extracted from the image to be processed, the first target object image can be input into the target object image processing model, and the target object image processing model can be used to enhance the resolution of the first target object image process, and output a higher resolution and clearer image of the second target object.
- the target object image processing model is a generative adversarial network model including a generator, and the target object image processing model may be obtained by training in the manner shown in FIG. 1:
- the low-resolution image of the original training sample is The image is used as the input of the generator, and the high-resolution image output by the generator after processing the low-resolution image is obtained.
- the target difference information may include at least one of the following: the high-resolution image and the original training sample image The difference information between the feature points, the high-resolution image and the specified feature area difference information in the original training sample image.
- a target object image processing model is obtained in response to the completion of the model training.
- image fusion is performed according to the second target object image and the to-be-processed image to obtain a target image.
- the target object may be a key part in the image to be processed, that is, a relatively significant part.
- the first target object image in the image to be processed is separately processed, and then the processed second target object with higher resolution is processed.
- Image fusion of the image and the image to be processed can make the target object in the obtained target image clearer, and improve the problem that the target object is not clear caused by directly processing the entire image of the image to be processed.
- the image to be processed may be an image frame in a video, and each image frame in the video may be processed separately to obtain a video file with higher resolution and higher definition.
- the first target object image extracted from the image to be processed is input into the target object image processing model to obtain a second target object image with a higher resolution, and the first target object image in the to-be-processed image is individually After processing, the processed second target object image with higher resolution and the to-be-processed image are image-fused to obtain a target image, which can make the target object in the obtained target image clearer and the details more realistic.
- the model is trained through the difference information of feature points or the difference information of the specified feature area, without comparing the entire image, the model training speed is faster, or the target difference information can include both of these two.
- the difference information considered is more comprehensive, so that the difference between the high-resolution image and the original training sample image can be more accurately characterized according to the target difference information.
- Training the model according to the target difference information can make the difference between the high-resolution image generated by the generator and the original training sample image smaller, that is, the image is more accurate.
- performing image fusion according to the second target object image and the to-be-processed image to obtain the target image may include: performing resolution enhancement processing on the to-be-processed image to obtain the target to-be-processed image; The object image and the target to-be-processed image are image-fused to obtain the target image.
- the image to be processed can be input into an image processing model to obtain the target image to be processed output by the image processing model.
- the image processing model may be different from the above-mentioned target object image processing model, and the image processing model may be any network model for performing full image processing on the image to be processed, that is, improving the resolution of the image to be processed.
- the to-be-processed image may be up-sampled to obtain a target to-be-processed image with a higher resolution.
- the second target object image processed by the target object image processing model can be image-fused with the target to-be-processed image to obtain a target image with a resolution of the target image.
- the target object details in the image are more realistic.
- Poisson fusion may be used for image fusion, and the edge transition of the fusion part may be made more natural by using Poisson fusion.
- FIG. 3 is a block diagram of an image processing apparatus according to an exemplary embodiment. As shown in FIG. 3 , the image processing apparatus 300 may include:
- the extraction module 301 is used for extracting the first target object image from the image to be processed
- the input module 302 is configured to input the first target object image into the target object image processing model to obtain a second target object image output by the target object image processing model, wherein the resolution of the second target object image higher than the first target object image;
- An image fusion module 303 configured to perform image fusion according to the second target object image and the to-be-processed image to obtain a target image
- the target object image processing model is a generative adversarial network model including a generator
- the target object image processing model may be obtained by training the training device 400 of the target object image processing model shown in FIG. 4 , as shown in FIG.
- the training device 400 of the target object image processing model may include:
- the image obtaining module 401 is configured to use the low-resolution image of the original training sample image as the input of the generator, and obtain the high-resolution image output by the generator after processing the low-resolution image;
- a determination module 402 configured to determine whether the model is trained according to target difference information between the high-resolution image and the original training sample image, wherein the target difference information includes at least one of the following: the Feature point difference information between the high-resolution image and the original training sample image, and specified feature area difference information between the high-resolution image and the original training sample image;
- the model obtaining module 403 is configured to obtain the target object image processing model in response to the completion of model training.
- the first target object image extracted from the image to be processed is input into the target object image processing model to obtain a second target object image with a higher resolution, and the first target object image in the to-be-processed image is individually After processing, the processed second target object image with higher resolution and the to-be-processed image are image-fused to obtain a target image, which can make the target object in the obtained target image clearer and the details more realistic.
- the model is trained through the difference information of feature points or the difference information of the specified feature area, without comparing the entire image, the model training speed is faster, or the target difference information can include both of these two.
- the difference information considered is more comprehensive, so that the difference between the high-resolution image and the original training sample image can be more accurately characterized according to the target difference information.
- Training the model according to the target difference information can make the difference between the high-resolution image generated by the generator and the original training sample image smaller, that is, the image is more accurate.
- the image fusion module 303 may include: a resolution enhancement processing sub-module for performing resolution enhancement processing on the to-be-processed image to obtain the target to-be-processed image; an image fusion sub-module for performing resolution enhancement processing on the to-be-processed image;
- the second target object image and the target to-be-processed image are image-fused to obtain the target image.
- FIG. 4 is a block diagram of an apparatus for training a target object image processing model according to an exemplary embodiment, where the target object image processing model is a generative adversarial network model including a generator, and the apparatus 400 may include:
- the image obtaining module 401 is used for taking the low-resolution image of the original training sample image as the input of the generator, and obtaining the high-resolution image output by the generator after processing the low-resolution image;
- a determination module 402 configured to determine whether the model is trained according to target difference information between the high-resolution image and the original training sample image, wherein the target difference information includes at least one of the following: the Feature point difference information between the high-resolution image and the original training sample image, and specified feature area difference information between the high-resolution image and the original training sample image;
- the model obtaining module 403 is configured to obtain the target object image processing model in response to the completion of model training.
- the target difference information further includes overall difference information between the high-resolution image adjusted to the preset resolution and the original training sample image adjusted to the preset resolution.
- the target object image processing model further includes a discriminator; the determining module 402 may include: a first determining sub-module for determining the degree of difference represented by each type of difference information included in the target difference information If all are smaller than the respective corresponding difference thresholds, and the discriminator's true or false determination result of the high-resolution image is true, it is determined that the model training is completed.
- the target object image processing model further includes a discriminator;
- the determining module 402 may include: a fusion sub-module, configured to fuse each type of difference information included in the target difference information to obtain a fusion difference information; a second determination sub-module, used for when the degree of difference represented by the fusion difference information is less than a preset fusion degree of difference threshold, and the discriminator’s true or false determination result of the high-resolution image is true In this case, it is determined that the model training is completed.
- Terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (eg, mobile terminals such as in-vehicle navigation terminals), etc., and stationary terminals such as digital TVs, desktop computers, and the like.
- the electronic device shown in FIG. 5 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
- an electronic device 500 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 501 that may be loaded into random access according to a program stored in a read only memory (ROM) 502 or from a storage device 508 Various appropriate actions and processes are executed by the programs in the memory (RAM) 503 . In the RAM 503, various programs and data required for the operation of the electronic device 500 are also stored.
- the processing device 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504.
- An input/output (I/O) interface 505 is also connected to bus 504 .
- I/O interface 505 input devices 506 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speakers, vibration
- An output device 507 such as a computer
- a storage device 508 including, for example, a magnetic tape, a hard disk, etc.
- Communication means 509 may allow electronic device 500 to communicate wirelessly or by wire with other devices to exchange data. While FIG. 5 shows electronic device 500 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
- embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
- the computer program may be downloaded and installed from the network via the communication device 509, or from the storage device 508, or from the ROM 502.
- the processing apparatus 501 When the computer program is executed by the processing apparatus 501, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
- the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
- the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
- a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
- a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
- a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
- Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, electrical wire, optical fiber cable, RF (radio frequency), etc., or any suitable combination of the foregoing.
- the client and server can use any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol) to communicate, and can communicate with digital data in any form or medium Communication (eg, a communication network) interconnects.
- HTTP HyperText Transfer Protocol
- Examples of communication networks include local area networks (“LAN”), wide area networks (“WAN”), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently known or future development network of.
- the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
- the computer-readable medium carries one or more programs, and when the one or more programs are executed by the electronic device, the electronic device: extracts a first target object image from an image to be processed; The object image is input into the target object image processing model, and a second target object image output by the target object image processing model is obtained, wherein the resolution of the second target object image is higher than that of the first target object image; according to The second target object image and the to-be-processed image are subjected to image fusion to obtain a target image; wherein, the target object image processing model is a generative adversarial network model including a generator, and the target object image processing model is obtained by It is obtained by training in the following way: taking the low-resolution image of the original training sample image as the input of the generator, and obtaining the high-resolution image output by the generator after processing the low-resolution image; according to the high-resolution image The target difference information between the high-resolution image and the original training sample image is used to determine whether the model is trained, wherein the target
- the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device causes the electronic device to: use the low-resolution image of the original training sample image as the input, obtain the high-resolution image output after the generator processes the low-resolution image; determine whether the model is trained according to the target difference information between the high-resolution image and the original training sample image,
- the target difference information includes at least one of the following: feature point difference information between the high-resolution image and the original training sample image, information on the difference between the high-resolution image and the original training sample image
- the specified feature area difference information in response to the completion of model training, the target object image processing model is obtained.
- Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages.
- the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
- the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider to via Internet connection).
- LAN local area network
- WAN wide area network
- each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
- the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
- each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
- the modules involved in the embodiments of the present disclosure may be implemented in software or hardware. Wherein, the name of the module does not constitute a limitation of the module itself under certain circumstances, for example, the extraction module may also be described as a "target object image extraction module”.
- exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.
- FPGAs Field Programmable Gate Arrays
- ASICs Application Specific Integrated Circuits
- ASSPs Application Specific Standard Products
- SOCs Systems on Chips
- CPLDs Complex Programmable Logical Devices
- a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
- the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
- Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
- machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
- RAM random access memory
- ROM read only memory
- EPROM or flash memory erasable programmable read only memory
- CD-ROM compact disk read only memory
- magnetic storage or any suitable combination of the foregoing.
- Example 1 provides an image processing method, the method comprising: extracting a first target object image from an image to be processed; inputting the first target object image into a target object In the image processing model, a second target object image output by the target object image processing model is obtained, wherein the resolution of the second target object image is higher than that of the first target object image; according to the second target object image
- the image and the image to be processed are image-fused to obtain the target image;
- the target object image processing model is a generative confrontation network model including a generator, and the target object image processing model is obtained by training in the following manner:
- the low-resolution image of the original training sample image is used as the input of the generator to obtain a high-resolution image output after the generator processes the low-resolution image; according to the high-resolution image and the original Target difference information between training sample images, to determine whether the model is trained, wherein the target difference information includes at least one of the following: the feature point difference between the high-resolution image and the
- Example 2 provides the method of Example 1, and the target difference information further includes a high-resolution image adjusted to a preset resolution and a high-resolution image adjusted to the preset resolution Overall difference information between the original training sample images.
- Example 3 provides the method of Example 1, the target object image processing model further includes a discriminator; The target difference information of the target difference information, and determining whether the training of the model is completed, including: the degree of difference represented by each type of difference information included in the target difference information is smaller than the corresponding difference degree threshold, and the discriminator In the case where the true-false determination result of the image is true, it is determined that the model training is completed.
- Example 4 provides the method of Example 1, the target object image processing model further includes a discriminator; The target difference information of the target difference information, and determining whether the training of the model is completed, including: fusing each kind of difference information included in the target difference information to obtain fusion difference information; the degree of difference represented by the fusion difference information is less than the preset fusion.
- the difference threshold is set, and the discriminator's true-false determination result on the high-resolution image is true, it is determined that the model training is completed.
- Example 5 provides the method according to any one of Examples 1-4, wherein image fusion is performed according to the second target object image and the to-be-processed image to obtain a target image , comprising: performing resolution enhancement processing on the to-be-processed image to obtain a target to-be-processed image; and performing image fusion according to the second target object image and the target to-be-processed image to obtain the target image.
- Example 6 provides a method for training a target object image processing model, where the target object image processing model is a generative adversarial network model including a generator, and the method includes: The low-resolution image of the original training sample image is used as the input of the generator, and the high-resolution image output by the generator after processing the low-resolution image is obtained; according to the high-resolution image and the original training Target difference information between sample images, to determine whether the model is trained, wherein the target difference information includes at least one of the following: feature point difference information between the high-resolution image and the original training sample image , the difference information of the designated feature area in the high-resolution image and the original training sample image; in response to the completion of model training, the target object image processing model is obtained.
- the target object image processing model is a generative adversarial network model including a generator
- the method includes: The low-resolution image of the original training sample image is used as the input of the generator, and the high-resolution image output by the generator after processing the low-resolution image
- Example 7 provides the method of Example 6, the target difference information further includes a high-resolution image adjusted to a preset resolution and a high-resolution image adjusted to the preset resolution Overall difference information between the original training sample images.
- Example 8 provides the method of Example 6, the target object image processing model further includes a discriminator; The target difference information of the target difference information, and determining whether the training of the model is completed, including: the degree of difference represented by each type of difference information included in the target difference information is smaller than the corresponding difference degree threshold, and the discriminator In the case where the true-false determination result of the image is true, it is determined that the model training is completed.
- Example 9 provides the method of Example 6, the target object image processing model further includes a discriminator; The target difference information of the target difference information, and determining whether the training of the model is completed, including: fusing each kind of difference information included in the target difference information to obtain fusion difference information; the degree of difference represented by the fusion difference information is less than the preset fusion.
- the difference threshold is set, and the discriminator's true-false determination result on the high-resolution image is true, it is determined that the model training is completed.
- Example 10 provides an image processing apparatus, the apparatus includes: an extraction module for extracting a first target object image from an image to be processed; an input module for The first target object image is input into the target object image processing model, and the second target object image output by the target object image processing model is obtained, wherein the resolution of the second target object image is higher than that of the first target object.
- the target object image processing model is a generative confrontation network model including a generator
- the target object image processing model is obtained by training a training device for the target object image processing model
- the training device for the target object image processing model includes: an image acquisition module for using the low-resolution image of the original training sample image as a The input of the generator is to obtain the high-resolution image outputted by the generator after processing the low-resolution image; the determination module is used for determining according to the difference between the high-resolution image and the original training sample image.
- Target difference information to determine whether the model is trained, wherein the target difference information includes at least one of the following: feature point difference information between the high-resolution image and the original training sample image, the high-resolution image The difference information of the designated feature area in the rate image and the original training sample image; the model obtaining module is used for obtaining the target object image processing model in response to the completion of the model training.
- Example 11 provides an apparatus for training a target object image processing model, where the target object image processing model is a generative adversarial network model including a generator, and the apparatus includes: an image The obtaining module is used to take the low-resolution image of the original training sample image as the input of the generator, and obtain the high-resolution image output after the low-resolution image is processed by the generator; the determining module is used for according to The target difference information between the high-resolution image and the original training sample image, to determine whether the model is trained, wherein the target difference information includes at least one of the following: the high-resolution image and the feature point difference information between the original training sample images, and the specified feature area difference information in the high-resolution image and the original training sample image; a model obtaining module, used for obtaining the target object in response to the completion of model training Image processing model.
- the target object image processing model is a generative adversarial network model including a generator
- the apparatus includes: an image The obtaining module is used to take the low-resolution image of the
- Example 12 provides a computer-readable medium having stored thereon a computer program that, when executed by a processing apparatus, implements the steps of the method described in any one of Examples 1-5 .
- Example 13 provides a computer-readable medium having stored thereon a computer program that, when executed by a processing apparatus, implements the steps of the method described in any one of Examples 6-9 .
- Example 14 provides an electronic device, comprising: a storage device on which a computer program is stored; and a processing device for executing the computer program in the storage device, to The steps of implementing the method of any of Examples 1-5.
- Example 15 provides an electronic device, comprising: a storage device on which a computer program is stored; and a processing device for executing the computer program in the storage device to The steps of implementing the method of any of Examples 6-9.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Biodiversity & Conservation Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (17)
- 一种图像处理方法,所述方法包括:从待处理图像中提取第一目标对象图像;将所述第一目标对象图像输入到目标对象图像处理模型中,得到所述目标对象图像处理模型输出的第二目标对象图像,其中,所述第二目标对象图像的分辨率高于所述第一目标对象图像;以及根据所述第二目标对象图像和所述待处理图像进行图像融合,得到目标图像。
- 根据权利要求1所述的方法,其中所述目标对象图像处理模型为包括生成器的生成式对抗网络模型,所述目标对象图像处理模型是通过如下方式训练得到的:将原始训练样本图像的低分辨率图像作为所述生成器的输入,得到所述生成器对所述低分辨率图像处理之后输出的高分辨率图像;根据所述高分辨率图像与所述原始训练样本图像之间的目标差异信息,确定模型是否训练完成,其中,所述目标差异信息包括以下中的至少一者:所述高分辨率图像与所述原始训练样本图像之间的特征点差异信息、所述高分辨率图像与所述原始训练样本图像中的指定特征区域差异信息;响应于模型训练完成,得到所述目标对象图像处理模型。
- 根据权利要求1或2所述的方法,其中所述目标差异信息还包括调整至预设分辨率之后的高分辨率图像和调整至所述预设分辨率之后的原始训练样本图像之间的整体差异信息。
- 根据权利要求1或2所述的方法,其中所述目标对象图像处理模型还包括判别器;根据所述高分辨率图像与所述原始训练样本图像之间的目标差异信息,确定模型是否训练完成,包括:在所述目标差异信息包括的每种差异信息所表征的差异度均小于各自对应的差异度阈值、且所述判别器对所述高分辨率图像的真假判定结果为真实的情况下,确定所述模型训练完成。
- 根据权利要求1或2所述的方法,其中所述目标对象图像处理模型还包括判别器;根据所述高分辨率图像与所述原始训练样本图像之间的目标差异信息,确定模型是否训练完成,包括:将所述目标差异信息中包括的每种差异信息进行融合,得到融合差异信息;在所述融合差异信息所表征的差异度小于预设的融合差异度阈值、且所述判别器对所述高分辨率图像的真假判定结果为真实的情况下,确定所述模型训练完成。
- 根据权利要求1-5中任一项所述的方法,其中,所述根据所述第二目标对象图像和所述待处理图像进行图像融合,得到目标图像,包括:对所述待处理图像进行分辨率增强处理,得到目标待处理图像;根据所述第二目标对象图像和所述目标待处理图像进行图像融合,得到所述目标图像。
- 一种目标对象图像处理模型的训练方法,其中,所述方法包括:将原始训练样本图像的低分辨率图像作为生成器的输入,得到所述生成器对所述低分辨率图像处理之后输出的高分辨率图像,其中所述目标对象图像处理模型为包括所述生成器的生成式对抗网络模型,;根据所述高分辨率图像与所述原始训练样本图像之间的目标差异信息,确定模型是否训练完成,其中,所述目标差异信息包括以下中的至少一者:所述高分辨率图像与所述原始训练样本图像之间的特征点差异信息、所述高分辨率图像与所述原始训练样本图像中的指定特征区域差异信息;响应于模型训练完成,得到所述目标对象图像处理模型。
- 根据权利要求7所述的方法,其中所述目标差异信息还包括调整至预设分辨率之后的高分辨率图像和调整至所述预设分辨率之后的原始训练样本图像之间的整体差异信息。
- 根据权利要求7所述的方法,其中所述目标对象图像处理模型还包括判别器;根据所述高分辨率图像与所述原始训练样本图像之间的目标差异信息,确定模型是否训练完成,包括:在所述目标差异信息包括的每种差异信息所表征的差异度均小于各自对应的差异度阈值、且所述判别器对所述高分辨率图像的真假判定结果为真实的情况下,确定所述模型 训练完成。
- 根据权利要求7所述的方法,其中所述目标对象图像处理模型还包括判别器;根据所述高分辨率图像与所述原始训练样本图像之间的目标差异信息,确定模型是否训练完成,包括:将所述目标差异信息中包括的每种差异信息进行融合,得到融合差异信息;在所述融合差异信息所表征的差异度小于预设的融合差异度阈值、且所述判别器对所述高分辨率图像的真假判定结果为真实的情况下,确定所述模型训练完成。
- 一种图像处理装置,所述装置包括:提取模块,被配置用于从待处理图像中提取第一目标对象图像;输入模块,被配置用于将所述第一目标对象图像输入到目标对象图像处理模型中,得到所述目标对象图像处理模型输出的第二目标对象图像,其中,所述第二目标对象图像的分辨率高于所述第一目标对象图像;图像融合模块,被配置用于根据所述第二目标对象图像和所述待处理图像进行图像融合,得到目标图像。
- 根据权利要求11所述的图像处理装置,其中所述目标对象图像处理模型为包括生成器的生成式对抗网络模型,所述目标对象图像处理模型被配置为通过目标对象图像处理模型的训练装置训练得到,该目标对象图像处理模型的训练装置包括:图像获得模块,被配置用于将原始训练样本图像的低分辨率图像作为所述生成器的输入,得到所述生成器对所述低分辨率图像处理之后输出的高分辨率图像;确定模块,被配置用于根据所述高分辨率图像与所述原始训练样本图像之间的目标差异信息,确定模型是否训练完成,其中,所述目标差异信息包括以下中的至少一者:所述高分辨率图像与所述原始训练样本图像之间的特征点差异信息、所述高分辨率图像与所述原始训练样本图像中的指定特征区域差异信息;模型获得模块,被配置用于响应于模型训练完成,得到所述目标对象图像处理模型。
- 一种目标对象图像处理模型的训练装置,所述装置包括:图像获得模块,被配置用于将原始训练样本图像的低分辨率图像作为所述生成器的输 入,得到所述生成器对所述低分辨率图像处理之后输出的高分辨率图像,其中所述目标对象图像处理模型为包括生成器的生成式对抗网络模型;确定模块,被配置用于根据所述高分辨率图像与所述原始训练样本图像之间的目标差异信息,确定模型是否训练完成,其中,所述目标差异信息包括以下中的至少一者:所述高分辨率图像与所述原始训练样本图像之间的特征点差异信息、所述高分辨率图像与所述原始训练样本图像中的指定特征区域差异信息;模型获得模块,被配置用于响应于模型训练完成,得到所述目标对象图像处理模型。
- 一种计算机可读介质,其上存储有计算机程序,该程序被处理装置执行时实现权利要求1-6中任一项所述方法的步骤。
- 一种计算机可读介质,其上存储有计算机程序,该程序被处理装置执行时实现权利要求7-10中任一项所述方法的步骤。
- 一种电子设备,包括:存储装置,其上存储有计算机程序;处理装置,用于执行所述存储装置中的所述计算机程序,以实现权利要求1-6中任一项所述方法的步骤。
- 一种电子设备,包括:存储装置,其上存储有计算机程序;处理装置,用于执行所述存储装置中的所述计算机程序,以实现权利要求7-10中任一项所述方法的步骤。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18/252,979 US20240013359A1 (en) | 2020-11-18 | 2021-11-17 | Image processing method, model training method, apparatus, medium and device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011298847.4A CN112381717A (zh) | 2020-11-18 | 2020-11-18 | 图像处理方法、模型训练方法、装置、介质及设备 |
CN202011298847.4 | 2020-11-18 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/252,979 Continuation US20240013359A1 (en) | 2020-11-18 | 2021-11-17 | Image processing method, model training method, apparatus, medium and device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022105779A1 true WO2022105779A1 (zh) | 2022-05-27 |
Family
ID=74584392
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/131155 WO2022105779A1 (zh) | 2020-11-18 | 2021-11-17 | 图像处理方法、模型训练方法、装置、介质及设备 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20240013359A1 (zh) |
CN (1) | CN112381717A (zh) |
WO (1) | WO2022105779A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116362972A (zh) * | 2023-05-22 | 2023-06-30 | 飞狐信息技术(天津)有限公司 | 图像处理方法、装置、电子设备及存储介质 |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112381717A (zh) * | 2020-11-18 | 2021-02-19 | 北京字节跳动网络技术有限公司 | 图像处理方法、模型训练方法、装置、介质及设备 |
CN113344776B (zh) * | 2021-06-30 | 2023-06-27 | 北京字跳网络技术有限公司 | 图像处理方法、模型训练方法、装置、电子设备及介质 |
CN113688832B (zh) * | 2021-08-27 | 2023-02-03 | 北京三快在线科技有限公司 | 一种模型训练及图像处理方法、装置 |
CN117196957B (zh) * | 2023-11-03 | 2024-03-22 | 广东省电信规划设计院有限公司 | 基于人工智能的图像分辨率转换方法及装置 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108921782A (zh) * | 2018-05-17 | 2018-11-30 | 腾讯科技(深圳)有限公司 | 一种图像处理方法、装置及存储介质 |
CN110310229A (zh) * | 2019-06-28 | 2019-10-08 | Oppo广东移动通信有限公司 | 图像处理方法、图像处理装置、终端设备及可读存储介质 |
CN110428366A (zh) * | 2019-07-26 | 2019-11-08 | Oppo广东移动通信有限公司 | 图像处理方法和装置、电子设备、计算机可读存储介质 |
CN111353929A (zh) * | 2018-12-21 | 2020-06-30 | 北京字节跳动网络技术有限公司 | 图像处理方法、装置和电子设备 |
CN111626932A (zh) * | 2020-05-07 | 2020-09-04 | Tcl华星光电技术有限公司 | 图像的超分辨率重建方法及装置 |
CN112381717A (zh) * | 2020-11-18 | 2021-02-19 | 北京字节跳动网络技术有限公司 | 图像处理方法、模型训练方法、装置、介质及设备 |
-
2020
- 2020-11-18 CN CN202011298847.4A patent/CN112381717A/zh active Pending
-
2021
- 2021-11-17 WO PCT/CN2021/131155 patent/WO2022105779A1/zh active Application Filing
- 2021-11-17 US US18/252,979 patent/US20240013359A1/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108921782A (zh) * | 2018-05-17 | 2018-11-30 | 腾讯科技(深圳)有限公司 | 一种图像处理方法、装置及存储介质 |
CN111353929A (zh) * | 2018-12-21 | 2020-06-30 | 北京字节跳动网络技术有限公司 | 图像处理方法、装置和电子设备 |
CN110310229A (zh) * | 2019-06-28 | 2019-10-08 | Oppo广东移动通信有限公司 | 图像处理方法、图像处理装置、终端设备及可读存储介质 |
CN110428366A (zh) * | 2019-07-26 | 2019-11-08 | Oppo广东移动通信有限公司 | 图像处理方法和装置、电子设备、计算机可读存储介质 |
CN111626932A (zh) * | 2020-05-07 | 2020-09-04 | Tcl华星光电技术有限公司 | 图像的超分辨率重建方法及装置 |
CN112381717A (zh) * | 2020-11-18 | 2021-02-19 | 北京字节跳动网络技术有限公司 | 图像处理方法、模型训练方法、装置、介质及设备 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116362972A (zh) * | 2023-05-22 | 2023-06-30 | 飞狐信息技术(天津)有限公司 | 图像处理方法、装置、电子设备及存储介质 |
CN116362972B (zh) * | 2023-05-22 | 2023-08-08 | 飞狐信息技术(天津)有限公司 | 图像处理方法、装置、电子设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN112381717A (zh) | 2021-02-19 |
US20240013359A1 (en) | 2024-01-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022105779A1 (zh) | 图像处理方法、模型训练方法、装置、介质及设备 | |
US20230394671A1 (en) | Image segmentation method and apparatus, and device, and storage medium | |
WO2022252881A1 (zh) | 图像处理方法、装置、可读介质和电子设备 | |
CN111325704B (zh) | 图像修复方法、装置、电子设备及计算机可读存储介质 | |
CN111784712B (zh) | 图像处理方法、装置、设备和计算机可读介质 | |
WO2022171036A1 (zh) | 视频目标追踪方法、视频目标追踪装置、存储介质及电子设备 | |
CN110991373A (zh) | 图像处理方法、装置、电子设备及介质 | |
CN112800276B (zh) | 视频封面确定方法、装置、介质及设备 | |
WO2021197161A1 (zh) | 图标更新方法、装置和电子设备 | |
WO2023143222A1 (zh) | 图像处理方法、装置、设备及存储介质 | |
WO2022105622A1 (zh) | 图像分割方法、装置、可读介质及电子设备 | |
CN112418249A (zh) | 掩膜图像生成方法、装置、电子设备和计算机可读介质 | |
WO2022116990A1 (zh) | 视频裁剪方法、装置、存储介质及电子设备 | |
CN113038176B (zh) | 视频抽帧方法、装置和电子设备 | |
CN112257598B (zh) | 图像中四边形的识别方法、装置、可读介质和电子设备 | |
CN111783632B (zh) | 针对视频流的人脸检测方法、装置、电子设备及存储介质 | |
CN111311609B (zh) | 一种图像分割方法、装置、电子设备及存储介质 | |
CN116664849B (zh) | 数据处理方法、装置、电子设备和计算机可读介质 | |
WO2023098576A1 (zh) | 图像处理方法、装置、设备及介质 | |
WO2023138441A1 (zh) | 视频生成方法、装置、设备及存储介质 | |
WO2023016290A1 (zh) | 视频分类方法、装置、可读介质和电子设备 | |
CN116596748A (zh) | 图像风格化处理方法、装置、设备、存储介质和程序产品 | |
WO2022052889A1 (zh) | 图像识别方法、装置、电子设备和计算机可读介质 | |
CN113033552B (zh) | 文本识别方法、装置和电子设备 | |
CN111737575B (zh) | 内容分发方法、装置、可读介质及电子设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21893926 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18252979 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12.09.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21893926 Country of ref document: EP Kind code of ref document: A1 |