WO2017161710A1 - 基于深度学习的解像方法和系统 - Google Patents
基于深度学习的解像方法和系统 Download PDFInfo
- Publication number
- WO2017161710A1 WO2017161710A1 PCT/CN2016/086494 CN2016086494W WO2017161710A1 WO 2017161710 A1 WO2017161710 A1 WO 2017161710A1 CN 2016086494 W CN2016086494 W CN 2016086494W WO 2017161710 A1 WO2017161710 A1 WO 2017161710A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- resolution image
- low
- feature information
- face feature
- image
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 77
- 238000013135 deep learning Methods 0.000 title description 3
- 238000012549 training Methods 0.000 claims abstract description 40
- 238000012545 processing Methods 0.000 claims abstract description 13
- 238000001914 filtration Methods 0.000 claims description 56
- 230000008569 process Effects 0.000 claims description 37
- 230000001815 facial effect Effects 0.000 claims description 26
- 239000004744 fabric Substances 0.000 claims description 17
- 238000000605 extraction Methods 0.000 claims description 11
- 230000005284 excitation Effects 0.000 claims description 8
- 238000010219 correlation analysis Methods 0.000 claims description 6
- 238000013461 design Methods 0.000 abstract description 10
- 238000005457 optimization Methods 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 18
- 238000010586 diagram Methods 0.000 description 12
- 238000003708 edge detection Methods 0.000 description 4
- 238000007429 general method Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 3
- 230000035945 sensitivity Effects 0.000 description 3
- 239000000284 extract Substances 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000036548 skin texture Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440263—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the spatial resolution, e.g. for displaying on a connected PDA
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
- G06T3/4076—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution using the original low-resolution images to iteratively correct the high-resolution images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/162—Detection; Localisation; Normalisation using pixel segmentation or colour matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
Definitions
- the present disclosure relates to the field of image processing in television display, and more particularly to a method and system for face and image high resolution decoding based on deep learning.
- Super resolution is based on the resolution of the current video source is not as good as the resolution that HDTV can display.
- Super-resolution technology enhances visual clarity by stretching, comparing, and correcting original images to output images that are more suitable for display on Full HD (Full High Definition) LCD TVs. Compared with ordinary LCD TVs, it is simple to stretch the SD signal to a high-definition screen. The super-resolution technology shows more prominent details, which changes the impression that HDTV is not as good as low-resolution TV.
- the resolution of an image also known as resolution, resolution, and resolution, refers to how many pixels can be displayed on the display. The more pixels on the display, the finer the picture.
- High-resolution images contain high pixel densities, provide rich detail information, and more accurate and detailed descriptions of objective scenes. High-resolution images are in great demand in the information age, such as satellite remote sensing images, video security surveillance, military surveillance aerial photography, medical digital imaging and video standard conversion.
- Face illusion is a specific area super-resolution technique that produces a high-resolution output from a low-resolution input.
- Low-resolution images are obtained by down-sampling and linear convolution processing of high-resolution images.
- the illusion technique can be understood as reconstructing high-frequency details. For the current super-resolution technology, there is very little point about the illusion of the face.
- an image decoding method comprising the steps of: a. building a sample library using an original high resolution image set; b. training the convolutional structure network using the sample library; c. utilizing training The subsequent convolutional fabric network processes the low resolution input signal to obtain a high resolution output signal.
- an image decoding system comprising: a sample library building device configured to build a sample library using an original high resolution image set; a training device configured to utilize a sample library pair convolution structure network Training is performed; an output device configured to process the low resolution input signal with the adjusted convolutional structure network to obtain a high resolution output signal.
- the method of the present disclosure adds the similarity information of the face feature part while utilizing the information of the original input image to enlarge the image, and enriches the details of the face in the image after the image, so that the sharpness is obviously improved.
- the image decoding method and system according to the present disclosure can realize the processing of the data after the expansion by simply expanding the hardware without requiring a large change algorithm; and deploying the complex algorithm into a parallel design, which can work independently between different servers. And modular design, the functional module design can be changed through later optimization.
- Figure 1 shows a general method of face resolution techniques.
- FIG. 2 illustrates another general method of face resolution techniques.
- FIG. 3 illustrates a flow chart of a method of decoding in accordance with an embodiment of the present disclosure.
- FIG. 4 illustrates a specific implementation flow of the resolution method of FIG. 3 in accordance with at least one embodiment of the present disclosure.
- FIG. 5 shows a specific implementation process of the training process S405 of FIG.
- FIG. 6 illustrates a specific implementation flow of the resolution method of FIG. 3 in accordance with at least one embodiment of the present disclosure.
- FIG. 7 shows a specific implementation process of the second training process S607 of FIG. 6.
- FIG. 8 shows a block diagram of an imaging system in accordance with an embodiment of the present disclosure.
- FIG. 9 illustrates a block diagram of a particular implementation of the resolution system of FIG. 8 in accordance with at least one embodiment of the present disclosure.
- FIG. 10 illustrates a block diagram of a particular implementation of the resolution system of FIG. 8 in accordance with at least one embodiment of the present disclosure.
- Figure 1 shows a general method of face resolution techniques.
- face recognition is performed using a PCA (Principal Component Analysis) algorithm.
- the low-resolution image is mapped to a high-resolution image, and the constraint is used for face reconstruction, and finally the high-resolution image is output.
- PCA Principal Component Analysis
- FIG. 2 illustrates another general method of face resolution techniques.
- Feature mapping is performed on the input face image, and the high-frequency details with high similarity are added to complete the reconstruction of the face.
- the output of the high-resolution face image is performed.
- Question 2 The filling of the details in the feature reconstruction of the face is based on the reconstructed image, and the output is more unnatural and unreal.
- the deep neural network is used to construct the training model of high and low resolution face database. After the model is well fitted, the sample library and quantity can be changed at any time. It is only necessary to update the whole training model and obtain new feature filtering parameters.
- the method adds the similarity information of the face feature part while using the information of the original input image to zoom in and out, and enriches the details of the face in the image after the image, so that the clarity is obviously raised. Rise.
- FIG. 3 illustrates a flow chart of a method of decoding in accordance with an embodiment of the present disclosure.
- step S301 a sample library is created using the original high resolution image set.
- step S302 the convolutional structure network is trained using the sample library.
- step S303 the low resolution input signal is processed using the trained convolutional fabric network to obtain a high resolution output signal.
- FIG. 4 illustrates a specific implementation flow of the resolution method of FIG. 3 in accordance with at least one embodiment of the present disclosure.
- step S301 in FIG. 3 further includes steps S401-S404.
- step S401 the original high resolution image set is downsampled to obtain a low resolution image set.
- the downsampling process may take, for example, an existing or future process capable of achieving the same function, such as a linear convolution process.
- step S402 face feature information of the low resolution image is extracted using the face feature extraction method.
- the face feature extraction method may be an existing or future method capable of achieving the same function such as an edge detection algorithm.
- step S403 face feature point marking is performed on the high resolution image to obtain face feature information of the high resolution image.
- the structuring of face images mainly consists of facial components, contours and smooth regions. Marker detection is used for local facial components and contours.
- step S404 the face feature information of the low resolution image and the face feature information of the high resolution image are used to establish a face feature information including face feature information of the low resolution image and the associated high resolution image. Face feature sample library.
- step S302 in FIG. 3 further includes step S405.
- step S405 the face feature information pairs of the low resolution image and the high resolution image in the face feature sample library are trained to obtain the first filter parameter.
- the first filtering parameter is, for example, a classifier filtering parameter for a convolutional fabric network.
- step S303 in FIG. 3 further includes steps S406-S408.
- step S406 face feature information of the low resolution image is input as an input signal.
- step S407 the face feature information of the input low resolution image is processed using the convolutional structure network based on the adjusted first filter parameter obtained in step S405.
- step S408 the face feature information of the high resolution image processed by the convolutional structure network is output as an output signal.
- FIG. 5 shows a specific implementation process of the training process S405 of FIG.
- step S501 correlation analysis is performed on the face feature information of the low resolution image in the face feature sample library and the face feature information pair of the associated high resolution image to obtain a convolution structure network.
- the first filter parameter is the first filter parameter.
- steps S502 and S503 high-pass filtering and low-pass filtering are respectively performed on the face feature information of the high-resolution image to obtain high-frequency information of the face feature as a high-pass filtered face result and a face feature low-frequency information as a low-pass filtered face. result.
- the high-pass filtering facial feature information can obtain high-frequency features, for example, facial structure contour information; and the low-pass filtering facial feature information can obtain detailed information, such as details of facial skin texture/roughness.
- step S504 the high-pass filtered face result and the low-pass filtered face result are superimposed to obtain the superimposed result, that is, the superposition of the extracted high-frequency and low-frequency information (feature contour and detail texture).
- step S505 feature classification is performed on the superimposed result, and the detail template of the face feature information of the high resolution image is obtained as a feedback signal of the convolution structure network. For example, different features such as a, b, c, etc. are marked as one category, respectively, to obtain different categories of detail templates.
- the prediction result signal obtained after the processing is substantially the same as the feedback signal. That is, the difference between the prediction result signal and the feedback signal is smaller than the first threshold, and the first threshold may be set according to actual conditions, for example, the first threshold may be less than or equal to 0.01.
- the facial feature information of the low resolution image is processed by the convolution structure network to obtain the facial feature information of the high resolution image.
- the convolutional structure network is formed by alternately connecting a plurality of convolution layers and excitation layers.
- the number of convolutional layers and excitation layers can be set according to actual conditions, for example, 2 or more.
- each layer of the convolutional structure network takes the output of the previous layer as input and passes the output of this layer to the next layer.
- Each convolutional layer can include a plurality of filtering units having adjustable filtering parameters.
- the number of filtering units included in each convolutional layer may be the same or different.
- the convolutional layer extracts the features from the input signal or the feature map of the previous layer by a convolution operation to obtain a convolved facial feature map.
- the excitation layer is used to remove features with low sensitivity to the human eye.
- the convolutional structure network When the convolutional structure network is used to obtain the prediction result signal ( among them Representing the prediction result signal (ie, the feature value), m represents the number of image sets in the face feature sample library, and FM i represents the feature map output through the last layer of the excitation layer), and uses the following variance as compared with the feedback signal.
- the function calculates the error rate J(W,b), then calculates the partial derivative of the error rate for each filter parameter, and then adjusts the filter parameters according to the partial derivative (gradient).
- J(W,b) represents the error rate and m represents the number of image sets in the face feature sample library.
- m represents the number of image sets in the face feature sample library.
- Represents a feedback signal Indicates the prediction result signal, h W,b represents the weight coefficient.
- h W,b is the empirical value, the default value is 1, h W,b depends on the complexity of the network to adjust the size according to experience.
- FIG. 6 illustrates a specific implementation flow of the resolution method of FIG. 3 in accordance with at least one embodiment of the present disclosure.
- step S301 in FIG. 3 further includes steps S601-S605.
- step S601 the original high resolution image set is downsampled to obtain a low resolution image set.
- the downsampling process may take, for example, an existing or future process capable of achieving the same function, such as a linear convolution process.
- step S602 the face feature information of the low resolution image of the low resolution image is extracted using the face feature extraction method.
- the face feature extraction method may be an existing or future method capable of achieving the same function such as an edge detection algorithm.
- step S603 the face feature point mark is performed on the high resolution image to obtain face feature information of the high resolution image.
- step S604 the face feature information of the low resolution image and the face feature information of the high resolution image are used to establish a face feature information including the face feature information of the low resolution image and the associated high resolution image. Face feature sample library.
- a library of image samples containing the low resolution image and the associated high resolution image pair is created using the low resolution image and the high resolution image.
- step S302 in FIG. 3 further includes steps S606 and S607.
- step S606 the face feature information pairs of the low resolution image and the high resolution image in the face feature sample library are trained to obtain the first filter parameter.
- step S607 the low resolution image and the high resolution image pair in the image sample library are trained to obtain the second filter parameter.
- step S303 in FIG. 3 further includes steps S608-S610.
- step S608 low resolution information is input as an input signal.
- step S609 the input signal is processed by the convolutional structure network based on the adjusted first filter parameter obtained in step S606 and the adjusted second filter parameter obtained in step S607.
- step S610 high-resolution information processed by the convolutional structure network is output as an output signal.
- the first training process S606 in FIG. 6 is the same as the training process S405 in FIG. I will not repeat them here.
- FIG. 7 shows a specific implementation process of the second training process S607 of FIG. 6.
- step S701 a correlation analysis is performed on the low resolution image and the associated high resolution image pair in the image sample library to obtain a second filter parameter of the convolutional structure network.
- steps S702 and S703 high-resolution filtering and low-pass filtering are respectively performed on the high-resolution image to obtain image high-frequency information as a high-pass filtered image result and image low-frequency information as low-pass filtered image results.
- the high-pass filtered image can obtain the high-frequency information of the image, which is the relatively prominent feature in the image; and the low-pass filtered image can obtain the low-frequency information of the image, that is, the detailed texture feature in the image.
- step S704 the high-pass filtered face result and the low-pass filtered face result are superimposed to obtain a superimposed result, that is, a superposition of the extracted high-frequency and low-frequency information (feature contour and detail texture).
- step S705 the result of the superposition is subjected to feature classification, and the detail template of the high-resolution image is obtained as a feedback signal of the convolution structure network.
- the detail template of the high-resolution image is obtained as a feedback signal of the convolution structure network.
- different features such as a, b, c, etc. are marked as one category, respectively, to obtain different categories of detail templates.
- the second filtering parameter in the convolutional structure network uses the low resolution image as the input signal of the convolutional structure network to make the prediction obtained by processing the input signal by using the adjusted second filtering parameter in the convolution network
- the resulting signal is substantially the same as the feedback signal. That is, the difference between the prediction result signal and the feedback signal is smaller than the first threshold, and the first threshold may be set according to actual conditions, for example, the first threshold may be less than or equal to 0.01.
- the low resolution image is processed using a convolutional structure network to obtain a high resolution image.
- the training process S405 in FIG. For the specific training process of the second training process S707, it is the training process S405 in FIG.
- the specific training process is similar. The difference is that the face feature information of the low-resolution image in the face feature sample library and the face feature information pair of the associated high-resolution image are replaced with the image sample library established in step S705 of FIG. 7. Low resolution image and high resolution image pair. Therefore, it will not be described here.
- FIG. 8 shows a block diagram of an imaging system in accordance with an embodiment of the present disclosure.
- the resolution system includes a sample library construction device 801, a training device 802, and an output device 803.
- a sample library building device 801 is configured to build a sample library using the original high resolution image set.
- Training device 802 is configured to train the convolutional fabric network using a sample library.
- Output device 803 is configured to process the low resolution input signal using the trained convolutional fabric network to obtain a high resolution output signal.
- FIG. 9 illustrates a block diagram of a particular implementation of the resolution system of FIG. 8 in accordance with at least one embodiment of the present disclosure.
- the sample library construction apparatus 801 in FIG. 8 further includes a downsampling unit 901, a face parsing unit 902, a feature point marking unit 903, and a face feature sample library establishing unit 904.
- the training device 802 of FIG. 8 further includes a training unit 905.
- the output device 803 in FIG. 8 further includes an input unit 906, a convolutional fabric network 907, and an output unit 908.
- the downsampling unit 901 is configured to downsample the original high resolution image set to obtain a low resolution image set.
- the downsampling process may take, for example, an existing or future process capable of achieving the same function, such as a linear convolution process.
- the face parsing unit 902 is configured to extract face feature information of the low resolution image of the low resolution image using the face feature extraction method.
- the face feature extraction method may be an existing or future method capable of achieving the same function such as an edge detection algorithm.
- the feature point marking unit 903 is configured to perform face feature point marking on the high resolution image to obtain face feature information of the high resolution image.
- the structuring of face images mainly consists of facial components, contours and smooth regions. Marker detection is used for local facial components and contours.
- the face feature sample library establishing unit 904 is configured to use the face feature information of the low resolution image and the face feature information of the high resolution image to create face feature information and related high resolution images including the low resolution image.
- the training unit 905 is configured to train the face feature information pairs of the low resolution image and the high resolution image in the face feature sample library to obtain the first filter parameter.
- the first filter parameter For example, a classifier filter parameter for a convolutional fabric network.
- the input unit 906 is configured to input face feature information of the low resolution image as an input signal.
- the convolutional fabric network 907 is configured to process the facial feature information of the input low resolution image using the convolutional fabric network based on the adjusted first filtering parameters.
- the output unit 908 is configured to output face feature information of the high resolution image processed by the convolutional structure network as an output signal.
- FIG. 10 illustrates a block diagram of a particular implementation of the resolution system of FIG. 8 in accordance with at least one embodiment of the present disclosure.
- the sample library construction apparatus 801 in FIG. 8 further includes: a downsampling unit 1001, a face parsing unit 1002, a feature point marking unit 1003, a face feature sample library establishing unit 1004, and an image sample library establishing unit 1005. .
- the training device 802 of FIG. 8 further includes a first training unit 1006 and a second training unit 1007.
- the output device 803 in FIG. 8 further includes an input unit 1008, a convolutional fabric network 1009, and an output unit 1010.
- the downsampling unit 1001 is configured to downsample the original high resolution image set to obtain a low resolution image set.
- the downsampling process may take, for example, an existing or future process capable of achieving the same function, such as a linear convolution process.
- the face parsing unit 1002 is configured to extract face feature information of the low resolution image of the low resolution image using the face feature extraction method.
- the face feature extraction method may be an existing or future method capable of achieving the same function such as an edge detection algorithm.
- the feature point marking unit 1003 is configured to perform face feature point marking on the high resolution image to obtain face feature information of the high resolution image.
- the face feature sample library establishing unit 1004 is configured to use the face feature information of the low resolution image and the face feature information of the high resolution image to create face feature information and related high resolution images including the low resolution image.
- the image sample library building unit 1005 is configured to create a library of image samples containing low resolution images and associated high resolution image pairs using low resolution images and high resolution images.
- the first training unit 1006 is configured to train the face feature information of the low resolution image and the face feature information pair of the high resolution image in the face feature sample library to obtain the first filter parameter.
- a second training unit 1007 configured to image low resolution images and high resolution in a library of image samples The rate image pair is trained to obtain a second filter parameter.
- the input unit 1008 is configured to input face feature information and/or an image of the low resolution image as an input signal.
- Convolutional fabric network 1009 is configured to process facial feature information and/or images of the input low resolution image using a convolutional fabric network based on the adjusted first and/or second filtering parameters.
- the output unit 1010 is configured to output face feature information and/or images of the high resolution image processed by the convolutional fabric network as an output signal.
- Depth learning system based on deep learning including: parallelization and hierarchical design training model and resolution model.
- first, second, third, etc. may be used herein to describe various elements, components and/or portions, these elements, components and/or portions are not limited by these terms. These terms are only used to distinguish one element, component or part. Thus, a first element, component or portion discussed below may be referred to as a second element, component or portion without departing from the teachings of the present disclosure.
- the computer program instructions may also be stored in a computer readable memory, and the computer or other programmable data processing apparatus may be directed to operate in a particular manner such that the instructions stored in the computer readable memory comprise an implementation flow diagram and/or a block diagram block.
- the computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on a computer or other programmable device to produce computer-implemented processing such that instructions are executed on a computer or other programmable device.
- the steps of the specified function/action in the flowchart and/or block diagram block are implemented.
- Each block may represent a code module, segment or portion that includes one or more executable instructions for implementing the specified logical function.
- the functions noted in the blocks may not occur in the order noted. For example, two blocks shown in succession may be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Human Computer Interaction (AREA)
- Databases & Information Systems (AREA)
- Biodiversity & Conservation Biology (AREA)
- Medical Informatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Description
Claims (20)
- 一种解像方法,包括:利用原始高分辨率图像集建立样本库;利用样本库对卷积结构网络进行训练;利用训练后的卷积结构网络对低分辨率的输入信号进行处理以得到高分辨率的输出信号。
- 如权利要求1所述的解像方法,其中,所述样本库包括人脸特征样本库,所述利用原始高分辨率图像集建立样本库进一步包括:对原始高分辨率图像集进行下采样处理得到低分辨率图像集;使用人脸特征提取方法提取低分辨率图像的人脸特征信息;对高分辨率图像进行人脸特征点标记,以获得高分辨率图像的人脸特征信息;利用低分辨率图像的人脸特征信息和高分辨率图像的人脸特征信息建立包含低分辨率图像的人脸特征信息和相关的高分辨率图像的人脸特征信息对的人脸特征样本库。
- 如权利要求2所述的解像方法,其中,所述利用样本库对卷积结构网络进行训练进一步包括:对人脸特征样本库中的低分辨率图像的人脸特征信息和相关的高分辨率图像的人脸特征信息对进行相关性分析,得到卷积结构网络的第一滤波参数;对高分辨率图像的人脸特征信息分别进行高通滤波和低通滤波以得到高通滤波人脸结果和低通滤波人脸结果;将高通滤波人脸结果和低通滤波人脸结果叠加并进行特征分类,得到高分辨率图像的人脸特征信息的细节模板作为卷积结构网络的反馈信号;采用低分辨率图像的人脸特征信息作为卷积结构网络的输入信号,调整卷积结构网络中的第一滤波参数,采用调整后的第一滤波参数利用卷积结构网络对输入信号进行处理以得到与反馈信号相同的预测结果信号。
- 如权利要求3所述的解像方法,其中,所述利用训练后的卷积结构网络对低分辨率的输入信号进行处理以得到高分辨率的输出信号进一步包括:输入低分辨率图像的人脸特征信息;基于调整后的第一滤波参数,利用卷积结构网络对输入的低分辨率图像的人脸特征信息进行处理;输出经卷积结构网络处理的高分辨率图像的人脸特征信息。
- 如权利要求2-4中的任何一个所述的解像方法,其中,所述样本库包括图像样本库,所述利用原始高分辨率图像集建立样本库进一步包括:利用低分辨率图像集和高分辨率图像集建立包含低分辨率图像和相关的高分辨率图像对的图像样本库。
- 如权利要求5所述的解像方法,其中,所述利用样本库对卷积结构网络进行训练进一步包括:对低分辨率图像和相关的高分辨率图像对进行相关性分析,得到卷积结构网络的第二滤波参数;对高分辨率图像分别进行高通滤波和低通滤波以得到高通滤波结果和低通滤波结果;将高通滤波结果和低通滤波结果叠加并进行特征分类,得到高分辨率图像的细节模板作为卷积结构网络的反馈信号;采用低分辨率图像作为卷积结构网络的输入信号,调整卷积结构网络中的第二滤波参数,采用调整后的第二滤波参数利用卷积结构网络对输入信号进行处理以得到与反馈信号相同的预测结果信号。
- 如权利要求6所述的解像方法,其中,所述利用训练后的卷积结构网络对低分辨率的输入信号进行处理以得到高分辨率的输出信号进一步包括:输入低分辨率的图像;基于调整后的第二滤波参数,利用卷积结构网络对输入的低分辨率的图像进行处理;输出经卷积结构网络处理的高分辨率的图像。
- 如权利要求7所述的解像方法,其中,卷积结构网络是由多个卷积层和激励层交替连接而成,每个卷积层都可以包括多个具有可调整滤波参数的滤波单元,其中,每个滤波单元采用公式F(x)=Wx+b进行卷积操作,其中,W和b是滤波参数,x是输入,F(x)为输出。
- 如权利要求9所述的解像方法,当预测结果信号与反馈信号不相同时,对于每个滤波参数,计算J(W,b)的偏导数,并根据偏导数调整第一滤波参数或第二滤波参数。
- 如权利要求10所述的解像方法,其中,所述第一滤波参数是用于卷积结构网络的分类器滤波参数。
- 一种解像系统,包括:样本库构建装置,被配置利用原始高分辨率图像集建立样本库;训练装置,被配置利用样本库对卷积结构网络进行训练;输出装置,被配置利用训练后的卷积结构网络对低分辨率的输入信号进行处理以得到高分辨率的输出信号。
- 如权利要求12所述的解像系统,其中,所述样本库包括人脸特征样本库,样本库构建装置进一步包括:下采样单元,被配置以对原始高分辨率图像集进行下采样处理得到低分辨率图像集;人脸解析单元,被配置以使用人脸特征提取方法提取低分辨率图像的低分辨率图像的人脸特征信息;特征点标记单元,被配置以对高分辨率图像进行人脸特征点标记,以获得高分辨率图像的人脸特征信息;人脸特征样本库建立单元,被配置以利用低分辨率图像的人脸特征信息和高分辨率图像的人脸特征信息建立包含低分辨率图像的人脸特征信息和相关的高分辨率图像的人脸特征信息对的人脸特征样本库。
- 如权利要求13所述的解像系统,其中,训练装置进一步包括:第一训练单元,被配置以对人脸特征样本库中的低分辨率图像的人脸特征信息和相关的高分辨率图像的人脸特征信息对进行相关性分析,得到卷积结构网络的第一滤波参数;对高分辨率图像的人脸特征信息分别进行高通滤波和低通滤波以得到高通滤波人脸结果和低通滤波人脸结果;将高通滤波人 脸结果和低通滤波人脸结果叠加并进行特征分类,得到高分辨率图像的人脸特征信息的细节模板作为卷积结构网络的反馈信号;采用低分辨率图像的人脸特征信息作为卷积结构网络的输入信号,调整卷积结构网络中的第一滤波参数,采用调整后的第一滤波参数利用卷积结构网络对输入信号进行处理以得到与反馈信号相同的预测结果信号。
- 如权利要求13-14中的任何一个所述的解像系统,其中,所述样本库包括图像样本库,样本库构建装置进一步包括:图像样本库建立单元,被配置以利用低分辨率图像和高分辨率图像建立包含低分辨率图像和相关的高分辨率图像对的图像样本库。
- 如权利要求15所述的解像系统,其中,训练装置进一步包括:第二训练单元,被配置以对低分辨率图像和相关的高分辨率图像对进行相关性分析,得到卷积结构网络的第二滤波参数;对高分辨率图像分别进行高通滤波和低通滤波以得到高通滤波结果和低通滤波结果;将高通滤波结果和低通滤波结果叠加并进行特征分类,得到高分辨率图像的细节模板作为卷积结构网络的反馈信号;采用低分辨率图像作为卷积结构网络的输入信号,调整卷积结构网络中的第二滤波参数,采用调整后的第二滤波参数利用卷积结构网络对输入信号进行处理以得到与反馈信号相同的预测结果信号。
- 如权利要求16所述的解像系统,其中,输出装置进一步包括:输入单元,被进一步配置以输入低分辨率的人脸特征信息和/或图像;卷积结构网络,被进一步配置基于调整后的第一和/或第二滤波参数,利用卷积结构网络对输入的低分辨率的人脸特征信息和/或图像进行处理;输出单元,被进一步配置以输出经卷积结构网络处理的高分辨率的人脸特征信息和/或图像。
- 如权利要求17所述的解像系统,其中,卷积结构网络是由多个卷积层和激励层交替连接而成,每个卷积层都可以包括多个具有可调整滤波参数的滤波单元,其中,每个滤波单元采用公式F(x)=Wx+b进行卷积操作,其中,W和b是滤波参数,x是输入,F(x)为输出。
- 如权利要求19所述的解像系统,当预测结果信号与反馈信号不相同时,对于每个滤波参数,计算J(W,b)的偏导数,并根据偏导数调整第一滤波参数或第二滤波参数。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/537,677 US10769758B2 (en) | 2016-03-21 | 2016-06-21 | Resolving method and system based on deep learning |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610161589.2 | 2016-03-21 | ||
CN201610161589.2A CN105847968B (zh) | 2016-03-21 | 2016-03-21 | 基于深度学习的解像方法和系统 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017161710A1 true WO2017161710A1 (zh) | 2017-09-28 |
Family
ID=56588161
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2016/086494 WO2017161710A1 (zh) | 2016-03-21 | 2016-06-21 | 基于深度学习的解像方法和系统 |
Country Status (3)
Country | Link |
---|---|
US (1) | US10769758B2 (zh) |
CN (1) | CN105847968B (zh) |
WO (1) | WO2017161710A1 (zh) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018138603A1 (en) * | 2017-01-26 | 2018-08-02 | Semiconductor Energy Laboratory Co., Ltd. | Semiconductor device and electronic device including the semiconductor device |
US10489887B2 (en) * | 2017-04-10 | 2019-11-26 | Samsung Electronics Co., Ltd. | System and method for deep learning image super resolution |
CN107633218B (zh) | 2017-09-08 | 2021-06-08 | 百度在线网络技术(北京)有限公司 | 用于生成图像的方法和装置 |
US11768979B2 (en) * | 2018-03-23 | 2023-09-26 | Sony Corporation | Information processing device and information processing method |
CN108875904A (zh) * | 2018-04-04 | 2018-11-23 | 北京迈格威科技有限公司 | 图像处理方法、图像处理装置和计算机可读存储介质 |
CN109977963B (zh) * | 2019-04-10 | 2021-10-15 | 京东方科技集团股份有限公司 | 图像处理方法、设备、装置以及计算机可读介质 |
CN112215761A (zh) * | 2019-07-12 | 2021-01-12 | 华为技术有限公司 | 图像处理方法、装置及设备 |
CN110543815B (zh) * | 2019-07-22 | 2024-03-08 | 平安科技(深圳)有限公司 | 人脸识别模型的训练方法、人脸识别方法、装置、设备及存储介质 |
CN111899252B (zh) * | 2020-08-06 | 2023-10-27 | 腾讯科技(深圳)有限公司 | 基于人工智能的病理图像处理方法和装置 |
CN112580617B (zh) * | 2021-03-01 | 2021-06-18 | 中国科学院自动化研究所 | 自然场景下的表情识别方法和装置 |
CN113658040A (zh) * | 2021-07-14 | 2021-11-16 | 西安理工大学 | 一种基于先验信息和注意力融合机制的人脸超分辨方法 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101216889A (zh) * | 2008-01-14 | 2008-07-09 | 浙江大学 | 一种融合全局特征与局部细节信息的人脸图像超分辨率方法 |
CN101299235A (zh) * | 2008-06-18 | 2008-11-05 | 中山大学 | 一种基于核主成分分析的人脸超分辨率重构方法 |
CN101719270A (zh) * | 2009-12-25 | 2010-06-02 | 武汉大学 | 一种基于非负矩阵分解的人脸超分辨率处理方法 |
CN103020940A (zh) * | 2012-12-26 | 2013-04-03 | 武汉大学 | 一种基于局部特征转换的人脸超分辨率重建方法 |
US20130241633A1 (en) * | 2011-09-13 | 2013-09-19 | Jeol Ltd. | Method and Apparatus for Signal Processing |
CN105120130A (zh) * | 2015-09-17 | 2015-12-02 | 京东方科技集团股份有限公司 | 一种图像升频系统、其训练方法及图像升频方法 |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101477684B (zh) * | 2008-12-11 | 2010-11-10 | 西安交通大学 | 一种利用位置图像块重建的人脸图像超分辨率方法 |
KR20110065997A (ko) * | 2009-12-10 | 2011-06-16 | 삼성전자주식회사 | 영상처리장치 및 영상처리방법 |
JP5706177B2 (ja) | 2010-02-09 | 2015-04-22 | パナソニック インテレクチュアル プロパティ コーポレーション オブアメリカPanasonic Intellectual Property Corporation of America | 超解像処理装置及び超解像処理方法 |
CN101950415B (zh) * | 2010-09-14 | 2011-11-16 | 武汉大学 | 一种基于形状语义模型约束的人脸超分辨率处理方法 |
US8743119B2 (en) * | 2011-05-24 | 2014-06-03 | Seiko Epson Corporation | Model-based face image super-resolution |
US9734558B2 (en) * | 2014-03-20 | 2017-08-15 | Mitsubishi Electric Research Laboratories, Inc. | Method for generating high-resolution images using regression patterns |
CN106462549B (zh) * | 2014-04-09 | 2020-02-21 | 尹度普有限公司 | 使用从显微变化中进行的机器学习来鉴定实体对象 |
CN105960657B (zh) * | 2014-06-17 | 2019-08-30 | 北京旷视科技有限公司 | 使用卷积神经网络的面部超分辨率 |
CN104899830B (zh) * | 2015-05-29 | 2017-09-29 | 清华大学深圳研究生院 | 一种图像超分辨方法 |
CN204948182U (zh) * | 2015-09-17 | 2016-01-06 | 京东方科技集团股份有限公司 | 一种图像升频系统及显示装置 |
-
2016
- 2016-03-21 CN CN201610161589.2A patent/CN105847968B/zh active Active
- 2016-06-21 US US15/537,677 patent/US10769758B2/en active Active
- 2016-06-21 WO PCT/CN2016/086494 patent/WO2017161710A1/zh active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101216889A (zh) * | 2008-01-14 | 2008-07-09 | 浙江大学 | 一种融合全局特征与局部细节信息的人脸图像超分辨率方法 |
CN101299235A (zh) * | 2008-06-18 | 2008-11-05 | 中山大学 | 一种基于核主成分分析的人脸超分辨率重构方法 |
CN101719270A (zh) * | 2009-12-25 | 2010-06-02 | 武汉大学 | 一种基于非负矩阵分解的人脸超分辨率处理方法 |
US20130241633A1 (en) * | 2011-09-13 | 2013-09-19 | Jeol Ltd. | Method and Apparatus for Signal Processing |
CN103020940A (zh) * | 2012-12-26 | 2013-04-03 | 武汉大学 | 一种基于局部特征转换的人脸超分辨率重建方法 |
CN105120130A (zh) * | 2015-09-17 | 2015-12-02 | 京东方科技集团股份有限公司 | 一种图像升频系统、其训练方法及图像升频方法 |
Also Published As
Publication number | Publication date |
---|---|
US20180089803A1 (en) | 2018-03-29 |
CN105847968A (zh) | 2016-08-10 |
US10769758B2 (en) | 2020-09-08 |
CN105847968B (zh) | 2018-12-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017161710A1 (zh) | 基于深度学习的解像方法和系统 | |
Liu et al. | Video super-resolution based on deep learning: a comprehensive survey | |
WO2017036092A1 (zh) | 超解像方法和系统、服务器、用户设备及其方法 | |
US20220222776A1 (en) | Multi-Stage Multi-Reference Bootstrapping for Video Super-Resolution | |
DE102020214863A1 (de) | Selbstüberwachtes verfahren und system zur tiefenschätzung | |
CN107204010A (zh) | 一种单目图像深度估计方法与系统 | |
US8538200B2 (en) | Systems and methods for resolution-invariant image representation | |
CN106327422A (zh) | 一种图像风格化重建方法及装置 | |
CN105488759B (zh) | 一种基于局部回归模型的图像超分辨率重建方法 | |
JP2017527011A (ja) | イメージをアップスケーリングする方法及び装置 | |
CN106530231B (zh) | 一种基于深层协作表达的超分辨率图像的重建方法及系统 | |
Huang et al. | Fast blind image super resolution using matrix-variable optimization | |
Luvizon et al. | Adaptive multiplane image generation from a single internet picture | |
CN104376544B (zh) | 一种基于多区域尺度放缩补偿的非局部超分辨率重建方法 | |
Liu et al. | Asflow: Unsupervised optical flow learning with adaptive pyramid sampling | |
Zhang et al. | Remote-sensing image superresolution based on visual saliency analysis and unequal reconstruction networks | |
CN117593702B (zh) | 远程监控方法、装置、设备及存储介质 | |
CN117541629B (zh) | 基于可穿戴头盔的红外图像和可见光图像配准融合方法 | |
CN109766938A (zh) | 基于场景标签约束深度网络的遥感影像多类目标检测方法 | |
CN114494786A (zh) | 一种基于多层协调卷积神经网络的细粒度图像分类方法 | |
WO2021213188A1 (zh) | 图像处理模型的训练方法、图像处理方法及对应的装置 | |
Niu et al. | Learning from multi-perception features for real-word image super-resolution | |
CN112184555A (zh) | 一种基于深度交互学习的立体图像超分辨率重建方法 | |
CN116152710A (zh) | 一种基于跨帧实例关联的视频实例分割方法 | |
US20230005104A1 (en) | Method and electronic device for performing ai based zoom of image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 15537677 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16895062 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 16895062 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.05.2019) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 16895062 Country of ref document: EP Kind code of ref document: A1 |