WO2022133944A1 - Image processing method and image processing apparatus - Google Patents

Image processing method and image processing apparatus Download PDF

Info

Publication number
WO2022133944A1
WO2022133944A1 PCT/CN2020/139145 CN2020139145W WO2022133944A1 WO 2022133944 A1 WO2022133944 A1 WO 2022133944A1 CN 2020139145 W CN2020139145 W CN 2020139145W WO 2022133944 A1 WO2022133944 A1 WO 2022133944A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
neural network
network model
feature information
image
Prior art date
Application number
PCT/CN2020/139145
Other languages
French (fr)
Chinese (zh)
Inventor
郑凯
李选富
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2020/139145 priority Critical patent/WO2022133944A1/en
Priority to CN202080107407.8A priority patent/CN116569218A/en
Publication of WO2022133944A1 publication Critical patent/WO2022133944A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics

Definitions

  • the present application relates to the field of image technology, and in particular, to an image processing method and an image processing device.
  • VR virtual reality
  • 3D three-dimensional light field content
  • a method commonly used in the industry is: sampling the 5-dimensional (5-dimensions, 5D) coordinates of the spatial points on the camera rays through a neural ray field (NeRF), The density and color of the image at this coordinate are synthesized based on this coordinate, and the final image is obtained using classical volume rendering techniques.
  • this method calculates pixel by pixel, ignoring the correlation between pixels, and has insufficient ability to restore the color and details of the image.
  • the present application provides an image processing method and an image processing device, which improve the ability to restore the color and details of the image, thereby effectively improving the image quality.
  • an image processing method which is characterized by comprising: acquiring spatial position information and perspective information of a picture in a camera; inputting the above-mentioned spatial position information and the above-mentioned perspective information into a first neural network model to obtain a
  • the spatial position information and the spatial light field feature information corresponding to the above-mentioned viewing angle information, the above-mentioned spatial light field feature information includes spatial three-dimensional feature information and color feature information, and the above-mentioned first neural network model is used to process the three-dimensional information of the image; the above-mentioned spatial light field
  • the feature information is input to the second neural network model to obtain the target image, and the above-mentioned second neural network model is used to restore the color information and detail information of the image.
  • the target image under the current perspective is generated by combining two trained neural network models, the three-dimensional light field information and color feature information of the target image are effectively reconstructed by using the first neural network model, and the The second neural network model effectively restores the color information and detail information of the image, and decouples the task of optimizing color details from the task of generating a spatial light field, thereby improving the ability to restore the color and details of the image and effectively improving the image quality.
  • the above-mentioned spatial position information refers to the position of the light in the three-dimensional space, which may be represented by three-dimensional coordinates (x, y, z).
  • the viewing angle information refers to the direction in the three-dimensional space of the ray emitted by the above-mentioned light from the above-mentioned spatial position, which can be used ( ⁇ , ) is represented by two parameters.
  • the above-mentioned spatial position information and viewing angle information can also be collectively referred to as 5D coordinates, using coordinates (x, y, z, ⁇ , )express.
  • the above-mentioned first neural network may also be referred to as a three-dimensional representation network model or a light field reconstruction network model, which is not limited in this embodiment of the present application.
  • the above-mentioned second neural network model may be a neural network model for processing image color information and detail information, such as a convolution neural network (CNN) network model.
  • CNN convolution neural network
  • the second neural network model includes an encoding network and a decoding network; the above-mentioned spatial light field feature information is input into the second neural network model to obtain a target image,
  • the method includes: inputting the above-mentioned spatial light field feature information into the decoding network to obtain the target image.
  • the spatial light field feature information includes spatial three-dimensional feature information and color feature information
  • the VR device realizes that the VR device can detect the color of the target image through the decoding network in the first neural network model and the second neural network model.
  • the improvement of the recovery ability of information and detailed information improves the image quality.
  • the above-mentioned method before the above-mentioned inputting the above-mentioned spatial position information and the above-mentioned perspective information into the first neural network model, the above-mentioned method further includes: inputting the color information of the sample image into the first neural network model.
  • the encoding network obtains the color feature information and detail feature information of the sample image; the first neural network model is trained by using the color feature information and detail feature information of the sample image and the spatial position information corresponding to the sample image.
  • the training of the first neural network model by the VR device no longer uses the image as the ground truth reference, but uses the first intermediate representation generated by the encoding network as the ground truth, and uses the spatial position of the corresponding image Information and perspective information are used as input to train the first neural network model network, so that the first neural network model can learn the first intermediate representation, so that the second intermediate representation output by the first neural network is more accurate.
  • the VR device can pass the second intermediate representation through the decoding network, and can output higher quality images.
  • the above-mentioned training of the first neural network model includes: taking the color feature information and detail feature information of the above-mentioned sample image as true values, and using the corresponding The spatial position information is used as input, and the above-mentioned first neural network model is trained.
  • the embodiment of the present application adopts the method of segmented training, and the second neural network model can be trained first, so as to generate the above-mentioned first intermediate representation (high-dimensional feature information containing color details), and then the first intermediate representation is used as the true value to train the first neural network model, so that the first neural network model can learn the implicit representation of the light field, and let the first neural network model output a more accurate intermediate representation, that is, the above-mentioned second intermediate representation.
  • the segmentation training method of the embodiment of the present application is easier to converge, and the training efficiency is higher.
  • an image processing apparatus comprising: an acquisition module and a processing module; wherein the acquisition module is used for: acquiring spatial position information and viewing angle information of a picture in a camera; The viewing angle information is input into the first neural network model, and the spatial light field feature information corresponding to the spatial position information and the viewing angle information is obtained, and the spatial light field feature information includes spatial three-dimensional feature information and color feature information.
  • a neural network model is used to process the three-dimensional information of the image; and the spatial light field feature information is input into a second neural network model to obtain a target image, and the second neural network model is used to restore the color information and details of the image information.
  • the above-mentioned second neural network model includes an encoding network and a decoding network; the above-mentioned processing module is specifically configured to: input the above-mentioned spatial light field feature information into the above-mentioned decoding network to obtain The above target image.
  • the above-mentioned processing module is specifically configured to: before inputting the above-mentioned spatial position information and the above-mentioned perspective information into the first neural network model, input the color information of the sample image to the first neural network model.
  • the encoding network obtains the color feature information and detail feature information of the sample image; and uses the color feature information and detail feature information of the sample image and the spatial position information corresponding to the sample image to train the first neural network model.
  • the above-mentioned processing module is specifically configured to: take the color feature information and detail feature information of the above-mentioned sample image as the true value, and take the spatial position information corresponding to the above-mentioned sample image as the input , train the first neural network model above.
  • another image processing apparatus comprising: a processor, which is coupled to a memory and can be configured to execute instructions in the memory, so as to implement the method in any possible implementation manner of the first aspect.
  • the apparatus further includes a memory.
  • the apparatus further includes a communication interface to which the processor is coupled.
  • a processor including: an input circuit, an output circuit, and a processing circuit.
  • the processing circuit is configured to receive the signal through the input circuit and transmit the signal through the output circuit, so that the processor executes the method in any one of the possible implementation manners of the above first aspect.
  • the above-mentioned processor may be a chip
  • the input circuit may be an input pin
  • the output circuit may be an output pin
  • the processing circuit may be a transistor, a gate circuit, a flip-flop, and various logic circuits.
  • the input signal received by the input circuit may be received and input by, for example, but not limited to, a receiver
  • the signal output by the output circuit may be, for example, but not limited to, output to and transmitted by a transmitter
  • the circuit can be the same circuit that acts as an input circuit and an output circuit at different times.
  • the embodiments of the present application do not limit the specific implementation manners of the processor and various circuits.
  • a processing apparatus including a processor and a memory.
  • the processor is configured to read the instructions stored in the memory, so as to execute the method in any one of the possible implementation manners of the first aspect.
  • processors there are one or more processors and one or more memories.
  • the memory may be integrated with the processor, or the memory may be provided separately from the processor.
  • the memory can be a non-transitory memory, such as a read only memory (ROM), which can be integrated with the processor on the same chip, or can be separately set in different On the chip, the embodiment of the present application does not limit the type of the memory and the setting manner of the memory and the processor.
  • ROM read only memory
  • the processing device in the fifth aspect may be a chip, and the processor may be implemented by hardware or software.
  • the processor When implemented by hardware, the processor may be a logic circuit, an integrated circuit, etc.; when implemented by software
  • the processor can be a general-purpose processor, which is realized by reading software codes stored in a memory, and the memory can be integrated in the processor or located outside the processor and exist independently.
  • a computer program product includes: a computer program (also referred to as code, or instruction), which, when the computer program is executed, enables the computer to execute any one of the above-mentioned first aspects. method in method.
  • a computer-readable storage medium stores a computer program (also referred to as code, or instruction) when it is run on a computer, causing the computer to execute the above-mentioned first aspect. method in any of the possible implementations.
  • FIG. 1 is a schematic flowchart of an image processing method provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of an image processing process provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a training process of a first neural network model provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a training process of a second neural network model provided by an embodiment of the present application.
  • FIG. 5 is a schematic block diagram of an image processing apparatus provided by an embodiment of the present application.
  • FIG. 6 is a schematic block diagram of another image processing apparatus provided by an embodiment of the present application.
  • VR virtual reality
  • virtual environment also known as virtual environment, spiritual environment or artificial environment
  • VR refers to the use of computers to generate a virtual world that can directly apply visual, auditory and tactile sensations to participants and allow them to observe and operate interactively.
  • VR technology is very broad. At present, VR equipment is developing rapidly and is becoming simple, easy to use and popular. However, unlike the rapid development of VR equipment, high-quality VR digital content is very limited. Different from traditionally displayed 2D digital content, in order to enhance the immersive experience (for example, the display content changes with the movement of people), the VR device needs to acquire the 3D light field content of the scene, and the capture of the 3D light field content of the scene needs to be very complicated. The hardware devices limit the flexibility of 3D light field content acquisition.
  • the VR device can be based on image-based rendering (IBR) technology, that is, the ability to generate images from different viewing angles or different coordinates.
  • IBR image-based rendering
  • VR devices can obtain information of the entire scene through IBR, and generate images from any perspective in real time.
  • IBR presents two huge challenges.
  • IBR needs to reconstruct a 3-dimensional (three dimensions, 3D) model, but the reconstructed 3D model must be detailed enough and show the occlusion relationship of objects in the scene.
  • the surface color and material of the object generated by the 3D model need to rely on the representation ability of the input image, but the increase of the input data set will reduce the speed and performance of the model. Therefore, this method has certain requirements on the performance of the VR device, and has insufficient ability to restore the color, details and other information of the image.
  • the VR device can use the neural ray field NeRF to synthesize a complex scene representation using a sparse image data set, and then use the 5D coordinates of the spatial points on the camera ray (eg, the spatial position of the camera ray). (x, y, z) and viewing direction ( ⁇ , )) sampling to synthesize the density and color at the corresponding viewing angle. Then, the VR device can use the classic volume rendering technology for the density and color of the new perspective to obtain the image corresponding to the 5D coordinates, so as to continuously represent the new perspective of the entire scene.
  • This method uses a fully-connected deep learning network to perform pixel-by-pixel computation on the dataset, and does not use the correlation between pixels. The pixels are isolated from each other, and the ability to restore details in some scenes is insufficient.
  • the present application provides an image processing method and an image processing device, by combining two trained neural network models to generate a target image under the current perspective, and using the first neural network model to effectively reconstruct the three-dimensional light of the target image.
  • Field information and color feature information the second neural network model is used to effectively restore the color information and detail information of the image, thereby improving the ability to restore the color and detail of the image, and effectively improving the image quality.
  • the first, the second, and various numeral numbers are only for the convenience of description, and are not used to limit the scope of the embodiments of the present application.
  • the first neural network model, the second neural network model, etc. distinguish different neural networks and the like.
  • At least one means one or more, and “plurality” means two or more.
  • And/or which describes the association relationship of the associated objects, indicates that there can be three kinds of relationships, for example, A and/or B, which can indicate: the existence of A alone, the existence of A and B at the same time, and the existence of B alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the associated objects are an “or” relationship.
  • At least one item(s) below” or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s).
  • At least one (a) of a, b and c may represent: a, or b, or c, or a and b, or a and c, or b and c, or a, b and c, wherein a, b, c can be single or multiple.
  • the method in the embodiment of the present application may be performed by a VR device provided with a camera, and the VR device may be, for example, VR glasses, a VR headset, etc., which is not limited in the embodiment of the present application.
  • FIG. 1 is a schematic flowchart of an image processing method 100 in an embodiment of the present application. As shown in FIG. 1, the method 100 may include the following steps:
  • the above-mentioned spatial position information refers to the position of the light in the three-dimensional space, which may be represented by three-dimensional coordinates (x, y, z).
  • the viewing angle information refers to the direction in the three-dimensional space of the ray emitted by the above-mentioned light from the above-mentioned spatial position, which can be used ( ⁇ , ) is represented by two parameters.
  • the above-mentioned spatial position information and viewing angle information can also be collectively referred to as 5D coordinates, using coordinates (x, y, z, ⁇ , )express.
  • S102 Input the above-mentioned spatial position information and the above-mentioned perspective information into the first neural network model, and obtain the spatial light field feature information corresponding to the above-mentioned spatial position information and the above-mentioned perspective information, and the above-mentioned spatial light field feature information includes spatial three-dimensional feature information and color Feature information, the above-mentioned first neural network model is used to process the three-dimensional information of the image.
  • first neural network may also be referred to as a three-dimensional representation network model or a light field reconstruction network model, which is not limited in this embodiment of the present application.
  • the color feature information included in the above-mentioned spatial light field feature information may be a three-primary color format (red green blue color mode, RGB) or a YUV format, where "Y” represents brightness (luminance or luma), " U” and “V” represent the chrominance (chrominance or chroma), which is used to describe the color and saturation of the image, and is used to specify the color of the pixel.
  • the VR device may adopt different processing methods for the spatial light field feature information in different formats.
  • the VR device may input the spatial light field feature information including the color feature information in the RGB format into the second neural network. network model to obtain the target image.
  • the VR device when the color feature information included in the above-mentioned spatial light field feature information is in YUV format, the VR device can convert the color feature information in YUV format from YUV format to RGB format, and then convert the color feature information including the conversion into RGB format.
  • the spatial light field feature information of the latter color feature information is input to the above-mentioned second neural network model to obtain a target image.
  • the above-mentioned second neural network model may be a neural network model for processing image color information and detail information, such as a convolution neural network (CNN) network model.
  • CNN convolution neural network
  • the target image under the current perspective is generated by combining two trained neural network models, the three-dimensional light field information and color feature information of the target image are effectively reconstructed by using the first neural network model, and the The second neural network model effectively restores the color information and detail information of the image, and decouples the task of optimizing color details from the task of generating a spatial light field, thereby improving the ability to restore the color and details of the image and effectively improving the image quality.
  • the method of the embodiment of the present application is compared with the second implementation manner in the above-mentioned prior art, and the PSNR of the target image is improved from 32 At 34, the image quality is improved.
  • PSNR peak signal-to-noise ratio
  • the above-mentioned second neural network model is an RGB network specially processing color detail information
  • the above-mentioned first neural network model is a NeRF network specially processing 3D information.
  • the above-mentioned second neural network model includes an encoding network and a decoding network; inputting the above-mentioned spatial light field feature information into the second neural network model to obtain a target image includes: inputting the above-mentioned spatial light field characteristic information To the above-mentioned decoding network, the above-mentioned target image is obtained.
  • FIG. 2 shows a processing process of the image processing method provided by the embodiment of the present application.
  • the VR device can input the spatial position information and perspective information of the above image into the first neural network model to obtain the spatial light field feature information, and then input the spatial light field feature information into the second neural network model
  • the decoding network generates the target image.
  • the spatial light field feature information includes spatial three-dimensional feature information and color feature information
  • the VR device realizes that the VR device can detect the color of the target image through the decoding network in the first neural network model and the second neural network model.
  • the improvement of the recovery ability of information and detailed information improves the image quality.
  • the use of the neural network model provided by the embodiments of the present application is described in detail above with reference to FIG. 1 and FIG. 2 , and the training process of the neural network model will be described in detail below with reference to FIG. 3 and FIG. 4 .
  • the training process includes training of the first neural network model and training of the second neural network model.
  • the training of the first neural network model may include: inputting the color information of the sample image into the encoding network of the above-mentioned second neural network model to obtain the color feature information and detail feature information of the above-mentioned sample image; using the color feature information of the above-mentioned sample image. and the detailed feature information and the spatial position information corresponding to the sample image, to train the first neural network model.
  • the first intermediate representation contains information such as color, detail, domain, and relevance of the image. For example, the color, texture details, and position information of things in the image, as well as the relationship between colors, details, and positions between different things, and so on.
  • the VR device may map the color feature information of the image to a high-dimensional feature space through the encoding network to obtain the first intermediate representation.
  • the VR device can use the first intermediate representation and the spatial position information corresponding to the sample image to train the first neural network model and obtain the second intermediate representation.
  • the VR device inputs the second intermediate representation into the above-mentioned decoding network to obtain the training result of the sample image.
  • the VR device may use the first intermediate representation and the spatial position information corresponding to the sample image to train the first neural network model, so that the first neural network model can learn the parameters in the first intermediate representation.
  • the above-mentioned training of the first neural network model includes: taking the color feature information and detail feature information of the above-mentioned sample image as true values, and using the spatial position information corresponding to the above-mentioned sample image as input, training the Describe the first neural network model.
  • FIG. 3 shows the training process of the first neural network model provided by the embodiment of the present application.
  • the VR device can generate color feature information and detail feature information from the color information of the sample image by inputting the color information of the sample image into the encoding network in the second neural network model. (ie the first intermediate representation described above). Then, the VR device inputs the spatial position information and perspective information of the sample image into the first neural network model, and uses the color feature information and the detail feature information as true values to train the first neural network model, so that the first neural network model Information such as image color, details, domain, and correlation included in the first intermediate representation can be learned, so that the output result of the first neural network model can be close to the true value, thereby completing the training of the first neural network model.
  • the first neural network model Information such as image color, details, domain, and correlation included in the first intermediate representation can be learned, so that the output result of the first neural network model can be close to the true value, thereby completing the training of the first neural network model.
  • the training of the first neural network model by the VR device no longer uses the image as the ground truth reference, but uses the first intermediate representation generated by the encoding network as the ground truth, and uses the spatial position of the corresponding image Information and perspective information are used as input to train the first neural network model network, so that the first neural network model can learn the first intermediate representation, so that the second intermediate representation output by the first neural network is more accurate.
  • the VR device can pass the second intermediate representation through the decoding network, and can output higher quality images.
  • the VR device can also Two neural network models are trained.
  • FIG. 4 shows the training process of the second neural network model provided by the embodiment of the present application.
  • the VR device can input the color information of the sample image into the coding network in the second neural network model, generate the color feature information and detail feature information of the sample image through the coding network, and then use the obtained color feature information and detail feature information are input into the decoding network in the second neural network model, and the decoded image is obtained.
  • the embodiment of the present application adopts the method of segmented training, and the second neural network model can be trained first, so as to generate the above-mentioned first intermediate representation (high-dimensional feature information containing color details), and then the first intermediate representation is used as the true value to train the first neural network model, so that the first neural network model can learn the implicit representation of the light field, and let the first neural network model output a more accurate intermediate representation, that is, the above-mentioned second intermediate representation.
  • the segmentation training method of the embodiment of the present application is easier to converge, and the training efficiency is higher.
  • FIG. 5 shows an image processing apparatus 500 provided by an embodiment of the present application.
  • the apparatus 500 includes: an acquisition module 501 and a processing module 502 .
  • the acquisition module 501 is used to acquire the spatial position information and the perspective information of the picture in the camera;
  • the processing module 502 is used to input the spatial position information and the perspective information into the first neural network model, and obtain and the spatial position information and the perspective information.
  • the spatial light field feature information corresponding to the position information and the viewing angle information, the spatial light field feature information includes spatial three-dimensional feature information and color feature information, and the first neural network model is used to process the three-dimensional information of the image;
  • the spatial light field feature information is input to a second neural network model to obtain a target image, and the second neural network model is used to restore the color information and detail information of the image.
  • the above-mentioned second neural network model includes an encoding network and a decoding network; the processing module 502 is configured to input the above-mentioned spatial light field feature information into the above-mentioned decoding network to obtain the above-mentioned target image.
  • the processing module 502 is used to input the color information of the sample image into the encoding network before inputting the above-mentioned spatial position information and the above-mentioned perspective information into the first neural network model, and obtain the color feature information of the above-mentioned sample image. and detail feature information; and use the color feature information and detail feature information of the sample image and the spatial position information corresponding to the sample image to train the first neural network model.
  • the processing module 502 is configured to use the color feature information and detail feature information of the sample image as true values, and use the spatial position information corresponding to the sample image as input to train the first neural network model.
  • the apparatus 500 here is embodied in the form of functional modules.
  • module as used herein may refer to an application specific integrated circuit (ASIC), an electronic circuit, a processor for executing one or more software or firmware programs (eg, a shared processor, a dedicated processor, or a group of processors, etc.) and memory, merge logic, and/or other suitable components to support the described functions.
  • ASIC application specific integrated circuit
  • the apparatus 500 may be specifically the VR device in the foregoing embodiment, or the functions of the VR device in the foregoing embodiment may be integrated in the apparatus 500, and the apparatus 500 may be used to execute In order to avoid repetition, the various processes and/or steps corresponding to the VR device in the above method embodiments will not be repeated here.
  • the above-mentioned apparatus 500 has a function of implementing the corresponding steps performed by the VR device in the above-mentioned method; the above-mentioned functions may be implemented by hardware, or by executing corresponding software by hardware.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • the apparatus 500 in FIG. 5 may also be a chip or a system of chips, such as a system on chip (system on chip, SoC).
  • SoC system on chip
  • FIG. 6 shows another image processing apparatus 600 provided by an embodiment of the present application.
  • the apparatus 600 includes a processor 601 , a transceiver 602 and a memory 603 .
  • the processor 601, the transceiver 602 and the memory 603 communicate with each other through an internal connection path, the memory 603 is used to store instructions, and the processor 601 is used to execute the instructions stored in the memory 603 to control the transceiver 602 to send signals and / or receive signals.
  • the transceiver 602 is used to obtain the spatial position information and view angle information of the picture in the camera; the processor 601 is used to input the spatial position information and the view angle information into the first neural network model, and obtain the spatial position information and the view angle information.
  • the spatial light field feature information is input to a second neural network model to obtain a target image, and the second neural network model is used to restore the color information and detail information of the image.
  • the above-mentioned second neural network model includes an encoding network and a decoding network; the processor 601 is configured to input the above-mentioned spatial light field feature information into the above-mentioned decoding network to obtain the above-mentioned target image.
  • the processor 601 is used to input the color information of the sample image into the encoding network before inputting the above-mentioned spatial position information and the above-mentioned perspective information into the first neural network model, and obtain the color feature information of the above-mentioned sample image. and detail feature information; and use the color feature information and detail feature information of the sample image and the spatial position information corresponding to the sample image to train the first neural network model.
  • the processor 601 is configured to use the color feature information and detail feature information of the sample image as true values, and use the spatial position information corresponding to the sample image as input to train the first neural network model.
  • the apparatus 600 may be specifically the VR device in the above embodiments, or the functions of the VR device in the above embodiments may be integrated in the apparatus 600, and the apparatus 600 may be used to execute each of the above method embodiments corresponding to the VR device steps and/or processes.
  • the memory 603 may include read only memory and random access memory and provide instructions and data to the processor. A portion of the memory may also include non-volatile random access memory.
  • the memory may also store device type information.
  • the processor 601 may be configured to execute the instructions stored in the memory, and when the processor executes the instructions, the processor may execute various steps and/or processes corresponding to the VR device in the foregoing method embodiments.
  • the processor may be a central processing unit (Central Processing Unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (DSPs), application specific integrated circuits (ASICs) ), field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • CPU Central Processing Unit
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • each step of the above-mentioned method can be completed by a hardware integrated logic circuit in a processor or an instruction in the form of software.
  • the steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware processor, or executed by a combination of hardware and software modules in the processor.
  • the software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art.
  • the storage medium is located in the memory, and the processor executes the instructions in the memory, and completes the steps of the above method in combination with its hardware. To avoid repetition, detailed description is omitted here.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

An image processing method and an image processing apparatus, which improve the capability of recovering an image color and a detail, thereby effectively improving the image quality. The method comprises: acquiring spatial position information and viewing angle information of a picture in a camera (S101); inputting the spatial position information and the viewing angle information into a first neural network model, so as to obtain light field spatial feature information corresponding to the spatial position information and the viewing angle information, wherein the light field spatial feature information comprises spatial three-dimensional feature information and color feature information, and the first neural network model is used for processing three-dimensional information of an image (S102); and inputting the light field spatial feature information into a second neural network model, so as to obtain a target image, wherein the second neural network model is used for restoring color information and detail information of an image (S103).

Description

图像处理方法和图像处理装置Image processing method and image processing device 技术领域technical field
本申请涉及图像技术领域,尤其涉及一种图像处理方法和图像处理装置。The present application relates to the field of image technology, and in particular, to an image processing method and an image processing device.
背景技术Background technique
随着虚拟现实(virtual reality,VR)技术的发展,VR设备越来越简单、易用、普及。然而,与VR设备的飞速改进不同,高质量的VR内容仍非常有限。原因在于与传统显示的2维(two dimensions,2D)数字内容不同,为了增强用户身临其境的感受(比如显示内容跟随用户的运动而变化),VR设备需要获取场景的三维光场内容,通过三维光场内容渲染生成任意视角的图像,从而显示出高质量的VR内容。With the development of virtual reality (VR) technology, VR equipment is becoming more and more simple, easy to use and popular. However, unlike the rapid improvement of VR equipment, high-quality VR content is still very limited. The reason is that, unlike traditionally displayed two-dimensions (2D) digital content, in order to enhance the user's immersive experience (for example, the display content changes with the user's movement), the VR device needs to obtain the three-dimensional light field content of the scene. Generate images from any viewing angle through 3D light field content rendering to display high-quality VR content.
为了显示出上述高质量的VR内容,业界目前通常采用的一种方式是:通过神经射线场(nerve ray field,NeRF)对相机光线上的空间点的5维(five dimensions,5D)坐标采样,基于该坐标合成该坐标下图像的密度和颜色,并使用经典的体渲染技术获得最终的图像。该方法在生成图像时,是通过逐像素计算,忽略了像素间的相关性,对图像颜色和细节的恢复能力不足。In order to display the above-mentioned high-quality VR content, a method commonly used in the industry is: sampling the 5-dimensional (5-dimensions, 5D) coordinates of the spatial points on the camera rays through a neural ray field (NeRF), The density and color of the image at this coordinate are synthesized based on this coordinate, and the final image is obtained using classical volume rendering techniques. When generating an image, this method calculates pixel by pixel, ignoring the correlation between pixels, and has insufficient ability to restore the color and details of the image.
发明内容SUMMARY OF THE INVENTION
本申请提供一种图像处理方法和图像处理装置,提高了对图像颜色和细节的恢复能力,进而有效提高了图像质量。The present application provides an image processing method and an image processing device, which improve the ability to restore the color and details of the image, thereby effectively improving the image quality.
第一方面,提供了一种图像处理方法,其特征在于,包括:获取摄像头中画面的空间位置信息和视角信息;将上述空间位置信息和上述视角信息输入至第一神经网络模型,获得与上述空间位置信息和上述视角信息对应的空间光场特征信息,上述空间光场特征信息包括空间三维特征信息和颜色特征信息,上述第一神经网络模型用于处理图像的三维信息;将上述空间光场特征信息输入至第二神经网络模型,获得目标图像,上述第二神经网络模型用于还原图像的颜色信息和细节信息。In a first aspect, an image processing method is provided, which is characterized by comprising: acquiring spatial position information and perspective information of a picture in a camera; inputting the above-mentioned spatial position information and the above-mentioned perspective information into a first neural network model to obtain a The spatial position information and the spatial light field feature information corresponding to the above-mentioned viewing angle information, the above-mentioned spatial light field feature information includes spatial three-dimensional feature information and color feature information, and the above-mentioned first neural network model is used to process the three-dimensional information of the image; the above-mentioned spatial light field The feature information is input to the second neural network model to obtain the target image, and the above-mentioned second neural network model is used to restore the color information and detail information of the image.
本申请实施例的图像处理方法,通过结合训练好的两个神经网络模型生成当前视角下的目标图像,使用第一神经网络模型有效重建了目标图像的三维光场信息和颜色特征信息,使用第二神经网络模型有效还原了图像的颜色信息和细节信息,将优化颜色细节任务与生成空间光场任务进行了解耦,从而提高了对图像颜色和细节的恢复能力,有效提高了图像质量。In the image processing method of the embodiment of the present application, the target image under the current perspective is generated by combining two trained neural network models, the three-dimensional light field information and color feature information of the target image are effectively reconstructed by using the first neural network model, and the The second neural network model effectively restores the color information and detail information of the image, and decouples the task of optimizing color details from the task of generating a spatial light field, thereby improving the ability to restore the color and details of the image and effectively improving the image quality.
应理解,上述空间位置信息指光线在三维空间中的位置,可以用三维坐标(x,y,z)表示。视角信息指上述光线从上述空间位置发出的射线在三维空间中的方向,可以用(θ,
Figure PCTCN2020139145-appb-000001
)两个参数表示。上述空间位置信息和视角信息也可以统一称为5D坐标,用坐标(x,y,z,θ,
Figure PCTCN2020139145-appb-000002
)表示。
It should be understood that the above-mentioned spatial position information refers to the position of the light in the three-dimensional space, which may be represented by three-dimensional coordinates (x, y, z). The viewing angle information refers to the direction in the three-dimensional space of the ray emitted by the above-mentioned light from the above-mentioned spatial position, which can be used (θ,
Figure PCTCN2020139145-appb-000001
) is represented by two parameters. The above-mentioned spatial position information and viewing angle information can also be collectively referred to as 5D coordinates, using coordinates (x, y, z, θ,
Figure PCTCN2020139145-appb-000002
)express.
应理解,上述第一神经网络还可以称为三维表征网络模型或者光场重建网络模型,本申请实施例在此不作限定。上述第二神经网络模型可以为处理图像颜色信息和细节信息的 神经网络模型,如卷积神经网络(convolution neural network,CNN)网络模型。It should be understood that the above-mentioned first neural network may also be referred to as a three-dimensional representation network model or a light field reconstruction network model, which is not limited in this embodiment of the present application. The above-mentioned second neural network model may be a neural network model for processing image color information and detail information, such as a convolution neural network (CNN) network model.
结合第一方面,在第一方面的某些实现方式中,上述第二神经网络模型包括编码网络和解码网络;上述将所述空间光场特征信息输入至第二神经网络模型,获得目标图像,包括:将上述空间光场特征信息输入至所述解码网络,获得所述目标图像。With reference to the first aspect, in some implementations of the first aspect, the second neural network model includes an encoding network and a decoding network; the above-mentioned spatial light field feature information is input into the second neural network model to obtain a target image, The method includes: inputting the above-mentioned spatial light field feature information into the decoding network to obtain the target image.
在本申请实施例中,空间光场特征信息包括空间三维特征信息和颜色特征信息,VR设备通过上述第一神经网络模型和第二神经网络模型中的解码网络,实现了VR设备对目标图像颜色信息和细节信息的恢复能力的提升,提升了图像质量。In the embodiment of the present application, the spatial light field feature information includes spatial three-dimensional feature information and color feature information, and the VR device realizes that the VR device can detect the color of the target image through the decoding network in the first neural network model and the second neural network model. The improvement of the recovery ability of information and detailed information improves the image quality.
结合第一方面,在第一方面的某些实现方式中,在上述将所述空间位置信息和上述视角信息输入至第一神经网络模型之前,上述方法还包括:将样本图像的颜色信息输入至上述编码网络,获得上述样本图像的颜色特征信息和细节特征信息;利用上述样本图像的颜色特征信息和细节特征信息、以及上述样本图像对应的空间位置信息,训练上述第一神经网络模型。With reference to the first aspect, in some implementations of the first aspect, before the above-mentioned inputting the above-mentioned spatial position information and the above-mentioned perspective information into the first neural network model, the above-mentioned method further includes: inputting the color information of the sample image into the first neural network model. The encoding network obtains the color feature information and detail feature information of the sample image; the first neural network model is trained by using the color feature information and detail feature information of the sample image and the spatial position information corresponding to the sample image.
在本申请实施例中,VR设备对第一神经网络模型的训练不再使用图像作为真值参考,而是使用上述编码网络生成的第一中间表征为真值,以及使用对应的图像的空间位置信息和视角信息作为输入,来训练第一神经网络模型网络,使得该第一神经网络模型能够学习到上述第一中间表征,使得上述第一神经网络输出的第二中间表征更为准确。VR设备可以将该第二中间表征通过解码网络,可输出更高质量的图像。In the embodiment of the present application, the training of the first neural network model by the VR device no longer uses the image as the ground truth reference, but uses the first intermediate representation generated by the encoding network as the ground truth, and uses the spatial position of the corresponding image Information and perspective information are used as input to train the first neural network model network, so that the first neural network model can learn the first intermediate representation, so that the second intermediate representation output by the first neural network is more accurate. The VR device can pass the second intermediate representation through the decoding network, and can output higher quality images.
结合第一方面,在第一方面的某些实现方式中,上述训练所述第一神经网络模型,包括:将上述样本图像的颜色特征信息和细节特征信息作为真值,以上述样本图像对应的空间位置信息为输入,训练上述第一神经网络模型。With reference to the first aspect, in some implementations of the first aspect, the above-mentioned training of the first neural network model includes: taking the color feature information and detail feature information of the above-mentioned sample image as true values, and using the corresponding The spatial position information is used as input, and the above-mentioned first neural network model is trained.
应理解,当上述第二神经网络模型中的编码网络所生成的图像的颜色特征信息和细节特征信息可以被第二神经网络模型中的解码网络解码,并精确的还原成较高质量的图像,则表明该第二神经网络模型的训练完成。It should be understood that when the color feature information and detail feature information of the image generated by the encoding network in the second neural network model can be decoded by the decoding network in the second neural network model, and accurately restored to a higher quality image, It indicates that the training of the second neural network model is completed.
本申请实施例采用分段训练的方法,可以先对第二神经网络模型进行训练,从而生成上述第一中间表征(包含着颜色细节的高维度特征信息),再将该第一中间表征作为真值训练第一神经网络模型,使得该第一神经网络模型能够学习到光场的隐式表征,并让该第一神经网络模型输出更为准确的中间表征,即上述第二中间表征。本申请实施例的分段训练方法,与直接采用端到端训练三维光场表征和解码网络的模型相比,更容易收敛,训练效率较高。The embodiment of the present application adopts the method of segmented training, and the second neural network model can be trained first, so as to generate the above-mentioned first intermediate representation (high-dimensional feature information containing color details), and then the first intermediate representation is used as the true value to train the first neural network model, so that the first neural network model can learn the implicit representation of the light field, and let the first neural network model output a more accurate intermediate representation, that is, the above-mentioned second intermediate representation. Compared with the model that directly adopts the end-to-end training of the three-dimensional light field representation and decoding network, the segmentation training method of the embodiment of the present application is easier to converge, and the training efficiency is higher.
第二方面,提供了一种图像处理装置,包括:获取模块和处理模块;其中该获取模块用于:获取摄像头中画面的空间位置信息和视角信息;以及,将所述空间位置信息和所述视角信息输入至第一神经网络模型,获得与所述空间位置信息和所述视角信息对应的空间光场特征信息,所述空间光场特征信息包括空间三维特征信息和颜色特征信息,所述第一神经网络模型用于处理图像的三维信息;以及,将所述空间光场特征信息输入至第二神经网络模型,获得目标图像,所述第二神经网络模型用于还原图像的颜色信息和细节信息。In a second aspect, an image processing apparatus is provided, comprising: an acquisition module and a processing module; wherein the acquisition module is used for: acquiring spatial position information and viewing angle information of a picture in a camera; The viewing angle information is input into the first neural network model, and the spatial light field feature information corresponding to the spatial position information and the viewing angle information is obtained, and the spatial light field feature information includes spatial three-dimensional feature information and color feature information. A neural network model is used to process the three-dimensional information of the image; and the spatial light field feature information is input into a second neural network model to obtain a target image, and the second neural network model is used to restore the color information and details of the image information.
结合第二方面,在第二方面的某些实现方式中,上述第二神经网络模型包括编码网络和解码网络;上述处理模块具体用于:将上述空间光场特征信息输入至上述解码网络,获得上述目标图像。In combination with the second aspect, in some implementations of the second aspect, the above-mentioned second neural network model includes an encoding network and a decoding network; the above-mentioned processing module is specifically configured to: input the above-mentioned spatial light field feature information into the above-mentioned decoding network to obtain The above target image.
结合第二方面,在第二方面的某些实现方式中,上述处理模块具体用于:在将上述空 间位置信息和上述视角信息输入至第一神经网络模型之前,将样本图像的颜色信息输入至上述编码网络,获得上述样本图像的颜色特征信息和细节特征信息;并利用上述样本图像的颜色特征信息和细节特征信息、以及上述样本图像对应的空间位置信息,训练上述第一神经网络模型。With reference to the second aspect, in some implementations of the second aspect, the above-mentioned processing module is specifically configured to: before inputting the above-mentioned spatial position information and the above-mentioned perspective information into the first neural network model, input the color information of the sample image to the first neural network model. The encoding network obtains the color feature information and detail feature information of the sample image; and uses the color feature information and detail feature information of the sample image and the spatial position information corresponding to the sample image to train the first neural network model.
结合第二方面,在第二方面的某些实现方式中,上述处理模块具体用于:将上述样本图像的颜色特征信息和细节特征信息作为真值,以上述样本图像对应的空间位置信息为输入,训练上述第一神经网络模型。In combination with the second aspect, in some implementations of the second aspect, the above-mentioned processing module is specifically configured to: take the color feature information and detail feature information of the above-mentioned sample image as the true value, and take the spatial position information corresponding to the above-mentioned sample image as the input , train the first neural network model above.
第三方面,提供了另一种图像处理装置,包括:处理器,该处理器与存储器耦合,可用于执行存储器中的指令,以实现上述第一方面中任一种可能实现方式中的方法。可选地,该装置还包括存储器。可选地,该装置还包括通信接口,处理器与通信接口耦合。In a third aspect, another image processing apparatus is provided, comprising: a processor, which is coupled to a memory and can be configured to execute instructions in the memory, so as to implement the method in any possible implementation manner of the first aspect. Optionally, the apparatus further includes a memory. Optionally, the apparatus further includes a communication interface to which the processor is coupled.
第四方面,提供了一种处理器,包括:输入电路、输出电路和处理电路。处理电路用于通过输入电路接收信号,并通过输出电路发射信号,使得处理器执行上述第一方面中任一种可能实现方式中的方法。In a fourth aspect, a processor is provided, including: an input circuit, an output circuit, and a processing circuit. The processing circuit is configured to receive the signal through the input circuit and transmit the signal through the output circuit, so that the processor executes the method in any one of the possible implementation manners of the above first aspect.
在具体实现过程中,上述处理器可以为芯片,输入电路可以为输入管脚,输出电路可以为输出管脚,处理电路可以为晶体管、门电路、触发器和各种逻辑电路等。输入电路所接收的输入的信号可以是由例如但不限于接收器接收并输入的,输出电路所输出的信号可以是例如但不限于输出给发射器并由发射器发射的,且输入电路和输出电路可以是同一电路,该电路在不同的时刻分别用作输入电路和输出电路。本申请实施例对处理器及各种电路的具体实现方式不做限定。In a specific implementation process, the above-mentioned processor may be a chip, the input circuit may be an input pin, the output circuit may be an output pin, and the processing circuit may be a transistor, a gate circuit, a flip-flop, and various logic circuits. The input signal received by the input circuit may be received and input by, for example, but not limited to, a receiver, the signal output by the output circuit may be, for example, but not limited to, output to and transmitted by a transmitter, and the input circuit and output The circuit can be the same circuit that acts as an input circuit and an output circuit at different times. The embodiments of the present application do not limit the specific implementation manners of the processor and various circuits.
第五方面,提供了一种处理装置,包括处理器和存储器。该处理器用于读取存储器中存储的指令,以执行上述第一方面中任一种可能实现方式中的方法。In a fifth aspect, a processing apparatus is provided, including a processor and a memory. The processor is configured to read the instructions stored in the memory, so as to execute the method in any one of the possible implementation manners of the first aspect.
可选地,处理器为一个或多个,存储器为一个或多个。Optionally, there are one or more processors and one or more memories.
可选地,存储器可以与处理器集成在一起,或者存储器与处理器分离设置。Alternatively, the memory may be integrated with the processor, or the memory may be provided separately from the processor.
在具体实现过程中,存储器可以为非瞬时性(non-transitory)存储器,例如只读存储器(read only memory,ROM),其可以与处理器集成在同一块芯片上,也可以分别设置在不同的芯片上,本申请实施例对存储器的类型以及存储器与处理器的设置方式不做限定。In the specific implementation process, the memory can be a non-transitory memory, such as a read only memory (ROM), which can be integrated with the processor on the same chip, or can be separately set in different On the chip, the embodiment of the present application does not limit the type of the memory and the setting manner of the memory and the processor.
上述第五方面中的处理装置可以是一个芯片,该处理器可以通过硬件来实现也可以通过软件来实现,当通过硬件实现时,该处理器可以是逻辑电路、集成电路等;当通过软件来实现时,该处理器可以是一个通用处理器,通过读取存储器中存储的软件代码来实现,该存储器可以集成在处理器中,可以位于该处理器之外,独立存在。The processing device in the fifth aspect may be a chip, and the processor may be implemented by hardware or software. When implemented by hardware, the processor may be a logic circuit, an integrated circuit, etc.; when implemented by software When implemented, the processor can be a general-purpose processor, which is realized by reading software codes stored in a memory, and the memory can be integrated in the processor or located outside the processor and exist independently.
第六方面,提供了一种计算机程序产品,计算机程序产品包括:计算机程序(也可以称为代码,或指令),当计算机程序被运行时,使得计算机执行上述第一方面中任一种可能实现方式中的方法。In a sixth aspect, a computer program product is provided. The computer program product includes: a computer program (also referred to as code, or instruction), which, when the computer program is executed, enables the computer to execute any one of the above-mentioned first aspects. method in method.
第七方面,提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序(也可以称为代码,或指令)当其在计算机上运行时,使得计算机执行上述第一方面中任一种可能实现方式中的方法。In a seventh aspect, a computer-readable storage medium is provided, the computer-readable storage medium stores a computer program (also referred to as code, or instruction) when it is run on a computer, causing the computer to execute the above-mentioned first aspect. method in any of the possible implementations.
附图说明Description of drawings
图1是本申请实施例提供的一种图像处理方法的示意性流程图;FIG. 1 is a schematic flowchart of an image processing method provided by an embodiment of the present application;
图2是本申请实施例提供的图像处理过程的示意图;2 is a schematic diagram of an image processing process provided by an embodiment of the present application;
图3是本申请实施例提供的第一神经网络模型的训练过程的示意图;3 is a schematic diagram of a training process of a first neural network model provided by an embodiment of the present application;
图4是本申请实施例提供的第二神经网络模型的训练过程的示意图;4 is a schematic diagram of a training process of a second neural network model provided by an embodiment of the present application;
图5是本申请实施例提供的一种图像处理装置的示意性框图;FIG. 5 is a schematic block diagram of an image processing apparatus provided by an embodiment of the present application;
图6是本申请实施例提供的另一种图像处理装置的示意性框图。FIG. 6 is a schematic block diagram of another image processing apparatus provided by an embodiment of the present application.
具体实施方式Detailed ways
下面将结合附图,对本申请中的技术方案进行描述。The technical solutions in the present application will be described below with reference to the accompanying drawings.
虚拟现实(virtual reality,VR),又称虚拟环境、灵境或人工环境,是指利用计算机生成一种可对参与者直接施加视觉、听觉和触觉感受,并允许其交互地观察和操作的虚拟世界的技术。Virtual reality (VR), also known as virtual environment, spiritual environment or artificial environment, refers to the use of computers to generate a virtual world that can directly apply visual, auditory and tactile sensations to participants and allow them to observe and operate interactively. Technology.
VR技术的前景非常广阔,目前,VR设备发展飞快,正在变得简单、易用、普及。然而,与VR设备飞速发展不同,高质量的VR数字内容非常有限。与传统显示的2D数字内容不同,为了增强身临其境的感受(比如显示内容跟随人的运动而变化),VR设备需要获取场景的三维光场内容,捕获场景的三维光场内容需要非常复杂的硬件设备,限制了三维光场内容获取的灵活性。The prospect of VR technology is very broad. At present, VR equipment is developing rapidly and is becoming simple, easy to use and popular. However, unlike the rapid development of VR equipment, high-quality VR digital content is very limited. Different from traditionally displayed 2D digital content, in order to enhance the immersive experience (for example, the display content changes with the movement of people), the VR device needs to acquire the 3D light field content of the scene, and the capture of the 3D light field content of the scene needs to be very complicated. The hardware devices limit the flexibility of 3D light field content acquisition.
为了实现使用户只需要拿着VR设备绕场景一圈,就能获得准确地光场信息用于VR设备来进行显示,业界通常采用以下两种方式实现。In order to realize that the user only needs to hold the VR device around the scene to obtain accurate light field information for display by the VR device, the industry usually adopts the following two methods.
在第一种实现方式中,VR设备可以基于图像的渲染(image based rendering,IBR)技术,即不同视角或者不同坐标下图像生成的能力。VR设备可以通过IBR来获取整个场景的信息,实时生成任意视角的图像。但是,对于多数场景,IBR存在两个巨大的挑战。首先,IBR需要重建3维(three dimensions,3D)模型,但重建的3D模型必须足够细致,而且要表现出场景中物体的遮挡关系。其次,3D模型所生成的物体表面颜色、材质需要依赖输入图像的表征能力,但是输入数据集增多会降低模型的速度和性能。所以,该方法对VR设备的性能有一定的要求,且对图像的颜色、细节等信息恢复的能力不足。In the first implementation manner, the VR device can be based on image-based rendering (IBR) technology, that is, the ability to generate images from different viewing angles or different coordinates. VR devices can obtain information of the entire scene through IBR, and generate images from any perspective in real time. However, for most scenarios, IBR presents two huge challenges. First, IBR needs to reconstruct a 3-dimensional (three dimensions, 3D) model, but the reconstructed 3D model must be detailed enough and show the occlusion relationship of objects in the scene. Secondly, the surface color and material of the object generated by the 3D model need to rely on the representation ability of the input image, but the increase of the input data set will reduce the speed and performance of the model. Therefore, this method has certain requirements on the performance of the VR device, and has insufficient ability to restore the color, details and other information of the image.
在第二种实现方式中,VR设备可以通过神经射线场NeRF采用稀疏的图片数据集来合成复杂场景表示,再通过对摄相机光线(camera ray)上的空间点的5D坐标(例如,空间位置(x,y,z)与视角方向(θ,
Figure PCTCN2020139145-appb-000003
))采样来合成对应视角下的密度(density)和颜色(color)。然后,VR设备可以对上述新视角下的密度(density)和颜色(color)使用经典的体渲染技术(volume rendering),来获得上述5D坐标对应的图像,从而实现连续的表示整个场景的新视角图像生成的任务。但是,该方法是采用全连接形式的深度学习网络对数据集进行逐像素计算的,没有用到像素之间的相关性,像素间相互孤立,对某些场景的细节恢复能力有所不足。
In the second implementation, the VR device can use the neural ray field NeRF to synthesize a complex scene representation using a sparse image data set, and then use the 5D coordinates of the spatial points on the camera ray (eg, the spatial position of the camera ray). (x, y, z) and viewing direction (θ,
Figure PCTCN2020139145-appb-000003
)) sampling to synthesize the density and color at the corresponding viewing angle. Then, the VR device can use the classic volume rendering technology for the density and color of the new perspective to obtain the image corresponding to the 5D coordinates, so as to continuously represent the new perspective of the entire scene. The task of image generation. However, this method uses a fully-connected deep learning network to perform pixel-by-pixel computation on the dataset, and does not use the correlation between pixels. The pixels are isolated from each other, and the ability to restore details in some scenes is insufficient.
有鉴于此,本申请提供了一种图像处理方法和图像处理装置,通过结合训练好的两个神经网络模型生成当前视角下的目标图像,使用第一神经网络模型有效重建了目标图像的三维光场信息和颜色特征信息,使用第二神经网络模型有效还原了图像的颜色信息和细节信息,从而提高了对图像颜色和细节的恢复能力,有效提高了图像质量。In view of this, the present application provides an image processing method and an image processing device, by combining two trained neural network models to generate a target image under the current perspective, and using the first neural network model to effectively reconstruct the three-dimensional light of the target image. Field information and color feature information, the second neural network model is used to effectively restore the color information and detail information of the image, thereby improving the ability to restore the color and detail of the image, and effectively improving the image quality.
在介绍本申请实施例提供的方法及装置之前,先做出以下几点说明。Before introducing the method and device provided by the embodiments of the present application, the following points are first made.
第一,在下文示出的实施例中,各术语及英文缩略语,如视角信息、颜色信息或空间 位置信息等,均为方便描述而给出的示例性举例,不应对本申请构成任何限定。本申请并不排除在已有或未来的协议中定义其它能够实现相同或相似功能的术语的可能。First, in the embodiments shown below, terms and English abbreviations, such as viewing angle information, color information, or spatial position information, etc., are exemplary examples given for convenience of description, and should not constitute any limitation to the application. . This application does not exclude the possibility of defining other terms that can achieve the same or similar functions in existing or future agreements.
第二,在下文示出的实施例中第一、第二以及各种数字编号仅为描述方便进行的区分,并不用来限制本申请实施例的范围。例如,第一神经网络模型、第二神经网络模型等区分不同神经网络等。Second, in the embodiments shown below, the first, the second, and various numeral numbers are only for the convenience of description, and are not used to limit the scope of the embodiments of the present application. For example, the first neural network model, the second neural network model, etc. distinguish different neural networks and the like.
第三,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a、b和c中的至少一项(个),可以表示:a,或b,或c,或a和b,或a和c,或b和c,或a、b和c,其中a,b,c可以是单个,也可以是多个。Third, "at least one" means one or more, and "plurality" means two or more. "And/or", which describes the association relationship of the associated objects, indicates that there can be three kinds of relationships, for example, A and/or B, which can indicate: the existence of A alone, the existence of A and B at the same time, and the existence of B alone, where A, B can be singular or plural. The character "/" generally indicates that the associated objects are an "or" relationship. "At least one item(s) below" or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (a) of a, b and c may represent: a, or b, or c, or a and b, or a and c, or b and c, or a, b and c, wherein a, b, c can be single or multiple.
为了使本申请的目的、技术方案更加清楚直观,下面将结合附图及实施例,对本申请提供的图像处理方法及图像处理装置进行详细说明。应理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the purpose and technical solutions of the present application clearer and more intuitive, the image processing method and the image processing apparatus provided by the present application will be described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.
应理解,本申请实施例的方法可以由设有摄像头的VR设备执行,该VR设备可以例如是VR眼镜、VR头戴等,本申请实施例对此并不限定。It should be understood that the method in the embodiment of the present application may be performed by a VR device provided with a camera, and the VR device may be, for example, VR glasses, a VR headset, etc., which is not limited in the embodiment of the present application.
图1是本申请实施例中的图像处理方法100的示意性流程图。如图1所示,该方法100可以包括下列步骤:FIG. 1 is a schematic flowchart of an image processing method 100 in an embodiment of the present application. As shown in FIG. 1, the method 100 may include the following steps:
S101、获取摄像头中画面的空间位置信息和视角信息。S101. Acquire spatial position information and viewing angle information of a picture in a camera.
应理解,上述空间位置信息指光线在三维空间中的位置,可以用三维坐标(x,y,z)表示。视角信息指上述光线从上述空间位置发出的射线在三维空间中的方向,可以用(θ,
Figure PCTCN2020139145-appb-000004
)两个参数表示。上述空间位置信息和视角信息也可以统一称为5D坐标,用坐标(x,y,z,θ,
Figure PCTCN2020139145-appb-000005
)表示。
It should be understood that the above-mentioned spatial position information refers to the position of the light in the three-dimensional space, which may be represented by three-dimensional coordinates (x, y, z). The viewing angle information refers to the direction in the three-dimensional space of the ray emitted by the above-mentioned light from the above-mentioned spatial position, which can be used (θ,
Figure PCTCN2020139145-appb-000004
) is represented by two parameters. The above-mentioned spatial position information and viewing angle information can also be collectively referred to as 5D coordinates, using coordinates (x, y, z, θ,
Figure PCTCN2020139145-appb-000005
)express.
S102、将上述空间位置信息和上述视角信息输入至第一神经网络模型,获得与上述空间位置信息和上述视角信息对应的空间光场特征信息,上述空间光场特征信息包括空间三维特征信息和颜色特征信息,上述第一神经网络模型用于处理图像的三维信息。S102: Input the above-mentioned spatial position information and the above-mentioned perspective information into the first neural network model, and obtain the spatial light field feature information corresponding to the above-mentioned spatial position information and the above-mentioned perspective information, and the above-mentioned spatial light field feature information includes spatial three-dimensional feature information and color Feature information, the above-mentioned first neural network model is used to process the three-dimensional information of the image.
应理解,上述第一神经网络还可以称为三维表征网络模型或者光场重建网络模型,本申请实施例在此不作限定。It should be understood that the above-mentioned first neural network may also be referred to as a three-dimensional representation network model or a light field reconstruction network model, which is not limited in this embodiment of the present application.
S103、将上述空间光场特征信息输入至第二神经网络模型,获得目标图像,上述第二神经网络模型用于还原图像的颜色信息和细节信息。S103. Input the above-mentioned spatial light field feature information into a second neural network model to obtain a target image, and the above-mentioned second neural network model is used to restore the color information and detail information of the image.
在本申请实施例中,上述空间光场特征信息中包括的颜色特征信息可以为三原色格式(red green blue color mode,RGB)或者YUV格式,其中“Y”表示明亮度(luminance或luma),“U”和“V”表示的则是色度(chrominance或chroma),作用是描述影像色彩及饱和度,用于指定像素的颜色。针对上述不同格式的空间光场特征信息,VR设备可以采用不同的处理方式。In this embodiment of the present application, the color feature information included in the above-mentioned spatial light field feature information may be a three-primary color format (red green blue color mode, RGB) or a YUV format, where "Y" represents brightness (luminance or luma), " U" and "V" represent the chrominance (chrominance or chroma), which is used to describe the color and saturation of the image, and is used to specify the color of the pixel. The VR device may adopt different processing methods for the spatial light field feature information in different formats.
在一种可能的实现方式中,当上述空间光场特征信息包括的颜色特征信息为RGB格式时,VR设备可以将包括该RGB格式的颜色特征信息的空间光场特征信息输入至上述第二神经网络模型,获得目标图像。In a possible implementation manner, when the color feature information included in the spatial light field feature information is in RGB format, the VR device may input the spatial light field feature information including the color feature information in the RGB format into the second neural network. network model to obtain the target image.
在另一种可能的实现方式中,当上述空间光场特征信息包括的颜色特征信息为YUV格式时,VR设备可以将该YUV格式的颜色特征信息从YUV格式转换为RGB格式,再将包括转换后的颜色特征信息的空间光场特征信息输入至上述第二神经网络模型,获得目标图像。In another possible implementation, when the color feature information included in the above-mentioned spatial light field feature information is in YUV format, the VR device can convert the color feature information in YUV format from YUV format to RGB format, and then convert the color feature information including the conversion into RGB format. The spatial light field feature information of the latter color feature information is input to the above-mentioned second neural network model to obtain a target image.
应理解,上述第二神经网络模型可以为处理图像颜色信息和细节信息的神经网络模型,如卷积神经网络(convolution neural network,CNN)网络模型。It should be understood that the above-mentioned second neural network model may be a neural network model for processing image color information and detail information, such as a convolution neural network (CNN) network model.
本申请实施例的图像处理方法,通过结合训练好的两个神经网络模型生成当前视角下的目标图像,使用第一神经网络模型有效重建了目标图像的三维光场信息和颜色特征信息,使用第二神经网络模型有效还原了图像的颜色信息和细节信息,将优化颜色细节任务与生成空间光场任务进行了解耦,从而提高了对图像颜色和细节的恢复能力,有效提高了图像质量。In the image processing method of the embodiment of the present application, the target image under the current perspective is generated by combining two trained neural network models, the three-dimensional light field information and color feature information of the target image are effectively reconstructed by using the first neural network model, and the The second neural network model effectively restores the color information and detail information of the image, and decouples the task of optimizing color details from the task of generating a spatial light field, thereby improving the ability to restore the color and details of the image and effectively improving the image quality.
假设采用峰值信噪比(peak signal to noise ratio,PSNR)作为图像质量的衡量标准,本申请实施例的方法与上述现有技术中的第二种实现方式相比,目标图像的PSNR从32提高到了34,即图像质量得到了提高。Assuming that the peak signal-to-noise ratio (PSNR) is used as the measurement standard of image quality, the method of the embodiment of the present application is compared with the second implementation manner in the above-mentioned prior art, and the PSNR of the target image is improved from 32 At 34, the image quality is improved.
可选地,上述第二神经网络模型为专门处理颜色细节信息的RGB网络,上述第一神经网络模型为专门处理3D信息的NeRF网络。本申请实施例通过将两个网络相结合,可以提高整体输出效果。Optionally, the above-mentioned second neural network model is an RGB network specially processing color detail information, and the above-mentioned first neural network model is a NeRF network specially processing 3D information. By combining the two networks in the embodiment of the present application, the overall output effect can be improved.
作为一个可选的实施例,上述第二神经网络模型包括编码网络和解码网络;将上述空间光场特征信息输入至第二神经网络模型,获得目标图像,包括:将上述空间光场特征信息输入至上述解码网络,获得上述目标图像。As an optional embodiment, the above-mentioned second neural network model includes an encoding network and a decoding network; inputting the above-mentioned spatial light field feature information into the second neural network model to obtain a target image includes: inputting the above-mentioned spatial light field characteristic information To the above-mentioned decoding network, the above-mentioned target image is obtained.
图2示出了本申请实施例提供的图像处理方法的处理过程。如图2所示,VR设备可以将上述图像的空间位置信息和视角信息输入到第一神经网络模型,获得空间光场特征信息,再将该空间光场特征信息输入到第二神经网络模型中的解码网络,生成目标图像。FIG. 2 shows a processing process of the image processing method provided by the embodiment of the present application. As shown in Figure 2, the VR device can input the spatial position information and perspective information of the above image into the first neural network model to obtain the spatial light field feature information, and then input the spatial light field feature information into the second neural network model The decoding network generates the target image.
在本申请实施例中,空间光场特征信息包括空间三维特征信息和颜色特征信息,VR设备通过上述第一神经网络模型和第二神经网络模型中的解码网络,实现了VR设备对目标图像颜色信息和细节信息的恢复能力的提升,提升了图像质量。In the embodiment of the present application, the spatial light field feature information includes spatial three-dimensional feature information and color feature information, and the VR device realizes that the VR device can detect the color of the target image through the decoding network in the first neural network model and the second neural network model. The improvement of the recovery ability of information and detailed information improves the image quality.
上文中结合图1和图2,详细描述了本申请实施例提供的神经网络模型的使用,下面将结合附图3和图4,详细描述上述神经网络模型的训练过程。该训练过程包括对第一神经网络模型的训练和对第二神经网络模型的训练。The use of the neural network model provided by the embodiments of the present application is described in detail above with reference to FIG. 1 and FIG. 2 , and the training process of the neural network model will be described in detail below with reference to FIG. 3 and FIG. 4 . The training process includes training of the first neural network model and training of the second neural network model.
对第一神经网络模型的训练可以包括:将样本图像的颜色信息输入至上述第二神经网络模型的编码网络,获得上述样本图像的颜色特征信息和细节特征信息;利用上述样本图像的颜色特征信息和细节特征信息、以及上述样本图像对应的空间位置信息,训练上述第一神经网络模型。The training of the first neural network model may include: inputting the color information of the sample image into the encoding network of the above-mentioned second neural network model to obtain the color feature information and detail feature information of the above-mentioned sample image; using the color feature information of the above-mentioned sample image. and the detailed feature information and the spatial position information corresponding to the sample image, to train the first neural network model.
应理解,上述图像颜色信息和细节信息可以统称为第一中间表征。该第一中间表征包含了图像的颜色、细节、领域和相关性等信息。例如,图像中事物的颜色、纹理细节和位置信息、以及不同事物之间的颜色、细节、位置间的关系等等。It should be understood that the above-mentioned image color information and detail information may be collectively referred to as the first intermediate representation. The first intermediate representation contains information such as color, detail, domain, and relevance of the image. For example, the color, texture details, and position information of things in the image, as well as the relationship between colors, details, and positions between different things, and so on.
在一种可能的实现方式中,VR设备可以通过上述编码网络将图像的颜色特征信息映射到高维的特征空间,获得上述第一中间表征。VR设备可以利用上述第一中间表征和样本图像对应的空间位置信息,训练第一神经网络模型,并获得第二中间表征。VR设备将该第二中间表征输入到上述解码网络,获得样本图像的训练结果。In a possible implementation manner, the VR device may map the color feature information of the image to a high-dimensional feature space through the encoding network to obtain the first intermediate representation. The VR device can use the first intermediate representation and the spatial position information corresponding to the sample image to train the first neural network model and obtain the second intermediate representation. The VR device inputs the second intermediate representation into the above-mentioned decoding network to obtain the training result of the sample image.
在本申请实施例中,VR设备可以利用上述第一中间表征和上述样本图像对应的空间位置信息,训练上述第一神经网络模型,使第一神经网络模型学习到第一中间表征中的参数。In this embodiment of the present application, the VR device may use the first intermediate representation and the spatial position information corresponding to the sample image to train the first neural network model, so that the first neural network model can learn the parameters in the first intermediate representation.
作为一个可选的实施例,上述训练所述第一神经网络模型,包括:将上述样本图像的颜色特征信息和细节特征信息作为真值,以上述样本图像对应的空间位置信息为输入,训练所述第一神经网络模型。As an optional embodiment, the above-mentioned training of the first neural network model includes: taking the color feature information and detail feature information of the above-mentioned sample image as true values, and using the spatial position information corresponding to the above-mentioned sample image as input, training the Describe the first neural network model.
图3示出了本申请实施例提供的第一神经网络模型训练过程。如图3所示,VR设备可以通过将样本图像的颜色信息输入到上述第二神经网络模型中的编码网络,通过该编码网络在上述样本图像的颜色信息中提取生成颜色特征信息和细节特征信息(即上述第一中间表征)。然后,VR设备将上述样本图像的空间位置信息和视角信息输入第一神经网络模型,并将上述颜色特征信息和细节特征信息作为真值对第一神经网络模型进行训练,使得第一神经网络模型可以学习到上述第一中间表征中包含的图像颜色、细节、领域和相关性等信息,使得第一神经网络模型的输出结果可以接近真值,从而完成对该第一神经网络模型的训练。FIG. 3 shows the training process of the first neural network model provided by the embodiment of the present application. As shown in FIG. 3 , the VR device can generate color feature information and detail feature information from the color information of the sample image by inputting the color information of the sample image into the encoding network in the second neural network model. (ie the first intermediate representation described above). Then, the VR device inputs the spatial position information and perspective information of the sample image into the first neural network model, and uses the color feature information and the detail feature information as true values to train the first neural network model, so that the first neural network model Information such as image color, details, domain, and correlation included in the first intermediate representation can be learned, so that the output result of the first neural network model can be close to the true value, thereby completing the training of the first neural network model.
在本申请实施例中,VR设备对第一神经网络模型的训练不再使用图像作为真值参考,而是使用上述编码网络生成的第一中间表征为真值,以及使用对应的图像的空间位置信息和视角信息作为输入,来训练第一神经网络模型网络,使得该第一神经网络模型能够学习到上述第一中间表征,使得上述第一神经网络输出的第二中间表征更为准确。VR设备可以将该第二中间表征通过解码网络,可输出更高质量的图像。In the embodiment of the present application, the training of the first neural network model by the VR device no longer uses the image as the ground truth reference, but uses the first intermediate representation generated by the encoding network as the ground truth, and uses the spatial position of the corresponding image Information and perspective information are used as input to train the first neural network model network, so that the first neural network model can learn the first intermediate representation, so that the second intermediate representation output by the first neural network is more accurate. The VR device can pass the second intermediate representation through the decoding network, and can output higher quality images.
应理解,在VR设备使用上述第二神经网络模型之前,或者,VR设备通过第二神经网络模型的编码网络获取上述第一神经网络模型训练所需的真值之前,VR设备还可以对该第二神经网络模型进行训练。It should be understood that before the VR device uses the above-mentioned second neural network model, or before the VR device obtains the true value required for training the above-mentioned first neural network model through the coding network of the second neural network model, the VR device can also Two neural network models are trained.
由于第二神经网络模型包括编码网络和解码网络,对第二神经网络模型的训练可以对编码网络和解码网络一起进行。图4示出了本申请实施例提供的第二神经网络模型的训练过程。如图4所示,VR设备可以将样本图像的颜色信息输入到第二神经网络模型中的编码网络,通过编码网络生成样本图像的颜色特征信息和细节特征信息,再将所获得的颜色特征信息和细节特征信息输入到该第二神经网络模型中的解码网络,并获得解码图像。Since the second neural network model includes an encoding network and a decoding network, the training of the second neural network model can be performed on the encoding network and the decoding network together. FIG. 4 shows the training process of the second neural network model provided by the embodiment of the present application. As shown in Figure 4, the VR device can input the color information of the sample image into the coding network in the second neural network model, generate the color feature information and detail feature information of the sample image through the coding network, and then use the obtained color feature information and detail feature information are input into the decoding network in the second neural network model, and the decoded image is obtained.
应理解,当上述第二神经网络模型中的编码网络所生成的图像的颜色特征信息和细节特征信息可以被第二神经网络模型中的解码网络解码,并精确的还原成较高质量的图像,则表明该第二神经网络模型的训练完成。It should be understood that when the color feature information and detail feature information of the image generated by the encoding network in the second neural network model can be decoded by the decoding network in the second neural network model, and accurately restored to a higher quality image, It indicates that the training of the second neural network model is completed.
本申请实施例采用分段训练的方法,可以先对第二神经网络模型进行训练,从而生成上述第一中间表征(包含着颜色细节的高维度特征信息),再将该第一中间表征作为真值训练第一神经网络模型,使得该第一神经网络模型能够学习到光场的隐式表征,并让该第一神经网络模型输出更为准确的中间表征,即上述第二中间表征。本申请实施例的分段训练方法,与直接采用端到端训练三维光场表征和解码网络的模型相比,更容易收敛,训练效率较高。The embodiment of the present application adopts the method of segmented training, and the second neural network model can be trained first, so as to generate the above-mentioned first intermediate representation (high-dimensional feature information containing color details), and then the first intermediate representation is used as the true value to train the first neural network model, so that the first neural network model can learn the implicit representation of the light field, and let the first neural network model output a more accurate intermediate representation, that is, the above-mentioned second intermediate representation. Compared with the model that directly adopts the end-to-end training of the three-dimensional light field representation and decoding network, the segmentation training method of the embodiment of the present application is easier to converge, and the training efficiency is higher.
应理解,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence numbers of the above processes does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
上文中结合图1至图4,详细描述了本申请实施例提供的图像处理方法,下面将结合 附图5和图6,详细描述本申请实施例提供的图像处理装置。The image processing method provided by the embodiment of the present application is described in detail above with reference to FIG. 1 to FIG. 4 , and the image processing apparatus provided by the embodiment of the present application will be described in detail below with reference to FIG. 5 and FIG. 6 .
图5示出了本申请实施例提供的图像处理装置500,该装置500包括:获取模块501和处理模块502。FIG. 5 shows an image processing apparatus 500 provided by an embodiment of the present application. The apparatus 500 includes: an acquisition module 501 and a processing module 502 .
其中,获取模块501,用于获取摄像头中画面的空间位置信息和视角信息;处理模块502,用于将所述空间位置信息和所述视角信息输入至第一神经网络模型,获得与所述空间位置信息和所述视角信息对应的空间光场特征信息,所述空间光场特征信息包括空间三维特征信息和颜色特征信息,所述第一神经网络模型用于处理图像的三维信息;以及,将所述空间光场特征信息输入至第二神经网络模型,获得目标图像,所述第二神经网络模型用于还原图像的颜色信息和细节信息。Among them, the acquisition module 501 is used to acquire the spatial position information and the perspective information of the picture in the camera; the processing module 502 is used to input the spatial position information and the perspective information into the first neural network model, and obtain and the spatial position information and the perspective information. The spatial light field feature information corresponding to the position information and the viewing angle information, the spatial light field feature information includes spatial three-dimensional feature information and color feature information, and the first neural network model is used to process the three-dimensional information of the image; The spatial light field feature information is input to a second neural network model to obtain a target image, and the second neural network model is used to restore the color information and detail information of the image.
可选地,上述第二神经网络模型包括编码网络和解码网络;该处理模块502,用于将上述空间光场特征信息输入至上述解码网络,获得上述目标图像。Optionally, the above-mentioned second neural network model includes an encoding network and a decoding network; the processing module 502 is configured to input the above-mentioned spatial light field feature information into the above-mentioned decoding network to obtain the above-mentioned target image.
可选地,该处理模块502,用于在将上述空间位置信息和上述视角信息输入至第一神经网络模型之前,将样本图像的颜色信息输入至上述编码网络,获得上述样本图像的颜色特征信息和细节特征信息;并利用上述样本图像的颜色特征信息和细节特征信息、以及上述样本图像对应的空间位置信息,训练上述第一神经网络模型。Optionally, the processing module 502 is used to input the color information of the sample image into the encoding network before inputting the above-mentioned spatial position information and the above-mentioned perspective information into the first neural network model, and obtain the color feature information of the above-mentioned sample image. and detail feature information; and use the color feature information and detail feature information of the sample image and the spatial position information corresponding to the sample image to train the first neural network model.
可选地,该处理模块502,用于将上述样本图像的颜色特征信息和细节特征信息作为真值,以上述样本图像对应的空间位置信息为输入,训练上述第一神经网络模型。Optionally, the processing module 502 is configured to use the color feature information and detail feature information of the sample image as true values, and use the spatial position information corresponding to the sample image as input to train the first neural network model.
应理解,这里的装置500以功能模块的形式体现。这里的术语“模块”可以指应用特有集成电路(application specific integrated circuit,ASIC)、电子电路、用于执行一个或多个软件或固件程序的处理器(例如共享处理器、专有处理器或组处理器等)和存储器、合并逻辑电路和/或其它支持所描述的功能的合适组件。在一个可选例子中,本领域技术人员可以理解,装置500可以具体为上述实施例中的VR设备,或者,上述实施例中VR设备的功能可以集成在装置500中,装置500可以用于执行上述方法实施例中与VR设备对应的各个流程和/或步骤,为避免重复,在此不再赘述。It should be understood that the apparatus 500 here is embodied in the form of functional modules. The term "module" as used herein may refer to an application specific integrated circuit (ASIC), an electronic circuit, a processor for executing one or more software or firmware programs (eg, a shared processor, a dedicated processor, or a group of processors, etc.) and memory, merge logic, and/or other suitable components to support the described functions. In an optional example, those skilled in the art can understand that the apparatus 500 may be specifically the VR device in the foregoing embodiment, or the functions of the VR device in the foregoing embodiment may be integrated in the apparatus 500, and the apparatus 500 may be used to execute In order to avoid repetition, the various processes and/or steps corresponding to the VR device in the above method embodiments will not be repeated here.
上述装置500具有实现上述方法中VR设备执行的相应步骤的功能;上述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块。The above-mentioned apparatus 500 has a function of implementing the corresponding steps performed by the VR device in the above-mentioned method; the above-mentioned functions may be implemented by hardware, or by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above functions.
在本申请的实施例,图5中的装置500也可以是芯片或者芯片系统,例如:片上系统(system on chip,SoC)。In the embodiment of the present application, the apparatus 500 in FIG. 5 may also be a chip or a system of chips, such as a system on chip (system on chip, SoC).
图6示出了本申请实施例提供的另一图像处理装置600。该装置600包括处理器601、收发器602和存储器603。其中,处理器601、收发器602和存储器603通过内部连接通路互相通信,该存储器603用于存储指令,该处理器601用于执行该存储器603存储的指令,以控制该收发器602发送信号和/或接收信号。FIG. 6 shows another image processing apparatus 600 provided by an embodiment of the present application. The apparatus 600 includes a processor 601 , a transceiver 602 and a memory 603 . The processor 601, the transceiver 602 and the memory 603 communicate with each other through an internal connection path, the memory 603 is used to store instructions, and the processor 601 is used to execute the instructions stored in the memory 603 to control the transceiver 602 to send signals and / or receive signals.
其中,收发器602,用于获取摄像头中画面的空间位置信息和视角信息;处理器601,用于将所述空间位置信息和所述视角信息输入至第一神经网络模型,获得与所述空间位置信息和所述视角信息对应的空间光场特征信息,所述空间光场特征信息包括空间三维特征信息和颜色特征信息,所述第一神经网络模型用于处理图像的三维信息;以及,将所述空间光场特征信息输入至第二神经网络模型,获得目标图像,所述第二神经网络模型用于还原图像的颜色信息和细节信息。Wherein, the transceiver 602 is used to obtain the spatial position information and view angle information of the picture in the camera; the processor 601 is used to input the spatial position information and the view angle information into the first neural network model, and obtain the spatial position information and the view angle information. The spatial light field feature information corresponding to the position information and the viewing angle information, the spatial light field feature information includes spatial three-dimensional feature information and color feature information, and the first neural network model is used to process the three-dimensional information of the image; The spatial light field feature information is input to a second neural network model to obtain a target image, and the second neural network model is used to restore the color information and detail information of the image.
可选地,上述第二神经网络模型包括编码网络和解码网络;该处理器601,用于将上述空间光场特征信息输入至上述解码网络,获得上述目标图像。Optionally, the above-mentioned second neural network model includes an encoding network and a decoding network; the processor 601 is configured to input the above-mentioned spatial light field feature information into the above-mentioned decoding network to obtain the above-mentioned target image.
可选地,该处理器601,用于在将上述空间位置信息和上述视角信息输入至第一神经网络模型之前,将样本图像的颜色信息输入至上述编码网络,获得上述样本图像的颜色特征信息和细节特征信息;并利用上述样本图像的颜色特征信息和细节特征信息、以及上述样本图像对应的空间位置信息,训练上述第一神经网络模型。Optionally, the processor 601 is used to input the color information of the sample image into the encoding network before inputting the above-mentioned spatial position information and the above-mentioned perspective information into the first neural network model, and obtain the color feature information of the above-mentioned sample image. and detail feature information; and use the color feature information and detail feature information of the sample image and the spatial position information corresponding to the sample image to train the first neural network model.
可选地,该处理器601,用于将上述样本图像的颜色特征信息和细节特征信息作为真值,以上述样本图像对应的空间位置信息为输入,训练上述第一神经网络模型。Optionally, the processor 601 is configured to use the color feature information and detail feature information of the sample image as true values, and use the spatial position information corresponding to the sample image as input to train the first neural network model.
应理解,装置600可以具体为上述实施例中的VR设备,或者,上述实施例中VR设备的功能可以集成在装置600中,装置600可以用于执行上述方法实施例中与VR设备对应的各个步骤和/或流程。可选地,该存储器603可以包括只读存储器和随机存取存储器,并向处理器提供指令和数据。存储器的一部分还可以包括非易失性随机存取存储器。例如,存储器还可以存储设备类型的信息。该处理器601可以用于执行存储器中存储的指令,并且该处理器执行该指令时,该处理器可以执行上述方法实施例中与VR设备对应的各个步骤和/或流程。It should be understood that the apparatus 600 may be specifically the VR device in the above embodiments, or the functions of the VR device in the above embodiments may be integrated in the apparatus 600, and the apparatus 600 may be used to execute each of the above method embodiments corresponding to the VR device steps and/or processes. Optionally, the memory 603 may include read only memory and random access memory and provide instructions and data to the processor. A portion of the memory may also include non-volatile random access memory. For example, the memory may also store device type information. The processor 601 may be configured to execute the instructions stored in the memory, and when the processor executes the instructions, the processor may execute various steps and/or processes corresponding to the VR device in the foregoing method embodiments.
应理解,在本申请实施例中,该处理器可以是中央处理单元(Central Processing Unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that, in this embodiment of the present application, the processor may be a central processing unit (Central Processing Unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (DSPs), application specific integrated circuits (ASICs) ), field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器执行存储器中的指令,结合其硬件完成上述方法的步骤。为避免重复,这里不再详细描述。In the implementation process, each step of the above-mentioned method can be completed by a hardware integrated logic circuit in a processor or an instruction in the form of software. The steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware processor, or executed by a combination of hardware and software modules in the processor. The software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art. The storage medium is located in the memory, and the processor executes the instructions in the memory, and completes the steps of the above method in combination with its hardware. To avoid repetition, detailed description is omitted here.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art can realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working process of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method embodiments, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的 部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific embodiments of the present application, but the protection scope of the present application is not limited to this. should be covered within the scope of protection of this application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims (14)

  1. 一种图像处理方法,其特征在于,包括:An image processing method, comprising:
    获取摄像头中画面的空间位置信息和视角信息;Obtain the spatial position information and viewing angle information of the picture in the camera;
    将所述空间位置信息和所述视角信息输入至第一神经网络模型,获得与所述空间位置信息和所述视角信息对应的空间光场特征信息,所述空间光场特征信息包括空间三维特征信息和颜色特征信息,所述第一神经网络模型用于处理图像的三维信息;Inputting the spatial position information and the viewing angle information into a first neural network model to obtain spatial light field feature information corresponding to the spatial position information and the viewing angle information, where the spatial light field feature information includes spatial three-dimensional features information and color feature information, the first neural network model is used to process the three-dimensional information of the image;
    将所述空间光场特征信息输入至第二神经网络模型,获得目标图像,所述第二神经网络模型用于还原图像的颜色信息和细节信息。The spatial light field feature information is input into a second neural network model to obtain a target image, and the second neural network model is used to restore the color information and detail information of the image.
  2. 如权利要求1所述的方法,其特征在于,所述第二神经网络模型包括编码网络和解码网络;The method of claim 1, wherein the second neural network model comprises an encoding network and a decoding network;
    所述将所述空间光场特征信息输入至第二神经网络模型,获得目标图像,包括:The inputting the spatial light field feature information into the second neural network model to obtain the target image, including:
    将所述空间光场特征信息输入至所述解码网络,获得所述目标图像。The spatial light field feature information is input into the decoding network to obtain the target image.
  3. 如权利要求2所述的方法,其特征在于,在所述将所述空间位置信息和所述视角信息输入至第一神经网络模型之前,所述方法还包括:The method according to claim 2, wherein before the inputting the spatial position information and the viewing angle information into the first neural network model, the method further comprises:
    将样本图像的颜色信息输入至所述编码网络,获得所述样本图像的颜色特征信息和细节特征信息;inputting the color information of the sample image into the encoding network to obtain color feature information and detail feature information of the sample image;
    利用所述样本图像的颜色特征信息和细节特征信息、以及所述样本图像对应的空间位置信息,训练所述第一神经网络模型。The first neural network model is trained by using the color feature information and detail feature information of the sample image and the spatial position information corresponding to the sample image.
  4. 如权利要求3所述的方法,其特征在于,所述训练所述第一神经网络模型,包括:The method of claim 3, wherein the training the first neural network model comprises:
    将所述样本图像的颜色特征信息和细节特征信息作为真值,以所述样本图像对应的空间位置信息为输入,训练所述第一神经网络模型。The color feature information and detail feature information of the sample image are used as true values, and the spatial position information corresponding to the sample image is used as input to train the first neural network model.
  5. 一种图像处理装置,其特征在于,包括:An image processing device, comprising:
    获取模块,用于获取摄像头中画面的空间位置信息和视角信息;The acquisition module is used to acquire the spatial position information and viewing angle information of the picture in the camera;
    处理模块,用于将所述空间位置信息和所述视角信息输入至第一神经网络模型,获得与所述空间位置信息和所述视角信息对应的空间光场特征信息,所述空间光场特征信息包括空间三维特征信息和颜色特征信息,所述第一神经网络模型用于处理图像的三维信息;以及,将所述空间光场特征信息输入至第二神经网络模型,获得目标图像,所述第二神经网络模型用于还原图像的颜色信息和细节信息。a processing module, configured to input the spatial position information and the viewing angle information into the first neural network model, and obtain the spatial light field feature information corresponding to the spatial position information and the viewing angle information, the spatial light field feature The information includes spatial three-dimensional feature information and color feature information, and the first neural network model is used to process the three-dimensional information of the image; and the spatial light field feature information is input into the second neural network model to obtain a target image, the said The second neural network model is used to restore the color information and detail information of the image.
  6. 根据权利要求5所述的装置,其特征在于,所述第二神经网络模型包括编码网络和解码网络;The device according to claim 5, wherein the second neural network model comprises an encoding network and a decoding network;
    所述处理模块具体用于:The processing module is specifically used for:
    将所述空间光场特征信息输入至所述解码网络,获得所述目标图像。The spatial light field feature information is input into the decoding network to obtain the target image.
  7. 如权利要求6所述的装置,其特征在于,所述处理模块具体用于:The apparatus of claim 6, wherein the processing module is specifically configured to:
    在将所述空间位置信息和所述视角信息输入至第一神经网络模型之前,将样本图像的颜色信息输入至所述编码网络,获得所述样本图像的颜色特征信息和细节特征信息;Before inputting the spatial position information and the perspective information into the first neural network model, input the color information of the sample image into the encoding network to obtain the color feature information and detail feature information of the sample image;
    利用所述样本图像的颜色特征信息和细节特征信息、以及所述样本图像对应的空间位置信息,训练所述第一神经网络模型。The first neural network model is trained by using the color feature information and detail feature information of the sample image and the spatial position information corresponding to the sample image.
  8. 如权利要求7所述的装置,其特征在于,所述处理模块具体用于:The apparatus of claim 7, wherein the processing module is specifically configured to:
    将所述样本图像的颜色特征信息和细节特征信息作为真值,以所述样本图像对应的空间位置信息为输入,训练所述第一神经网络模型。The color feature information and detail feature information of the sample image are used as true values, and the spatial position information corresponding to the sample image is used as input to train the first neural network model.
  9. 一种图像处理装置,其特征在于,包括:处理器,所述处理器和存储器耦合,所述处理器用于执行所述存储器中存储的指令,以执行下列步骤:An image processing device, characterized in that it comprises: a processor, wherein the processor is coupled to a memory, and the processor is configured to execute instructions stored in the memory to perform the following steps:
    获取摄像头中画面的空间位置信息和视角信息;Obtain the spatial position information and viewing angle information of the picture in the camera;
    将所述空间位置信息和所述视角信息输入至第一神经网络模型,获得与所述空间位置信息和所述视角信息对应的空间光场特征信息,所述空间光场特征信息包括空间三维特征信息和颜色特征信息,所述第一神经网络模型用于处理图像的三维信息;Inputting the spatial position information and the perspective information into the first neural network model to obtain spatial light field feature information corresponding to the spatial position information and the perspective information, where the spatial light field feature information includes spatial three-dimensional features information and color feature information, the first neural network model is used to process the three-dimensional information of the image;
    以及,将所述空间光场特征信息输入至第二神经网络模型,获得目标图像,所述第二神经网络模型用于还原图像的颜色信息和细节信息。And, inputting the spatial light field feature information into a second neural network model to obtain a target image, where the second neural network model is used to restore color information and detail information of the image.
  10. 根据权利要求9所述的装置,其特征在于,所述第二神经网络模型包括编码网络和解码网络;The device according to claim 9, wherein the second neural network model comprises an encoding network and a decoding network;
    所述处理器具体用于:The processor is specifically used for:
    将所述空间光场特征信息输入至所述解码网络,获得所述目标图像。The spatial light field feature information is input to the decoding network to obtain the target image.
  11. 如权利要求10所述的装置,其特征在于,所述处理器具体用于:The apparatus of claim 10, wherein the processor is specifically configured to:
    在将所述空间位置信息和所述视角信息输入至第一神经网络模型之前,将样本图像的颜色信息输入至所述编码网络,获得所述样本图像的颜色特征信息和细节特征信息;Before inputting the spatial position information and the perspective information into the first neural network model, input the color information of the sample image into the encoding network to obtain the color feature information and detail feature information of the sample image;
    利用所述样本图像的颜色特征信息和细节特征信息、以及所述样本图像对应的空间位置信息,训练所述第一神经网络模型。The first neural network model is trained by using the color feature information and detail feature information of the sample image and the spatial position information corresponding to the sample image.
  12. 如权利要求11所述的装置,其特征在于,所述处理器具体用于:The apparatus of claim 11, wherein the processor is specifically configured to:
    将所述样本图像的颜色特征信息和细节特征信息作为真值,以所述样本图像对应的空间位置信息为输入,训练所述第一神经网络模型。The first neural network model is trained by taking the color feature information and detail feature information of the sample image as true values, and using the spatial position information corresponding to the sample image as an input.
  13. 一种计算机可读存储介质,其特征在于,用于存储计算机程序,所述计算机程序包括用于实现如权利要求1至4中任一项所述的方法的指令。A computer-readable storage medium, characterized by being used for storing a computer program, the computer program comprising instructions for implementing the method according to any one of claims 1 to 4.
  14. 一种芯片系统,其特征在于,包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有所述芯片系统的通信设备执行权利要求1至4中任一项所述的方法。A chip system, characterized by comprising: a processor for calling and running a computer program from a memory, so that a communication device installed with the chip system executes the method according to any one of claims 1 to 4.
PCT/CN2020/139145 2020-12-24 2020-12-24 Image processing method and image processing apparatus WO2022133944A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/139145 WO2022133944A1 (en) 2020-12-24 2020-12-24 Image processing method and image processing apparatus
CN202080107407.8A CN116569218A (en) 2020-12-24 2020-12-24 Image processing method and image processing apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/139145 WO2022133944A1 (en) 2020-12-24 2020-12-24 Image processing method and image processing apparatus

Publications (1)

Publication Number Publication Date
WO2022133944A1 true WO2022133944A1 (en) 2022-06-30

Family

ID=82157246

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/139145 WO2022133944A1 (en) 2020-12-24 2020-12-24 Image processing method and image processing apparatus

Country Status (2)

Country Link
CN (1) CN116569218A (en)
WO (1) WO2022133944A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115272575A (en) * 2022-07-28 2022-11-01 中国电信股份有限公司 Image generation method and device, storage medium and electronic equipment
CN115714888A (en) * 2022-10-09 2023-02-24 名之梦(上海)科技有限公司 Video generation method, device, equipment and computer readable storage medium
CN116071484A (en) * 2023-03-07 2023-05-05 清华大学 Billion pixel-level intelligent reconstruction method and device for large-scene sparse light field

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100189342A1 (en) * 2000-03-08 2010-07-29 Cyberextruder.Com, Inc. System, method, and apparatus for generating a three-dimensional representation from one or more two-dimensional images
CN108510573A (en) * 2018-04-03 2018-09-07 南京大学 A method of the multiple views human face three-dimensional model based on deep learning is rebuild
CN109255843A (en) * 2018-09-26 2019-01-22 联想(北京)有限公司 Three-dimensional rebuilding method, device and augmented reality AR equipment
CN110163953A (en) * 2019-03-11 2019-08-23 腾讯科技(深圳)有限公司 Three-dimensional facial reconstruction method, device, storage medium and electronic device
CN110400337A (en) * 2019-07-10 2019-11-01 北京达佳互联信息技术有限公司 Image processing method, device, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100189342A1 (en) * 2000-03-08 2010-07-29 Cyberextruder.Com, Inc. System, method, and apparatus for generating a three-dimensional representation from one or more two-dimensional images
CN108510573A (en) * 2018-04-03 2018-09-07 南京大学 A method of the multiple views human face three-dimensional model based on deep learning is rebuild
CN109255843A (en) * 2018-09-26 2019-01-22 联想(北京)有限公司 Three-dimensional rebuilding method, device and augmented reality AR equipment
CN110163953A (en) * 2019-03-11 2019-08-23 腾讯科技(深圳)有限公司 Three-dimensional facial reconstruction method, device, storage medium and electronic device
CN110400337A (en) * 2019-07-10 2019-11-01 北京达佳互联信息技术有限公司 Image processing method, device, electronic equipment and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115272575A (en) * 2022-07-28 2022-11-01 中国电信股份有限公司 Image generation method and device, storage medium and electronic equipment
CN115272575B (en) * 2022-07-28 2024-03-29 中国电信股份有限公司 Image generation method and device, storage medium and electronic equipment
CN115714888A (en) * 2022-10-09 2023-02-24 名之梦(上海)科技有限公司 Video generation method, device, equipment and computer readable storage medium
CN115714888B (en) * 2022-10-09 2023-08-29 名之梦(上海)科技有限公司 Video generation method, device, equipment and computer readable storage medium
CN116071484A (en) * 2023-03-07 2023-05-05 清华大学 Billion pixel-level intelligent reconstruction method and device for large-scene sparse light field
US11908067B1 (en) 2023-03-07 2024-02-20 Tsinghua University Method and device for gigapixel-level light field intelligent reconstruction of large-scale scene

Also Published As

Publication number Publication date
CN116569218A (en) 2023-08-08

Similar Documents

Publication Publication Date Title
WO2022133944A1 (en) Image processing method and image processing apparatus
US10474227B2 (en) Generation of virtual reality with 6 degrees of freedom from limited viewer data
US20230377269A1 (en) Methods and systems for producing content in multiple reality environments
US20230377183A1 (en) Depth-Aware Photo Editing
WO2020192568A1 (en) Facial image generation method and apparatus, device and storage medium
CN112166604B (en) Volume capture of objects with a single RGBD camera
US20220245912A1 (en) Image display method and device
US10444931B2 (en) Vantage generation and interactive playback
JP2023521270A (en) Learning lighting from various portraits
US10521892B2 (en) Image lighting transfer via multi-dimensional histogram matching
WO2023207379A1 (en) Image processing method and apparatus, device and storage medium
WO2023241459A1 (en) Data communication method and system, and electronic device and storage medium
CN110533773A (en) A kind of three-dimensional facial reconstruction method, device and relevant device
CN109658488B (en) Method for accelerating decoding of camera video stream through programmable GPU in virtual-real fusion system
CN116740261A (en) Image reconstruction method and device and training method and device of image reconstruction model
WO2022179087A1 (en) Video processing method and apparatus
KR20230149093A (en) Image processing method, training method for image processing, and image processing apparatus
WO2021173489A1 (en) Apparatus, method, and system for providing a three-dimensional texture using uv representation
Li et al. Dynamic View Synthesis with Spatio-Temporal Feature Warping from Sparse Views
Bai et al. Local-to-Global Panorama Inpainting for Locale-Aware Indoor Lighting Prediction
Zhang et al. Survey on controlable image synthesis with deep learning
WO2024119997A1 (en) Illumination estimation method and apparatus
Han et al. Learning residual color for novel view synthesis
US20240096041A1 (en) Avatar generation based on driving views
CN116681818B (en) New view angle reconstruction method, training method and device of new view angle reconstruction network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20966516

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202080107407.8

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20966516

Country of ref document: EP

Kind code of ref document: A1