WO2023231918A1 - 图像处理方法、装置、电子设备及存储介质 - Google Patents

图像处理方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2023231918A1
WO2023231918A1 PCT/CN2023/096537 CN2023096537W WO2023231918A1 WO 2023231918 A1 WO2023231918 A1 WO 2023231918A1 CN 2023096537 W CN2023096537 W CN 2023096537W WO 2023231918 A1 WO2023231918 A1 WO 2023231918A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
processed
target
coordinate
model
Prior art date
Application number
PCT/CN2023/096537
Other languages
English (en)
French (fr)
Inventor
廖昀昊
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2023231918A1 publication Critical patent/WO2023231918A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/2621Cameras specially adapted for the electronic generation of special effects during image pickup, e.g. digital cameras, camcorders, video cameras having integrated special effects capability
    • G06T3/04

Definitions

  • Embodiments of the present disclosure relate to the field of image processing technology, such as an image processing method, device, electronic device, and storage medium.
  • the present disclosure provides an image processing method, device, electronic equipment and storage medium.
  • an embodiment of the present disclosure provides an image processing method, which method includes:
  • the style map, transformation matrix, velocity field map and the image to be processed are processed based on a single rendering channel to obtain a target special effect image corresponding to the image to be processed.
  • embodiments of the present disclosure also provide an image processing device, which includes:
  • An image acquisition module configured to acquire an image to be processed including the target object, and determine the style map, transformation matrix and velocity field map corresponding to the image to be processed;
  • An image processing module is configured to process the style map, transformation matrix, velocity field map and the image to be processed based on a single rendering channel to obtain a target special effects image corresponding to the image to be processed.
  • embodiments of the present disclosure also provide an electronic device, where the electronic device includes:
  • processors one or more processors
  • a storage device configured to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the image processing method described in any one of the embodiments of the present disclosure.
  • embodiments of the present disclosure also provide a storage medium containing computer-executable instructions, which when executed by a computer processor are used to perform any of the embodiments of the present disclosure.
  • Figure 1 is a schematic flow chart of an image processing method provided by an embodiment of the present disclosure
  • Figure 2 is a schematic diagram of determining pixel attributes corresponding to points to be rendered provided by an embodiment of the present disclosure
  • Figure 3 is a schematic diagram of a rendering effect provided by an embodiment of the present disclosure.
  • Figure 4 is a schematic structural diagram of an image processing device provided by an embodiment of the present disclosure.
  • FIG. 5 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
  • embodiments of the present disclosure provide an image processing method, device, electronic device, and storage medium.
  • the term “include” and its variations are open-ended, ie, “including but not limited to.”
  • the term “based on” means “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; and the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • a prompt message is sent to the user to clearly remind the user that the operation requested will require the acquisition and use of the user's personal information. Therefore, users can autonomously choose whether to provide personal information to software or hardware such as electronic devices, applications, servers, or storage media that perform the operations of the embodiments of the present disclosure based on the prompt information.
  • the method of sending prompt information to the user may be, for example, a pop-up window, and the prompt information may be presented in the form of text in the pop-up window.
  • the pop-up window can also contain a selection control for the user to choose "agree” or "disagree” to provide personal information to the electronic device.
  • image rendering can be performed.
  • image rendering needs to be performed, and the implementation of the present disclosure can be used for image processing.
  • the process of generating special effect images may be a short video shooting process, a video call, a live video broadcast, or a multi-person conversation scenario, and the embodiments of the present disclosure may be used.
  • image rendering is mainly used for further processing of images after collecting them.
  • the device for executing the special effects image processing method provided by the embodiment of the present disclosure can be integrated into application software that supports special effects image processing functions, and the software can be installed in an electronic device.
  • the electronic device can be a mobile device. Terminal or personal computer (Personal Computer, PC) terminal, etc.
  • the application software can be a type of software for image/video processing, as long as it can realize image/video processing.
  • Figure 1 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure.
  • the embodiment of the present disclosure can perform image rendering.
  • the method can be executed by an image processing device, and the device can be implemented in the form of software and/or hardware.
  • it is implemented through an electronic device, which may be a mobile terminal, a PC, or a server.
  • the implementation of this embodiment may be executed by the server, may be executed by the client, or may be executed by the client and the server in cooperation.
  • the method includes:
  • S110 Collect the image to be processed including the target object, and determine the style map, transformation matrix and velocity field map corresponding to the image to be processed.
  • a control for triggering the special effects can be developed in advance.
  • the special effects triggering operation can be responded to, thereby collecting The image to be processed and processed.
  • the image to be processed may be an image captured by an application, or an image captured based on the camera device provided by the terminal device, or each video frame collected during the video shooting process may be used as an image to be processed. It should be noted that each video frame is processed using the rendering method provided by the embodiment of the present disclosure. At the same time, after the special effects image corresponding to the first video frame is rendered, the next video is rendered. Frames are repeatedly executed in this embodiment to determine corresponding special effect video frames.
  • the target objects can be users, animals and plants in the frame, etc.
  • the target object may correspond to a user, that is, special effects processing needs to be performed on the user in the image to be processed to obtain a corresponding special effect image.
  • which user in the frame is the target object can be pre-calibrated, or all users can be used as target users. For example, if you only need to render special effects for a specific user, you can upload the user image corresponding to the specific user in advance, and determine the user characteristics of the user, so that when the corresponding user is included in the display interface, a feature recognition algorithm can be used to determine Is the user a calibrated specific user? If so, special effects processing will be performed. Otherwise, special effects processing will not be performed.
  • the trigger timing for determining the acquisition of the image to be processed including the target object includes at least one of the following: detecting the triggering of the special effects processing prop; detecting the collected audio information triggering the special effects wake-up word; detecting the entry into the frame. Including the target object; the detected body movements of the target object are consistent with the preset body movements.
  • the special effects processing props can be triggered by a button displayed on the application software display interface, and the triggering representation of the button needs to determine the current special effects image.
  • the user triggers this button, it can be considered that special effects processing is required, and the collected image to be processed needs to be processed.
  • the advantage of determining whether to add special effects based on the content of the voice information is that it avoids user interaction with the display page and improves the intelligence of adding special effects.
  • Another implementation method may be to determine whether the user's facial image is included in the field of view according to the shooting field of view of the mobile terminal. When the user's facial image is detected, the application software can detect the event of detecting the facial image as a collection event.
  • the operation of the image to be processed can also be the detection of an object in the frame that triggers a special effects processing action, such as an "OK" gesture.
  • a special effects processing action such as an "OK" gesture.
  • Persons skilled in the art should understand that which event is selected as the page turning condition of the special effect can be set according to the actual situation, and the embodiments of the present disclosure are not specifically limited here.
  • the style map can be understood as a map corresponding to a certain characteristic style, and the style map corresponds to the facial area of the target object in the image to be processed.
  • the velocity field diagram can be understood as a view describing the motion of pixels, and it is a deformation diagram from a macro perspective. This velocity field map mainly corresponds to the motion field map of pixels in the facial area.
  • the velocity field diagram can be understood as consisting of many matrices, and each matrix is used to represent the displacement parameters of the corresponding pixel points.
  • the transformation matrix is used to process the pre-established mesh model (quad mesh) to transform the mesh model into the facial area of the target object.
  • a single rendering pass can be understood as when rendering an image to be processed based on a shader, a single rendering pass can be used to process the above image to obtain a corresponding target special effect image.
  • Using a single rendering pass for processing can avoid the generation of multiple intermediate images during the rendering process.
  • determining the style map, transformation matrix and velocity field map corresponding to the image to be processed may be: processing the image to be processed based on the target style map generation model to obtain the image corresponding to the target area.
  • a style map wherein the target area corresponds to the facial area of the target object; determining a velocity field diagram corresponding to at least one vertex texture coordinate in the mesh model; wherein the mesh model corresponds to the facial area of the target object ;
  • the texture coordinates of the grid model are consistent with the texture coordinates of the style map and the velocity field map respectively.
  • style map, transformation matrix and velocity field map corresponding to each image to be processed are different.
  • image to be processed is transformed, there are also certain differences in the results obtained after processing the image to be processed. .
  • the processing of an image to be processed is taken as an example for explanation.
  • the target style map generation model may be a pre-generated model used to convert the image to be processed into a corresponding style map.
  • the target style model may be a stylegan model based on a Generative Adversarial Network (GAN).
  • GAN Generative Adversarial Network
  • the converted style map can be used as a style map.
  • the style features corresponding to the style map can be the style features required by any user, and the style features can be corresponding to the training samples. For example, if the training samples are samples corresponding to feature style A, then the corresponding target style map generation model is A style.
  • the obtained style map is also an image with characteristic style A, and the image at this time can be used as a GAN image.
  • the velocity field map is a texture image that records two-dimensional (2D) vector information, that is, the velocity field map is an image that records the texture coordinate offset of each vertex in the grid model.
  • 2D two-dimensional
  • Flowmap a texture image that records 2D vector information.
  • the color on the velocity field map (usually the Red Green (RG) channel) records the direction of the vector field there, allowing a certain point on the model to show the characteristics of quantitative flow.
  • the flow effect is simulated by offsetting the uv in the shader and then sampling the texture. That is, the offset uv is determined by the vector field recorded by the RG channel to simulate the flow effect.
  • the Quad mesh model is pre-established.
  • the mesh model is composed of multiple patches. Each patch corresponds to multiple vertex texture coordinates. It can be based on the determined transformation matrix corresponding to the image to be processed. Convert the vertex texture coordinates in the mesh model into viewport space (i.e., screen space).
  • the window space corresponds to the space of the image to be processed.
  • the transformation matrix may be a matrix that transforms the vertex texture coordinates of the mesh model to transform the mesh model into viewport space. At this time, the window space can be understood as the space corresponding to the display interface.
  • the mesh model, style map and velocity field map are all corresponding. For example, if the vertex texture coordinates corresponding to the mesh model are 0 to 1, then the texture coordinates of the style map and velocity field map also correspond to 0 to 1. , and each vertex texture coordinate has a one-to-one correspondence.
  • the style map and the velocity field map can be obtained, the target special effects map effect that needs to be converted can be determined, and the color information of the corresponding pixel points is sampled and rendered based on the rendering channel. Get the target special effects image.
  • S120 Process the style map, transformation matrix, velocity field map and the image to be processed based on a single rendering channel to obtain a target special effects image corresponding to the image to be processed.
  • a single rendering pass can be understood as a rendering pass that renders the above obtained results to obtain a target special effect image corresponding to the image to be processed.
  • the style map, transformation matrix, velocity field map and to-be- Process the image to obtain a target special effect image corresponding to the image to be processed including: determining the pixel coordinates to be processed of at least one model texture coordinate in the grid model in the image to be processed based on the transformation matrix; based on the coordinates of at least one pixel to be processed , at least one model texture coordinate and the velocity field map, determine the target pixel coordinate of the at least one model texture coordinate in the image to be processed; based on at least one target pixel coordinate and the transformation matrix, determine at least one model texture coordinate corresponding to the target style of the style map Texture coordinates; determine the target special effects image based on the target pixel coordinates corresponding to the same model texture coordinates and the pixel attributes of the target style texture coordinates.
  • the mesh model is composed of multiple patches.
  • Each patch is composed of multiple, for example, at least six vertices.
  • Each vertex has corresponding texture coordinates, and interpolation can be performed based on the vertex texture coordinates of each patch. Operation to obtain each grid point located on the patch.
  • the texture coordinates corresponding to each grid point can be determined according to the vertex texture coordinates, and used as the grid texture coordinates.
  • the mesh model is shown in Figure 2.
  • the upper left model vertex is (0, 0) and the lower right model vertex is (1, 1). That is, the model texture coordinates of a certain point in the mesh model are (u, v).
  • the processing method for each model's texture coordinates is the same, and will be explained later by taking the model texture coordinates as (u, v) as an example.
  • the model texture coordinates of the mesh model can be converted from the model space to the view window space (screen space) based on the transformation matrix, that is, the same space corresponding to the image to be processed.
  • the texture coordinates of each model can be obtained corresponding to the coordinates in the image to be processed, and used as the pixel coordinates to be processed. That is, the pixel coordinates to be processed are the coordinates corresponding to the texture coordinates of each model converted to the image to be processed.
  • the target texture coordinate is the final corresponding pixel point corresponding to the grid texture coordinate, which corresponds to the point on the image to be processed.
  • the target style texture coordinates can be understood as the model texture coordinates corresponding to the corresponding texture coordinates after being mapped to the GAN image to obtain the display attributes corresponding to the target style texture coordinates.
  • the target pixel coordinates and the target style texture coordinates corresponding to each grid texture coordinate can be determined based on the above steps, the display attributes of the target pixel coordinates and the target style texture coordinates can be obtained, and the corresponding grid texture coordinates can be determined. Display properties.
  • the mesh model is converted to the corresponding facial area. Therefore, based on the display attributes of each mesh texture coordinate in the mesh model and the display attributes corresponding to the areas other than the mesh model in the image to be processed, the target special effects image can be determined .
  • the pixel coordinates to be processed of the current model texture coordinates in the image to be processed are determined by left-multiplying the transformation matrix based on the current model texture coordinates.
  • the transformation matrix includes a model matrix, a visual matrix and a projection matrix.
  • Model matrix used to convert coordinates to corresponding coordinates in the world coordinate system.
  • the visual matrix is used to transform all vertices from the world coordinate system to the coordinate system from the camera's perspective.
  • the coordinate system transformation is essentially a translation and rotation operation. Determining the View matrix requires knowing the camera's position and camera orientation.
  • Projection matrix mainly used to convert vertices The coordinates are converted to the corresponding xyz conversion to between [-1, 1]. Subsequently, the transformation matrix is called the MVP matrix.
  • the MVP matrix By left-multiplying the model texture coordinates (u, v) by the MVP matrix, the model texture coordinates can be converted into the view window space.
  • each model texture coordinate by left-multiplying each model texture coordinate by the MVP matrix, we obtain the model texture coordinates corresponding to the corresponding pixel point coordinates on the image to be processed, and use them as the pixel coordinates to be processed (x, y).
  • the model texture coordinates (u, v) are left multiplied by the MVP matrix to obtain the pixel points (x, y) corresponding to the image to be processed.
  • the target pixel coordinate of the at least one model texture coordinate in the image to be processed is determined based on the at least one pixel coordinate to be processed, the at least one model texture coordinate and the velocity field map,
  • the method includes: for the model texture coordinates, determining that the current model texture coordinates correspond to the current displacement texture coordinates in the velocity field map; and determining the target of the current model texture coordinates according to the current displacement texture coordinates and the corresponding pixel coordinates to be processed. Pixel coordinates.
  • the grid model corresponds to the velocity field diagram, that is, the coordinates corresponding to the velocity field diagram and the model texture coordinates of the grid model correspond to the same point.
  • the pixel attributes of each point in the velocity field include Red Green Blue Alpha (RGBA), where RG can be used as the offsets ⁇ u and ⁇ v corresponding to the corresponding model texture coordinates.
  • RGBA Red Green Blue Alpha
  • the pixel coordinates to be processed corresponding to the model texture coordinates (u, v) are (x, y).
  • Flow (u, v) (r, g), where r and g correspond to the coordinate offsets ⁇ u and ⁇ v respectively.
  • the coordinates of the target pixel point corresponding to the model texture in the image to be processed are (x+ ⁇ u, y+ ⁇ v).
  • the above steps can be repeated to obtain the target pixel coordinates corresponding to each model texture coordinate.
  • the pixel attributes corresponding to the target pixel coordinates can be obtained.
  • the pixel attributes can include RGB values and a values. The a value is mainly used to characterize the transparency value of the alpha channel during the rendering process.
  • the process is: obtain the pixel attribute corresponding to the current displacement texture coordinate, and determine the coordinate offset based on at least two attribute values in the pixel attribute;
  • the coordinates of the pixels to be processed are accumulated to obtain the coordinates of the target pixels.
  • Style texture coordinates are used to superimpose or mix the pixel attributes of the target pixel with the pixel attributes based on the target style texture coordinates to obtain the final pixel attributes.
  • determining that the at least one model texture coordinate corresponds to the target style texture coordinate of the style map based on at least one target pixel coordinate and the transformation matrix includes: based on the target pixel of the current model texture coordinate The coordinates are left multiplied by the inverse matrix of the transformation matrix to obtain that the current model texture coordinates correspond to the target style texture coordinates in the style map.
  • the GAN image that is, the style map for processing to determine the current model texture coordinates corresponding to the deformation. Which coordinate point in the GAN image (target style texture coordinates), and then obtain the pixel attributes of the target style texture coordinates.
  • the pixel attributes include RGB values and a values.
  • the target pixel attributes of the corresponding model texture coordinates can be determined based on the pixel attributes of the target texture coordinates and the pixel attributes of the target pixel coordinates.
  • the deformation is combined to determine the pixel attributes of the corresponding pixel points, but also the pixel attributes of the GAN image are combined to improve the effect of rendering the target special effects image closer to the corresponding style characteristics.
  • determining the target special effect image based on the target pixel coordinates corresponding to the same model texture coordinates and the pixel attributes of the target style texture coordinates includes: for the model texture coordinates, obtaining the corresponding texture coordinates of the current model in the image to be processed. The first pixel attribute of the target pixel coordinate, and obtain the second pixel attribute corresponding to the target style texture coordinate in the style map; based on the first pixel attribute and the second pixel attribute, determine the method used when rendering the current model texture coordinates Target pixel attributes; determine the target special effect image based on the target pixel attributes of at least one model texture coordinate and the image to be processed.
  • the processing method is the same for each model texture coordinate. Based on this, the processing of a model texture coordinate is explained as an example.
  • the target pixel coordinates corresponding to the current model texture coordinates the target pixel coordinates corresponding to the first pixel attribute of the image to be processed are obtained.
  • the first pixel attribute includes the RGB value and a value of the pixel, where a represents the transparency value.
  • the pixel attribute corresponding to the target style texture coordinate according to the current model texture coordinate can be used as the second pixel attribute.
  • the second The pixel attributes also include the RGB value and a value of the pixel.
  • the fusion function includes a fusion ratio, and the target pixel attribute of the current model texture coordinates is obtained, that is, the target RGB value and a value.
  • the facial area can be rendered.
  • the target special effect image can be obtained.
  • determining the target special effects image based on the target pixel attributes of at least one model texture coordinate and the image to be processed includes: based on the target pixel attributes corresponding to the facial area and the image to be processed. Process pixel attributes in the image other than the facial area to determine the target special effect image.
  • a target special effects image corresponding to the image to be processed can be obtained by fusion.
  • the above-mentioned processing can be performed on the sequentially collected images to be processed to obtain the corresponding target special effects video. It is also possible to perform special effects processing on each uploaded image to be processed based on its timestamp to obtain a target special effects video spliced according to the timestamp.
  • the style features of the style map may correspond to dynasty features or geographical region features.
  • the dynasty characteristics may correspond to the dressing characteristics of a certain dynasty or a certain period.
  • the geographical area feature may be a dress-up feature associated with each geographical area. For example, if the features of a certain dynasty correspond to the dressing features of a chubby face, the presence of special buns, and special makeup on the face, then the GAN image or style map is the facial image after the above dressing processing is performed on the facial area in the image to be processed. At this time, It can also involve buns, etc.
  • the above rendering steps can be performed, and the image to be processed can be processed to obtain the final target special effects image.
  • the style features corresponding to the style map correspond to the makeup and hair features of the Song Dynasty.
  • the operation provided by the embodiment of the present disclosure is triggered, see Figure 3.
  • the style map, transformation matrix and velocity field map can be determined based on the embodiment of the present disclosure.
  • the target special effects image that meets the characteristics of the Song Dynasty is obtained, as shown in Figure 3.
  • the style map, transformation matrix and velocity field map corresponding to the image to be processed are determined; and the style map, transformation matrix is calculated based on a single rendering channel.
  • velocity field diagram and the image to be processed are processed to obtain the same as the image to be processed.
  • Processing the target special effects image corresponding to the image avoids the need to obtain multiple intermediate transition images when using multiple rendering passes to render data, and render the intermediate transition images based on the next rendering pass. That is, during the entire rendering process, not only Multiple rendering passes are required to participate, and intermediate images need to be stored during the rendering process, which occupies memory. Data rendering based on a single rendering pass is implemented, which not only avoids memory occupancy and the use of a large number of rendering passes, but also Improve image rendering efficiency.
  • Figure 4 is a schematic structural diagram of an image processing device provided by an embodiment of the present disclosure. As shown in Figure 4, the device includes: an image acquisition module 310 and an image processing module 320.
  • the image acquisition module 310 is configured to determine the style map, transformation matrix and velocity field map corresponding to the image to be processed when the image to be processed including the target object is collected; the image processing module 320 is configured to be based on a single The rendering channel processes the style map, transformation matrix, velocity field map and the image to be processed to obtain a target special effect image corresponding to the image to be processed.
  • the triggering timing of acquiring the image to be processed including the target object includes at least one of the following:
  • the image acquisition module includes:
  • a style map determination unit configured to process the image to be processed based on a target style map generation model to obtain a style map corresponding to a target area, wherein the target area corresponds to the facial area of the target object;
  • a velocity field map determination unit configured to determine a velocity field map corresponding to the image to be processed; wherein the grid model corresponds to the facial area of the target object;
  • a transformation matrix determination unit configured to determine a transformation matrix corresponding to the image to be processed in the rendering pipeline, so as to transform the vertex texture coordinates of the network model based on the transformation matrix, so that the transformed mesh
  • the model corresponds to the facial area of the target object in the image to be processed
  • the texture coordinates of the grid model are consistent with the texture coordinates of the style map and the velocity field map respectively.
  • the image processing module includes:
  • a first pixel coordinate determination unit configured to determine the pixel coordinates to be processed in the image to be processed according to at least one model texture coordinate in the grid model based on the transformation matrix;
  • a second pixel coordinate determination unit configured to determine the target pixel coordinate of the at least one model texture coordinate in the image to be processed based on at least one pixel coordinate to be processed, the at least one model texture coordinate, and the velocity field map.
  • a style texture coordinate determination unit configured to determine, based on at least one target pixel coordinate and the transformation matrix, that the at least one model texture coordinate corresponds to the target style texture coordinate of the style map;
  • the special effects image determination unit is configured to determine the target special effects image based on the target pixel coordinates corresponding to the same model texture coordinates and the pixel attributes of the target style texture coordinates.
  • the first pixel coordinate determination unit is configured to left-multiply the transformation matrix based on the current model texture coordinates for the model texture coordinates to determine whether the current model texture coordinates are in the image to be processed.
  • the transformation matrix includes a model matrix, a visual matrix and a projection matrix.
  • the second pixel coordinate determination unit includes:
  • the displacement texture coordinate determination subunit is configured to determine, for the model texture coordinates, that the current model texture coordinates correspond to the current displacement texture coordinates in the velocity field map;
  • the target pixel coordinate determination subunit is configured to determine the target pixel coordinate of the current model texture coordinate according to the current displacement texture coordinate and the corresponding pixel coordinate to be processed.
  • the target pixel coordinate determination subunit includes:
  • the coordinate offset determination subunit is configured to obtain the pixel attribute corresponding to the current displacement texture coordinate, and determine the coordinate offset based on at least two attribute values in the pixel attribute; based on the coordinate offset
  • the coordinates of the pixels to be processed are accumulated to obtain the coordinates of the target pixel.
  • the style texture coordinate determination unit is configured to left-multiply the target pixel coordinate based on the current model texture coordinates by the inverse matrix of the transformation matrix to obtain the current model texture coordinates corresponding to the style map.
  • Target style texture coordinates in .
  • the special effect image determination unit includes:
  • the pixel attribute acquisition subunit is configured to acquire the first pixel attribute of the target pixel coordinate corresponding to the current model texture coordinate in the image to be processed for the model texture coordinate, and acquire the target style texture in the style map.
  • the target pixel attribute determination subunit is configured to determine the target pixel attribute used when rendering the current model texture coordinates based on the first pixel attribute and the second pixel attribute;
  • the target special effects image determination subunit is configured to determine the target special effects image based on the target pixel attribute of at least one model texture coordinate and the image to be processed.
  • the target special effect image determination subunit is further configured to be based on The target special effect image is determined by the target pixel attributes corresponding to the facial area and the pixel attributes in the image to be processed except for the facial area.
  • the device further includes: a special effects video determination module configured to splice at least one target special effects image to obtain the target special effects video.
  • the style features of the style map correspond to dynasty features or geographical region features.
  • the style map, transformation matrix and velocity field map corresponding to the image to be processed are determined; and the style map, transformation matrix is calculated based on a single rendering channel.
  • velocity field map and the image to be processed are processed to obtain the target special effects image corresponding to the image to be processed, avoiding the need to obtain multiple intermediate transition images when using multiple rendering channels to render data, and based on the following One rendering pass renders the intermediate transition image, that is, not only multiple rendering passes are required to participate in the entire rendering process, but also the intermediate image needs to be stored during the rendering process, which occupies memory.
  • Data rendering based on a single rendering pass is implemented. , which not only avoids memory usage and uses a large number of rendering passes, but also improves image rendering efficiency.
  • the image processing device provided by the embodiments of the present disclosure can execute the image processing method provided by any embodiment of the present disclosure, and has functional modules and beneficial effects corresponding to the execution method.
  • FIG. 5 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
  • Terminal devices in embodiments of the present disclosure may include, but are not limited to, mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA), tablet computers (PAD), portable multimedia players (Portable Media Player , PMP), mobile terminals such as vehicle-mounted terminals (such as vehicle-mounted navigation terminals), and fixed terminals such as digital televisions (Television, TV), desktop computers, etc.
  • PDA Personal Digital Assistant
  • PMP portable multimedia players
  • PMP Portable Media Player
  • mobile terminals such as vehicle-mounted terminals (such as vehicle-mounted navigation terminals)
  • fixed terminals such as digital televisions (Television, TV), desktop computers, etc.
  • the electronic device shown in FIG. 5 is only an example and should not impose any limitations on the functions and scope of use of the embodiments of the present disclosure.
  • the electronic device 400 may include a processing device (such as a central processing unit, a graphics processor, etc.) 401, which may process data according to a program stored in a read-only memory (Read-Only Memory, ROM) 402 or from a storage device. 408 loaded into random access memory (Random Access Memory, RAM) 403 to perform various appropriate actions and processes. In the RAM 403, various programs and data required for the operation of the electronic device 400 are also stored.
  • the processing device 401, ROM 402 and RAM 403 are connected to each other via a bus 404.
  • An editing/output (I/O) interface 405 is also connected to bus 404.
  • the following devices can be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a Liquid Crystal Display (LCD) , an output device 407 such as a speaker, a vibrator, etc.; a storage device 408 including a magnetic tape, a hard disk, etc.; and a communication device 409.
  • the communication device 409 may allow the electronic device 400 to communicate wirelessly or wiredly with other devices to exchange data.
  • FIG. 5 illustrates electronic device 400 with various means, it should be understood that implementation or availability of all illustrated means is not required. More or fewer means may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product including a computer program carried on a non-transitory computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via communication device 409, or from storage device 408, or from ROM 402.
  • the processing device 401 When the computer program is executed by the processing device 401, the above-mentioned functions defined in the method of the embodiment of the present disclosure are performed.
  • the electronic device provided by the embodiments of the present disclosure and the image processing method provided by the above embodiments belong to the same inventive concept.
  • Technical details that are not described in detail in this embodiment can be referred to the above embodiments, and this embodiment has the same features as the above embodiments. beneficial effects.
  • Embodiments of the present disclosure provide a computer storage medium on which a computer program is stored.
  • the program is executed by a processor, the image processing method provided by the above embodiments is implemented.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium may be, for example, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or any combination thereof.
  • Examples of computer readable storage media may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard drive, random access memory (RAM), read only memory (ROM), erasable programmable read only memory Memory (Erasable Programmable Read-Only Memory, EPROM) or flash memory, optical fiber, portable compact disk only Compact Disc Read-Only Memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
  • Program code contained on a computer-readable medium can be transmitted using any appropriate medium, including but not limited to: wires, optical cables, radio frequency (Radio Frequency, RF), etc., or any suitable combination of the above.
  • the client and server can communicate using any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol), and can communicate with digital data in any form or medium.
  • Communications e.g., communications network
  • Examples of communication networks include Local Area Networks (LANs), Wide Area Networks (WANs), the Internet (e.g., the Internet), and end-to-end networks (e.g., ad hoc end-to-end networks), as well as any current network for knowledge or future research and development.
  • LANs Local Area Networks
  • WANs Wide Area Networks
  • the Internet e.g., the Internet
  • end-to-end networks e.g., ad hoc end-to-end networks
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; it may also exist independently without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs.
  • the electronic device executes the above-mentioned one or more programs.
  • the style map, transformation matrix, velocity field map and the image to be processed are processed based on a single rendering channel to obtain a target special effect image corresponding to the image to be processed.
  • the storage medium may be a non-transitory storage medium.
  • Computer program code for performing the operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages - such as Java, Smalltalk, C++, or a combination thereof. Includes conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network—including a local area network (LAN) or a wide area network (WAN)—or it can be connected to an external computer (such as through an Internet service provider). .
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved.
  • each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or operations. , or can be implemented using a combination of specialized hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure can be implemented in software or hardware.
  • the name of the unit does not constitute a limitation on the unit itself under certain circumstances.
  • the first acquisition unit can also be described as "the unit that acquires at least two Internet Protocol addresses.”
  • exemplary types of hardware logic components include: field programmable gate array (Field Programmable Gate Array, FPGA), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), application specific standard product (Application Specific Standard Product (ASSP), System on Chip (SOC), Complex Programmable Logic Device (CPLD), etc.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any suitable combination of the foregoing.
  • machine-readable storage media examples include one or more wire-based electrical connections, laptop disks, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM) ) or flash memory, optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read only memory
  • CD-ROM portable compact disk read-only memory
  • magnetic storage device magnetic storage device

Abstract

本公开实施例提供了一种图像处理方法、装置、电子设备及存储介质,该方法包括:采集包括目标对象的待处理图像,确定与所述待处理图像相对应的风格图、变换矩阵以及速度场图;基于单一渲染通道对所述风格图、变换矩阵、速度场图以及所述待处理图像进行处理,得到与所述待处理图像相对应的目标特效图像。

Description

图像处理方法、装置、电子设备及存储介质
本申请要求在2022年06月01日提交中国专利局、申请号为202210621895.5的中国专利申请的优先权,以上申请的全部内容通过引用结合在本申请中。
技术领域
本公开实施例涉及图像处理技术领域,例如涉及一种图像处理方法、装置、电子设备及存储介质。
背景技术
越来越多的用户希望通过应用程序拍摄出具有一定风格特征的图像,而此类图像的渲染多是需要多渲染通道的方式来渲染完成。
发明内容
本公开提供了一种图像处理方法、装置、电子设备及存储介质。
第一方面,本公开实施例提供了一种图像处理方法,该方法包括:
采集包括目标对象的待处理图像,确定与所述待处理图像相对应的风格图、变换矩阵以及速度场图;
基于单一渲染通道对所述风格图、变换矩阵、速度场图以及所述待处理图像进行处理,得到与所述待处理图像相对应的目标特效图像。
第二方面,本公开实施例还提供了一种图像处理装置,该装置包括:
图像获取模块,设置为采集包括目标对象的待处理图像,确定与所述待处理图像相对应的风格图、变换矩阵以及速度场图;
图像处理模块,设置为基于单一渲染通道对所述风格图、变换矩阵、速度场图以及所述待处理图像进行处理,得到与所述待处理图像相对应的目标特效图像。
第三方面,本公开实施例还提供了一种电子设备,所述电子设备包括:
一个或多个处理器;
存储装置,设置为存储一个或多个程序,
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如本公开实施例任一所述的图像处理方法。
第四方面,本公开实施例还提供了一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行如本公开实施例任一 所述的图像处理方法。
附图说明
贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,原件和元素不一定按照比例绘制。
图1为本公开实施例所提供的一种图像处理方法流程示意图;
图2为本公开实施例所提供的一种确定待渲染点所对应的像素属性的示意图;
图3为本公开实施例所提供的一种渲染效果示意图;
图4为本公开实施例所提供的一种图像处理装置结构示意图;
图5为本公开实施例所提供的一种电子设备的结构示意图。
具体实施方式
在基于多渲染通道来进行图像渲染时,会产生多幅中间图像,并对中间图像进行存储,再基于另一渲染通道对多幅中间图像渲染处理,存在存储中间图像占用内容,以及需要多通道渲染,存在通道利用率较低,进而引起图像渲染效率较低的状况。
考虑到上述情况,本公开实施例提供了一种图像处理方法、装置、电子设备及存储介质。
下面将参照附图描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模 块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。
可以理解的是,在使用本公开各实施例之前,均应当依据相关法律法规通过恰当的方式对本公开所涉及个人信息的类型、使用范围、使用场景等告知用户并获得用户的授权。
例如,在响应于接收到用户的主动请求时,向用户发送提示信息,以明确地提示用户,其请求执行的操作将需要获取和使用到用户的个人信息。从而,使得用户可以根据提示信息来自主地选择是否向执行本公开实施例的操作的电子设备、应用程序、服务器或存储介质等软件或硬件提供个人信息。
作为一种实现方式,响应于接收到用户的主动请求,向用户发送提示信息的方式例如可以是弹窗的方式,弹窗中可以以文字的方式呈现提示信息。此外,弹窗中还可以承载供用户选择“同意”或者“不同意”向电子设备提供个人信息的选择控件。
可以理解的是,上述通知和获取用户授权过程仅是示意性的,不对本公开的实现方式构成限定,其它满足相关法律法规的方式也可应用于本公开的实现方式中。
可以理解的是,本实施例所涉及的数据(包括但不限于数据本身、数据的获取或使用)应当遵循相应法律法规及相关规定的要求。
在介绍本实施例之前,可以先对应用场景进行示例性说明,本公开实施例,可以进行图像渲染。例如,在生成特效图像的过程中,需要进行图像渲染,则可以采用本公开实施进行图像处理。生成特效图像的过程中可以是短视频拍摄过程、视频通话、视频直播亦或是多人会话场景中,可以采用本公开实施例。还需要说明的是,图像渲染主要是在采集图像之后对图像进行进一步处理中所使用到的。
在本实施例中,执行本公开实施例提供的特效图像处理方法的装置,可以集成在支持特效图像处理功能的应用软件中,且该软件可以安装至电子设备中,例如,电子设备可以是移动终端或者个人计算机(Personal Computer,PC)端等。 应用软件可以是对图像/视频处理的一类软件,只要可以实现图像/视频处理即可。
图1为本公开实施例所提供的一种图像处理方法流程示意图,本公开实施例可以进行图像渲染,该方法可以由图像处理装置来执行,该装置可以通过软件和/或硬件的形式实现,例如,通过电子设备来实现,该电子设备可以是移动终端、PC端或服务器等。该实施例的实现可以是由服务端执行、也可以是由客户端执行、还可以是由客户端和服务端配合执行。
如图1所示,所述方法包括:
S110、采集包括目标对象的待处理图像,确定与所述待处理图像相对应的风格图、变换矩阵以及速度场图。
在本实施例中,在支持特效图像处理功能的应用软件或应用程序中,可以预先开发用于触发特效的控件,当检测到用户触发该控件时,即可对特效触发操作进行响应,从而采集待处理图像,并对其进行处理。
其中,待处理图像可以是应用程序拍摄的图像,或者是基于终端设备自带的摄像装置拍摄得到的图像,还可以是在视频拍摄过程中将采集的每个视频帧作为待处理图像。需要说明的是,对每个视频帧的处理都是采用本公开实施例所提供的渲染方式进行渲染的,同时,还是在渲染得到第一个视频帧所对应的特效图像后,对下一视频帧重复执行本实施例来确定相应的特效视频帧。目标对象可以是入镜画面中的用户、动植物等。例如,目标对象可以对应的为用户,即需要对待处理图像中的用户进行特效处理,以得到相应的特效图像。在本公开实施例中,目标对象为入镜画面中的哪个用户,可以是预先标定的,也可以是将所有用户作为目标用户。例如,只需要对某个特定的用户特效渲染,则可以预先上传与特定用户所对应的用户图像,并确定该用户的用户特征,以在显示界面中包括相应的用户时,采用特征识别算法确定该用户是否为标定的特定用户,若是,则执行特效处理,反之,则不执行特效处理。
在公开实施例中,确定采集包括目标对象的待处理图像的触发时机包括至少一种,分别为:检测到触发特效处理道具;检测到采集的音频信息触发特效唤醒词;检测到入镜画面中包括目标对象;检测到目标对象的肢体动作与预设肢体动作相一致。
其中,特效处理道具可以通过应用软件显示界面上显示的按键来触发,该按键的触发表征需要确定出当前特效图像。在实际应用中,若用户触发该按键,可以认为要特效处理,则需要对采集的待处理图像进行处理。
还可以是,基于终端设备上部署的麦克风阵列采集语音信息,并对语音信息分析处理,若处理结果中包括添加特效的词汇,则说明触发了特效添加功能。基于语音信息的内容来确定是否添加特效的好处在于,避免用户与显示页面的交互,提高了特效添加的智能性。另一种实现方式可以是,根据移动终端的拍摄视野范围,确定视野范围内是否包含用户的面部图像,当检测到用户的面部图像时,应用软件即可将检测到面部图像这一事件作为采集待处理图像的操作;还可以是在检测到入镜画面中的对象触发了特效处理动作,例如,“OK”手势。本领域技术人员应当理解,选择何种事件作为特效的翻页条件可以根据实际情况进行设置,本公开实施例在此不做具体的限定。
在本实施例中,风格图可以理解为某一种特征风格所对应的贴图,该风格图对应于待处理图像中目标对象的面部区域。速度场图可以理解为是描述像素运动的视图,从宏观表现上为形变示意图。该速度场图主要对应于面部区域像素点的运动场图。速度场图中可以理解为有很多个矩阵构成的,每个矩阵用于表征相应像素点的位移参数。变换矩阵用于对预先建立的网格模型(quad mesh)进行处理,以将网格模型转换到目标对象的面部区域。单一渲染通道可以理解为在基于着色器对待处理图像进行渲染时,可以采用单一渲染通道以对上述图像进行处理,得到相应的目标特效图像。采用单一渲染通道进行处理,可以避免在渲染过程中产生多幅中间图像,进而基于中间图像进行图像渲染时,存在占用内,导致渲染效率较低的状况。也就是说,本公开实施例,由于采用的为单一渲染通道处理,因此仅需要对相应的坐标进行转换,不需要得到多幅中间图像,达到了降低内存占用。
在本公开实施例中,确定与待处理图像所对应的风格图、变换矩阵以及速度场图,可以是:基于目标风格图生成模型对所述待处理图像进行处理,得到与目标区域相对应的风格图,其中,所述目标区域对应于所述目标对象的面部区域;确定网格模型中至少一个顶点纹理坐标所对应的速度场图;其中,所述网格模型对应于目标对象的面部区域;确定与所述待处理图像在渲染管线中所对应的变换矩阵,以基于所述变换矩阵将所述网格模型投影变换,以使变换后的网格模型对应于所述目标对象的面部区域;其中,所述网格模型的纹理坐标分别与所述风格图和所述速度场图的纹理坐标相一致。
需要说明的是,每一幅待处理图像所对应的风格图、变换矩阵以及速度场图都是不同,当待处理图像发生变换时,其对待处理图像处理后所得到的结果也存在一定的差异。在此,以对一幅待处理图像处理为例来说明。
其中,目标风格图生成模型可以是预先生成的模型,用于将待处理图像转换为相应的风格图。目标风格模型可以是基于生成对抗网络(Generative Adversarial Network,GAN)的stylegan模型。可以将转换的风格图作为风格图。风格图所对应的风格特征可以是任意用户所需要的风格特征,其风格特征可以是训练样本是对应的,例如,训练样本为特征风格A所对应的样本,那么目标风格图生成模型对应的为A风格。相应的,得到的风格图也为特征风格为A的图像,可以将此时的图像作为GAN图像。相应的,可以采用相应的算法或者模型,确定与待处理图像相对应的速度场图。速度场图是一幅记录二维(2-dimensional,2D)向量信息的纹理图像,即速度场图为记录网格模型中各顶点纹理坐标偏移量的图像。例如,速度场图流型图(Flowmap)的实质就是一张记录了2D向量信息的纹理图像。速度场图上的颜色(通常为Red Green(RG)通道)记录该处向量场的方向,让模型上某一点表现出定量流动的特征。通过在着色器中偏移uv再对纹理进行采样,来模拟流动效果,即通过RG通道记录的向量场确定偏移uv来模拟流动效果。通过确定速度场,可以确定相应像素点所对应的形变位移,进而拿到相应像素点的显示信息,并渲染,以得到特效图像。Quad mesh网格模型为预先建立的,该网格模型是有多个面片构成的,每个面片对应于多个顶点纹理坐标,可以基于确定出的与待处理图像所对应的变换矩阵,将网格模型中的顶点纹理坐标转换到视窗空间(即,屏幕空间)中。该视窗空间与待处理图像的空间相对应。变换矩阵可以是对网格模型的顶点纹理坐标变换的矩阵,以将网格模型转换到视窗空间中。此时的,视窗空间可以理解为显示界面所对应的空间。
网格模型、风格图以及速度场图都是相对应的,例如,网格模型所对应的顶点纹理坐标为0~1,那么,风格图以及速度场图的纹理坐标也对应的为0~1,并且每个顶点纹理坐标都是一一对应的关系。
在本公开实施例中,通过确定上述信息,可以获取到风格图以及速度场图,可以确定需要将待处理转换的目标特效图效果,并对基于渲染通道采样相应像素点的颜色信息并渲染,得到目标特效图像。
S120、基于单一渲染通道对所述风格图、变换矩阵、速度场图以及所述待处理图像进行处理,得到与所述待处理图像相对应的目标特效图像。
其中,单一渲染通道可以理解为一个渲染通道对上述得到的结果进行渲染,以得到与待处理图像对应的目标特效图像。
在本实施例中,基于单一渲染通道对风格图、变换矩阵、速度场图以及待 处理图像进行处理,得到与待处理图像相对应的目标特效图像,包括:基于变换矩阵确定网格模型中至少一个模型纹理坐标在待处理图像中的待处理像素坐标;基于至少一个待处理像素坐标、至少一个模型纹理坐标以及速度场图,确定至少一个模型纹理坐标在待处理图像中的目标像素坐标;基于至少一个目标像素坐标以及变换矩阵,确定至少一个模型纹理坐标对应于风格图的目标风格纹理坐标;基于同一模型纹理坐标所对应的目标像素坐标和目标风格纹理坐标的像素属性,确定目标特效图像。
其中,网格模型是由多个面片构成的,每个面片由多个例如至少六个顶点构成,每个顶点都有相应的纹理坐标,可以基于每个面片的顶点纹理坐标进行插值运算,得到位于面片上的各个网格点。同时,可以根据顶点纹理坐标确定各网格点所对应的纹理坐标,并将其作为网格纹理坐标。网格模型如图2所示,左上模型顶点为(0,0),右下模型顶点为(1,1),即网格模型中某个点的模型纹理坐标为(u,v)。对每个模型纹理坐标的处理方式都是一样的,后续以模型纹理坐标为(u,v)为例来说明。可以基于变换矩阵将网格模型的模型纹理坐标从模型空间转换为视窗空间(屏幕空间)中,即与待处理图像所对应的同一空间。此时,可以得到各模型纹理坐标对应于待处理图像中的坐标,并作为待处理像素坐标。即待处理像素坐标为各模型纹理坐标转化到待处理图像上后所对应的坐标。目标纹理坐标是与网格纹理坐标相对应的最终对应的像素点,该像素点对应于待处理图像上的点。目标风格纹理坐标可以理解为模型纹理坐标对应于到GAN图像上之后所对应的纹理坐标,以获取该目标风格纹理坐标所对应的显示属性。
例如,可以基于上述步骤分别确定各网格纹理坐标所对应的目标像素坐标以及目标风格纹理坐标,可以获取目标像素坐标的显示属性以及目标风格纹理坐标的显示属性,确定网格纹理坐标所对应的显示属性。网格模型以转换到相应的面部区域,因此,基于网格模型中各网格纹理坐标的显示属性和待处理图像中除网格模型之外区域所对应的显示属性,可以确定出目标特效图像。
在本实施例中,对于模型纹理坐标,基于当前模型纹理坐标左乘变换矩阵,确定当前模型纹理坐标在待处理图像中的待处理像素坐标。
其中,变换矩阵包括模型矩阵、视觉矩阵以及投影矩阵。模型矩阵,用于将坐标转换到世界坐标系下所对应的坐标。视觉矩阵用于把所有顶点从世界坐标系变换到以相机视角下的坐标系,坐标系变换本质上就是平移和旋转操作。确定View矩阵需要知道相机的位置和相机的朝向。投影矩阵,主要用于将顶点 坐标转换到相应的xyz转换到[-1,1]之间。后续,将变换矩阵称为MVP矩阵。通过对模型纹理坐标(u,v)左乘MVP矩阵,就可以得到模型纹理坐标转换到视窗空间中。
例如,通过将每个模型纹理坐标左乘MVP矩阵,得到模型纹理坐标对应于待处理图像上所对应的像素点坐标,并将其作为待处理像素坐标(x,y)。示例性,参见图2,模型纹理坐标(u,v)左乘MVP矩阵后得到对应于待处理图像中的像素点(x,y)。通过上述确定待处理像素坐标的方式,可以确定网格模型中的每个点对应于的为待处理图像中的像素点,进而基于风格图和形变图像进行进一步处理,以得到面部区域中与网格模型相对应的每个点的显示属性,从而得到目标特效图像。
在上述本实施例的基础上,在得到模型纹理坐标所对应的待处理像素坐标之后,还需要确定网格纹理坐标所对应的形变张量,即形变位移,进而基于形变位移来确定其对应的像素点,进而获取该像素点所对应的显示属性并渲染。
在一实施例中,所述基于至少一个待处理像素坐标、所述至少一个模型纹理坐标以及所述速度场图,确定所述至少一个模型纹理坐标在所述待处理图像中的目标像素坐标,包括:对于模型纹理坐标,确定当前模型纹理坐标对应于所述速度场图中的当前位移纹理坐标;根据所述当前位移纹理坐标和相应的待处理像素坐标,确定所述当前模型纹理坐标的目标像素坐标。
网格模型与速度场图是相对应的,即速度场图所对应的坐标与网格模型的模型纹理坐标对应的为同一点。相应的,该速度场中每个点的像素属性中包括Red Green Blue Alpha(RGBA),其中,可以将RG分别作为相应模型纹理坐标所对应的偏移量Δu和Δv。例如,继续参见图2,模型纹理坐标(u,v)对应的待处理像素坐标为(x,y),基于该模型纹理坐标(u,v)可以知道Flow(u,v)=(r,g),其中,r,g分别对应于坐标偏移量Δu,Δv,可以得到该模型纹理对应到待处理图像中所对应的目标像素点坐标为(x+Δu,y+Δv)。可以重复执行上述步骤,得到每个模型纹理坐标所对应的目标像素坐标。基于目标像素坐标和待处理图像,可以获取目标像素坐标所对应的像素属性,例如,像素属性中可以包括RGB值和a值,a值主要用于表征渲染过程中阿尔法通道的透明度值。
可以理解为,其处理为:获取所述当前位移纹理坐标所对应的像素属性,并基于所述像素属性中的至少两个属性值,确定坐标偏移量;基于所述坐标偏移量对所述待处理像素坐标进行累加,得到所述目标像素坐标。
在一实施例中,在基于上述方式确定模型纹理坐标所对应的像素属性之后,为了得到与某种风格相对应的图像,则可以基于目标像素坐标,确定其对应于风格图上所对应的目标风格纹理坐标,以基于目标风格纹理坐标的像素属性对目标像素点的像素属性进行叠加,或者混合,以得到最终的像素属性。
在一实施例中,所述基于至少一个目标像素坐标以及所述变换矩阵,确定所述至少一个模型纹理坐标对应于所述风格图的目标风格纹理坐标,包括:基于当前模型纹理坐标的目标像素坐标左乘所述变换矩阵的逆矩阵,得到所述当前模型纹理坐标对应于所述风格图中的目标风格纹理坐标。
可以理解为,在确定出当前模型纹理坐标所对应的目标像素坐标之后,为了得到相应的风格特征图像,还需要结合GAN图像,即风格图进行处理,以确定当前模型纹理坐标对应的为形变之后GAN图像中的哪一个坐标点(目标风格纹理坐标),进而获取目标风格纹理坐标的像素属性。
例如,继续参见图2,可以基于目标像素坐标(x+Δu,y+Δv)左乘MVP矩阵的逆矩阵,就可以将其转换到模型空间中,得到该目标像素坐标的目标纹理坐标(u’,v’)。获取该目标纹理坐标(u’,v’)的像素属性,像素属性中包括RGB值和a值。
在基于上述方式确定各模型纹理坐标所对应的目标纹理坐标和目标像素坐标之后,可以基于目标纹理坐标的像素属性和目标像素坐标的像素属性,确定相应模型纹理坐标的目标像素属性。采用上述方式,不仅结合了形变确定相应像素点的像素属性,还结合了GAN图像的像素属性,提高了渲染出的目标特效图像更接近相应风格特征的效果。
在一实施例中,基于同一模型纹理坐标所对应的目标像素坐标和目标风格纹理坐标的像素属性,确定目标特效图像,包括:对于模型纹理坐标,获取待处理图像中与当前模型纹理坐标所对应的目标像素坐标的第一像素属性,以及获取风格图中与目标风格纹理坐标相对应的第二像素属性;基于第一像素属性以及第二像素属性,确定对当前模型纹理坐标渲染时所采用的目标像素属性;基于至少一个模型纹理坐标的目标像素属性和待处理图像,确定目标特效图像。
例如,对于每个模型纹理坐标来说处理方式都是相同的,基于此对一个模型纹理坐标处理为例来说明。根据当前模型纹理坐标所对应的目标像素坐标,获取目标像素坐标对应于待处理图像的第一像素属性。第一像素属性中包括该像素点的RGB值和a值,其中,a表征透明度值。同时,可以根据当前模型纹理坐标对应于目标风格纹理坐标的像素属性,作为第二像素属性。其中,第二 像素属性中也包括该像素点的RGB值和a值。通过对第一像素属性和第二像素属性融合处理,例如,根据预先设置的融合函数,融合函数中包括融合比例,得到当前模型纹理坐标的目标像素属性,即目标RGB值和a值。基于每个模型纹理坐标的目标RGB值和a值,可以对面部区域进行渲染。相应的,基于面部区域的目标像素属性和待处理图像中其它区域的像素属性,可以得到目标特效图像。
示例性的,继续参见图2,将目标风格纹理坐标(u’,v’)的像素属性和目标像素坐标(x+Δu,y+Δv)的像素属性共同赋值到模型纹理坐标(u,v)上。在上述实施例的基础上,所述基于至少一个模型纹理坐标的目标像素属性和所述待处理图像,确定所述目标特效图像,包括:基于与面部区域所对应的目标像素属性和所述待处理图像中除所述面部区域之外的像素属性,确定所述目标特效图像。
例如,可以基于与面部区域相对应的目标像素属性和待处理图像中除面部区域之外的像素属性,可以融合得到一幅与待处理图像相对应的目标特效图像。
在上述实施例的基础上,如果是要得到相应的目标特效视频,则可以对依次采集的待处理图像进行上述处理,以得到相应的目标特效视频。还可以是,依据上传的每一幅待处理图像的时间戳对其进行特效处理,以得到按时间戳拼接的目标特效视频。
在上述实施例的基础上,风格图的风格特征可以与朝代特征或者地理区域特征相对应。其中,朝代特征可以是与某个朝代,或者某个时期的装扮特征相对应。地理区域特征可以是与各地理区域的装扮特征。例如,某个朝代特征对应的为面部胖乎乎、存在特殊发髻以及面部特殊妆容的装扮特征,则GAN图像即风格图为对待处理图像中的面部区域进行上述装扮处理后的面部图像,此时,也可以涉及到发髻等。可以执行上述渲染步骤,就可以对待处理图像进行处理,以得到最终的目标特效图像。
示例性的,如果风格图所对应的风格特征与宋朝时期的妆容和妆发特征相对应。当触发了与本公开实施例所提供的操作,参见图3,在采集包括目标对象的待处理图像后,可以基于本公开实施例确定风格图、变换矩阵以及速度场图,在基于单一渲染通道,得到满足宋朝时期特征的目标特效图像,如图3所示。
本公开实施例,通过在采集到包括目标对象的待处理图像时,确定与所述待处理图像相对应的风格图、变换矩阵以及速度场图;基于单一渲染通道对所述风格图、变换矩阵、速度场图以及所述待处理图像进行处理,得到与所述待 处理图像相对应的目标特效图像,避免了采用多渲染通道对数据渲染处理时,需要得到多个中间过渡图像,并基于下一渲染通道对中间过渡图像进行渲染,即,在整个渲染过程中不仅需要多渲染通道参与,还需要在渲染过程中存储中间图像,存在占用内存的状况,实现了基于单一渲染通道进行数据渲染数据,不仅可以避免内存占用以及使用渲染通道数量较多的状况,还可以提高图像渲染效率。
图4为本公开实施例所提供的一种图像处理装置结构示意图,如图4所示,所述装置包括:图像获取模块310以及图像处理模块320。
其中,图像获取模块310,设置为在采集到包括目标对象的待处理图像时,确定与所述待处理图像相对应的风格图、变换矩阵以及速度场图;图像处理模块320,设置为基于单一渲染通道对所述风格图、变换矩阵、速度场图以及所述待处理图像进行处理,得到与所述待处理图像相对应的目标特效图像。
在上述实施例的基础上,所述采集包括目标对象的待处理图像的触发时机包括下述至少一种:
检测到触发特效处理道具;
检测到采集的音频信息触发特效唤醒词;
检测到入镜画面中包括目标对象;
检测到所述目标对象的肢体动作与预设肢体动作相一致。
在上述实施例的基础上,所述图像获取模块,包括:
风格图确定单元,设置为基于目标风格图生成模型对所述待处理图像进行处理,得到与目标区域相对应的风格图,其中,所述目标区域对应于所述目标对象的面部区域;
速度场图确定单元,设置为确定与所述待处理图像所对应的速度场图;其中,所述网格模型对应于目标对象的面部区域;
变换矩阵确定单元,设置为确定与所述待处理图像在渲染管线中所对应的变换矩阵,以基于所述变换矩阵对所述网络模型的顶点纹理坐标进行变换处理,以使变换后的网格模型对应于所述待处理图像中目标对象的面部区域;
其中,所述网格模型的纹理坐标分别与所述风格图和所述速度场图的纹理坐标相一致。
在上述实施例的基础上,所述图像处理模块,包括:
第一像素坐标确定单元,设置为基于所述变换矩阵,根据网格模型中至少一个模型纹理坐标确定在所述待处理图像中的待处理像素坐标;
第二像素坐标确定单元,设置为基于至少一个待处理像素坐标、所述至少一个模型纹理坐标以及所述速度场图,确定所述至少一个模型纹理坐标在所述待处理图像中的目标像素坐标;
风格纹理坐标确定单元,设置为基于至少一个目标像素坐标以及所述变换矩阵,确定所述至少一个模型纹理坐标对应于所述风格图的目标风格纹理坐标;
特效图像确定单元,设置为基于同一模型纹理坐标所对应的目标像素坐标和所述目标风格纹理坐标的像素属性,确定所述目标特效图像。
在上述实施例的基础上,所述第一像素坐标确定单元,设置为对于模型纹理坐标,基于当前模型纹理坐标左乘所述变换矩阵,确定所述当前模型纹理坐标在所述待处理图像中的待处理像素坐标;其中,所述变换矩阵包括模型矩阵、视觉矩阵以及投影矩阵。
在上述实施例的基础上,所述第二像素坐标确定单元,包括:
位移纹理坐标确定子单元,设置为对于模型纹理坐标,确定当前模型纹理坐标对应于所述速度场图中的当前位移纹理坐标;
目标像素坐标确定子单元,设置为根据所述当前位移纹理坐标和相应的待处理像素坐标,确定所述当前模型纹理坐标的目标像素坐标。
在上述实施例的基础上,所述目标像素坐标确定子单元,包括:
坐标偏移量确定子单元,设置为获取所述当前位移纹理坐标所对应的像素属性,并基于所述像素属性中的至少两个属性值,确定坐标偏移量;基于所述坐标偏移量对所述待处理像素坐标进行累加,得到所述目标像素坐标。
在上述实施例的基础上,所述风格纹理坐标确定单元,设置为基于当前模型纹理坐标的目标像素坐标左乘所述变换矩阵的逆矩阵,得到所述当前模型纹理坐标对应于所述风格图中的目标风格纹理坐标。
在上述实施例的基础上,所述特效图像确定单元,包括:
像素属性获取子单元,设置为对于模型纹理坐标,获取所述待处理图像中与当前模型纹理坐标所对应的目标像素坐标的第一像素属性,以及获取所述风格图中与所述目标风格纹理坐标相对应的第二像素属性;
目标像素属性确定子单元,设置为基于所述第一像素属性以及所述第二像素属性,确定对所述当前模型纹理坐标渲染时所采用的目标像素属性;
目标特效图像确定子单元,设置为基于至少一个模型纹理坐标的目标像素属性和所述待处理图像,确定所述目标特效图像。
在上述实施例的基础上,所述目标特效图像确定子单元,还设置为基于与 面部区域所对应的目标像素属性和所述待处理图像中除所述面部区域之外的像素属性,确定所述目标特效图像。
在上述实施例的基础上,所述装置还包括:特效视频确定模块,设置为对至少一个目标特效图像进行拼接,得到目标特效视频。
在上述各实施例的基础上,所述风格图的风格特征与朝代特征或者地理区域特征相对应。
本公开实施例,通过在采集到包括目标对象的待处理图像时,确定与所述待处理图像相对应的风格图、变换矩阵以及速度场图;基于单一渲染通道对所述风格图、变换矩阵、速度场图以及所述待处理图像进行处理,得到与所述待处理图像相对应的目标特效图像,避免了采用多渲染通道对数据渲染处理时,需要得到多个中间过渡图像,并基于下一渲染通道对中间过渡图像进行渲染,即,在整个渲染过程中不仅需要多渲染通道参与,还需要在渲染过程中存储中间图像,存在占用内存的状况,实现了基于单一渲染通道进行数据渲染数据,不仅可以避免内存占用以及使用渲染通道数量较多的状况,还可以提高图像渲染效率。
本公开实施例所提供的图像处理装置可执行本公开任意实施例所提供的图像处理方法,具备执行方法相应的功能模块和有益效果。
值得注意的是,上述装置所包括的各个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,各功能单元的具体名称也只是为了便于相互区分,并不用于限制本公开实施例的保护范围。
图5为本公开实施例所提供的一种电子设备的结构示意图。下面参考图5,其示出了适于用来实现本公开实施例的电子设备(例如图5中的终端设备或服务器)400的结构示意图。本公开实施例中的终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,PDA)、平板电脑(PAD)、便携式多媒体播放器(Portable Media Player,PMP)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字电视(Television,TV)、台式计算机等等的固定终端。图5示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图5所示,电子设备400可以包括处理装置(例如中央处理器、图形处理器等)401,其可以根据存储在只读存储器(Read-Only Memory,ROM)402中的程序或者从存储装置408加载到随机访问存储器(Random Access Memory, RAM)403中的程序而执行各种适当的动作和处理。在RAM 403中,还存储有电子设备400操作所需的各种程序和数据。处理装置401、ROM 402以及RAM 403通过总线404彼此相连。编辑/输出(Input/Output,I/O)接口405也连接至总线404。
通常,以下装置可以连接至I/O接口405:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置406;包括例如液晶显示器(Liquid Crystal Display,LCD)、扬声器、振动器等的输出装置407;包括例如磁带、硬盘等的存储装置408;以及通信装置409。通信装置409可以允许电子设备400与其他设备进行无线或有线通信以交换数据。虽然图5示出了具有各种装置的电子设备400,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置409从网络上被下载和安装,或者从存储装置408被安装,或者从ROM 402被安装。在该计算机程序被处理装置401执行时,执行本公开实施例的方法中限定的上述功能。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。
本公开实施例提供的电子设备与上述实施例提供的图像处理方法属于同一发明构思,未在本实施例中详尽描述的技术细节可参见上述实施例,并且本实施例与上述实施例具有相同的有益效果。
本公开实施例提供了一种计算机存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述实施例所提供的图像处理方法。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM)或闪存、光纤、便携式紧凑磁盘只 读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如HTTP(HyperText Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:
在采集到包括目标对象的待处理图像时,确定与所述待处理图像相对应的风格图、变换矩阵以及速度场图;
基于单一渲染通道对所述风格图、变换矩阵、速度场图以及所述待处理图像进行处理,得到与所述待处理图像相对应的目标特效图像。
存储介质可以是非暂态(non-transitory)存储介质。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言-诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言-诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算 机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)-连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定,例如,第一获取单元还可以被描述为“获取至少两个网际协议地址的单元”。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(Field Programmable Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Product,ASSP)、片上系统(System on Chip,SOC)、复杂可编程逻辑设备(Complex Programmable Logic Device,CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)或快闪存储器、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。本领域技术人员应当理解,本公开 中所涉及的公开范围,并不限于上述技术特征的特定组合而成的实施例,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它实施例。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的实施例。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。

Claims (16)

  1. 一种图像处理方法,包括:
    采集包括目标对象的待处理图像,确定与所述待处理图像相对应的风格图、变换矩阵以及速度场图;
    基于单一渲染通道对所述风格图、变换矩阵、速度场图以及所述待处理图像进行处理,得到与所述待处理图像相对应的目标特效图像。
  2. 根据权利要求1所述的方法,其中,所述采集包括目标对象的待处理图像的触发时机包括下述至少一种:
    检测到触发特效处理道具;
    检测到采集的音频信息触发特效唤醒词;
    检测到入镜画面中包括目标对象;
    检测到所述目标对象的肢体动作与预设肢体动作相一致。
  3. 根据权利要求1所述的方法,其中,所述风格图是通过如下确定的:
    基于目标风格图生成模型对所述待处理图像进行处理,得到与目标区域相对应的风格图,其中,所述目标区域对应于所述目标对象的面部区域。
  4. 根据权利要求1所述的方法,其中,所述变换矩阵是通过如下确定的:
    确定与所述待处理图像在渲染管线中所对应的变换矩阵,以基于所述变换矩阵对网络模型的顶点纹理坐标进行变换处理,以使变换后的网格模型对应于所述待处理图像中目标对象的面部区域;其中,所述网格模型与面部区域相对应;
    其中,所述网格模型的模型纹理坐标分别与所述风格图和所述速度场图的纹理坐标相对应,所述模型纹理坐标中包括顶点纹理坐标。
  5. 根据权利要求1所述的方法,其中,所述基于单一渲染通道对所述风格图、变换矩阵、速度场图以及所述待处理图像进行处理,得到与所述待处理图像相对应的目标特效图像,包括:
    基于所述变换矩阵,根据网格模型中至少一个模型纹理坐标确定在所述待处理图像中的至少一个待处理像素坐标;
    基于所述至少一个待处理像素坐标、所述至少一个模型纹理坐标以及所述速度场图,确定所述至少一个模型纹理坐标在所述待处理图像中的至少一个目标像素坐标;
    基于所述至少一个目标像素坐标以及所述变换矩阵,确定所述至少一个模型纹理坐标对应于所述风格图的目标风格纹理坐标;
    基于同一模型纹理坐标所对应的目标像素坐标和所述目标风格纹理坐标的 像素属性,确定所述目标特效图像。
  6. 根据权利要求5所述的方法,其中,所述基于所述变换矩阵,根据网格模型中至少一个模型纹理坐标确定在所述待处理图像中的至少一个待处理像素坐标,包括:
    基于所述至少一个模型纹理坐标左乘所述变换矩阵,确定所述至少一个模型纹理坐标在所述待处理图像中的至少一个待处理像素坐标;
    其中,所述变换矩阵包括模型矩阵、视觉矩阵以及投影矩阵。
  7. 根据权利要求5所述的方法,其中,所述基于所述至少一个待处理像素坐标、所述至少一个模型纹理坐标以及所述速度场图,确定所述至少一个模型纹理坐标在所述待处理图像中的至少一个目标像素坐标,包括:
    对于模型纹理坐标,确定当前模型纹理坐标对应于所述速度场图中的当前位移纹理坐标;
    根据所述当前位移纹理坐标和相应的待处理像素坐标,确定所述当前模型纹理坐标的目标像素坐标。
  8. 根据权利要求7所述的方法,其中,所述根据所述当前位移纹理坐标和相应的待处理像素坐标,确定所述当前模型纹理坐标的目标像素坐标,包括:
    获取所述当前位移纹理坐标所对应的像素属性,并基于所述像素属性中的至少两个属性值,确定坐标偏移量;
    基于所述坐标偏移量对所述待处理像素坐标进行累加,得到所述目标像素坐标。
  9. 根据权利要求5所述的方法,其中,所述基于所述至少一个目标像素坐标以及所述变换矩阵,确定所述至少一个模型纹理坐标对应于所述风格图的目标风格纹理坐标,包括:
    基于当前模型纹理坐标的目标像素坐标左乘所述变换矩阵的逆矩阵,得到所述当前模型纹理坐标对应于所述风格图中的目标风格纹理坐标。
  10. 根据权利要求5所述的方法,其中,所述基于同一模型纹理坐标所对应的目标像素坐标和所述目标风格纹理坐标的像素属性,确定所述目标特效图像,包括:
    对于模型纹理坐标,获取所述待处理图像中与当前模型纹理坐标所对应的目标像素坐标的第一像素属性,以及获取所述风格图中与所述目标风格纹理坐标相对应的第二像素属性;
    基于所述第一像素属性以及所述第二像素属性,确定对所述当前模型纹理 坐标渲染时所采用的目标像素属性;
    基于至少一个模型纹理坐标的目标像素属性和所述待处理图像,确定所述目标特效图像。
  11. 根据权利要求10所述的方法,其中,所述基于至少一个模型纹理坐标的目标像素属性和所述待处理图像,确定所述目标特效图像,包括:
    基于与面部区域所对应的目标像素属性和所述待处理图像中除所述面部区域之外的像素属性,确定所述目标特效图像。
  12. 根据权利要求1所述的方法,还包括:
    对至少一个目标特效图像进行拼接,得到目标特效视频。
  13. 根据权利要求1-12中任一所述的方法,其中,所述风格图的风格特征与朝代特征或者地理区域特征相对应。
  14. 一种图像处理装置,包括:
    图像获取模块,设置为采集包括目标对象的待处理图像,确定与所述待处理图像相对应的风格图、变换矩阵以及速度场图;
    图像处理模块,设置为基于单一渲染通道对所述风格图、变换矩阵、速度场图以及所述待处理图像进行处理,得到与所述待处理图像相对应的目标特效图像。
  15. 一种电子设备,包括:
    一个或多个处理器;
    存储装置,设置为存储一个或多个程序,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-13中任一所述的图像处理方法。
  16. 一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行如权利要求1-13中任一所述的图像处理方法。
PCT/CN2023/096537 2022-06-01 2023-05-26 图像处理方法、装置、电子设备及存储介质 WO2023231918A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210621895.5A CN114866706A (zh) 2022-06-01 2022-06-01 图像处理方法、装置、电子设备及存储介质
CN202210621895.5 2022-06-01

Publications (1)

Publication Number Publication Date
WO2023231918A1 true WO2023231918A1 (zh) 2023-12-07

Family

ID=82641277

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/096537 WO2023231918A1 (zh) 2022-06-01 2023-05-26 图像处理方法、装置、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN114866706A (zh)
WO (1) WO2023231918A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114866706A (zh) * 2022-06-01 2022-08-05 北京字跳网络技术有限公司 图像处理方法、装置、电子设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111652830A (zh) * 2020-06-28 2020-09-11 Oppo广东移动通信有限公司 图像处理方法及装置、计算机可读介质及终端设备
CN112419477A (zh) * 2020-11-04 2021-02-26 中国科学院深圳先进技术研究院 一种面部图像风格转换方法、装置、存储介质和电子设备
CN112819945A (zh) * 2021-01-26 2021-05-18 北京航空航天大学 一种基于稀疏视点视频的流体重建方法
CN114331820A (zh) * 2021-12-29 2022-04-12 北京字跳网络技术有限公司 图像处理方法、装置、电子设备及存储介质
CN114387158A (zh) * 2022-01-10 2022-04-22 北京字跳网络技术有限公司 特效图像生成方法、装置、电子设备及存储介质
CN114866706A (zh) * 2022-06-01 2022-08-05 北京字跳网络技术有限公司 图像处理方法、装置、电子设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111652830A (zh) * 2020-06-28 2020-09-11 Oppo广东移动通信有限公司 图像处理方法及装置、计算机可读介质及终端设备
CN112419477A (zh) * 2020-11-04 2021-02-26 中国科学院深圳先进技术研究院 一种面部图像风格转换方法、装置、存储介质和电子设备
CN112819945A (zh) * 2021-01-26 2021-05-18 北京航空航天大学 一种基于稀疏视点视频的流体重建方法
CN114331820A (zh) * 2021-12-29 2022-04-12 北京字跳网络技术有限公司 图像处理方法、装置、电子设备及存储介质
CN114387158A (zh) * 2022-01-10 2022-04-22 北京字跳网络技术有限公司 特效图像生成方法、装置、电子设备及存储介质
CN114866706A (zh) * 2022-06-01 2022-08-05 北京字跳网络技术有限公司 图像处理方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN114866706A (zh) 2022-08-05

Similar Documents

Publication Publication Date Title
WO2022166872A1 (zh) 一种特效展示方法、装置、设备及介质
CN109168026B (zh) 即时视频显示方法、装置、终端设备及存储介质
US11776209B2 (en) Image processing method and apparatus, electronic device, and storage medium
WO2023125374A1 (zh) 图像处理方法、装置、电子设备及存储介质
WO2021254502A1 (zh) 目标对象显示方法、装置及电子设备
US20220159197A1 (en) Image special effect processing method and apparatus, and electronic device and computer readable storage medium
WO2023231918A1 (zh) 图像处理方法、装置、电子设备及存储介质
EP4343580A1 (en) Media file processing method and apparatus, device, readable storage medium, and product
WO2022171024A1 (zh) 图像显示方法、装置、设备及介质
WO2021135864A1 (zh) 图像处理方法及装置
WO2023040749A1 (zh) 图像处理方法、装置、电子设备及存储介质
WO2023109842A1 (zh) 图像展示方法、装置、电子设备及存储介质
US20230336878A1 (en) Photographing mode determination method and apparatus, and electronic device and storage medium
CN116934577A (zh) 一种风格图像生成方法、装置、设备及介质
WO2024037556A1 (zh) 图像处理方法、装置、设备及存储介质
WO2024051541A1 (zh) 特效图像生成方法、装置、电子设备及存储介质
US20220272283A1 (en) Image special effect processing method, apparatus, and electronic device, and computer-readable storage medium
WO2023221941A1 (zh) 图像处理方法、装置、设备及存储介质
WO2023116801A1 (zh) 一种粒子效果渲染方法、装置、设备及介质
WO2023138441A1 (zh) 视频生成方法、装置、设备及存储介质
CN110069641B (zh) 图像处理方法、装置和电子设备
CN114697568B (zh) 特效视频确定方法、装置、电子设备及存储介质
CN116596748A (zh) 图像风格化处理方法、装置、设备、存储介质和程序产品
WO2023140787A2 (zh) 视频的处理方法、装置、电子设备、存储介质和程序产品
US11805219B2 (en) Image special effect processing method and apparatus, electronic device and computer-readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23815101

Country of ref document: EP

Kind code of ref document: A1