WO2023138441A1 - 视频生成方法、装置、设备及存储介质 - Google Patents

视频生成方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2023138441A1
WO2023138441A1 PCT/CN2023/071620 CN2023071620W WO2023138441A1 WO 2023138441 A1 WO2023138441 A1 WO 2023138441A1 CN 2023071620 W CN2023071620 W CN 2023071620W WO 2023138441 A1 WO2023138441 A1 WO 2023138441A1
Authority
WO
WIPO (PCT)
Prior art keywords
coloring
video
order
image regions
image
Prior art date
Application number
PCT/CN2023/071620
Other languages
English (en)
French (fr)
Inventor
卢智雄
温思敬
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2023138441A1 publication Critical patent/WO2023138441A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/64Computer-aided capture of images, e.g. transfer from script file into camera, check of taken image quality, advice or proposal for image composition or decision on when to take image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/001Texturing; Colouring; Generation of texture or colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • H04N23/611Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Definitions

  • Embodiments of the present disclosure relate to the technical field of image processing, for example, to a video generation method, device, device, and storage medium.
  • smart terminal devices have become an indispensable tool for users. Users can take images and record videos through smart terminals to record their lives in the form of videos and photos. In addition, users can also reprocess short videos through terminal devices to express in richer forms, such as beautification, stylization, and expression editing.
  • Embodiments of the present disclosure provide a video generation method, device, device, and storage medium.
  • an embodiment of the present disclosure provides a method for generating a video, including:
  • the embodiment of the present disclosure also provides a video generation device, including:
  • the grayscale video acquisition module is configured to perform grayscale processing on the video to be processed to obtain the grayscale video
  • the segmentation module is configured to perform panoramic semantic segmentation on video frames in the grayscale video to obtain multiple image regions;
  • a coloring order determination module configured to determine the coloring order of the plurality of image regions
  • the coloring module is configured to sequentially color the plurality of image regions according to the coloring order to obtain the target video.
  • an embodiment of the present disclosure further provides an electronic device, and the electronic device includes:
  • a storage device configured to store one or more programs
  • the one or more processing devices are made to implement the video generation method as described in the embodiments of the present disclosure.
  • the embodiments of the present disclosure further provide a computer-readable medium on which a computer program is stored, and when the program is executed by a processing device, the video generation method as described in the embodiments of the present disclosure is implemented.
  • FIG. 1 is a flowchart of a video generation method in Embodiment 1 of the present disclosure
  • Fig. 2 is an example diagram of determining a coloring order based on a main object in an embodiment of the present disclosure
  • Fig. 3 is an example diagram of determining the coloring order based on body movements in an embodiment of the present disclosure
  • FIG. 4 is a schematic structural diagram of a video generation device in an embodiment of the present disclosure.
  • Fig. 5 is a schematic structural diagram of an electronic device in an embodiment of the present disclosure.
  • the term “comprise” and its variations are open-ended, ie “including but not limited to”.
  • the term “based on” is “based at least in part on”.
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one further embodiment”; the term “some embodiments” means “at least some embodiments.” Relevant definitions of other terms will be given in the description below.
  • Embodiment 1 is a flow chart of a video generation method provided by Embodiment 1 of the present disclosure. This embodiment can generate special effects video.
  • the method can be executed by a video generation device.
  • the device can be composed of hardware and/or software, and can generally be integrated into a device with a video generation function.
  • the device can be an electronic device such as a server, a mobile terminal, or a server cluster. As shown in Figure 1, the method includes the following steps:
  • Step 110 performing grayscale processing on the video to be processed to obtain a grayscale video.
  • the video to be processed may be collected by the user using a camera device, or may be obtained by merging and encoding static pictures.
  • Performing grayscale processing on the video to be processed can be understood as performing grayscale processing on each video frame in the video to be processed.
  • the average value, maximum value or minimum value of the three RGB values may be used as the final gray value, which is not limited here.
  • Step 120 performing region segmentation on the video frame in the grayscale video to obtain multiple image regions.
  • the method of region segmentation may be simply dividing the video frame into multiple sub-regions, for example: dividing into four sub-regions of up, down, left, and right; or performing semantic segmentation, instance segmentation or panoramic semantic segmentation.
  • Semantic segmentation can be understood as: classifying all pixels on the video frame.
  • Instance segmentation can be understood as a combination of object detection and semantic segmentation.
  • Panoramic semantic segmentation can be understood as detecting and segmenting all objects in the video frame, including the background.
  • this embodiment uses panoramic semantic segmentation to segment the video frame, so as to obtain multiple image regions in the video frame.
  • the panoramic semantic segmentation of video frames may be processed by using an existing panoramic semantic segmentation model, which is not limited here.
  • Step 130 determine the coloring sequence of multiple image regions.
  • the coloring order can be understood as the order in which each image area changes from grayscale to color during video playback.
  • the segmented areas in the video include the sky area, the human body area, the plant area and the ground area
  • the coloring may be performed in the following sequence during video playback: plant area-human body area-sky area-ground area.
  • the manner of determining the coloring sequence of the multiple image regions may be: acquiring depth information of the multiple image regions; and determining the coloring sequence according to the depth information.
  • the depth information of the image area is represented by the depth information of the pixels in the image area.
  • the manner of obtaining the depth information of multiple image regions may be: for each image region, obtain the depth information of the pixels in the image region; determine the average value of the depth information of the pixels as the depth information of the image region; or determine the depth information of the image region by the depth information of the central point of the image region.
  • an existing depth estimation algorithm may be used to determine the depth information of the pixel. That is, the average value of the depth information of each pixel can be used as the depth information of the current image area, or the depth information of the central pixel point can be used as the depth information of the current image area, or the depth information of randomly selected pixels from the current image area can be used as the depth information of the current image area. The efficiency of determining the depth information of an image region can be improved.
  • the coloring sequence may be determined from far to near or from near to far in depth information. Make the video to be processed present a coloring effect from far to near or from near to far.
  • the manner of determining the coloring sequence of the multiple image regions may be: acquiring distance information between the multiple image regions and the frame boundary; and determining the coloring sequence according to the distance information.
  • the frame boundary includes a left boundary, a right boundary, an upper boundary or a lower boundary.
  • the manner of determining the coloring order of the plurality of image regions may be: performing identification of the main object on the grayscale video; and determining the coloring order based on the main object.
  • a salient object segmentation algorithm may be used to identify the main object for each video frame in the grayscale video.
  • determining the coloring order based on the main object may be: determining a clockwise order or a counterclockwise order around the main object as the color chasing order.
  • FIG. 2 is an example diagram of determining the coloring order based on the main object in this embodiment. As shown in FIG. 2 , during video playback, the main object may be colored first, and then the image areas around the main object may be sequentially colored in a clockwise order around the main object. Make the video to be processed present a coloring effect around the main object.
  • the manner of determining the coloring order of the multiple image regions may be: if the grayscale video contains a human body, recognizing body movements of the human body; and determining the coloring order based on the body movements.
  • the body movement includes a gesture movement or a foot movement.
  • an existing body movement recognition algorithm may be used to recognize the body movement of the human body in the video frame.
  • determining the coloring order based on body movements can be understood as: determining the image area to be colored in the current frame according to body movements of the human body in the current video frame, for example, coloring the image area pointed by a finger.
  • FIG. 3 is an example diagram of determining a coloring order based on body movements in an embodiment of the present disclosure.
  • the manner of determining the coloring order of the plurality of image regions may be: receiving a coloring path drawn by a user; and determining the coloring order according to the coloring path.
  • the coloring path may be a path sequentially passing through image regions in the video.
  • determining the coloring order according to the coloring path can be understood as: determining the order in which the coloring path passes through the image area as the coloring order, that is, coloring the image areas in sequence according to the order in which the coloring path passes through the image area during video playback. Make the video to be processed present the coloring effect of coloring according to the path set by the user.
  • Step 140 sequentially coloring multiple image regions according to the coloring order to obtain the target video.
  • the process of coloring a plurality of image regions in turn according to the coloring order can be: assuming that the coloring order is first, second, ... Nth, within the first m seconds, color the first image region in all video frames; within m to m+n1 seconds, color the second image region in all video frames after m seconds; within m+n1 to m+n1+n2 seconds, color the third image region in all video frames after m+n1 seconds, and so on until the video playback is completed , so as to achieve the special effect that each image area is colored sequentially.
  • the original color of each image area in the video to be processed can be used for coloring, or the image area can be colored by using a set texture.
  • the manner of sequentially coloring the multiple image areas according to the coloring order may be: acquiring the original colors of the multiple image areas in the video to be processed; sequentially coloring the multiple image areas into the original colors according to the coloring order.
  • the original color is the RGB value of the pixels in the image area in the video frame to be processed.
  • the gray value of each pixel in the current object is replaced with the original RGB value, so as to realize the coloring process of the current image area.
  • the manner of sequentially coloring the multiple image regions according to the coloring order may be: acquiring a setting texture; and superimposing the setting texture on the corresponding image region.
  • the set texture may be a texture selected by the user.
  • the method of superimposing the set map onto the corresponding image area may be to retain the color of the pixels in the set map that fall into the current image area, and adjust the color of the pixels outside the current image area to be transparent, thereby realizing the process of superimposing the set map to the corresponding image area. Can improve the diversity of video coloring.
  • the manner of sequentially coloring the plurality of image areas according to the coloring order may be: coloring pixels in the image areas at the same time.
  • the method of sequentially coloring the plurality of image regions according to the coloring order may be: for each image region, coloring is performed according to a set method.
  • the setting method includes coloring direction and coloring speed.
  • the coloring direction may be from left to right, from right to left, from top to bottom, from bottom to top or spread outward from the center point, which is not limited here.
  • the shading speed can be the shading step size. For example, assuming that the shading direction is from left to right, the shading speed can be that the shading step size is N columns of pixels, and N is greater than or equal to 1. Coloring the image area according to a certain speed and direction can make the coloring of the video to be processed more interesting.
  • the method of sequentially coloring multiple image regions according to the coloring order may be: determining the background music of the video to be processed; performing accent recognition on the background music to obtain the accent point; and coloring the image regions arranged in the coloring order at the moment corresponding to the accent point.
  • the background music may be music selected by the user.
  • An existing accent detection algorithm can be used to identify the accent of the background music. The moment where the accent point is located can be taken as the moment when the coloring of the image area to be arranged is started, that is, it can be understood that the coloring of the image area to be arranged is completed within a period of time between two adjacent accent points. This can make the generated target video more rhythmic when gradually coloring the image area.
  • grayscale processing is performed on the video to be processed to obtain a grayscale video
  • panoramic semantic segmentation is performed on video frames in the grayscale video to obtain multiple image regions
  • the coloring sequence of the multiple image regions is determined
  • the multiple image regions are sequentially colored according to the coloring sequence to obtain the target video.
  • the video generation method provided by the embodiments of the present disclosure can color multiple gray-scale image regions separated by panoramic semantics according to the coloring order, which can realize the special effect of "segmentation and retain color", and can improve the interest of the video, the richness of the video display and the user experience.
  • Fig. 4 is a schematic structural diagram of a video generation device provided by an embodiment of the present disclosure. As shown in Figure 4, the device includes:
  • the grayscale video acquisition module 210 is configured to perform grayscale processing on the video to be processed to obtain the grayscale video;
  • the segmentation module 220 is configured to segment the video frame in the grayscale video to obtain a plurality of image areas
  • a coloring order determination module 230 configured to determine the coloring order of a plurality of image regions
  • the coloring module 240 is configured to sequentially color a plurality of image regions according to a coloring sequence to obtain a target video.
  • the coloring sequence determination module 230 is further set to:
  • the coloring sequence determination module 230 is further set to:
  • the coloring sequence determination module 230 is further set to:
  • the picture boundaries include left boundary, right boundary, upper boundary or lower boundary;
  • the coloring sequence determination module 230 is further set to:
  • the coloring sequence determination module 230 is further set to:
  • identifying body movements of the human body wherein the body movements include gestures or foot movements;
  • the coloring sequence determination module 230 is further set to:
  • the coloring module 240 is further configured to:
  • the coloring module 240 is further configured to:
  • the coloring module 240 is further configured to:
  • coloring is performed according to a set method; wherein, the set method includes a coloring direction and a coloring speed.
  • the coloring module 240 is further configured to:
  • the above-mentioned device can execute the methods provided by all the foregoing embodiments of the present disclosure, and has corresponding functional modules and advantageous effects for executing the above-mentioned methods.
  • the above-mentioned device can execute the methods provided by all the foregoing embodiments of the present disclosure, and has corresponding functional modules and advantageous effects for executing the above-mentioned methods.
  • FIG. 5 it shows a schematic structural diagram of an electronic device 300 suitable for implementing the embodiments of the present disclosure.
  • the electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA), tablet computers (PAD), portable multimedia players (Portable Media Player, PMP), vehicle-mounted terminals (such as vehicle-mounted navigation terminals), and fixed terminals such as digital televisions (Television, TV), desktop computers, etc., or various forms of servers, such as independent servers or server clusters.
  • PDA Personal Digital Assistant
  • PMP portable multimedia players
  • vehicle-mounted terminals such as vehicle-mounted navigation terminals
  • fixed terminals such as digital televisions (Television, TV), desktop computers, etc.
  • servers such as independent servers or server clusters.
  • the electronic device shown in FIG. 5 is only an example, and should not limit the functions and scope of use of the embodiments of the present disclosure.
  • the electronic device 300 may include a processing device (such as a central processing unit, a graphics processing unit, etc.) 301, which may perform various appropriate actions and processes according to a program stored in a read-only storage device (Read-Only Memory, ROM) 302 or a program loaded from a storage device 308 into a random-access random-access memory (Random Access Memory, RAM) 303.
  • ROM Read-Only Memory
  • RAM random-access random-access memory
  • the processing device 301, ROM 302, and RAM 303 are connected to each other through a bus 304.
  • An input/output (Input/Output, I/O) interface 305 is also connected to the bus 304 .
  • the following devices can be connected to the I/O interface 305: an input device 306 including, for example, a touch screen, a touchpad, a keyboard, a mouse, a camera, a microphone, an accelerometer, a gyroscope, etc.; an output device 307 including, for example, a liquid crystal display (Liquid Crystal Display, LCD), a speaker, a vibrator, etc.; a storage device 308 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 309.
  • the communication means 309 may allow the electronic device 300 to perform wireless or wired communication with other devices to exchange data. While FIG. 5 shows electronic device 300 having various means, it should be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer readable medium, the computer program comprising program code for performing a word recommendation method.
  • the computer program may be downloaded and installed from a network via communication means 309, or from storage means 308, or from ROM 302.
  • the processing device 301 When the computer program is executed by the processing device 301, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
  • a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof.
  • Examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), Erasable Programmable Read-Only Memory (EPROM) or flash memory, fiber optics, portable Compact Disc Read-Only Memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the program code contained on the computer readable medium can be transmitted by any appropriate medium, including but not limited to: electric wire, optical cable, radio frequency (Radio Frequency, RF), etc., or any suitable combination of the above.
  • the client and the server can communicate using any currently known or future-developed network protocols such as HTTP (Hyper Text Transfer Protocol, Hypertext Transfer Protocol), and can be interconnected with any form or medium of digital data communication (for example, a communication network).
  • HTTP Hyper Text Transfer Protocol
  • Examples of communication networks include local area networks (Local Area Networks, LANs), wide area networks (Wide Area Networks, WANs), internetworks (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: performs grayscale processing on the video to be processed to obtain a grayscale video; performs region segmentation on a video frame in the grayscale video to obtain multiple image regions; determines a coloring sequence of the multiple image regions; sequentially colorizes the multiple image regions according to the coloring sequence to obtain a target video.
  • the storage medium may be a non-transitory storage medium.
  • Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, or combinations thereof, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and conventional procedural programming languages—such as the “C” language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (e.g., through the Internet using an Internet service provider).
  • LAN local area network
  • WAN wide area network
  • Internet service provider e.g., AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
  • each block in the flowchart or block diagram may represent a module, program segment, or portion of code that includes one or more executable instructions for implementing specified logical functions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • the units involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the name of a unit does not constitute a limitation of the unit itself under certain circumstances.
  • exemplary types of hardware logic components include: Field Programmable Gate Array (Field Programmable Gate Array, FPGA), Application Specific Integrated Circuit (ASIC), Application Specific Standard Product (ASSP), System on Chip (System on Chip, SOC), Complex Programmable Logic Device (Complex Programmable Logic Device, CPLD) and so on.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device.
  • a machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • a machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing.
  • Examples of a machine-readable storage medium would include one or more wire-based electrical connections, a portable computer disk, a hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM) or flash memory, optical fiber, compact disc read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read only memory
  • flash memory optical fiber, compact disc read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • the embodiments of the present disclosure disclose a video generation method, including:
  • determining the coloring sequence of the plurality of image regions includes:
  • a shading order is determined according to the depth information.
  • obtaining the depth information of the plurality of image regions includes:
  • the depth information is determined as a coloring order from far to near or from near to far.
  • determining the coloring sequence of the plurality of image regions includes:
  • the picture boundary includes a left boundary, a right boundary, an upper boundary or a lower boundary;
  • a coloring order is determined according to the distance information.
  • determining the coloring sequence of the plurality of image regions includes:
  • a shading order is determined based on the host object.
  • determining the coloring sequence of the plurality of image regions includes:
  • identifying body movements of the human body In response to the grayscale video containing a human body, identifying body movements of the human body; wherein the body movements include gestures or foot movements;
  • a shading order is determined based on the gesture.
  • determining the coloring sequence of the plurality of image regions includes:
  • a shading order is determined according to the shading path.
  • sequentially coloring the plurality of image regions according to the coloring order includes:
  • sequentially coloring the plurality of image regions according to the coloring order includes:
  • sequentially coloring the plurality of image regions according to the coloring order includes:
  • coloring is performed according to a set method; wherein, the set method includes a coloring direction and a coloring speed.
  • sequentially coloring the plurality of image regions according to the coloring order includes:

Abstract

本公开实施例公开了一种视频生成方法、装置、设备及存储介质。对待处理视频进行灰度处理,获得灰度视频;对所述灰度视频中的视频帧进行区域分割,获得多个图像区域;确定所述多个图像区域的着色顺序;按照所述着色顺序依次对所述多个图像区域进行着色,获得目标视频。

Description

视频生成方法、装置、设备及存储介质
本申请要求在2022年01月19日提交中国专利局、申请号为202210060992.1的中国专利申请的优先权,以上申请的全部内容通过引用结合在本申请中。
技术领域
本公开实施例涉及图像处理技术领域,例如涉及一种视频生成方法、装置、设备及存储介质。
背景技术
近几年,智能终端设备已经成为用户不可或缺的工具,用户可以通过智能终端拍摄图像及录制视频,以采用视频、照片等方式记录生活。另外,用户还可以通过终端设备对短视频进行再加工,以更丰富的形式进行表达,比如美颜、风格化、表情编辑等。
发明内容
本公开实施例提供一种视频生成方法、装置、设备及存储介质。
第一方面,本公开实施例提供了一种视频生成方法,包括:
对待处理视频进行灰度处理,获得灰度视频;
对所述灰度视频中的视频帧进行全景语义分割,获得多个图像区域;
确定所述多个图像区域的着色顺序;
按照所述着色顺序依次对所述多个图像区域进行着色,获得目标视频。
第二方面,本公开实施例还提供了一种视频生成装置,包括:
灰度视频获取模块,设置为对待处理视频进行灰度处理,获得灰度视频;
分割模块,设置为对所述灰度视频中的视频帧进行全景语义分割,获得多个图像区域;
着色顺序确定模块,设置为确定所述多个图像区域的着色顺序;
着色模块,设置为按照所述着色顺序依次对所述多个图像区域进行着色,获得目标视频。
第三方面,本公开实施例还提供了一种电子设备,所述电子设备包括:
一个或多个处理装置;
存储装置,设置为存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理装置执行,使得所述一个或 多个处理装置实现如本公开实施例所述的视频生成方法。
第四方面,本公开实施例还提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理装置执行时实现如本公开实施例所述的视频生成方法。
附图说明
图1是本公开实施例一中的一种视频生成方法的流程图;
图2是本公开实施例中的基于主体物体确定着色顺序的示例图;
图3是本公开实施例中的基于肢体动作确定着色顺序的示例图;
图4是本公开实施例中的一种视频生成装置的结构示意图;
图5是本公开实施例中的一种电子设备的结构示意图。
具体实施方式
下面将参照附图描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。
图1为本公开实施例一提供的一种视频生成方法的流程图,本实施例可以生成特效视频,该方法可以由视频生成装置来执行,该装置可由硬件和/或软件组成,并一般可集成在具有视频生成功能的设备中,该设备可以是服务器、移动终端或服务器集群等电子设备。如图1所示,该方法包括如下步骤:
步骤110,对待处理视频进行灰度处理,获得灰度视频。
其中,待处理视频可以是用户采用摄像设备采集的,或者将静态图片合并编码而成的。对待处理视频进行灰度处理可以理解为对待处理视频中的每一个视频帧进行灰度处理。灰度处理的原理可以是将每个像素点的颜色值(Red Green Blue,RGB)调整为相同的值,即R=G=B。本实施中,可以将RGB三个值的平均值、最大值或者最小值作为最终的灰度值,此处不做限定。
步骤120,对灰度视频中的视频帧进行区域分割,获得多个图像区域。
其中,区域分割的方式可以是简单的将视频帧划分为多个子区域,例如:划分为上下左右四个子区域;或者进行语义分割、实例分割或者全景语义分割。语义分割可以理解为:对视频帧上的所有像素点进行分类。实例分割可以理解为目标检测和语义分割的结合。全景语义分割可以理解为对视频帧中所有物体包括背景都要进行检测和分割。在一示例中,本实施例采用全景语义分割的方式对视频帧进行分割,从而在视频帧中获得多个图像区域。本实施例中,对视频帧进行全景语义分割可以采用现有的全景语义分割模型进行处理,此处不做限定。
步骤130,确定多个图像区域的着色顺序。
其中,着色顺序可以理解为在视频播放过程中各图像区域由灰度变为彩色的顺序。示例性的,假设视频中分割出的区域包含天空区域、人体区域、植物区域及地面区域,则在视频播放过程中可以按照如下顺序依次进行着色:植物区域-人体区域-天空区域-地面区域。本实施例中,对图像区域的着色顺序不做限定。
在一示例中,确定多个图像区域的着色顺序的方式可以是:获取多个图像区域的深度信息;根据深度信息确定着色顺序。
其中,图像区域的深度信息由图像区域中的像素点的深度信息来表征。
例如,获取多个图像区域的深度信息的方式可以是:对于每个图像区域,获取图像区域中像素点的深度信息;将像素点的深度信息的平均值确定为图像区域的深度信息;或者,将图像区域中心点的深度信息确定图像区域的深度信息。
其中,可以采用现有的深度估计算法确定像素点的深度信息。即可以将各像素点深度信息的平均值作为当前图像区域的深度信息,也可以将中心像素点的深度信息作为当前图像区域的深度信息,或者将从当前图像区域中随机选择的像素点的深度信息作为当前图像区域的深度信息。可以提高确定图像区域深度信息的效率。
例如,在确定了各图像区域的深度信息后,可以将深度信息由远到近或者由近到远确定为着色顺序。使得待处理视频呈现由远到近的着色或者由近到远的着色效果。
在一示例中,确定多个图像区域的着色顺序的方式可以是:获取多个图像区域与画面边界的距离信息;根据距离信息确定着色顺序。
其中,画面边界包括左边界、右边界、上边界或者下边界。图像区域与画面边界的距离信息可以是图像区域的中心点与画面边界的距离信息,或者图像区域中各像素点与画面边界的距离信息的平均值。根据距离信息确定着色顺序可以理解为将距离信息从大到小或者从小到大形成的顺序确定为着色顺序。即在视频播放过程中,可以按照二维(2-dimensional,2D)画面中从左到右、从右到左、从上到下或者从下到上的顺序依次进行作色。使得待处理视频呈现由从左到右、从右到左、从上到下或者从下到上的着色效果。
在一示例中,确定多个图像区域的着色顺序的方式可以是:对灰度视频进行主体物体的识别;基于主体物体确定着色顺序。
其中,可以采用显著物体分割算法对灰度视频中的各视频帧进行主体物体的识别。例如,基于主体物体确定着色顺序可以是:将绕主体物体的顺时针顺序或者逆时针顺序确定为追色顺序。示例性的,图2为本实施例中基于主体物体确定着色顺序的示例图,如图2所示,在视频播放过程中,可以先对主体物体进行着色,然后按照绕主体物体顺时针的顺序对主体物体周围的图像区域依次进行着色。使得待处理视频呈现绕主体物体的着色效果。
在一示例中,确定多个图像区域的着色顺序的方式可以是:若灰度视频中包含有人体,则识别人体的肢体动作;基于肢体动作确定着色顺序。
其中,肢体动作包括手势动作或者脚部动作。本实施例中,可以采用现有的肢体动作识别算法识别视频帧中人体的肢体动作。例如,基于肢体动作确定着色顺序可以理解为:根据人体在当前视频帧中的肢体动作确定在当前帧中着色的图像区域,例如:可以对手指指向的图像区域进行着色。示例性的,图3是本公开实施例中基于肢体动作确定着色顺序的示例图。如图3所示,第一张 图像中,手指指向左侧区域,则对左侧区域进行着色,第二张图像中,手指指向右侧区域,则对右侧区域进行着色,第三张图像中,双手指向天空,则对天空区域进行着色。使得待处理视频呈现随着人体动作进行着色的着色效果。
在一示例中,确定多个图像区域的着色顺序的方式可以是:接收用户绘制的着色路径;根据着色路径确定着色顺序。
其中,着色路径可以是依次经过视频中的图像区域的路径。例如,根据着色路径确定着色顺序可以理解为:将着色路径经过图像区域的顺序确定为着色顺序,即在视频播放过程中,按照着色路径经过图像区域的顺序依次对图像区域进行着色。使得待处理视频呈现按照用户设定的路径进行着色的着色效果。
步骤140,按照着色顺序依次对多个图像区域进行着色,获得目标视频。
其中,按照着色顺序依次对多个图像区域进行着色的过程可以是:假设着色顺序依次第一、第二、……第N,在前m秒内,对所有视频帧中排序为第一的图像区域进行着色,在第m到m+n1秒内,对第m秒后的所有视频帧中排序为第二的图像区域进行着色,在第m+n1到m+n1+n2秒内,将m+n1秒后的所有视频帧中排序第三的图像区域进行着色,依次类推,直到视频播放完成,从而实现各图像区域依次着色的特效。
本实施例中,可以采用各图像区域在待处理视频中的原始颜色进行着色,或者采用设定贴图对图像区域进行着色。
在一示例中,按照着色顺序依次对多个图像区域进行着色的方式可以是:获取多个图像区域在待处理视频中的原始颜色;按照着色顺序依次将多个图像区域着色为原始颜色。
其中,原始颜色即为图像区域中的像素点在待处理视频帧中的RGB值,当按照着色顺序处理到当前图像区域时,将当前物体中各像素点的灰度值替换为原RGB值,从而实现对当前图像区域的着色处理。
在一示例中,按照着色顺序依次对多个图像区域进行着色的方式可以是:获取设定贴图;将设定贴图叠加至对应的图像区域中。
其中,设定贴图可以是用户选择的贴图。将设定贴图叠加至对应的图像区域中的方式可以是,将设定贴图中落入当前图像区域的像素点的颜色保留,将落入当前图像区域外的像素点的颜色调整为透明,从而实现将设定贴图叠加至对应的图像区域的处理。可以提高视频着色的多样性。
在一示例中,按照所述着色顺序依次对所述多个图像区域进行着色的方式可以是:对图像区域中的像素点同时进行着色。
在一示例中,按照所述着色顺序依次对所述多个图像区域进行着色的方式可以是:对于每个图像区域,按照设定方式进行着色。
其中,设定方式包括着色方向及着色速度。其中,着色方向可以是从左到右、从右到左、从上到下、从下到上或者从中心点开始向外扩散,此处不做限定。着色速度可以是着色步长,例如:假设着色方向为从左到右,则着色速度可以是着色步长为N列像素点,N大于等于1。按照一定的速度和方向对图像区域进行着色,可以提高对待处理视频着色的趣味性。
在一示例中,按照着色顺序依次对多个图像区域进行着色的方式可以是:确定待处理视频的背景音乐;对背景音乐进行重音识别,获得重音点;在重音点对应的时刻对按照着色顺序排到的图像区域进行着色。
其中,背景音乐可以是用户选择的音乐。可以采用现有的重音检测算法对背景音乐的重音进行识别。可以将重音点所在的时刻作为对排到的图像区域开始作色的时刻,即可以理解为在相邻两个重音点内的时段内完成对排到的图像区域的着色。这样可以使得生成的目标视频,在对图像区域逐渐着色时更有节奏感。
本公开实施例,对待处理视频进行灰度处理,获得灰度视频;对灰度视频中的视频帧进行全景语义分割,获得多个图像区域;确定多个图像区域的着色顺序;按照着色顺序依次对多个图像区域进行着色,获得目标视频。本公开实施例提供的视频生成方法,对全景语义分割出的多个灰度图像区域按照着色顺序进行着色,可以实现“分割留色”的特效,可以提高视频的趣味性、视频展示的丰富性及用户体验。
图4是本公开实施例提供的一种视频生成装置的结构示意图。如图4所示,该装置包括:
灰度视频获取模块210,设置为对待处理视频进行灰度处理,获得灰度视频;
分割模块220,设置为对灰度视频中的视频帧进行区域分割,获得多个图像区域;
着色顺序确定模块230,设置为确定多个图像区域的着色顺序;
着色模块240,设置为按照着色顺序依次对多个图像区域进行着色,获得目标视频。
在一实施例中,着色顺序确定模块230,还设置为:
获取多个图像区域的深度信息;
根据深度信息确定着色顺序。
在一实施例中,着色顺序确定模块230,还设置为:
对于每个图像区域,获取图像区域中像素点的深度信息;
将像素点的深度信息的平均值确定为图像区域的深度信息;或者,
将图像区域中心点的深度信息确定图像区域的深度信息;
根据深度信息确定着色顺序,包括:
将深度信息由远到近或者由近到远确定为着色顺序。
在一实施例中,着色顺序确定模块230,还设置为:
获取多个图像区域与画面边界的距离信息;其中,画面边界包括左边界、右边界、上边界或者下边界;
根据距离信息确定着色顺序。
在一实施例中,着色顺序确定模块230,还设置为:
对灰度视频进行主体物体的识别;
基于主体物体确定着色顺序。
在一实施例中,着色顺序确定模块230,还设置为:
响应于灰度视频中包含有人体,识别人体的肢体动作;其中,肢体动作包括手势动作或者脚部动作;
基于肢体动作确定着色顺序。
在一实施例中,着色顺序确定模块230,还设置为:
接收用户绘制的着色路径;
根据着色路径确定着色顺序。
在一实施例中,着色模块240,还设置为:
获取多个图像区域在待处理视频中的原始颜色;
按照着色顺序依次将多个图像区域着色为原始颜色。
在一实施例中,着色模块240,还设置为:
获取设定贴图;
将设定贴图叠加至对应的图像区域中。
在一实施例中,着色模块240,还设置为:
对于每个图像区域,按照设定方式进行着色;其中,设定方式包括着色方向及着色速度。
在一实施例中,着色模块240,还设置为:
确定待处理视频的背景音乐;
对背景音乐进行重音识别,获得重音点;
在重音点对应的时刻对按照着色顺序排到的图像区域进行着色。
上述装置可执行本公开前述所有实施例所提供的方法,具备执行上述方法相应的功能模块和有益效果。未在本实施例中详尽描述的技术细节,可参见本公开前述所有实施例所提供的方法。
下面参考图5,其示出了适于用来实现本公开实施例的电子设备300的结构示意图。本公开实施例中的电子设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,PDA)、平板电脑(PAD)、便携式多媒体播放器(Portable Media Player,PMP)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字电视(Television,TV)、台式计算机等等的固定终端,或者各种形式的服务器,如独立服务器或者服务器集群。图5示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图5所示,电子设备300可以包括处理装置(例如中央处理器、图形处理器等)301,其可以根据存储在只读存储装置(Read-Only Memory,ROM)302中的程序或者从存储装置308加载到随机访问随机存取存储器(Random Access Memory,RAM)303中的程序而执行各种适当的动作和处理。在RAM 303中,还存储有电子设备300操作所需的各种程序和数据。处理装置301、ROM 302以及RAM 303通过总线304彼此相连。输入/输出(Input/Output,I/O)接口305也连接至总线304。
通常,以下装置可以连接至I/O接口305:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置306;包括例如液晶显示器(Liquid Crystal Display,LCD)、扬声器、振动器等的输出装置307;包括例如磁带、硬盘等的存储装置308;以及通信装置309。通信装置309可以允许电子设备300与其他设备进行无线或有线通信以交换数据。虽然图5示出了具有各种装置的电子设备300,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行词语的推荐方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置309从网络上被下载和安装,或者从存储装置308被安装,或者从ROM 302被安装。在该计算机程序被处理装置301执行时,执行本公开实施例的方法中限定的上 述功能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM)或闪存、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如HTTP(Hyper Text Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:对待处理视频进行灰度处理,获得灰度视频;对所述灰度视频中的视频帧进行区域分割,获得多个图像区域;确定所述多个图像区域的着色顺序;按照所述着色顺序依次对所述多个图像区域进行着色,获得目标视频。
存储介质可以是非暂态(non-transitory)存储介质。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(Field Programmable Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Product,ASSP)、片上系统(System on Chip,SOC)、复杂可编程逻辑设备(Complex Programmable Logic Device,CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机 器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)或快闪存储器、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
根据本公开实施例的一个或多个实施例,本公开实施例公开了一种视频生成方法,包括:
对待处理视频进行灰度处理,获得灰度视频;
对所述灰度视频中的视频帧进行区域分割,获得多个图像区域;
确定所述多个图像区域的着色顺序;
按照所述着色顺序依次对所述多个图像区域进行着色,获得目标视频。
在一实施例中,确定所述多个图像区域的着色顺序,包括:
获取所述多个图像区域的深度信息;
根据所述深度信息确定着色顺序。
在一实施例中,获取所述多个图像区域的深度信息,包括:
对于每个图像区域,获取所述图像区域中像素点的深度信息;
将所述像素点的深度信息的平均值确定为所述图像区域的深度信息;或者,
将所述图像区域中心点的深度信息确定所述图像区域的深度信息;
根据所述深度信息确定着色顺序,包括:
将所述深度信息由远到近或者由近到远确定为着色顺序。
在一实施例中,确定所述多个图像区域的着色顺序,包括:
获取所述多个图像区域与画面边界的距离信息;其中,所述画面边界包括左边界、右边界、上边界或者下边界;
根据所述距离信息确定着色顺序。
在一实施例中,确定所述多个图像区域的着色顺序,包括:
对所述灰度视频进行主体物体的识别;
基于所述主体物体确定着色顺序。
在一实施例中,确定所述多个图像区域的着色顺序,包括:
响应于所述灰度视频中包含有人体,识别所述人体的肢体动作;其中,所述肢体动作包括手势动作或者脚部动作;
基于所述肢体动作确定着色顺序。
在一实施例中,确定所述多个图像区域的着色顺序,包括:
接收用户绘制的着色路径;
根据所述着色路径确定着色顺序。
在一实施例中,按照所述着色顺序依次对所述多个图像区域进行着色,包括:
获取多个图像区域在所述待处理视频中的原始颜色;
按照所述着色顺序依次将所述多个图像区域着色为所述原始颜色。
在一实施例中,按照所述着色顺序依次对所述多个图像区域进行着色,包括:
获取设定贴图;
将所述设定贴图叠加至对应的图像区域中。
在一实施例中,按照所述着色顺序依次对所述多个图像区域进行着色,包括:
对于每个图像区域,按照设定方式进行着色;其中,设定方式包括着色方向及着色速度。
在一实施例中,按照所述着色顺序依次对所述多个图像区域进行着色,包括:
确定所述待处理视频的背景音乐;
对所述背景音乐进行重音识别,获得重音点;
在所述重音点对应的时刻对按照所述着色顺序排到的图像区域进行着色。
本领域技术人员会理解,本公开不限于这里所述的特定实施例,对本领域技术人员来说能够进行各种变化、重新调整和替代而不会脱离本公开的保护范围。因此,虽然通过以上实施例对本公开进行了说明,但是本公开不仅仅限于以上实施例,在不脱离本公开构思的情况下,还可以包括更多其他等效实施例,而本公开的范围由所附的权利要求范围决定。

Claims (14)

  1. 一种视频生成方法,包括:
    对待处理视频进行灰度处理,获得灰度视频;
    对所述灰度视频中的视频帧进行区域分割,获得多个图像区域;
    确定所述多个图像区域的着色顺序;
    按照所述着色顺序依次对所述多个图像区域进行着色,获得目标视频。
  2. 根据权利要求1所述的方法,其中,确定所述多个图像区域的着色顺序,包括:
    获取所述多个图像区域的深度信息;
    根据所述深度信息确定着色顺序。
  3. 根据权利要求2所述的方法,其中,获取所述多个图像区域的深度信息,包括:
    对于每个图像区域,获取所述图像区域中像素点的深度信息;
    将所述像素点的深度信息的平均值确定为所述图像区域的深度信息;或者,
    将所述图像区域中心点的深度信息确定所述图像区域的深度信息;
    根据所述深度信息确定着色顺序,包括:
    将所述深度信息由远到近或者由近到远确定为着色顺序。
  4. 根据权利要求1所述的方法,其中,确定所述多个图像区域的着色顺序,包括:
    获取所述多个图像区域分别与画面边界的距离信息;其中,所述画面边界包括左边界、右边界、上边界或者下边界;
    根据所述距离信息确定着色顺序。
  5. 根据权利要求1所述的方法,其中,确定所述多个图像区域的着色顺序,包括:
    对所述灰度视频进行主体物体的识别;
    基于所述主体物体确定着色顺序。
  6. 根据权利要求1所述的方法,其中,确定所述多个图像区域的着色顺序,包括:
    响应于所述灰度视频中包含有人体,识别所述人体的肢体动作;其中,所述肢体动作包括手势动作或者脚部动作;
    基于所述肢体动作确定着色顺序。
  7. 根据权利要求1所述的方法,其中,确定所述多个图像区域的着色顺序,包括:
    接收用户绘制的着色路径;
    根据所述着色路径确定着色顺序。
  8. 根据权利要求1所述的方法,其中,按照所述着色顺序依次对所述多个图像区域进行着色,包括:
    获取多个图像区域在所述待处理视频中的原始颜色;
    按照所述着色顺序依次将所述多个图像区域着色为所述原始颜色。
  9. 根据权利要求1所述的方法,其中,按照所述着色顺序依次对所述多个图像区域进行着色,包括:
    获取设定贴图;
    将所述设定贴图叠加至对应的图像区域中。
  10. 根据权利要求1所述的方法,其中,按照所述着色顺序依次对所述多个图像区域进行着色,包括:
    对于每个图像区域,按照设定方式进行着色;其中,设定方式包括着色方向及着色速度。
  11. 根据权利要求1所述的方法,其中,按照所述着色顺序依次对所述多个图像区域进行着色,包括:
    确定所述待处理视频的背景音乐;
    对所述背景音乐进行重音识别,获得重音点;
    在所述重音点对应的时刻对按照所述着色顺序排到的图像区域进行着色。
  12. 一种视频生成装置,包括:
    灰度视频获取模块,设置为对待处理视频进行灰度处理,获得灰度视频;
    分割模块,设置为对所述灰度视频中的视频帧进行区域分割,获得多个图像区域;
    着色顺序确定模块,设置为确定所述多个图像区域的着色顺序;
    着色模块,设置为按照所述着色顺序依次对所述多个图像区域进行着色,获得目标视频。
  13. 一种电子设备,包括:
    一个或多个处理装置;
    存储装置,设置为存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理装置执行,使得所述一个或多个处理装置实现如权利要求1-11中任一所述的视频生成方法。
  14. 一种计算机可读介质,所述计算机可读介质上存储有计算机程序,所 述计算机程序被处理装置执行时实现如权利要求1-11中任一所述的视频生成方法。
PCT/CN2023/071620 2022-01-19 2023-01-10 视频生成方法、装置、设备及存储介质 WO2023138441A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210060992.1 2022-01-19
CN202210060992.1A CN114422698B (zh) 2022-01-19 2022-01-19 视频生成方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2023138441A1 true WO2023138441A1 (zh) 2023-07-27

Family

ID=81274992

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/071620 WO2023138441A1 (zh) 2022-01-19 2023-01-10 视频生成方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN114422698B (zh)
WO (1) WO2023138441A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114422698B (zh) * 2022-01-19 2023-09-26 北京字跳网络技术有限公司 视频生成方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011118825A (ja) * 2009-12-07 2011-06-16 Yamazaki Co Ltd モノクロ動画の着色装置及びモノクロ動画の着色方法
CN108492348A (zh) * 2018-03-30 2018-09-04 北京金山安全软件有限公司 图像处理方法、装置、电子设备及存储介质
CN109754444A (zh) * 2018-02-07 2019-05-14 京东方科技集团股份有限公司 图像着色方法和装置
CN111815733A (zh) * 2020-08-07 2020-10-23 深兰科技(上海)有限公司 一种视频着色的方法及系统
CN113822951A (zh) * 2021-06-25 2021-12-21 腾讯科技(深圳)有限公司 图像处理方法、装置、电子设备及存储介质
CN114422698A (zh) * 2022-01-19 2022-04-29 北京字跳网络技术有限公司 视频生成方法、装置、设备及存储介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2760672B2 (ja) * 1991-06-03 1998-06-04 シャープ株式会社 画像処理装置
WO2015189343A1 (en) * 2014-06-12 2015-12-17 Thomson Licensing Methods and systems for color processing of digital images
CN110515452B (zh) * 2018-05-22 2022-02-22 腾讯科技(深圳)有限公司 图像处理方法、装置、存储介质和计算机设备
CN111340921A (zh) * 2018-12-18 2020-06-26 北京京东尚科信息技术有限公司 染色方法、装置和计算机系统及介质
CN110276840B (zh) * 2019-06-21 2022-12-02 腾讯科技(深圳)有限公司 多虚拟角色的控制方法、装置、设备及存储介质
US11410347B2 (en) * 2020-04-13 2022-08-09 Sony Group Corporation Node-based image colorization on image/video editing applications
CN113411550B (zh) * 2020-10-29 2022-07-19 腾讯科技(深圳)有限公司 视频上色方法、装置、设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011118825A (ja) * 2009-12-07 2011-06-16 Yamazaki Co Ltd モノクロ動画の着色装置及びモノクロ動画の着色方法
CN109754444A (zh) * 2018-02-07 2019-05-14 京东方科技集团股份有限公司 图像着色方法和装置
CN108492348A (zh) * 2018-03-30 2018-09-04 北京金山安全软件有限公司 图像处理方法、装置、电子设备及存储介质
CN111815733A (zh) * 2020-08-07 2020-10-23 深兰科技(上海)有限公司 一种视频着色的方法及系统
CN113822951A (zh) * 2021-06-25 2021-12-21 腾讯科技(深圳)有限公司 图像处理方法、装置、电子设备及存储介质
CN114422698A (zh) * 2022-01-19 2022-04-29 北京字跳网络技术有限公司 视频生成方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN114422698A (zh) 2022-04-29
CN114422698B (zh) 2023-09-26

Similar Documents

Publication Publication Date Title
CN109618222B (zh) 一种拼接视频生成方法、装置、终端设备及存储介质
WO2022083383A1 (zh) 图像处理方法、装置、电子设备及计算机可读存储介质
WO2023125374A1 (zh) 图像处理方法、装置、电子设备及存储介质
JP7199527B2 (ja) 画像処理方法、装置、ハードウェア装置
WO2021254502A1 (zh) 目标对象显示方法、装置及电子设备
US20230421716A1 (en) Video processing method and apparatus, electronic device and storage medium
CN110796664B (zh) 图像处理方法、装置、电子设备及计算机可读存储介质
US11893770B2 (en) Method for converting a picture into a video, device, and storage medium
WO2022233223A1 (zh) 图像拼接方法、装置、设备及介质
US20240119082A1 (en) Method, apparatus, device, readable storage medium and product for media content processing
WO2023071707A1 (zh) 视频图像处理方法、装置、电子设备及存储介质
WO2023078284A1 (zh) 图片渲染方法、装置、设备、存储介质和程序产品
WO2023138441A1 (zh) 视频生成方法、装置、设备及存储介质
WO2021227953A1 (zh) 图像特效配置方法、图像识别方法、装置及电子设备
CN112785669B (zh) 一种虚拟形象合成方法、装置、设备及存储介质
WO2024037556A1 (zh) 图像处理方法、装置、设备及存储介质
WO2023138468A1 (zh) 虚拟物体的生成方法、装置、设备及存储介质
WO2023078281A1 (zh) 图片处理方法、装置、设备、存储介质和程序产品
WO2022237435A1 (zh) 更换画面中的背景的方法、设备、存储介质及程序产品
US11810336B2 (en) Object display method and apparatus, electronic device, and computer readable storage medium
WO2022237460A1 (zh) 图像处理方法、设备、存储介质及程序产品
WO2022262473A1 (zh) 图像处理方法、装置、设备及存储介质
CN113905177B (zh) 视频生成方法、装置、设备及存储介质
US11805219B2 (en) Image special effect processing method and apparatus, electronic device and computer-readable storage medium
CN114399696A (zh) 一种目标检测方法、装置、存储介质及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23742756

Country of ref document: EP

Kind code of ref document: A1