CN112995537B - Video construction method and system - Google Patents
Video construction method and system Download PDFInfo
- Publication number
- CN112995537B CN112995537B CN202110175132.8A CN202110175132A CN112995537B CN 112995537 B CN112995537 B CN 112995537B CN 202110175132 A CN202110175132 A CN 202110175132A CN 112995537 B CN112995537 B CN 112995537B
- Authority
- CN
- China
- Prior art keywords
- video
- representation
- input information
- information
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000010276 construction Methods 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 claims abstract description 32
- 238000012545 processing Methods 0.000 claims abstract description 32
- 230000004927 fusion Effects 0.000 claims abstract description 31
- 238000007499 fusion processing Methods 0.000 claims abstract description 11
- 238000006243 chemical reaction Methods 0.000 claims abstract description 10
- 238000004364 calculation method Methods 0.000 claims abstract description 6
- 230000008569 process Effects 0.000 claims abstract description 3
- 238000003709 image segmentation Methods 0.000 claims description 14
- 238000003384 imaging method Methods 0.000 claims description 13
- 238000007500 overflow downdraw method Methods 0.000 claims description 11
- 238000013528 artificial neural network Methods 0.000 claims description 8
- 238000000605 extraction Methods 0.000 claims description 8
- 230000014509 gene expression Effects 0.000 claims description 6
- 238000012512 characterization method Methods 0.000 claims description 3
- 238000003672 processing method Methods 0.000 claims description 3
- 230000001133 acceleration Effects 0.000 abstract description 5
- 230000006870 function Effects 0.000 abstract description 2
- 238000005457 optimization Methods 0.000 description 8
- 239000000976 ink Substances 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000000474 nursing effect Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010422 painting Methods 0.000 description 1
- 230000036544 posture Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment
- H04N5/262—Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Processing Or Creating Images (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a video construction method and a system, wherein the method comprises the following steps: firstly, respectively carrying out feature conversion on multiple kinds of input information describing the same video to obtain feature representation information of each input information; then, sequentially obtaining representation abstract model views and representation videos of all input information, then performing fusion processing on all representation videos to obtain a group of fusion image sets, and finally, taking the harmony fusion image set as construction video output of all input information to construct a smooth video work; the method can realize video generation aiming at different styles and scenes, simultaneously performs fusion and harmonious processing on the generated video, finally constructs smooth video works, and simultaneously has an acceleration function on the parallel operation process, thereby reducing the calculation amount and memory occupation, reducing the workload of edge equipment and enabling the edge equipment to rapidly construct the video.
Description
Technical Field
The invention relates to the technical field of video animation, in particular to a video construction method and a video construction system.
Background
The deep learning intelligent perception algorithm enables the electronic equipment to have accurate semantic perception capability, such as text-based semantic recognition, voice information-based semantic recognition and image semantic-based recognition, and provides a good method basis for describing and characterizing environment and intention of the equipment. The method for constructing the video based on the semantic information also obtains good expression effect on the video prediction and generation of the characters, and the function realization of generating the video from the voice, the text and the image also improves the efficiency of the design work of industries such as animation, propagation, education, construction and the like.
The current intelligent algorithm can generate video aiming at human body gestures, expressions, mouth shapes, gestures and scenes according to simple plane composition, and can also generate painting images aiming at special styles aiming at intelligent algorithm training. However, the currently used intelligent methods all predict one type of video, and in real-time applications, multiple types of videos need to be predicted, and meanwhile, fusion processing needs to be performed on videos with different prediction types.
In addition, the video construction method aiming at the current method has huge computation amount, and the running time on the terminal equipment is difficult to meet the requirements of users.
Disclosure of Invention
In order to overcome the defects of the technology, the invention provides a video construction method and a video construction system, the method can realize the generation of videos with styles and scenes aiming at different input types, and simultaneously carry out fusion and harmonious processing on the generated videos to finally construct smooth video works.
The invention is realized by the following technical scheme:
the video construction method provided by the scheme comprises the following steps:
s1, respectively carrying out feature conversion on multiple kinds of input information describing the same video to obtain feature representation information of each input information;
s2, respectively matching the characteristic representation information of each input information with an abstract model library to generate a representation abstract model view based on each input information;
s3, respectively inputting the representation abstract model views of each input information into a video generation algorithm model to generate corresponding representation videos;
s4, performing imaging processing on each representation video respectively to obtain a group of fusion image sets;
and S5, performing harmony processing on the fusion image set to generate a harmony fusion atlas, and outputting the harmony fusion atlas serving as a constructed video of all input information.
The working principle of the scheme is as follows: according to the video construction method provided by the scheme, different description information describing the same characteristic video can be simultaneously used as input information to construct a complete video, the current intelligent method can generate videos according to human postures, expressions, mouth shapes, gestures and scenes according to simple plane composition, and the intelligent method can also be used for training drawing images of special styles to generate videos; however, the existing method can only simultaneously construct videos with the same format, but needs to predict multiple types of videos in real-time application, and needs to perform fusion processing on videos with different prediction types, and the method provided by the scheme can realize simultaneous input of video description information with multiple formats (for example, voice description information is used as first input information, and image description information/image description information and voice description information are used as second input information), obtain representation videos of each input information through synchronous processing on multiple input information, finally perform fusion processing on all representation videos to obtain a fusion image set, and finally obtain a complete video which can contain all the characteristics of the input information; the video construction method not only realizes the video generation aiming at different styles and scenes, and the synchronous processing of a plurality of input information, but also effectively improves the video construction efficiency.
The further optimization scheme is that the type of the input information in the S1 is one or more of hand-drawn abstract pictures, voice, texts and images.
The further optimization scheme is that the feature characterization information in S1 includes: text representation information, semantic representation information and feature map representation information.
The further optimization scheme is that the abstract model library in S2 includes, but is not limited to, abstract model information for a pose, a gesture, a mouth shape, an expression or a scene, and the information representation form includes, but is not limited to, vector data, a coordinate set and point cloud data.
The further optimization scheme is that the imaging processing process of the representation video of the input information in the S4 comprises the following steps:
s41, performing image segmentation processing on each frame image of the representation video of each input information respectively, wherein each frame image obtains a plurality of segmented image blocks;
s42, adding foreground or background marks to each segmented image block, and obtaining a group of Jing Tuji combinations by each input information;
s43, fusing the foreground and the background of each group of scene picture set according to frames to generate a group of fused image sets.
The further optimization scheme is that the method for fusion processing of the foreground and the background comprises but is not limited to: a spatial domain fusion method, a transform domain fusion method, and a neural network-based image fusion method.
The further optimization scheme is that the method for adding the mark to the foreground or the background comprises but is not limited to an image segmentation marking method and an image semantic identification marking method.
The further optimization scheme is that the harmony processing method in the S6 is an image harmony method based on a neural network method.
Based on the video construction method, the invention also provides a video construction system, which comprises the following steps: the device comprises a first feature extraction device, an abstract model matching device, a representation video generation device, an imaging processing device, a fusion device and a calculation device;
the first feature extraction device is used for respectively carrying out feature conversion on the plurality of input information to obtain feature representation information of each input information;
the abstract model matching device is used for respectively matching the characteristic representation information of each input information with an abstract model library to generate a representation abstract model view based on each input information;
the representation video generation device respectively inputs representation abstract model views of each input information into a video generation algorithm model to generate corresponding representation videos;
the imaging processing device respectively carries out imaging processing on each representation video to obtain a scene graph set of each representation video;
the fusion device fuses Jing Tuji representing videos to generate a fusion image set;
the computing device executes a harmony algorithm on the fused image set to generate a harmony fused image set, and the harmony fused image set is used as the constructed video output of all the input information.
The further optimization scheme is that the input information is one or more of hand-drawn abstract pictures, voice, texts and images.
The video construction system also comprises a memory module, a table look-up matching module, a parallel operation acceleration module and a main control processor module;
the memory module comprises a storage medium, is used for transmitting and temporarily storing external input information, and temporarily storing the weight of the intelligent algorithm, the characteristic representation information, the abstract characteristic library, the abstract model view, the representation video, the image segmentation result, the scene graph set, the fusion image set and the constructed video;
the table look-up matching module internally comprises a matching operation unit and an address mapping unit and is used for searching and matching the information in the abstract characteristic library in the memory aiming at the characteristic representation information and outputting an abstract model view aiming at the characteristic representation information;
the parallel operation acceleration module internally comprises at least one parallel operation processing unit and is used for accelerating the execution of harmony processing operation in the information characteristic extraction, video generation operation, image segmentation nursing and harmony image set;
the main control processor module internally comprises a control logic module and a processor module and is used for controlling the memory module, searching the matching module, accelerating data transmission among the modules through parallel operation, executing fusion processing calculation aiming at the foreground and the background and performing non-parallel operation in the video construction method.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the invention provides a video construction method and a video construction system, which can realize video generation aiming at different styles and scenes, simultaneously carry out fusion and harmonious processing on the generated videos, and finally construct smooth video works; the video construction system formed by connecting the provided video construction device and the interactive equipment can construct the video works with the styles and descriptions required by the users according to the input various information.
Drawings
The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention.
FIG. 1 is a flow chart of a video construction method of the present invention;
fig. 2 is a schematic diagram of the video construction system of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to examples and accompanying drawings, and the exemplary embodiments and descriptions thereof are only used for explaining the present invention and are not meant to limit the present invention.
Example 1
As shown in fig. 1, this embodiment provides a video construction method, which includes the following steps:
step 1, performing feature conversion on first input information to acquire first feature representation information aiming at the first input information;
step 2, matching the first representation information obtained in the step 1 with an abstract model library to generate a first representation abstract model view based on first input information;
step 3, the video generation algorithm model executes video generation operation on the first abstract model view generated in the step 2 to generate a first representation video aiming at first input information;
step 4, performing feature conversion on the second input information to acquire second feature representation information aiming at the second input information; matching the second representation information with the abstract model library to generate a second representation abstract model view based on second input information; inputting the second abstract model view into the video generation algorithm model, executing the video generation operation, and generating the second input information
A second characterizing video;
step 5, respectively executing image segmentation processing on each frame of image of the first representation video in the step 3 and each frame of image of the second representation video in the step 4 to obtain an image segmentation result of each frame of each video;
step 6, respectively adding foreground or background marks according to the divided image blocks in each frame of image in the step 5 to obtain a first scene image set and a second scene image set;
step 7, merging the marked first scene image set and a second scene image set Jing Tuji into a frame to perform foreground and background fusion processing to generate a fusion image set;
and 8, performing harmony processing on each frame of image in the fusion image set generated in the step 7 to generate a harmonious fusion image set and obtain a constructed video.
It should be noted that the input information includes, but is not limited to, hand-drawing abstract, voice, text, and image input information;
the feature representation information includes but is not limited to text, semantics, feature map information;
the abstract feature library includes, but is not limited to, abstract information for gestures, mouth shapes, expressions, scenes, and its information representation forms include, but are not limited to, vector data, coordinate sets, and point cloud data.
The input information includes, but is not limited to, first input information, second input information, third input information; generating at least one characterization video;
it should be noted that, the foreground and background marking method in step 6 includes, but is not limited to, an image segmentation marking method and an image semantic identification marking;
it should be noted that, the image fusion method of the foreground and the background in step 7 includes, but is not limited to, a spatial domain fusion method, a transform domain fusion method, and an image fusion method based on a neural network;
it should be noted that the image harmony processing method described in step 8 includes, but is not limited to, a neural network method and an image harmony method.
Example 2
Further explanation is given by taking scene video construction in an animation task as an embodiment.
In this embodiment, a "scene where a character walks in a grassland" in an animation video scene construction is taken as an example, where an initial image and a text description are used as input information, an abstract model library based on a wash ink style is used to generate a confrontation network execution video generation method, an encoding and decoding structure neural network is used to execute image segmentation processing and harmony processing of a background mark and an image, and an image fusion method based on a deep neural network is used to execute fusion processing of each frame of image in a video.
S1, performing characteristic conversion on a character picture drawn by a first original hand and a description text of the picture as input information to acquire characteristic representation information aiming at character information;
s2, matching the character characteristic representation information obtained in the step S1 with an ink and wash style abstract model library to generate a representation abstract model view based on character information;
s3, generating a confrontation network algorithm model, executing video generation operation on the figure representation abstract model view, and generating a first representation video for figure information;
step s4, performing feature conversion on the hand-drawn sketch picture and the text information described by the hand-drawn sketch picture to obtain representation information aiming at the sketch information; matching the grassland representation information with the ink and wash style abstract model library to generate a representation abstract model view based on the grassland information; inputting the grassland abstract model view into a video generation algorithm model, executing video generation operation, and generating a second representation video aiming at the Chinese ink style grassland;
step s5, respectively executing image segmentation processing on each frame of images of the character first representation video in the step s3 and the grassland second representation video in the step s4 to obtain an image segmentation result of each frame of each video;
step s6, respectively adding foreground or background marks according to the divided image blocks in each frame of image in the step s5 to obtain a character scene image set and a grassland scene image set;
step s7, carrying out fusion processing of foreground and background on the marked character scene image set and the grassland scene image set according to frames to generate a fusion image set;
and step s8, performing harmony processing on each frame of image in the fusion image set generated in the step s7 to generate a harmonious fusion image set, and acquiring a task of the ink and wash style to walk on the grassland video.
Example 3
As shown in fig. 2, the video construction system provided in this embodiment includes: the device comprises a first feature extraction device, an abstract model matching device, a representation video generation device, an imaging processing device, a fusion device and a calculation device;
the first feature extraction device is used for respectively carrying out feature conversion on the plurality of input information to obtain feature representation information of each input information;
the abstract model matching device is used for respectively matching the characteristic representation information of each input information with an abstract model library to generate a representation abstract model view based on each input information;
the representation video generation device respectively inputs representation abstract model views of all input information into a video generation algorithm model to generate corresponding representation videos;
the imaging processing device respectively carries out imaging processing on each representation video to obtain a scene graph set of each representation video;
the fusion device fuses Jing Tuji representing videos to generate a fusion image set;
and the computing device executes a harmony algorithm on the fused image set to generate a harmony fused atlas, and the harmony fused atlas is used as a constructed video output of all input information.
The device also comprises a memory module, a table look-up matching module, a parallel operation acceleration module and a main control processor module;
the memory module comprises a storage medium, is used for transmitting and temporarily storing external input information, and temporarily storing the weight of the intelligent algorithm, the characteristic representation information, the abstract characteristic library, the abstract model view, the representation video, the image segmentation result, the scene graph set, the fusion image set and the constructed video;
the table look-up matching module internally comprises a matching operation unit and an address mapping unit and is used for searching and matching the information in the abstract characteristic library in the memory aiming at the characteristic representation information and outputting an abstract model view aiming at the characteristic representation information;
the parallel operation acceleration module internally comprises at least one parallel operation processing unit and is used for accelerating the execution of harmony processing operation in the information characteristic extraction, video generation operation, image segmentation nursing and harmony image set;
the main control processor module internally comprises a control logic module and a processor module and is used for controlling the memory module, searching the matching module, accelerating data transmission among the modules through parallel operation, executing fusion processing calculation aiming at the foreground and the background and performing non-parallel operation in the video construction method.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (9)
1. A video construction method, comprising the steps of: s1, respectively carrying out feature conversion on multiple kinds of input information describing the same video to obtain feature representation information of each input information; s2, respectively matching the characteristic representation information of each input information with an abstract model library to generate a representation abstract model view based on each input information; s3, respectively inputting representation abstract model views of all input information into a video generation algorithm model to generate corresponding representation videos; s4, performing imaging processing on each representation video respectively to obtain a group of fusion image sets; the imaging processing process of the representation video of the input information in the S4 comprises the following steps: s41, performing image segmentation processing on each frame image of the representation video of each input information respectively, wherein each frame image obtains a plurality of segmented image blocks; s42, adding foreground or background marks to each segmented image block, and obtaining a group of Jing Tuji combinations by each input information; s43, fusing the foreground and the background of each group of scene picture set according to frames to generate a group of fused image sets; and S5, performing harmony processing on the fusion image set to generate a harmony fusion atlas, and taking the harmony fusion atlas as the constructed video output of all input information.
2. The video construction method according to claim 1, wherein the input information in S1 is one or more of hand-drawn abstract, voice, text and image.
3. The video construction method according to claim 1, wherein the feature characterization information in S1 includes: text representation information, semantic representation information and feature map representation information.
4. The method of claim 1, wherein the abstract model library in S2 includes but is not limited to abstract model information for pose, gesture, mouth, expression or scene, and its information representation forms include but are not limited to vector data, coordinate set, point cloud data.
5. The video construction method according to claim 1, wherein the foreground and background fusion processing method includes but is not limited to: a spatial domain fusion method, a transform domain fusion method, and a neural network-based image fusion method.
6. The video construction method according to claim 1, wherein the method for adding a mark to the foreground or the background includes, but is not limited to, an image segmentation mark method and an image semantic identification mark method.
7. The video construction method according to claim 1, wherein the harmony processing method in S6 is an image harmony method based on a neural network method.
8. A video construction system for use in the video construction method of any one of claims 1 to 7, comprising: the device comprises a first feature extraction device, an abstract model matching device, a representation video generation device, an imaging processing device, a fusion device and a calculation device; the first feature extraction device is used for respectively carrying out feature conversion on the plurality of input information to obtain feature representation information of each input information; the abstract model matching device is used for respectively matching the characteristic representation information of each input information with an abstract model library to generate a representation abstract model view based on each input information; the representation video generation device respectively inputs representation abstract model views of all input information into a video generation algorithm model to generate corresponding representation videos; the imaging processing device respectively carries out imaging processing on each representation video to obtain a scene graph set of each representation video; the fusion device fuses Jing Tuji representing videos to generate a fusion image set; and the computing device executes a harmony algorithm on the fused image set to generate a harmony fused atlas, and the harmony fused atlas is used as a constructed video output of all input information.
9. A video construction system according to claim 8, wherein the input information is one or more of hand-drawn abstract, speech, text, image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110175132.8A CN112995537B (en) | 2021-02-09 | 2021-02-09 | Video construction method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110175132.8A CN112995537B (en) | 2021-02-09 | 2021-02-09 | Video construction method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112995537A CN112995537A (en) | 2021-06-18 |
CN112995537B true CN112995537B (en) | 2023-02-24 |
Family
ID=76347959
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110175132.8A Active CN112995537B (en) | 2021-02-09 | 2021-02-09 | Video construction method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112995537B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108986186A (en) * | 2018-08-14 | 2018-12-11 | 山东师范大学 | The method and system of text conversion video |
CN109522908A (en) * | 2018-11-16 | 2019-03-26 | 董静 | Image significance detection method based on area label fusion |
CN110717054A (en) * | 2019-09-16 | 2020-01-21 | 清华大学 | Method and system for generating video by crossing modal characters based on dual learning |
CN111669515A (en) * | 2020-05-30 | 2020-09-15 | 华为技术有限公司 | Video generation method and related device |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7512537B2 (en) * | 2005-03-22 | 2009-03-31 | Microsoft Corporation | NLP tool to dynamically create movies/animated scenes |
US20120130717A1 (en) * | 2010-11-19 | 2012-05-24 | Microsoft Corporation | Real-time Animation for an Expressive Avatar |
EP3161829B1 (en) * | 2014-06-30 | 2019-12-04 | Mario Amura | Audio/video editing device, movie production method starting from still images and audio tracks and associated computer program |
JP6546611B2 (en) * | 2017-02-03 | 2019-07-17 | 日本電信電話株式会社 | Image processing apparatus, image processing method and image processing program |
CN107948519B (en) * | 2017-11-30 | 2020-03-27 | Oppo广东移动通信有限公司 | Image processing method, device and equipment |
-
2021
- 2021-02-09 CN CN202110175132.8A patent/CN112995537B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108986186A (en) * | 2018-08-14 | 2018-12-11 | 山东师范大学 | The method and system of text conversion video |
CN109522908A (en) * | 2018-11-16 | 2019-03-26 | 董静 | Image significance detection method based on area label fusion |
CN110717054A (en) * | 2019-09-16 | 2020-01-21 | 清华大学 | Method and system for generating video by crossing modal characters based on dual learning |
CN111669515A (en) * | 2020-05-30 | 2020-09-15 | 华为技术有限公司 | Video generation method and related device |
Also Published As
Publication number | Publication date |
---|---|
CN112995537A (en) | 2021-06-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11783491B2 (en) | Object tracking method and apparatus, storage medium, and electronic device | |
CN110021051B (en) | Human image generation method based on generation of confrontation network through text guidance | |
CN109086683B (en) | Human hand posture regression method and system based on point cloud semantic enhancement | |
CN114550177B (en) | Image processing method, text recognition method and device | |
JP2023541532A (en) | Text detection model training method and apparatus, text detection method and apparatus, electronic equipment, storage medium, and computer program | |
CN111709497B (en) | Information processing method and device and computer readable storage medium | |
CN113704531A (en) | Image processing method, image processing device, electronic equipment and computer readable storage medium | |
CN109712108B (en) | Visual positioning method for generating network based on diversity discrimination candidate frame | |
Ren et al. | Visual semantic segmentation based on few/zero-shot learning: An overview | |
CN113761153A (en) | Question and answer processing method and device based on picture, readable medium and electronic equipment | |
CN115393872B (en) | Method, device and equipment for training text classification model and storage medium | |
CN117078790B (en) | Image generation method, device, computer equipment and storage medium | |
CN110427864B (en) | Image processing method and device and electronic equipment | |
CN114610677B (en) | Determination method and related device of conversion model | |
CN117218246A (en) | Training method and device for image generation model, electronic equipment and storage medium | |
CN117011875A (en) | Method, device, equipment, medium and program product for generating multimedia page | |
US20230072445A1 (en) | Self-supervised video representation learning by exploring spatiotemporal continuity | |
CN114241167A (en) | Template-free virtual clothes changing method and device from video to video | |
WO2024066549A1 (en) | Data processing method and related device | |
CN112995537B (en) | Video construction method and system | |
CN116939288A (en) | Video generation method and device and computer equipment | |
CN111382301A (en) | Three-dimensional model generation method and system based on generation countermeasure network | |
CN116468886A (en) | Scene sketch semantic segmentation method and device based on strokes | |
CN115775300A (en) | Reconstruction method of human body model, training method and device of human body reconstruction model | |
CN113052156B (en) | Optical character recognition method, device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |