WO2024104477A1 - Image generation method and apparatus, electronic device, and storage medium - Google Patents

Image generation method and apparatus, electronic device, and storage medium Download PDF

Info

Publication number
WO2024104477A1
WO2024104477A1 PCT/CN2023/132440 CN2023132440W WO2024104477A1 WO 2024104477 A1 WO2024104477 A1 WO 2024104477A1 CN 2023132440 W CN2023132440 W CN 2023132440W WO 2024104477 A1 WO2024104477 A1 WO 2024104477A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target
original
target image
style
Prior art date
Application number
PCT/CN2023/132440
Other languages
French (fr)
Chinese (zh)
Inventor
王晶
苗旺
徐雨旸
徐丁丁
刘松伟
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2024104477A1 publication Critical patent/WO2024104477A1/en

Links

Landscapes

  • Processing Or Creating Images (AREA)

Abstract

Embodiments of the present disclosure provide an image generation method and apparatus, an electronic device, and a storage medium. The method comprises: acquiring at least two original images; respectively performing image style migration on the at least two original images to obtain corresponding target image frames having a target image style; and combining the at least two target image frames according to a target-splicing image layout to generate a target spliced image, wherein the target spliced image layout is determined on the basis of the image content of the at least two original images in the initial image data. Target image frames having a specific image style are obtained by performing style migration on a plurality of original images in initial image data, and then the target image frames are combined to obtain a target spliced image matching the content of the plurality of original images, so that the effective information in a plurality of original image frames is fully displayed, and the visual expression is improved.

Description

图像生成方法、装置、电子设备及存储介质Image generation method, device, electronic device and storage medium
本申请要求2022年11月18日递交的,标题为“图像生成方法、装置、电子设备及存储介质”、申请号为202211449865.7的中国发明专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims priority to the Chinese invention patent application entitled “Image generation method, device, electronic device and storage medium” filed on November 18, 2022 and application number 202211449865.7. The entire contents of this application are incorporated by reference into this application.
技术领域Technical Field
本公开实施例涉及图像生成技术领域,尤其涉及一种图像生成方法、装置、电子设备及存储介质。The embodiments of the present disclosure relate to the field of image generation technology, and in particular, to an image generation method, device, electronic device, and storage medium.
背景技术Background technique
当前,以视频内容创造的应用场景为例,用户需要基于视频数据生成对应的图像作为视频封面,以实现视频内容的预览和展示的目的。现有技术中,通常是基于用户手动选取的方式,抽取视频中的某一视频帧来生成上述视频封面。Currently, taking the application scenario of video content creation as an example, users need to generate a corresponding image as a video cover based on video data to achieve the purpose of previewing and displaying the video content. In the prior art, a certain video frame in the video is usually extracted based on a manual selection by the user to generate the above video cover.
然而,现有技术中通过视频数据生成的图像,存在图像信息量少,视觉表现力差等问题。However, images generated by video data in the prior art have problems such as small amount of image information and poor visual expression.
发明内容Summary of the invention
本公开实施例提供一种图像生成方法、装置、电子设备及存储介质,以克服生成的图像存在的图像信息量少,视觉表现力差的问题。The embodiments of the present disclosure provide an image generation method, an apparatus, an electronic device, and a storage medium to overcome the problem that the generated images have little image information and poor visual expression.
第一方面,本公开实施例提供一种图像生成方法,包括:In a first aspect, an embodiment of the present disclosure provides an image generation method, comprising:
获取至少两个原始图像,并对至少两个所述原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧;根据至少两个所述原始图像对应的图像内容,确定目标拼图布局;将至少两个所述目标图像帧按照目标拼图布局进行组合,生成目标拼图。Acquire at least two original images, and perform image style transfer on at least two of the original images respectively to obtain corresponding target image frames with target image style; determine a target puzzle layout according to image contents corresponding to at least two of the original images; combine at least two of the target image frames according to the target puzzle layout to generate a target puzzle.
第二方面,本公开实施例提供一种图像生成装置,包括:In a second aspect, an embodiment of the present disclosure provides an image generating device, including:
获取模块,用于获取至少两个原始图像;An acquisition module, used for acquiring at least two original images;
迁移模块,用于对至少两个所述原始图像分别进行图像风格迁移,得到 对应的具有目标图像风格的目标图像帧;A migration module is used to perform image style migration on at least two of the original images to obtain a corresponding target image frame having a target image style;
组合模块,用于根据至少两个所述原始图像对应的图像内容,确定目标拼图布局,并将至少两个所述目标图像帧按照目标拼图布局进行组合,生成目标拼图。The combination module is used to determine a target puzzle layout according to image contents corresponding to at least two of the original images, and to combine at least two of the target image frames according to the target puzzle layout to generate a target puzzle.
第三方面,本公开实施例提供一种电子设备,包括:In a third aspect, an embodiment of the present disclosure provides an electronic device, including:
处理器,以及与所述处理器通信连接的存储器;A processor, and a memory communicatively connected to the processor;
所述存储器存储计算机执行指令;The memory stores computer-executable instructions;
所述处理器执行所述存储器存储的计算机执行指令,以实现如上第一方面以及第一方面各种可能的设计所述的图像生成方法。The processor executes the computer-executable instructions stored in the memory to implement the image generating method as described in the first aspect and various possible designs of the first aspect.
第四方面,本公开实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第一方面以及第一方面各种可能的设计所述的图像生成方法。In a fourth aspect, an embodiment of the present disclosure provides a computer-readable storage medium, in which computer execution instructions are stored. When a processor executes the computer execution instructions, the image generation method described in the first aspect and various possible designs of the first aspect is implemented.
第五方面,本公开实施例提供一种计算机程序产品,包括计算机程序,该计算机程序被处理器执行时实现如上第一方面以及第一方面各种可能的设计所述的图像生成方法。In a fifth aspect, an embodiment of the present disclosure provides a computer program product, including a computer program, which, when executed by a processor, implements the image generation method described in the first aspect and various possible designs of the first aspect.
本实施例提供的图像生成方法、装置、电子设备及存储介质,通过获取至少两个原始图像,并对至少两个所述原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧;根据至少两个所述原始图像对应的图像内容,确定目标拼图布局;将至少两个所述目标图像帧按照目标拼图布局进行组合,生成目标拼图。由于通过对至少两个原始图像进行风格迁移,得到具有特定图像风格的目标图像帧,再对目标图像帧进行排列组合,得到布局与多个原始图像的内容相匹配的目标拼图,使目标拼图不仅能展示带有风格特效的多帧图像,还能通过拼图布局展示带有风格特效的多帧图像在内容上的关联性,实现了对多帧原始图像中有效信息的充分展示,提高视觉表现力。The image generation method, device, electronic device and storage medium provided in the present embodiment obtain at least two original images, and perform image style transfer on at least two of the original images respectively to obtain corresponding target image frames with target image style; determine the target puzzle layout according to the image content corresponding to at least two of the original images; and combine at least two of the target image frames according to the target puzzle layout to generate a target puzzle. Since the target image frame with a specific image style is obtained by performing style transfer on at least two original images, and then the target image frames are arranged and combined to obtain a target puzzle with a layout matching the content of multiple original images, the target puzzle can not only display multiple frames of images with style special effects, but also display the content relevance of multiple frames of images with style special effects through the puzzle layout, thereby achieving full display of effective information in multiple frames of original images and improving visual expression.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本公开的一些实施例,对于本领域普通技术人员来讲,在 不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure or the prior art, the following briefly introduces the drawings required for use in the embodiments or the prior art descriptions. Obviously, the drawings described below are some embodiments of the present disclosure. For ordinary technicians in this field, Other drawings can be obtained based on these drawings without any creative effort.
图1为本公开实施例提供的图像生成方法的一种应用场景图;FIG1 is a diagram of an application scenario of the image generation method provided by an embodiment of the present disclosure;
图2为本公开实施例提供的图像生成方法的流程示意图一;FIG2 is a flowchart of an image generation method according to an embodiment of the present disclosure;
图3为图2所示实施例中步骤S102的具体实现方式流程图;FIG3 is a flowchart of a specific implementation method of step S102 in the embodiment shown in FIG2 ;
图4为图3所示实施例中步骤S1022的具体实现方式流程图;FIG4 is a flowchart of a specific implementation method of step S1022 in the embodiment shown in FIG3 ;
图5为本公开实施例提供的一种确定第一图像帧的示意图;FIG5 is a schematic diagram of determining a first image frame provided by an embodiment of the present disclosure;
图6为本公开实施例提供的一种目标拼图的目标拼图布局的示意图;FIG6 is a schematic diagram of a target puzzle layout of a target puzzle provided by an embodiment of the present disclosure;
图7为本公开实施例提供的图像生成方法的流程示意图二;FIG7 is a second flow chart of the image generation method provided by an embodiment of the present disclosure;
图8为本公开实施例提供的一种剪裁图像的示意图;FIG8 is a schematic diagram of a cropped image provided by an embodiment of the present disclosure;
图9为图2所示实施例中步骤S206的具体实现方式流程图;FIG9 is a flowchart of a specific implementation method of step S206 in the embodiment shown in FIG2 ;
图10为图2所示实施例中步骤S208的具体实现方式流程图;FIG10 is a flowchart of a specific implementation method of step S208 in the embodiment shown in FIG2 ;
图11为本公开实施例提供的图像生成装置的结构框图;FIG11 is a structural block diagram of an image generating device provided by an embodiment of the present disclosure;
图12为本公开实施例提供的一种电子设备的结构示意图;FIG12 is a schematic diagram of the structure of an electronic device provided by an embodiment of the present disclosure;
图13为本公开实施例提供的电子设备的硬件结构示意图。FIG. 13 is a schematic diagram of the hardware structure of an electronic device provided in an embodiment of the present disclosure.
具体实施方式Detailed ways
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。In order to make the purpose, technical solution and advantages of the embodiments of the present disclosure clearer, the technical solution in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the drawings in the embodiments of the present disclosure. Obviously, the described embodiments are part of the embodiments of the present disclosure, not all of the embodiments. Based on the embodiments in the present disclosure, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present disclosure.
下面对本公开实施例的应用场景进行解释:The application scenarios of the embodiments of the present disclosure are explained below:
图1为本公开实施例提供的图像生成方法的一种应用场景图,本公开实施例提供的图像生成方法,可以应用于视频封面生成、视频转换为图片集等视频图像编辑处理的应用场景。具体地,以生成视频封面的应用场景为例,本公开实施例提供的方法,可以应用于终端设备或服务器,以应用于终端设备为例,终端设备内运行有视频编辑应用程序(APP),如图1所示,用户通过拍摄、服务器下载或接收其他终端设备传输等方式,获取待处理视频后,将待处理视频加载至应用了本申请实施例提供的图像生成方法的视频编辑应用程序(图中示为App),之后,终端设备利用该视频编辑应用程序对待处 理视频进行处理后,基于待处理视频中的视频内容,生成能够表征视频内容的目标图片,从而将该目标图片作为该待处理视频的视频封面。FIG1 is a diagram of an application scenario of the image generation method provided in an embodiment of the present disclosure. The image generation method provided in an embodiment of the present disclosure can be applied to application scenarios of video image editing and processing such as video cover generation and video conversion to a picture set. Specifically, taking the application scenario of generating a video cover as an example, the method provided in an embodiment of the present disclosure can be applied to a terminal device or a server. Taking the application to a terminal device as an example, a video editing application (APP) is running in the terminal device. As shown in FIG1 , after the user obtains the video to be processed by shooting, downloading from a server, or receiving transmission from other terminal devices, the user loads the video to be processed into a video editing application (shown as an App in the figure) to which the image generation method provided in an embodiment of the present application is applied. Afterwards, the terminal device uses the video editing application to process the video to be processed. After the video is processed, a target image capable of representing the video content is generated based on the video content in the video to be processed, so that the target image is used as the video cover of the video to be processed.
现有技术中,以视频内容创造的过程中制作视频封面的应用场景为例,用户通过操作终端设备,基于视频数据生成对应的图像作为视频封面,以实现视频内容的预览和展示的目的。现有技术中,通常是基于用户手动选取的方式,抽取视频中的某一视频帧来生成上述图像,然而,视频数据中包括多帧图像,各帧图像的图像内容存在差异,取出其中的一帧图像作为视频封面,完全无法完全表现视频数据的内容,存在图像信息量少的问题,同时,使用视频数据中的原始图像帧作为视频封面,相比于视频数据的视频内容连续播放的展示方式,会导致视频封面无法突出视频内容的重点,影响视觉表现力和信息表达能力。In the prior art, taking the application scenario of making a video cover in the process of video content creation as an example, the user generates a corresponding image as a video cover based on the video data by operating the terminal device to achieve the purpose of previewing and displaying the video content. In the prior art, the above-mentioned image is usually generated by extracting a certain video frame in the video based on the user's manual selection. However, the video data includes multiple frames of images, and the image content of each frame of the image is different. Taking one of the frames as the video cover cannot fully express the content of the video data, and there is a problem of small amount of image information. At the same time, using the original image frame in the video data as the video cover, compared with the display method of continuous playback of the video content of the video data, will cause the video cover to fail to highlight the key points of the video content, affecting the visual expression and information expression ability.
另外,在本实施例方法所适用的其他图像处理的应用场景中,例如视频转换为一张或多张图片、利用图片集生成拼图的应用场景中,也同样面临上述问题。In addition, other image processing application scenarios to which the method of this embodiment is applicable, such as converting a video into one or more pictures, or generating a puzzle using a picture set, also face the above-mentioned problems.
本公开实施例提供一种图像生成方法以解决上述问题。The embodiments of the present disclosure provide an image generation method to solve the above problems.
参考图2,图2为本公开实施例提供的图像生成方法的流程示意图一。本实施例的方法可以应用在终端设备中,该图像生成方法包括:Referring to FIG. 2 , FIG. 2 is a flow chart of an image generation method provided by an embodiment of the present disclosure. The method of this embodiment can be applied in a terminal device, and the image generation method includes:
步骤S101:获取至少两个原始图像。Step S101: Acquire at least two original images.
示例性地,原始图像即用于作为目标拼图的素材的图像,原始图像可以通过从素材数据中进行抽帧而获得。其中,素材数据可以为视频、或者图片集、或者二者的集合。以素材数据为视频为例,参考图1所示的应用场景示意图,素材数据对应图1所示实施例中的待处理视频。素材数据可以是用户通过终端设备的图像采集单元,例如摄像头,拍摄得到的;也可以是通过访问服务器进行下载得到的,或者接收其他终端设备发送的数据而得到的,此处可根据需要设置,不再举例赘述。Exemplarily, the original image is an image used as the material of the target puzzle, and the original image can be obtained by extracting frames from the material data. The material data can be a video, or a picture set, or a collection of the two. Taking the material data as a video as an example, refer to the application scenario diagram shown in Figure 1, and the material data corresponds to the video to be processed in the embodiment shown in Figure 1. The material data can be obtained by shooting by the user through the image acquisition unit of the terminal device, such as a camera; it can also be obtained by accessing the server for downloading, or by receiving data sent by other terminal devices. It can be set here as needed, and no further examples are given.
其中,素材数据中包括至少两帧素材图像,以素材图像为视频数据为例,素材数据(视频数据)由多个视频帧(素材图像)组成,对初始图像数据进行解码,可以得到构成初始图像数据的各视频帧。之后,基于预设的规则,对素材数据中的各视频帧(素材图像)进行筛选,可得到其中满足规则要求的图像,即原始图像,例如,将素材数据中的关键帧(I帧,I-frame)作为原始图 像。其中,视频数据中的关键帧的确定和检索方法为现有技术,此处不再赘述。The material data includes at least two frames of material images. Taking the material image as video data as an example, the material data (video data) is composed of multiple video frames (material images). By decoding the initial image data, the video frames constituting the initial image data can be obtained. Afterwards, based on the preset rules, the video frames (material images) in the material data are screened to obtain images that meet the requirements of the rules, that is, the original images. For example, the key frames (I-frames) in the material data are used as the original images. Among them, the method of determining and retrieving key frames in video data is a prior art and will not be described in detail here.
在另一种可能的实现方式中,可以基于素材数据的内容,对素材数据进行抽帧,从而得到用于生成后续目标拼图的素材的原始图像。示例性地,如图3所示,步骤S101的具体实现方式包括:In another possible implementation, the material data may be frame extracted based on the content of the material data, thereby obtaining an original image of the material used to generate a subsequent target puzzle. For example, as shown in FIG3 , the specific implementation of step S101 includes:
步骤S1011:获取素材数据,素材数据包括视频和/或图片集;Step S1011: Acquire material data, where the material data includes a video and/or a picture set;
步骤S1012:根据素材数据中的素材图像的图像内容,对素材数据进行抽帧,得到至少两个原始图像,素材图像为视频中的视频帧和/或图片集中的图片。Step S1012: extracting frames of the material data according to the image content of the material images in the material data to obtain at least two original images, where the material images are video frames in the video and/or pictures in the picture set.
示例性地,在获得素材数据后,首先对素材数据中的每一素材图像进行图像识别,获得每一素材图像的图像内容,其中,图像内容的具体实现方式有多种,例如,可以是描述图像内容的特征矩阵(feature),也可以是描述图像内容的像素矩阵,还可以是表征图像中的具体内容的标识,更具体地,例如,当图像中包括人像时,对应的标识(图像内容)为#001、当图像中包括风景时,对应的标识(图像内容)为#002。Exemplarily, after obtaining the material data, image recognition is first performed on each material image in the material data to obtain the image content of each material image, wherein there are multiple specific implementation methods of the image content, for example, it can be a feature matrix (feature) describing the image content, it can also be a pixel matrix describing the image content, or it can be an identifier representing the specific content in the image. More specifically, for example, when the image includes a portrait, the corresponding identifier (image content) is #001, and when the image includes a landscape, the corresponding identifier (image content) is #002.
进一步地,可以在此基础上对内容进一步细分,从而得到更加细化的标识,例如,当图像中包括一个人像时,对应的标识(图像内容)为#001_1;当图像中包括两个人像时,对应的标识(图像内容)为#001_2。表征素材图像中的具体内容的标识的具体表示方式,以及标识与图像中具体内容的映射关系,可以基于具体需要设置,此处不再一一举例。Furthermore, the content can be further subdivided on this basis to obtain a more detailed identifier. For example, when the image includes a portrait, the corresponding identifier (image content) is #001_1; when the image includes two portraits, the corresponding identifier (image content) is #001_2. The specific representation method of the identifier representing the specific content in the material image, and the mapping relationship between the identifier and the specific content in the image can be set based on specific needs, and examples are not given here one by one.
进一步地,在得到各素材图像对应的图像内容之后,根据素材图像对应的图像内容进行筛选,确定出其中的至少两个关键的、能够更好的素材数据中的重要内容的图像帧,即原始图像,来生成目标拼图。在一种可能的实现方式中,如图4所示,步骤S1012的具体实现方式包括:Further, after obtaining the image content corresponding to each material image, the material images are screened according to the image content corresponding to the material images, and at least two key image frames that can better represent the important content in the material data, i.e., original images, are determined to generate the target puzzle. In a possible implementation, as shown in FIG4 , the specific implementation of step S1012 includes:
步骤S1012A:基于素材图像的图像内容,获取素材图像对应的姿态相似度,姿态相似度表征图像内容中的人物元素的姿态与目标姿态的相似度。Step S1012A: based on the image content of the material image, obtaining the posture similarity corresponding to the material image, where the posture similarity represents the similarity between the posture of the human element in the image content and the target posture.
步骤S1012B:根据各素材图像对应的姿态相似度,确定至少两个原始图像。Step S1012B: Determine at least two original images according to the posture similarities corresponding to the material images.
示例性地,在识别各素材图像后,得到素材图像的图像内容,本实施例中,图像内容中包括人物元素,人物元素是指与人像相关的图像元素,例如 人像中的头部、躯干、四肢、手部以及人像整体等。素材图像的图像内容中,人物元素呈现不同的姿态,对比图像内容中的人物元素的姿态和预设的目标姿态,得到姿态相似度,其中,素材图像的图像内容中的人物元素的姿态与目标姿态越一致,则姿态相似度越高;反正,则姿态相似度越低,姿态相似度的具体计算方法,可以基于图像一致性算法实现,图像一致性算法为本领域技术人员知晓的现有技术,此处不再赘述。其中,人物元素的姿态例如包括:面部表情、四肢和躯干的动作、手部动作等一种或多种。Exemplarily, after identifying each material image, the image content of the material image is obtained. In this embodiment, the image content includes human elements, which refer to image elements related to human portraits, such as The head, torso, limbs, hands and the entire portrait in the portrait. In the image content of the material image, the human elements present different postures. The posture of the human elements in the image content is compared with the preset target posture to obtain the posture similarity. Among them, the more consistent the posture of the human elements in the image content of the material image is with the target posture, the higher the posture similarity; otherwise, the lower the posture similarity. The specific calculation method of posture similarity can be implemented based on the image consistency algorithm. The image consistency algorithm is a prior art known to those skilled in the art and will not be repeated here. Among them, the posture of the human elements includes, for example: one or more of facial expressions, movements of the limbs and torso, and hand movements.
本实施例中,获取各素材图像对应的姿态相似度的目的,是为了评估素材图像是否能够表现素材数据中的重要内容,从而被筛选为原始图像作为后续生成的目标拼图的素材;其中,目标姿态包括多种预设的人物元素的姿态,例如,面部微笑的表情、面部大笑的表情、挥手时的四肢动作等,可以作为判断人物元素的姿态是否能够表达有效信息(例如是否能够表达开心、愤怒的情绪),以及是否符合美学特征的规则。因此,通过对比各素材图像的图像内容中人物元素的姿态和目标姿态的相似度,得到各原始图像对应的姿态相似度,之后基于该姿态相似度,筛选出各原始图像的图像内容中,人物姿态更加有意义,也更加符合美学特征,使后续生成的目标图像帧以及目标拼图能够表达出更加丰富的有效信息,并使图像中的人像更加美观。更进一步地,在一种可能的实现方式中,根据各素材图像对应的姿态相似度,确定至少两个原始图像的具体实现方式,包括:In this embodiment, the purpose of obtaining the posture similarity corresponding to each material image is to evaluate whether the material image can express the important content in the material data, so as to be screened as the original image as the material of the target puzzle generated subsequently; wherein the target posture includes a plurality of preset postures of human elements, for example, facial smiling expression, facial laughing expression, limb movements when waving, etc., which can be used as a rule for judging whether the posture of the human element can express effective information (for example, whether it can express happy or angry emotions), and whether it conforms to the aesthetic characteristics. Therefore, by comparing the similarity between the posture of the human element in the image content of each material image and the target posture, the posture similarity corresponding to each original image is obtained, and then based on the posture similarity, the image content of each original image is screened out, and the human posture is more meaningful and more in line with the aesthetic characteristics, so that the target image frame and the target puzzle generated subsequently can express more abundant effective information, and make the portrait in the image more beautiful. Further, in a possible implementation method, according to the posture similarity corresponding to each material image, the specific implementation method of determining at least two original images includes:
将姿态相似度大于第一相似度阈值的素材图像,确定为第一图像帧,和/或,将姿态相似度小于第二相似度阈值的素材图像,确定为第一图像帧;其中,第一相似度阈值大于第二相似度阈值。The material image with posture similarity greater than a first similarity threshold is determined as the first image frame, and/or the material image with posture similarity less than a second similarity threshold is determined as the first image frame; wherein the first similarity threshold is greater than the second similarity threshold.
图5为本公开实施例提供的一种确定原始图像的示意图,下面结合图5对上述步骤中确定至少两个原始图像的过程进行介绍,参考图5所示,姿态相似度、第一相似度阈值和第二相似度阈值均为归一化值,其中,姿态相似度=1时,表示完全一致;姿态相似度=0时,表示完全不一致,第一相似度阈值大于第二相似度阈值,具体地,例如,第一相似度阈值(图中示为p1)例如为0.9;第一相似度阈值(图中示为p2)例如为0.2。基于目标姿态,分别对素材图像A、素材图像A和素材图像C的图像内容进行处理后,得到素材图像A对应的姿态相似度gesture_evl_A=0.95,素材图像B对应的姿态相似 度gesture_evl_B=0.7,素材图像C对应的姿态相似度gesture_evl_C=0.1,其中,一方面,素材图像A对应的姿态相似度gesture_evl_A=0.95满足大于第一相似度阈值的条件(gesture_evl_A>0.9),即素材图像A对应的人物元素的姿态与目标姿态非常相近,因此将素材图像A对应的人物元素的姿态视为目标姿态,进而将原始图像A确定为原始图像。另一方面,素材图像C对应的姿态相似度gesture_evl_C=0.1满足小于第二相似度阈值的条件(gesture_evl_C<0.2),即素材图像C对应的人物元素的姿态与目标姿态相差非常大,此种情况下,认为素材图像中的人像的姿态是用户有目的性的设计的姿态,虽然(由于距离目标姿态相差较大)存在不符合美学特征的可能性,但该姿态包含较多的信息量,因此将素材图像C也确定为原始图像。而素材图像B对应的姿态相似度gesture_evl_B=0.8即不满足小于第二相似度阈值的条件,也不满足大于第一相似度阈值的条件(0.2<gesture_evl_B<0.9),此种情况下,素材图像B对应的姿态视为劣化的目标姿态,无法满足美学特征的要求,同时由于与目标姿态相近,也不足以体现出足够的信息量,因此,将素材图像B排出,而不作为原始图像。FIG5 is a schematic diagram of determining an original image provided by an embodiment of the present disclosure. The process of determining at least two original images in the above steps is introduced in conjunction with FIG5. Referring to FIG5, the posture similarity, the first similarity threshold and the second similarity threshold are all normalized values, wherein, when the posture similarity = 1, it indicates complete consistency; when the posture similarity = 0, it indicates complete inconsistency. The first similarity threshold is greater than the second similarity threshold. Specifically, for example, the first similarity threshold (shown as p1 in the figure) is, for example, 0.9; the first similarity threshold (shown as p2 in the figure) is, for example, 0.2. Based on the target posture, after processing the image contents of material image A, material image A and material image C respectively, the posture similarity corresponding to material image A is obtained as gesture_evl_A=0.95, and the posture similarity corresponding to material image B is obtained as gesture_evl_A=0.95. The gesture similarity of the material image C is gesture_evl_B=0.7, and the gesture similarity of the material image C is gesture_evl_C=0.1. On the one hand, the gesture similarity of the material image A is gesture_evl_A=0.95, which satisfies the condition of being greater than the first similarity threshold (gesture_evl_A>0.9), that is, the gesture of the human element corresponding to the material image A is very similar to the target gesture, so the gesture of the human element corresponding to the material image A is regarded as the target gesture, and the original image A is determined as the original image. On the other hand, the gesture similarity of the material image C is gesture_evl_C=0.1, which satisfies the condition of being less than the second similarity threshold (gesture_evl_C<0.2), that is, the gesture of the human element corresponding to the material image C is very different from the target gesture. In this case, it is considered that the gesture of the human figure in the material image is a gesture designed by the user with purpose. Although there is a possibility that it does not meet the aesthetic characteristics (due to the large difference from the target gesture), the gesture contains more information, so the material image C is also determined as the original image. The gesture similarity gesture_evl_B=0.8 corresponding to the material image B does not satisfy the condition of being less than the second similarity threshold, nor does it satisfy the condition of being greater than the first similarity threshold (0.2<gesture_evl_B<0.9). In this case, the gesture corresponding to the material image B is regarded as a degraded target gesture, which cannot meet the requirements of aesthetic features. At the same time, because it is close to the target gesture, it is not enough to reflect enough information. Therefore, the material image B is excluded and not used as the original image.
本实施例步骤中,通过获取素材图像的姿态相似度,并基于姿态相似度的大小,将大于第一相似度阈值和/或小于第二相似度阈值的原始图像确定为原始图像,从而保证第一图像帧能够包含较多的信息量并兼顾美学特征,进而实现对素材数据中的数据内容的充分展示,提高后续生成的目标图像帧和目标拼图的信息量和图像观感。In the steps of this embodiment, by obtaining the posture similarity of the material image, and based on the size of the posture similarity, the original image that is greater than the first similarity threshold and/or less than the second similarity threshold is determined as the original image, thereby ensuring that the first image frame can contain more information and take into account aesthetic characteristics, thereby achieving a full display of the data content in the material data, and improving the information content and image perception of the subsequently generated target image frames and target puzzles.
步骤S102:对至少两个原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧。Step S102: performing image style transfer on at least two original images respectively to obtain corresponding target image frames having the target image style.
示例性地,在获得至少两个原始图像后,对多个原始图像分别进行图像风格迁移,例如,素材数据共包括有30个关键帧,将该30个关键帧作为原始图像,分别进行图像风格迁移,得到30个对应的具体相同的图像风格(目标图像风格)的图像帧,即目标图像帧。其中,图像风格迁移是指为图像增加图像风格特效,从而使被处理的图像在颜色、线条上具有某种图像绘画风格,例如,油画风格、漫画风格、素描风格等。具体地的实现方法,例如通过预训练的能够实现目标图像风格的风格迁移模型,对原始图像分别进行处理,从而得到具有目标图像风格的图像,即目标图像帧。风格迁移模型的具体训练 及使用方法为本领域技术人员知晓的现有技术,此处不再赘述。Exemplarily, after obtaining at least two original images, image style transfer is performed on the multiple original images respectively. For example, the material data includes 30 key frames in total. The 30 key frames are used as original images, and image style transfer is performed respectively to obtain 30 corresponding image frames with the same specific image style (target image style), namely, target image frames. Image style transfer refers to adding image style effects to an image so that the processed image has a certain image painting style in color and line, such as oil painting style, comic style, sketch style, etc. The specific implementation method is, for example, to process the original images separately through a pre-trained style transfer model that can achieve the target image style, so as to obtain images with the target image style, namely, target image frames. Specific training of the style transfer model The method of use is known to those skilled in the art and will not be described in detail here.
步骤S103:根据至少两个原始图像对应的图像内容,确定目标拼图布局。Step S103: determining a target puzzle layout according to image contents corresponding to at least two original images.
步骤S104:将至少两个目标图像帧按照目标拼图布局进行组合,生成目标拼图。Step S104: combining at least two target image frames according to a target puzzle layout to generate a target puzzle.
示例性地,在得到至少两个目标图像帧后,对各目标图像帧进行拼接组合,得到一张具有一定布局规则的图像,即目标拼图。示例性地,目标拼图包括至少两个拼图区域,每一拼图区域用于显示一个对应的目标图像帧,目标拼图布局表征目标拼图中的拼图区域的大小和/或位置。图6为本公开实施例提供的一种目标拼图的目标拼图布局的示意图,如图6所示,目标拼图由四张目标图像帧构成,分别为目标图像帧A、目标图像帧B、目标图像帧C和目标图像帧D,其中,每一目标图像帧对应一个拼图区域,目标图像帧A对应的拼图区域相对较大,位于目标拼图的左侧,目标图像帧B、目标图像帧C和目标图像帧D对应的拼图区域相对较小,位于目标拼图的右侧。Exemplarily, after obtaining at least two target image frames, each target image frame is spliced and combined to obtain an image with certain layout rules, namely, a target puzzle. Exemplarily, the target puzzle includes at least two puzzle areas, each puzzle area is used to display a corresponding target image frame, and the target puzzle layout characterizes the size and/or position of the puzzle area in the target puzzle. Figure 6 is a schematic diagram of a target puzzle layout of a target puzzle provided in an embodiment of the present disclosure. As shown in Figure 6, the target puzzle is composed of four target image frames, namely, target image frame A, target image frame B, target image frame C and target image frame D, wherein each target image frame corresponds to a puzzle area, the puzzle area corresponding to target image frame A is relatively large and is located on the left side of the target puzzle, and the puzzle areas corresponding to target image frames B, target image frames C and target image frames D are relatively small and are located on the right side of the target puzzle.
其中,该目标拼图的目标拼图布局不是随机生成的,而是基于各目原始图像的图像内容而确定的。一种可能的实现方式中,目标拼图布局的生成步骤包括:The target puzzle layout of the target puzzle is not randomly generated, but is determined based on the image content of each original image. In a possible implementation, the step of generating the target puzzle layout includes:
步骤S103A:根据初始图像数据中的至少两个原始图像对应的图像内容,得到布局信息,布局信息表征目标拼图中各拼图区域的大小和/或位置。Step S103A: obtaining layout information according to the image contents corresponding to at least two original images in the initial image data, wherein the layout information represents the size and/or position of each puzzle area in the target puzzle.
步骤S103B:根据布局信息,生成目标拼图布局。Step S103B: Generate a target puzzle layout according to the layout information.
示例性地,上述步骤在步骤S103之前执行,具体地,例如,参考图6所示的目标拼图,初始图像数据是一段用于介绍衣服穿搭的视频数据,基于该初始图像数据中的原始图像得到的各目标图像帧中,目标图像帧A的图像内容对应整体人像(人物元素,下同),位于目标拼图左侧最显著的主要位置,来表现人物整体的衣物穿搭效果;而目标图像帧B的图像内容对应人像正面、目标图像帧C的图像内容对应人像背面、目标图像帧D的图像内容对应人像侧面,均位于目标拼图右侧的次要位置,用于表现人物在正面、背面和侧面的衣物穿搭效果,从而使目标拼图能够实现对视频数据(初始图像数据)中的重要内容信息的展现(衣物穿搭的正面、背面、侧面以及整体效果),提高目标拼图的信息量。Exemplarily, the above steps are performed before step S103. Specifically, for example, referring to the target puzzle shown in Figure 6, the initial image data is a video data for introducing clothing matching. In each target image frame obtained based on the original image in the initial image data, the image content of the target image frame A corresponds to the overall portrait (character element, the same below), which is located in the most prominent main position on the left side of the target puzzle to show the overall clothing matching effect of the character; while the image content of the target image frame B corresponds to the front of the portrait, the image content of the target image frame C corresponds to the back of the portrait, and the image content of the target image frame D corresponds to the side of the portrait, all of which are located in a secondary position on the right side of the target puzzle, for showing the clothing matching effects of the character on the front, back and side, so that the target puzzle can realize the display of important content information in the video data (initial image data) (the front, back, side and overall effect of clothing matching), thereby increasing the amount of information of the target puzzle.
进一步地,各目标图像帧的图像内容的获得方法在之前步骤中已进行介 绍,一种可能的实现方式中,可以根据目标图像帧针对目标姿态的姿态相似度,或者美学特征的显著程度进行排序,从而确定各目标视频帧对应的拼图区域的面积和位置,进而确定目标拼图布局。Furthermore, the method for obtaining the image content of each target image frame has been introduced in the previous step. In a possible implementation, the target image frames can be sorted according to the posture similarity of the target posture or the prominence of the aesthetic features, so as to determine the area and position of the puzzle area corresponding to each target video frame, and then determine the target puzzle layout.
在本实施例中,通过获取至少两个原始图像,并对至少两个原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧;根据至少两个原始图像对应的图像内容,确定目标拼图布局;将至少两个目标图像帧按照目标拼图布局进行组合,生成目标拼图。由于通过对至少两个原始图像进行风格迁移,得到具有特定图像风格的目标图像帧,再对目标图像帧进行排列组合,得到布局与多个原始图像的内容相匹配的目标拼图,使目标拼图不仅能展示带有风格特效的多帧图像,还能通过拼图布局展示带有风格特效的多帧图像在内容上的关联性,实现了对多帧原始图像中有效信息的充分展示,提高视觉表现力。In this embodiment, at least two original images are obtained, and image style transfer is performed on the at least two original images respectively to obtain corresponding target image frames with target image style; the target puzzle layout is determined according to the image content corresponding to the at least two original images; and the at least two target image frames are combined according to the target puzzle layout to generate a target puzzle. Since the target image frame with a specific image style is obtained by performing style transfer on at least two original images, and the target image frames are arranged and combined to obtain a target puzzle with a layout matching the content of multiple original images, the target puzzle can not only display multiple frames of images with style special effects, but also display the relevance of the content of multiple frames of images with style special effects through the puzzle layout, thereby achieving full display of effective information in multiple frames of original images and improving visual expression.
参考图7,图7为本公开实施例提供的图像生成方法的流程示意图二。本实施例在图2所示实施例的基础上,进一步对步骤S102进行细化,并增加了确定目标拼图布局的步骤,该图像生成方法包括:Referring to FIG. 7 , FIG. 7 is a second flow chart of the image generation method provided by the embodiment of the present disclosure. Based on the embodiment shown in FIG. 2 , this embodiment further refines step S102 and adds a step of determining the target puzzle layout. The image generation method includes:
步骤S201:获取素材数据,素材数据包括视频和/或图片集。Step S201: Acquire material data, where the material data includes a video and/or a picture set.
步骤S202:获取素材数据中各素材图像的图像内容。Step S202: Acquire the image content of each material image in the material data.
本实施例的步骤S201-S202的具体实现方式,在图2所示实施例中已进行详细介绍,此处不再赘述。The specific implementation of steps S201 - S202 of this embodiment has been described in detail in the embodiment shown in FIG. 2 , and will not be repeated here.
步骤S203:获取各素材图像对应的动态清晰度,并基于各素材图像的图像内容和对应的动态清晰度,得到至少两个原始图像。Step S203: acquiring the dynamic definition corresponding to each material image, and obtaining at least two original images based on the image content and the corresponding dynamic definition of each material image.
示例性地,在获取素材图像的情况下,可以进一步针对各素材图像进行清晰度检测,获得对应的动态清晰度,其中,动态清晰度指在播放动态图像时的画面清晰度,具体表现在动态画面是否出现“拖尾”、“重影”等现象。清晰度检测可以通过对图像进行相关分析而得到,具体地,例如将图像分为若干横向或纵向区域,之后计算相邻区域的相关性(correlation),若存在“拖尾”、“重影”等现象,即动态清晰度较低时,则相关性较大,反正则相关性较小,从而,基于相关性的计算,得到对应的动态清晰度。当然,动态清晰度还有其他可能的实现方式,此处不再一一举例赘述。Exemplarily, in the case of acquiring material images, it is possible to further perform clarity detection on each material image to obtain the corresponding dynamic clarity, wherein dynamic clarity refers to the picture clarity when playing dynamic images, which is specifically reflected in whether the dynamic picture has "tailing" or "ghosting" and other phenomena. Clarity detection can be obtained by performing correlation analysis on the image. Specifically, for example, the image is divided into several horizontal or vertical regions, and then the correlation of adjacent regions is calculated. If there are "tailing" or "ghosting" phenomena, that is, the dynamic clarity is low, the correlation is large, otherwise the correlation is small. Thus, based on the calculation of the correlation, the corresponding dynamic clarity is obtained. Of course, there are other possible implementation methods for dynamic clarity, which will not be described one by one here.
进一步地,在获取素材图像对应的动态清晰度后,在基于素材图像的图 像内容对各素材图像进行筛选的同时,进一步基于动态清晰度对各素材图像进行筛选,将其中动态清晰度较低的图像剔除,保留其中动态清晰度较高的图像作为原始图像,从而提高原始图像的画面清晰度,提高后续生成的目标图像帧的视觉效果。其中,基于素材图像的图像内容对各素材图像进行筛选的实现方式,在图2所示实施例中,已进行详细介绍,此处不再赘述。Further, after obtaining the dynamic definition corresponding to the material image, While filtering each material image based on image content, each material image is further filtered based on dynamic definition, and images with lower dynamic definition are removed, and images with higher dynamic definition are retained as original images, thereby improving the picture clarity of the original image and improving the visual effect of the target image frame generated subsequently. The implementation method of filtering each material image based on the image content of the material image has been introduced in detail in the embodiment shown in FIG2 and will not be repeated here.
可选地,本实施例中还包括:Optionally, this embodiment also includes:
步骤S203A:基于至少两个原始图像对应的图像内容,确定目标图像风格。Step S203A: Determine the target image style based on the image contents corresponding to at least two original images.
示例性地,目标图像风格是指某种图像风格特效的类型,例如油画风格、漫画风格、素描风格等。目标图像风格的确定方式有多种,例如,通过预设的配置信息,确定对应的目标图像风格;再例如,基于初始图像数据的数据内容,确定目标图像风格,其中,至少两个原始图像对应的图像内容,是指至少两个原始图像分别对应的图像内容,以及各图像内容之间的关联性,一种可能的实现方式中,至少两个原始图像对应的图像内容可以通过原始图像对应的素材数据的内容确定。至少两个原始图像对应的图像内容所表现的内容主题、类型,可以通过特定的内容标识来表示,例如,内容标识为#1,表示至少两个原始图像为用户的自拍视频;内容标识为#2,表示至少两个原始图像为短视频;内容标识为#3,表示至少两个原始图像为电影。内容标识的具体实现方式及表达方式可以基于需要设置,此处不再一一赘述。进一步地,至少两个原始图像对应的图像内容与目标图像风格之间具有预设的映射关系,例如,当至少两个原始图像对应的图像内容为人像自拍视频时,对应的目标图像风格为漫画风格;当至少两个原始图像对应的图像内容为短视频时,对应的目标图像风格为素描风格。本实施例中,通过至少两个原始图像对应的图像内容以及至少两个原始图像对应的图像内容之间的内容关联性,确定对应的目标图像风格,使生成的目标图像帧的图像风格与至少两个原始图像对应的图像内容相匹配,提高目标图像帧的视觉表现力。Exemplarily, the target image style refers to the type of a certain image style special effect, such as oil painting style, comic style, sketch style, etc. There are many ways to determine the target image style, for example, by determining the corresponding target image style through preset configuration information; for another example, the target image style is determined based on the data content of the initial image data, wherein the image content corresponding to at least two original images refers to the image content corresponding to at least two original images respectively, and the correlation between the image contents. In a possible implementation method, the image content corresponding to at least two original images can be determined by the content of the material data corresponding to the original images. The content theme and type expressed by the image content corresponding to at least two original images can be represented by a specific content identifier, for example, the content identifier is #1, indicating that at least two original images are selfie videos of users; the content identifier is #2, indicating that at least two original images are short videos; the content identifier is #3, indicating that at least two original images are movies. The specific implementation method and expression method of the content identifier can be set based on needs, and will not be repeated here. Furthermore, there is a preset mapping relationship between the image content corresponding to at least two original images and the target image style. For example, when the image content corresponding to at least two original images is a portrait selfie video, the corresponding target image style is a cartoon style; when the image content corresponding to at least two original images is a short video, the corresponding target image style is a sketch style. In this embodiment, the corresponding target image style is determined by the image content corresponding to at least two original images and the content correlation between the image content corresponding to at least two original images, so that the image style of the generated target image frame matches the image content corresponding to at least two original images, thereby improving the visual expression of the target image frame.
步骤S204:基于原始图像的图像内容,确定原始图像中的目标图像元素。Step S204: Determine a target image element in the original image based on the image content of the original image.
步骤S205:围绕目标图像元素进行边缘剪裁,得到包含目标图像元素的剪裁图像,其中,目标图像元素在剪裁图像中的图像区域占比大于目标图像元素在原始图像中的图像区域占比。 Step S205: performing edge cropping around the target image element to obtain a cropped image including the target image element, wherein the image area proportion of the target image element in the cropped image is greater than the image area proportion of the target image element in the original image.
示例性地,在获得原始图像之后,原始图像的画面构图与素材图像相同,将原始图像从素材数据中单独抽出后,由于缺乏前后图像帧的变化表现,会导致无法突出画面重点的问题。For example, after the original image is obtained, the picture composition of the original image is the same as the material image. After the original image is extracted from the material data alone, the problem of failing to highlight the key points of the picture will occur due to the lack of changes in the previous and next image frames.
为解决上述问题,本实施例中,基于原始图像的图像内容,确定原始图像中的目标图像元素,例如,原始图像的图像内容为人像自拍,则将其中的人像轮廓作为中心进行四周剪裁,剪切掉第一图像帧中的无效区域,而获得包含该人像轮廓(目标图像元素)的剪裁图像。图8为本公开实施例提供的一种剪裁图像的示意图,如图8所示,原始图像中包括人像,基于人像轮廓对人像外部的无效区域进行了裁剪后,得到的包含该人像的图像帧,即剪裁图像。其中。由于对第一图像帧中的无效区域进行了剪裁,因此,剪裁图像中的人像轮廓(目标图像元素)在剪裁图像中的占比,高于人像轮廓在原始图像中的占比。从而实现了突出画面重点的目的,提高目标图像帧的视觉表现力。同时,通过对原始图像进行剪裁,得到剪裁图像,减少无效图像区域,可以增加后续风格迁移过程中的图像迁移效率。To solve the above problems, in this embodiment, based on the image content of the original image, the target image element in the original image is determined. For example, if the image content of the original image is a selfie of a portrait, the portrait contour is used as the center for cropping around, and the invalid area in the first image frame is cut off to obtain a cropped image containing the portrait contour (target image element). FIG8 is a schematic diagram of a cropped image provided by an embodiment of the present disclosure. As shown in FIG8, the original image includes a portrait. After the invalid area outside the portrait is cropped based on the portrait contour, the image frame containing the portrait is obtained, that is, the cropped image. Among them. Since the invalid area in the first image frame is cropped, the proportion of the portrait contour (target image element) in the cropped image is higher than the proportion of the portrait contour in the original image. Thereby, the purpose of highlighting the focus of the picture is achieved and the visual expression of the target image frame is improved. At the same time, by cropping the original image to obtain a cropped image, the invalid image area is reduced, and the image migration efficiency in the subsequent style migration process can be increased.
步骤S206:根据至少两个原始图像对应的图像内容,确定目标拼图布局。Step S206: Determine a target puzzle layout according to the image contents corresponding to at least two original images.
进一步地,获取原始图像的图像内容后,可以基于一定的规则,对原始图像的图像内容进行评估,并根据评估结果生成目标拼图布局,使具有较高信息量和/或较高美观度的第一图像帧能够优先展示。具体实现方式例如基于图像内容对应的姿态相似度、美学特征等对第一图像帧进行评估和排序,从而生成目标拼图布局,具体实现方法在图2所示实施例的对应段落已进行介绍,此处不再赘述。Furthermore, after obtaining the image content of the original image, the image content of the original image can be evaluated based on certain rules, and a target puzzle layout can be generated based on the evaluation results, so that the first image frame with higher information content and/or higher aesthetics can be displayed preferentially. The specific implementation method is, for example, evaluating and sorting the first image frame based on the posture similarity, aesthetic features, etc. corresponding to the image content to generate the target puzzle layout. The specific implementation method has been introduced in the corresponding paragraph of the embodiment shown in FIG. 2 and will not be repeated here.
在一种可能的实现方式中,示例性地,如图9所示,步骤S206的具体实现方式包括:In a possible implementation, illustratively, as shown in FIG9 , a specific implementation of step S206 includes:
步骤S2061:根据至少两个原始图像对应的图像内容,得到上下文信息,上下文信息表征至少两个初始图像对应的图像内容之间的上下文关系。Step S2061: Obtain context information based on image contents corresponding to at least two original images, where the context information represents a contextual relationship between image contents corresponding to at least two initial images.
步骤S2062:根据上下文信息,确定目标拼图布局。Step S2062: Determine the target puzzle layout according to the context information.
示例性地,原始图像是对初始图像数据中的素材图像进行筛选的结果,也即,原始图像是特定的素材图像。不同的原始图像之间,具有内容上的连续性,例如,原始图像对应的素材数据是一段“舞蹈”视频,则视频中的各素材图像对应的舞蹈动作具有时序上的联系,从素材图像中筛选出的原始图像, 也具有这种时序上的关联性,即上下文关系,更具体地,例如,基于之前的步骤,得到了100个原始图像,之后,对各原始图像进行语义识别,得到每一原始图像对应的表征舞蹈动作的语义信息;基于各有序的原始图像对应的语义信息,生成上下文信息,上下文信息可以是表征语义信息之间的关联性的特征矩阵,之后,基于上下文信息,对重复性舞蹈动作、非重要舞蹈动作对应的原始图像进行筛选,得到只表征重要舞蹈动作、非重复舞蹈动作的第一图像帧的数量(例如为10个),以及表征舞蹈动作重要性的重要性评估值;进而,确定目标拼图布局中拼图区域的数量,以及对应的原始图像大小、位置,即布局信息。Exemplarily, the original image is the result of filtering the material images in the initial image data, that is, the original image is a specific material image. Different original images have continuity in content. For example, if the material data corresponding to the original image is a "dance" video, then the dance movements corresponding to the material images in the video have a temporal connection. The original image filtered out from the material image is It also has this kind of temporal correlation, that is, the contextual relationship. More specifically, for example, based on the previous steps, 100 original images are obtained. Then, semantic recognition is performed on each original image to obtain semantic information representing the dance movement corresponding to each original image; based on the semantic information corresponding to each ordered original image, contextual information is generated. The contextual information can be a feature matrix representing the correlation between semantic information. Then, based on the contextual information, the original images corresponding to repetitive dance movements and non-important dance movements are screened to obtain the number of first image frames (for example, 10) that only represent important dance movements and non-repetitive dance movements, as well as the importance evaluation value representing the importance of dance movements; then, the number of puzzle areas in the target puzzle layout, as well as the corresponding original image size and position, that is, the layout information, are determined.
本实施例中,通过获取上下文信息,并基于上下文信息确定目标拼图布局,充分利用了初始图像数据中的信息,使生成的目标拼图的目标拼图布局更加合理,能够更好的体现初始图像数据中的重要信息,提高展示效果。In this embodiment, by obtaining context information and determining the target puzzle layout based on the context information, full use is made of the information in the initial image data, making the target puzzle layout of the generated target puzzle more reasonable, better reflecting the important information in the initial image data, and improving the display effect.
步骤S207:基于目标图像风格对应的风格迁移模型,对各剪裁图像进行风格迁移,得到各剪裁图像对应的目标图像帧。Step S207: Based on the style transfer model corresponding to the target image style, style transfer is performed on each cropped image to obtain a target image frame corresponding to each cropped image.
步骤S208:在目标图像帧中显示特效标识,得到特效目标图像帧,其中,特效标识是基于目标图像帧的图像内容确定的。Step S208: displaying a special effect identifier in the target image frame to obtain a special effect target image frame, wherein the special effect identifier is determined based on the image content of the target image frame.
示例性地,在对剪裁图像进行风格迁移,得到目标图像帧后,还可以进一步的在目标图像帧中添加动态特征标识,例如“烟花贴图特效”、“虚拟饰品特效”等,从而进一步的提高目标拼图的视觉表现力。For example, after performing style transfer on the cropped image to obtain the target image frame, dynamic feature identifiers such as "fireworks sticker effects", "virtual jewelry effects", etc. can be further added to the target image frame to further improve the visual expressiveness of the target puzzle.
示例性地,如图10所示,步骤S208的具体实现步骤包括:Exemplarily, as shown in FIG10 , the specific implementation steps of step S208 include:
步骤S2081:对目标图像帧中的人物元素进行面部特征检测,得到对应的面部表情特征。Step S2081: Perform facial feature detection on the human element in the target image frame to obtain corresponding facial expression features.
步骤S2082:基于面部表情特征,确定对应的目标特效标识,并确定目标特效标识的目标显示位置。Step S2082: Based on the facial expression features, determine the corresponding target special effect mark, and determine the target display position of the target special effect mark.
步骤S2083:基于目标显示位置,在目标图像帧中添加目标特效标识,得到特效目标图像帧。Step S2083: Based on the target display position, a target special effect identifier is added to the target image frame to obtain a special effect target image frame.
示例性地,本实施例适用于目标图像帧中包含人物元素的场景,具体地,首先对目标图像帧中各元素进行识别,得到人物元素,例如人像的面部,再对人物元素进行面部特征检测,得到面部表情特征,例如包括:开心、难过等。之后,基于面部表情特征,确定对应的目标特效标识,再基于目标图像帧 中各元素的位置,确定标特效标识的目标显示位置,使目标特效标识避开其他图像元素,避免造成遮挡。最后,将目标特效标识加载至目标图像帧的目标显示位置处,得到特效目标图像帧。Exemplarily, this embodiment is applicable to a scene in which a target image frame contains human elements. Specifically, each element in the target image frame is first identified to obtain human elements, such as the face of a portrait, and then facial feature detection is performed on the human elements to obtain facial expression features, such as happiness, sadness, etc. Then, based on the facial expression features, the corresponding target special effect identifier is determined, and then based on the target image frame, the target special effect identifier is generated. The position of each element in the image is determined to determine the target display position of the special effect mark, so that the target special effect mark avoids other image elements to avoid occlusion. Finally, the target special effect mark is loaded to the target display position of the target image frame to obtain the special effect target image frame.
步骤S209:将至少两个特效目标图像帧按照目标拼图布局进行组合,生成目标拼图。Step S209: combining at least two special effect target image frames according to the target puzzle layout to generate a target puzzle.
本实施例的步骤S207、S209的具体实现方式,在图2所示实施例中已进行详细介绍,此处不再赘述。The specific implementation of steps S207 and S209 of this embodiment has been described in detail in the embodiment shown in FIG. 2 , and will not be repeated here.
对应于上文实施例的图像生成方法,图11为本公开实施例提供的图像生成装置的结构框图。为了便于说明,仅示出了与本公开实施例相关的部分。参照图11,图像生成装置3包括:Corresponding to the image generation method of the above embodiment, FIG11 is a structural block diagram of an image generation device provided by an embodiment of the present disclosure. For ease of explanation, only the parts related to the embodiment of the present disclosure are shown. Referring to FIG11 , the image generation device 3 includes:
获取模块31,用于获取至少两个原始图像;An acquisition module 31, used for acquiring at least two original images;
迁移模块32,用于对至少两个原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧;A migration module 32, configured to perform image style migration on at least two original images respectively to obtain corresponding target image frames having a target image style;
组合模块33,用于根据至少两个原始图像对应的图像内容,确定目标拼图布局,并将至少两个目标图像帧按照目标拼图布局进行组合,生成目标拼图。The combining module 33 is used to determine a target puzzle layout according to the image contents corresponding to at least two original images, and to combine at least two target image frames according to the target puzzle layout to generate a target puzzle.
在一种可能的实现方式中,获取模块31,具体用于:获取素材数据,素材数据包括视频和/或图片集;根据素材数据中的素材图像的图像内容,对素材数据进行抽帧,得到至少两个原始图像,素材图像为视频中的视频帧和/或图片集中的图片。In one possible implementation, the acquisition module 31 is specifically used to: acquire material data, the material data including a video and/or a picture set; extract frames of the material data according to the image content of the material images in the material data to obtain at least two original images, the material images being video frames in the video and/or pictures in the picture set.
在一种可能的实现方式中,获取模块31在根据素材数据中的素材图像的图像内容,对素材数据进行抽帧,得到至少两个原始图像时,具体用于:基于素材图像的图像内容,获取素材图像对应的姿态相似度,姿态相似度表征图像内容中的人物元素的姿态与目标姿态的相似度;根据各素材图像对应的姿态相似度,确定至少两个原始图像。In a possible implementation, when the acquisition module 31 extracts frames of the material data according to the image content of the material image in the material data to obtain at least two original images, it is specifically used to: acquire the posture similarity corresponding to the material image based on the image content of the material image, the posture similarity characterizing the similarity between the posture of the character element in the image content and the target posture; determine at least two original images according to the posture similarity corresponding to each material image.
在一种可能的实现方式中,获取模块31在根据各素材图像对应的姿态相似度,确定至少两个原始图像时,具体用于:将姿态相似度大于第一相似度阈值的素材图像,确定为原始图像,和/或,将姿态相似度小于第二相似度阈值的素材图像,确定为原始图像;其中,第一相似度阈值大于第二相似度阈值。 In a possible implementation, when the acquisition module 31 determines at least two original images based on the posture similarities corresponding to each material image, it is specifically used to: determine the material images whose posture similarity is greater than a first similarity threshold as the original images, and/or, determine the material images whose posture similarity is less than a second similarity threshold as the original images; wherein the first similarity threshold is greater than the second similarity threshold.
在一种可能的实现方式中,获取模块31,还用于:获取素材图像对应的动态清晰度;获取模块31在根据素材数据中的素材图像的图像内容,对素材数据进行抽帧,得到至少两个原始图像时,具体用于;基于各素材图像的图像内容和对应的动态清晰度,得到至少两个原始图像。In a possible implementation, the acquisition module 31 is also used to: acquire the dynamic clarity corresponding to the material image; when the acquisition module 31 extracts frames of the material data according to the image content of the material image in the material data to obtain at least two original images, it is specifically used to: acquire at least two original images based on the image content and the corresponding dynamic clarity of each material image.
在一种可能的实现方式中,迁移模块32,具体用于:获取目标图像风格对应的风格迁移模型;基于风格迁移模型处理原始图像,得到原始图像对应的目标图像帧。In a possible implementation, the migration module 32 is specifically used to: obtain a style migration model corresponding to the target image style; and process the original image based on the style migration model to obtain a target image frame corresponding to the original image.
在一种可能的实现方式中,迁移模块32在基于风格迁移模型处理原始图像,得到原始图像对应的目标图像帧时,具体用于:基于原始图像的图像内容,确定原始图像中的目标图像元素;围绕目标图像元素进行边缘剪裁,得到包含目标图像元素的剪裁图像,其中,目标图像元素在剪裁图像中的图像区域占比大于目标图像元素在原始图像中的图像区域占比;基于目标图像风格对应的风格迁移模型对各剪裁图像进行风格迁移,得到原始图像对应的目标图像帧。In one possible implementation, when the migration module 32 processes the original image based on the style transfer model to obtain a target image frame corresponding to the original image, it is specifically used to: determine the target image element in the original image based on the image content of the original image; perform edge cropping around the target image element to obtain a cropped image containing the target image element, wherein the target image element accounts for a larger proportion of the image area in the cropped image than the target image element accounts for the image area in the original image; perform style migration on each cropped image based on the style transfer model corresponding to the target image style to obtain a target image frame corresponding to the original image.
在一种可能的实现方式中,在对至少两个原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧之前,迁移模块32,还用于:基于至少两个原始图像对应的图像内容,确定目标图像风格。In a possible implementation, before performing image style migration on at least two original images to obtain corresponding target image frames with target image style, the migration module 32 is further used to: determine the target image style based on image contents corresponding to the at least two original images.
在一种可能的实现方式中,目标拼图包括至少两个拼图区域,每一拼图区域用于显示一个对应的目标图像帧,目标拼图布局表征目标拼图中的拼图区域的大小和/或位置。In a possible implementation, the target puzzle includes at least two puzzle areas, each puzzle area is used to display a corresponding target image frame, and the target puzzle layout represents the size and/or position of the puzzle area in the target puzzle.
在一种可能的实现方式中,组合模块33在根据至少两个原始图像对应的图像内容,确定目标拼图布局时,具体用于:据至少两个原始图像对应的图像内容,得到上下文信息,上下文信息表征至少两个初始图像对应的图像内容之间的上下文关系;根据上下文信息,确定目标拼图布局。In a possible implementation, when determining the target puzzle layout based on the image contents corresponding to at least two original images, the combination module 33 is specifically used to: obtain context information based on the image contents corresponding to at least two original images, wherein the context information represents a contextual relationship between the image contents corresponding to at least two initial images; and determine the target puzzle layout based on the context information.
在一种可能的实现方式中,组合模块33,还用于:在目标图像帧中添加特效标识,得到特效目标图像帧,其中,特效标识是基于目标图像帧的图像内容确定的。In a possible implementation, the combining module 33 is further configured to: add a special effect identifier to the target image frame to obtain a special effect target image frame, wherein the special effect identifier is determined based on the image content of the target image frame.
在一种可能的实现方式中,组合模块33在目标图像帧中添加特效标识,得到特效目标图像帧时,具体用于:对目标图像帧中的人物元素进行面部特征检测,得到对应的面部表情特征;基于面部表情特征,确定对应的目标特 效标识;确定目标特效标识的目标显示位置;基于目标显示位置,在目标图像帧中添加目标特效标识,得到特效目标图像帧。In a possible implementation, the combination module 33 adds a special effect identifier to the target image frame to obtain the special effect target image frame, which is specifically used to: perform facial feature detection on the character elements in the target image frame to obtain the corresponding facial expression features; determine the corresponding target special effect based on the facial expression features; effect identifier; determining a target display position of a target special effect identifier; and adding a target special effect identifier to a target image frame based on the target display position to obtain a special effect target image frame.
其中,获取模块31、迁移模块32和组合模块33依次连接。本实施例提供的图像生成装置3可以执行上述方法实施例的技术方案,其实现原理和技术效果类似,本实施例此处不再赘述。The acquisition module 31, the migration module 32 and the combination module 33 are connected in sequence. The image generation device 3 provided in this embodiment can implement the technical solution of the above method embodiment, and its implementation principle and technical effect are similar, which will not be described in detail in this embodiment.
图12为本公开实施例提供的一种电子设备的结构示意图,如图12所示,该电子设备4包括:FIG. 12 is a schematic diagram of the structure of an electronic device provided by an embodiment of the present disclosure. As shown in FIG. 12 , the electronic device 4 includes:
处理器41,以及与处理器41通信连接的存储器42;A processor 41, and a memory 42 communicatively connected to the processor 41;
存储器42存储计算机执行指令;The memory 42 stores computer executable instructions;
处理器41执行存储器42存储的计算机执行指令,以实现如图2-图10所示实施例中的图像生成方法。The processor 41 executes the computer-executable instructions stored in the memory 42 to implement the image generation method in the embodiments shown in FIG. 2 to FIG. 10 .
其中,可选地,处理器41和存储器42通过总线43连接。Optionally, the processor 41 and the memory 42 are connected via a bus 43 .
相关说明可以对应参见图2-图10所对应的实施例中的步骤所对应的相关描述和效果进行理解,此处不做过多赘述。The relevant instructions can be understood by referring to the relevant descriptions and effects corresponding to the steps in the embodiments corresponding to Figures 2 to 10, and no further details will be given here.
参考图13,其示出了适于用来实现本公开实施例的电子设备900的结构示意图,该电子设备900可以为终端设备或服务器。其中,终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,简称PDA)、平板电脑(Portable Android Device,简称PAD)、便携式多媒体播放器(Portable Media Player,简称PMP)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图13示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。Referring to FIG. 13 , it shows a schematic diagram of the structure of an electronic device 900 suitable for implementing the embodiment of the present disclosure, and the electronic device 900 may be a terminal device or a server. The terminal device may include but is not limited to mobile terminals such as mobile phones, laptop computers, digital broadcast receivers, personal digital assistants (PDAs), tablet computers (Portable Android Devices, PADs), portable multimedia players (PMPs), vehicle terminals (such as vehicle navigation terminals), etc., and fixed terminals such as digital TVs, desktop computers, etc. The electronic device shown in FIG. 13 is only an example and should not bring any limitation to the functions and scope of use of the embodiment of the present disclosure.
如图13所示,电子设备900可以包括处理装置(例如中央处理器、图形处理器等)901,其可以根据存储在只读存储器(Read Only Memory,简称ROM)902中的程序或者从存储装置908加载到随机访问存储器(Random Access Memory,简称RAM)903中的程序而执行各种适当的动作和处理。在RAM 903中,还存储有电子设备900操作所需的各种程序和数据。处理装置901、ROM 902以及RAM 903通过总线904彼此相连。输入/输出(I/O)接口905也连接至总线904。As shown in FIG. 13 , the electronic device 900 may include a processing device (e.g., a central processing unit, a graphics processing unit, etc.) 901, which may perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 902 or a program loaded from a storage device 908 to a random access memory (RAM) 903. Various programs and data required for the operation of the electronic device 900 are also stored in the RAM 903. The processing device 901, the ROM 902, and the RAM 903 are connected to each other via a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.
通常,以下装置可以连接至I/O接口905:包括例如触摸屏、触摸板、键 盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置906;包括例如液晶显示器(Liquid Crystal Display,简称LCD)、扬声器、振动器等的输出装置907;包括例如磁带、硬盘等的存储装置908;以及通信装置909。通信装置909可以允许电子设备900与其他设备进行无线或有线通信以交换数据。虽然图13示出了具有各种装置的电子设备900,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。Typically, the following devices can be connected to the I/O interface 905: including, for example, a touch screen, a touch pad, a keyboard, The electronic device 900 includes an input device 906 such as a disk, a mouse, a camera, a microphone, an accelerometer, a gyroscope, etc.; an output device 907 such as a liquid crystal display (LCD), a speaker, a vibrator, etc.; a storage device 908 such as a magnetic tape, a hard disk, etc.; and a communication device 909. The communication device 909 can allow the electronic device 900 to communicate with other devices wirelessly or wired to exchange data. Although FIG. 13 shows an electronic device 900 with various devices, it should be understood that it is not required to implement or have all the devices shown. More or fewer devices may be implemented or provided alternatively.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置909从网络上被下载和安装,或者从存储装置908被安装,或者从ROM902被安装。在该计算机程序被处理装置901执行时,执行本公开实施例的方法中限定的上述功能。In particular, according to an embodiment of the present disclosure, the process described above with reference to the flowchart can be implemented as a computer software program. For example, an embodiment of the present disclosure includes a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart. In such an embodiment, the computer program can be downloaded and installed from the network through the communication device 909, or installed from the storage device 908, or installed from the ROM 902. When the computer program is executed by the processing device 901, the above-mentioned functions defined in the method of the embodiment of the present disclosure are executed.
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意 合适的组合。It should be noted that the computer-readable medium disclosed above may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two. The computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination of the above. More specific examples of computer-readable storage media may include, but are not limited to: an electrical connection with one or more wires, a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the above. In the present disclosure, a computer-readable storage medium may be any tangible medium containing or storing a program that may be used by or in combination with an instruction execution system, device or device. In the present disclosure, a computer-readable signal medium may include a data signal propagated in a baseband or as part of a carrier wave, in which a computer-readable program code is carried. This propagated data signal may take a variety of forms, including but not limited to an electromagnetic signal, an optical signal, or any suitable combination of the above. Computer readable signal media may also be any computer readable medium other than computer readable storage media, which may send, propagate or transmit a program for use by or in conjunction with an instruction execution system, apparatus or device. The program code contained on the computer readable medium may be transmitted using any appropriate medium, including but not limited to: wires, optical cables, RF (radio frequency), etc., or any of the above. The right combination.
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。The computer-readable medium may be included in the electronic device, or may exist independently without being incorporated into the electronic device.
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备执行上述实施例所示的方法。The computer-readable medium carries one or more programs. When the one or more programs are executed by the electronic device, the electronic device executes the method shown in the above embodiment.
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(Local Area Network,简称LAN)或广域网(Wide Area Network,简称WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for performing the operations of the present disclosure may be written in one or more programming languages or a combination thereof, including object-oriented programming languages such as Java, Smalltalk, C++, and conventional procedural programming languages such as "C" or similar programming languages. The program code may be executed entirely on the user's computer, partially on the user's computer, as a separate software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving remote computers, the remote computer may be connected to the user's computer via any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (e.g., via the Internet using an Internet service provider).
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flow chart and block diagram in the accompanying drawings illustrate the possible architecture, function and operation of the system, method and computer program product according to various embodiments of the present disclosure. In this regard, each square box in the flow chart or block diagram can represent a module, a program segment or a part of a code, and the module, the program segment or a part of the code contains one or more executable instructions for realizing the specified logical function. It should also be noted that in some implementations as replacements, the functions marked in the square box can also occur in a sequence different from that marked in the accompanying drawings. For example, two square boxes represented in succession can actually be executed substantially in parallel, and they can sometimes be executed in the opposite order, depending on the functions involved. It should also be noted that each square box in the block diagram and/or flow chart, and the combination of the square boxes in the block diagram and/or flow chart can be implemented with a dedicated hardware-based system that performs a specified function or operation, or can be implemented with a combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定,例如,第一获取单元还可以被描述为“获取至少两个网际协议地址的单元”。The units involved in the embodiments described in the present disclosure may be implemented by software or hardware. The name of a unit does not limit the unit itself in some cases. For example, the first acquisition unit may also be described as a "unit for acquiring at least two Internet Protocol addresses".
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执 行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。The functions described above may be performed at least in part by one or more hardware logic components. For example, and without limitation, exemplary types of hardware logic components that may be used include: field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips (SOCs), complex programmable logic devices (CPLDs), and the like.
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, device, or equipment. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or equipment, or any suitable combination of the foregoing. A more specific example of a machine-readable storage medium may include an electrical connection based on one or more lines, a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
第一方面,根据本公开的一个或多个实施例,提供了一种图像生成方法,包括:In a first aspect, according to one or more embodiments of the present disclosure, there is provided an image generation method, comprising:
获取至少两个原始图像,并对至少两个所述原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧;根据至少两个所述原始图像对应的图像内容,确定目标拼图布局;将至少两个所述目标图像帧按照目标拼图布局进行组合,生成目标拼图。Acquire at least two original images, and perform image style transfer on at least two of the original images respectively to obtain corresponding target image frames with target image style; determine a target puzzle layout according to image contents corresponding to at least two of the original images; combine at least two of the target image frames according to the target puzzle layout to generate a target puzzle.
根据本公开的一个或多个实施例所述获取至少两个原始图像,包括:获取素材数据,所述素材数据包括视频和/或图片集;根据所述素材数据中的素材图像的图像内容,对所述素材数据进行抽帧,得到至少两个所述原始图像,所述素材图像为所述视频中的视频帧和/或所述图片集中的图片。According to one or more embodiments of the present disclosure, obtaining at least two original images includes: obtaining material data, wherein the material data includes a video and/or a picture set; extracting frames from the material data according to image contents of the material images in the material data to obtain at least two original images, wherein the material images are video frames in the video and/or pictures in the picture set.
根据本公开的一个或多个实施例根据所述素材数据中的素材图像的图像内容,对所述素材数据进行抽帧,得到至少两个所述原始图像,包括:基于所述素材图像的图像内容,获取所述素材图像对应的姿态相似度,所述姿态相似度表征所述图像内容中的人物元素的姿态与目标姿态的相似度;根据各所述素材图像对应的姿态相似度,确定至少两个原始图像。According to one or more embodiments of the present disclosure, the material data is frame extracted according to the image content of the material image in the material data to obtain at least two of the original images, including: based on the image content of the material image, obtaining the posture similarity corresponding to the material image, the posture similarity characterizing the similarity between the posture of the character element in the image content and the target posture; and determining at least two original images according to the posture similarities corresponding to each of the material images.
根据本公开的一个或多个实施例根据各所述素材图像对应的姿态相似度,确定至少两个原始图像,包括:将所述姿态相似度大于第一相似度阈值的素材图像,确定为所述原始图像,和/或,将所述姿态相似度小于第二相似度阈 值的素材图像,确定为所述原始图像;其中,所述第一相似度阈值大于所述第二相似度阈值。According to one or more embodiments of the present disclosure, at least two original images are determined according to the posture similarities corresponding to the material images, including: determining the material images whose posture similarity is greater than a first similarity threshold as the original images, and/or determining the material images whose posture similarity is less than a second similarity threshold as the original images. The material image with the same similarity value is determined as the original image; wherein the first similarity threshold is greater than the second similarity threshold.
根据本公开的一个或多个实施例,所述方法还包括:获取所述素材图像对应的动态清晰度;根据所述素材数据中的素材图像的图像内容,对所述素材数据进行抽帧,得到至少两个所述原始图像,包括;基于各所述素材图像的图像内容和对应的动态清晰度,得到至少两个所述原始图像。According to one or more embodiments of the present disclosure, the method further includes: obtaining the dynamic clarity corresponding to the material image; extracting frames of the material data according to the image content of the material image in the material data to obtain at least two of the original images, including; obtaining at least two of the original images based on the image content and corresponding dynamic clarity of each of the material images.
根据本公开的一个或多个实施例,对至少两个所述原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧,包括:获取所述目标图像风格对应的风格迁移模型;基于所述风格迁移模型处理所述原始图像,得到所述原始图像对应的目标图像帧。According to one or more embodiments of the present disclosure, image style transfer is performed on at least two of the original images respectively to obtain corresponding target image frames with the style of the target image, including: obtaining a style transfer model corresponding to the target image style; and processing the original images based on the style transfer model to obtain the target image frames corresponding to the original images.
根据本公开的一个或多个实施例,基于所述风格迁移模型处理所述原始图像,得到所述原始图像对应的目标图像帧,包括:基于所述原始图像的图像内容,确定所述原始图像中的目标图像元素;围绕所述目标图像元素进行边缘剪裁,得到包含所述目标图像元素的剪裁图像,其中,所述目标图像元素在所述剪裁图像中的图像区域占比大于所述目标图像元素在所述原始图像中的图像区域占比;基于所述目标图像风格对应的风格迁移模型对各所述剪裁图像进行风格迁移,得到所述原始图像对应的目标图像帧。According to one or more embodiments of the present disclosure, the original image is processed based on the style transfer model to obtain a target image frame corresponding to the original image, including: determining a target image element in the original image based on the image content of the original image; performing edge cropping around the target image element to obtain a cropped image containing the target image element, wherein the target image element in the cropped image accounts for a larger proportion of the image area of the target image element in the original image; and performing style migration on each of the cropped images based on the style transfer model corresponding to the style of the target image to obtain a target image frame corresponding to the original image.
根据本公开的一个或多个实施例,在对至少两个所述原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧之前,还包括:基于至少两个所述原始图像对应的图像内容,确定所述目标图像风格。According to one or more embodiments of the present disclosure, before performing image style migration on at least two of the original images respectively to obtain corresponding target image frames with the target image style, it also includes: determining the target image style based on image contents corresponding to the at least two original images.
根据本公开的一个或多个实施例,所述目标拼图包括至少两个拼图区域,每一拼图区域用于显示一个对应的所述目标图像帧,所述目标拼图布局表征所述目标拼图中的拼图区域的大小和/或位置。According to one or more embodiments of the present disclosure, the target puzzle includes at least two puzzle areas, each puzzle area is used to display a corresponding target image frame, and the target puzzle layout represents the size and/or position of the puzzle area in the target puzzle.
根据本公开的一个或多个实施例,根据至少两个所述原始图像对应的图像内容,确定目标拼图布局,包括:据所述至少两个原始图像对应的图像内容,得到上下文信息,所述上下文信息表征所述至少两个初始图像对应的图像内容之间的上下文关系;根据所述上下文信息,确定目标拼图布局。According to one or more embodiments of the present disclosure, determining a target puzzle layout based on image contents corresponding to at least two of the original images includes: obtaining context information based on the image contents corresponding to the at least two original images, the context information representing a contextual relationship between the image contents corresponding to the at least two initial images; and determining the target puzzle layout based on the context information.
根据本公开的一个或多个实施例,所述方法还包括:在所述目标图像帧中添加特效标识,得到特效目标图像帧,其中,所述特效标识是基于所述目标图像帧的图像内容确定的。 According to one or more embodiments of the present disclosure, the method further includes: adding a special effect identifier to the target image frame to obtain a special effect target image frame, wherein the special effect identifier is determined based on image content of the target image frame.
根据本公开的一个或多个实施例,在所述目标图像帧中添加特效标识,得到特效目标图像帧,包括:对所述目标图像帧中的人物元素进行面部特征检测,得到对应的面部表情特征;基于所述面部表情特征,确定对应的目标特效标识;确定所述目标特效标识的目标显示位置;基于所述目标显示位置,在所述目标图像帧中添加所述目标特效标识,得到特效目标图像帧。According to one or more embodiments of the present disclosure, a special effect identifier is added to the target image frame to obtain a special effect target image frame, including: performing facial feature detection on a person element in the target image frame to obtain a corresponding facial expression feature; determining a corresponding target special effect identifier based on the facial expression feature; determining a target display position of the target special effect identifier; and adding the target special effect identifier to the target image frame based on the target display position to obtain a special effect target image frame.
第二方面,根据本公开的一个或多个实施例,提供了一种图像生成装置,包括:In a second aspect, according to one or more embodiments of the present disclosure, there is provided an image generating device, comprising:
获取模块,用于获取至少两个原始图像;An acquisition module, used for acquiring at least two original images;
迁移模块,用于对至少两个所述原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧;A migration module, used to perform image style migration on at least two of the original images respectively to obtain corresponding target image frames with target image style;
组合模块,用于根据至少两个所述原始图像对应的图像内容,确定目标拼图布局,并将至少两个所述目标图像帧按照目标拼图布局进行组合,生成目标拼图。The combination module is used to determine a target puzzle layout according to image contents corresponding to at least two of the original images, and to combine at least two of the target image frames according to the target puzzle layout to generate a target puzzle.
在一种可能的实现方式中,所述获取模块,具体用于:获取素材数据,所述素材数据包括视频和/或图片集;根据所述素材数据中的素材图像的图像内容,对所述素材数据进行抽帧,得到至少两个所述原始图像,所述素材图像为所述视频中的视频帧和/或所述图片集中的图片。In a possible implementation, the acquisition module is specifically used to: acquire material data, the material data including a video and/or a picture set; extract frames of the material data according to image content of the material images in the material data to obtain at least two of the original images, the material images being video frames in the video and/or pictures in the picture set.
在一种可能的实现方式中,所述获取模块在根据所述素材数据中的素材图像的图像内容,对所述素材数据进行抽帧,得到至少两个所述原始图像时,具体用于:基于所述素材图像的图像内容,获取所述素材图像对应的姿态相似度,所述姿态相似度表征所述图像内容中的人物元素的姿态与目标姿态的相似度;根据各所述素材图像对应的姿态相似度,确定至少两个原始图像。In a possible implementation, when the acquisition module extracts frames of the material data according to the image content of the material image in the material data to obtain at least two of the original images, it is specifically used to: acquire the posture similarity corresponding to the material image based on the image content of the material image, the posture similarity representing the similarity between the posture of the character element in the image content and the target posture; determine at least two original images according to the posture similarities corresponding to each of the material images.
在一种可能的实现方式中,所述获取模块在根据各所述素材图像对应的姿态相似度,确定至少两个原始图像时,具体用于:将所述姿态相似度大于第一相似度阈值的素材图像,确定为所述原始图像,和/或,将所述姿态相似度小于第二相似度阈值的素材图像,确定为所述原始图像;其中,所述第一相似度阈值大于所述第二相似度阈值。In a possible implementation, when the acquisition module determines at least two original images based on the posture similarities corresponding to each of the material images, it is specifically used to: determine the material image whose posture similarity is greater than a first similarity threshold as the original image, and/or, determine the material image whose posture similarity is less than a second similarity threshold as the original image; wherein the first similarity threshold is greater than the second similarity threshold.
在一种可能的实现方式中,所述获取模块,还用于:获取所述素材图像对应的动态清晰度;所述获取模块在根据所述素材数据中的素材图像的图像内容,对所述素材数据进行抽帧,得到至少两个所述原始图像时,具体用于; 基于各所述素材图像的图像内容和对应的动态清晰度,得到至少两个所述原始图像。In a possible implementation, the acquisition module is further used to: acquire the dynamic definition corresponding to the material image; when the acquisition module extracts frames of the material data according to the image content of the material image in the material data to obtain at least two original images, the acquisition module is specifically used to: At least two original images are obtained based on the image content and the corresponding dynamic definition of each of the material images.
在一种可能的实现方式中,所述迁移模块,具体用于:获取所述目标图像风格对应的风格迁移模型;基于所述风格迁移模型处理所述原始图像,得到所述原始图像对应的目标图像帧。In a possible implementation, the migration module is specifically used to: obtain a style migration model corresponding to the target image style; and process the original image based on the style migration model to obtain a target image frame corresponding to the original image.
在一种可能的实现方式中,所述迁移模块在基于所述风格迁移模型处理所述原始图像,得到所述原始图像对应的目标图像帧时,具体用于:基于所述原始图像的图像内容,确定所述原始图像中的目标图像元素;围绕所述目标图像元素进行边缘剪裁,得到包含所述目标图像元素的剪裁图像,其中,所述目标图像元素在所述剪裁图像中的图像区域占比大于所述目标图像元素在所述原始图像中的图像区域占比;基于所述目标图像风格对应的风格迁移模型对各所述剪裁图像进行风格迁移,得到所述原始图像对应的目标图像帧。In a possible implementation, when the migration module processes the original image based on the style migration model to obtain a target image frame corresponding to the original image, the migration module is specifically used to: determine the target image element in the original image based on the image content of the original image; perform edge cropping around the target image element to obtain a cropped image containing the target image element, wherein the target image element accounts for a larger proportion of the image area in the cropped image than the target image element accounts for the image area in the original image; perform style migration on each of the cropped images based on the style migration model corresponding to the target image style to obtain a target image frame corresponding to the original image.
在一种可能的实现方式中,在对至少两个所述原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧之前,所述迁移模块,还用于:基于至少两个所述原始图像对应的图像内容,确定所述目标图像风格。In a possible implementation, before performing image style migration on at least two of the original images respectively to obtain corresponding target image frames with the target image style, the migration module is further used to: determine the target image style based on image content corresponding to the at least two original images.
在一种可能的实现方式中,所述目标拼图包括至少两个拼图区域,每一拼图区域用于显示一个对应的所述目标图像帧,所述目标拼图布局表征所述目标拼图中的拼图区域的大小和/或位置。In a possible implementation, the target puzzle includes at least two puzzle areas, each puzzle area is used to display a corresponding target image frame, and the target puzzle layout represents the size and/or position of the puzzle area in the target puzzle.
在一种可能的实现方式中,所述组合模块在根据至少两个所述原始图像对应的图像内容,确定目标拼图布局时,具体用于:据所述至少两个原始图像对应的图像内容,得到上下文信息,所述上下文信息表征所述至少两个初始图像对应的图像内容之间的上下文关系;根据所述上下文信息,确定目标拼图布局。In a possible implementation, when the combination module determines the target puzzle layout based on the image contents corresponding to at least two of the original images, it is specifically used to: obtain context information based on the image contents corresponding to the at least two original images, the context information representing the contextual relationship between the image contents corresponding to the at least two initial images; and determine the target puzzle layout based on the context information.
在一种可能的实现方式中,所述组合模块,还用于:在所述目标图像帧中添加特效标识,得到特效目标图像帧,其中,所述特效标识是基于所述目标图像帧的图像内容确定的。In a possible implementation, the combining module is further used to: add a special effect identifier to the target image frame to obtain a special effect target image frame, wherein the special effect identifier is determined based on the image content of the target image frame.
在一种可能的实现方式中,所述组合模块在所述目标图像帧中添加特效标识,得到特效目标图像帧时,具体用于:对所述目标图像帧中的人物元素进行面部特征检测,得到对应的面部表情特征;基于所述面部表情特征,确 定对应的目标特效标识;确定所述目标特效标识的目标显示位置;基于所述目标显示位置,在所述目标图像帧中添加所述目标特效标识,得到特效目标图像帧。In a possible implementation, when the combination module adds a special effect mark to the target image frame to obtain a special effect target image frame, it is specifically used to: perform facial feature detection on the character elements in the target image frame to obtain corresponding facial expression features; based on the facial expression features, determine Determine a corresponding target special effect identifier; determine a target display position of the target special effect identifier; based on the target display position, add the target special effect identifier to the target image frame to obtain a special effect target image frame.
第三方面,根据本公开的一个或多个实施例,提供了一种电子设备,包括:处理器,以及与所述处理器通信连接的存储器;In a third aspect, according to one or more embodiments of the present disclosure, there is provided an electronic device, comprising: a processor, and a memory communicatively connected to the processor;
所述存储器存储计算机执行指令;The memory stores computer-executable instructions;
所述处理器执行所述存储器存储的计算机执行指令,以实现如上第一方面以及第一方面各种可能的设计所述的图像生成方法。The processor executes the computer-executable instructions stored in the memory to implement the image generating method as described in the first aspect and various possible designs of the first aspect.
第四方面,根据本公开的一个或多个实施例,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第一方面以及第一方面各种可能的设计所述的图像生成方法。In a fourth aspect, according to one or more embodiments of the present disclosure, a computer-readable storage medium is provided, wherein the computer-readable storage medium stores computer execution instructions. When a processor executes the computer execution instructions, the image generation method described in the first aspect and various possible designs of the first aspect is implemented.
第五方面,本公开实施例提供一种计算机程序产品,包括计算机程序,该计算机程序被处理器执行时实现如上第一方面以及第一方面各种可能的设计所述的图像生成方法。In a fifth aspect, an embodiment of the present disclosure provides a computer program product, including a computer program, which, when executed by a processor, implements the image generation method described in the first aspect and various possible designs of the first aspect.
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a preferred embodiment of the present disclosure and an explanation of the technical principles used. Those skilled in the art should understand that the scope of disclosure involved in the present disclosure is not limited to the technical solutions formed by a specific combination of the above technical features, but should also cover other technical solutions formed by any combination of the above technical features or their equivalent features without departing from the above disclosed concept. For example, the above features are replaced with the technical features with similar functions disclosed in the present disclosure (but not limited to) by each other.
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。In addition, although each operation is described in a specific order, this should not be understood as requiring these operations to be performed in the specific order shown or in a sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Similarly, although some specific implementation details are included in the above discussion, these should not be interpreted as limiting the scope of the present disclosure. Some features described in the context of a separate embodiment can also be implemented in a single embodiment in combination. On the contrary, the various features described in the context of a single embodiment can also be implemented in multiple embodiments individually or in any suitable sub-combination mode.
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特 征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。 Although the subject matter has been described in language specific to structural features and/or methodological logical acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features described above. Rather, the specific features and acts described above are merely example forms of implementing the claims.

Claims (16)

  1. 一种图像生成方法,包括:A method for generating an image, comprising:
    获取至少两个原始图像;Obtain at least two original images;
    对至少两个所述原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧;Performing image style migration on at least two of the original images respectively to obtain corresponding target image frames having the target image style;
    根据至少两个所述原始图像对应的图像内容,确定目标拼图布局;Determining a target puzzle layout according to image contents corresponding to at least two of the original images;
    将至少两个所述目标图像帧按照目标拼图布局进行组合,生成目标拼图。At least two of the target image frames are combined according to a target puzzle layout to generate a target puzzle.
  2. 根据权利要求1所述的方法,所述获取至少两个原始图像,包括:According to the method of claim 1, the acquiring of at least two original images comprises:
    获取素材数据,所述素材数据包括视频和/或图片集;Acquiring material data, wherein the material data includes a video and/or a picture set;
    根据所述素材数据中的素材图像的图像内容,对所述素材数据进行抽帧,得到至少两个所述原始图像,所述素材图像为所述视频中的视频帧和/或所述图片集中的图片。According to the image content of the material images in the material data, the material data is frame extracted to obtain at least two original images, and the material images are video frames in the video and/or pictures in the picture set.
  3. 根据权利要求2所述的方法,根据所述素材数据中的素材图像的图像内容,对所述素材数据进行抽帧,得到至少两个所述原始图像,包括:The method according to claim 2, extracting frames of the material data according to the image content of the material image in the material data to obtain at least two of the original images, comprises:
    基于所述素材图像的图像内容,获取所述素材图像对应的姿态相似度,所述姿态相似度表征所述图像内容中的人物元素的姿态与目标姿态的相似度;Based on the image content of the material image, obtaining a posture similarity corresponding to the material image, wherein the posture similarity represents a similarity between a posture of a human element in the image content and a target posture;
    根据各所述素材图像对应的姿态相似度,确定至少两个原始图像。At least two original images are determined according to the posture similarities corresponding to the material images.
  4. 根据权利要求3所述的方法,根据各所述素材图像对应的姿态相似度,确定至少两个原始图像,包括:The method according to claim 3, determining at least two original images according to the posture similarities corresponding to the material images, comprises:
    将所述姿态相似度大于第一相似度阈值的素材图像,确定为所述原始图像,和/或,Determine the material image whose posture similarity is greater than a first similarity threshold as the original image, and/or,
    将所述姿态相似度小于第二相似度阈值的素材图像,确定为所述原始图像;Determine the material image whose posture similarity is less than a second similarity threshold as the original image;
    其中,所述第一相似度阈值大于所述第二相似度阈值。The first similarity threshold is greater than the second similarity threshold.
  5. 根据权利要求2所述的方法,所述方法还包括:The method according to claim 2, further comprising:
    获取所述素材图像对应的动态清晰度;Obtaining the dynamic definition corresponding to the material image;
    根据所述素材数据中的素材图像的图像内容,对所述素材数据进行抽帧,得到至少两个所述原始图像,包括;Extracting frames of the material data according to image contents of the material images in the material data to obtain at least two original images, including:
    基于各所述素材图像的图像内容和对应的动态清晰度,得到至少两个所述原始图像。 At least two original images are obtained based on the image content and the corresponding dynamic definition of each of the material images.
  6. 根据权利要求1所述的方法,对至少两个所述原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧,包括:According to the method of claim 1, performing image style transfer on at least two of the original images respectively to obtain corresponding target image frames having the target image style, comprising:
    获取所述目标图像风格对应的风格迁移模型;Obtaining a style transfer model corresponding to the target image style;
    基于所述风格迁移模型处理所述原始图像,得到所述原始图像对应的目标图像帧。The original image is processed based on the style transfer model to obtain a target image frame corresponding to the original image.
  7. 根据权利要求6所述的方法,基于所述风格迁移模型处理所述原始图像,得到所述原始图像对应的目标图像帧,包括:According to the method of claim 6, processing the original image based on the style transfer model to obtain a target image frame corresponding to the original image comprises:
    基于所述原始图像的图像内容,确定所述原始图像中的目标图像元素;Determining a target image element in the original image based on image content of the original image;
    围绕所述目标图像元素进行边缘剪裁,得到包含所述目标图像元素的剪裁图像,其中,所述目标图像元素在所述剪裁图像中的图像区域占比大于所述目标图像元素在所述原始图像中的图像区域占比;Perform edge cropping around the target image element to obtain a cropped image containing the target image element, wherein the image area proportion of the target image element in the cropped image is greater than the image area proportion of the target image element in the original image;
    基于所述目标图像风格对应的风格迁移模型对各所述剪裁图像进行风格迁移,得到所述原始图像对应的目标图像帧。Style transfer is performed on each of the cropped images based on a style transfer model corresponding to the target image style to obtain a target image frame corresponding to the original image.
  8. 根据权利要求1所述的方法,在对至少两个所述原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧之前,还包括:The method according to claim 1, before performing image style migration on at least two of the original images to obtain corresponding target image frames having the target image style, further comprises:
    基于至少两个所述原始图像对应的图像内容,确定所述目标图像风格。The target image style is determined based on image contents corresponding to at least two of the original images.
  9. 根据权利要求1所述的方法,所述目标拼图包括至少两个拼图区域,每一拼图区域用于显示一个对应的所述目标图像帧,所述目标拼图布局表征所述目标拼图中的拼图区域的大小和/或位置。According to the method of claim 1, the target puzzle includes at least two puzzle areas, each puzzle area is used to display a corresponding target image frame, and the target puzzle layout represents the size and/or position of the puzzle area in the target puzzle.
  10. 根据权利要求1所述的方法,根据至少两个所述原始图像对应的图像内容,确定目标拼图布局,包括:The method according to claim 1, determining a target puzzle layout according to image contents corresponding to at least two of the original images, comprises:
    据所述至少两个原始图像对应的图像内容,得到上下文信息,所述上下文信息表征所述至少两个初始图像对应的图像内容之间的上下文关系;Obtaining context information according to image contents corresponding to the at least two original images, wherein the context information represents a contextual relationship between image contents corresponding to the at least two initial images;
    根据所述上下文信息,确定目标拼图布局。According to the context information, a target puzzle layout is determined.
  11. 根据权利要求1所述的方法,所述方法还包括:The method according to claim 1, further comprising:
    在所述目标图像帧中添加特效标识,得到特效目标图像帧,其中,所述特效标识是基于所述目标图像帧的图像内容确定的。A special effect identifier is added to the target image frame to obtain a special effect target image frame, wherein the special effect identifier is determined based on the image content of the target image frame.
  12. 根据权利要求11所述的方法,在所述目标图像帧中添加特效标识,得到特效目标图像帧,包括:The method according to claim 11, adding a special effect identifier to the target image frame to obtain a special effect target image frame, comprises:
    对所述目标图像帧中的人物元素进行面部特征检测,得到对应的面部表 情特征;Perform facial feature detection on the human elements in the target image frame to obtain the corresponding facial representation emotional characteristics;
    基于所述面部表情特征,确定对应的目标特效标识;Based on the facial expression features, determining a corresponding target special effect identifier;
    确定所述目标特效标识的目标显示位置;Determining a target display position of the target special effect mark;
    基于所述目标显示位置,在所述目标图像帧中添加所述目标特效标识,得到特效目标图像帧。Based on the target display position, the target special effect identifier is added to the target image frame to obtain a special effect target image frame.
  13. 一种图像生成装置,包括:An image generating device, comprising:
    获取模块,用于获取至少两个原始图像;An acquisition module, used for acquiring at least two original images;
    迁移模块,用于对至少两个所述原始图像分别进行图像风格迁移,得到对应的具有目标图像风格的目标图像帧;A migration module, used to perform image style migration on at least two of the original images respectively to obtain corresponding target image frames with target image style;
    组合模块,用于根据至少两个所述原始图像对应的图像内容,确定目标拼图布局,并将至少两个所述目标图像帧按照目标拼图布局进行组合,生成目标拼图。The combination module is used to determine a target puzzle layout according to image contents corresponding to at least two of the original images, and to combine at least two of the target image frames according to the target puzzle layout to generate a target puzzle.
  14. 一种电子设备,包括:处理器,以及与所述处理器通信连接的存储器;An electronic device comprises: a processor, and a memory communicatively connected to the processor;
    所述存储器存储计算机执行指令;The memory stores computer-executable instructions;
    所述处理器执行所述存储器存储的计算机执行指令,以实现如权利要求1至12中任一项所述的图像生成方法。The processor executes the computer-executable instructions stored in the memory to implement the image generating method according to any one of claims 1 to 12.
  15. 一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1至12任一项所述的图像生成方法。A computer-readable storage medium having computer-executable instructions stored therein, wherein when a processor executes the computer-executable instructions, the image generation method according to any one of claims 1 to 12 is implemented.
  16. 一种计算机程序产品,包括计算机程序,该计算机程序被处理器执行时实现权利要求1至12中任一项所述的图像生成方法。 A computer program product comprises a computer program, wherein when the computer program is executed by a processor, the image generation method according to any one of claims 1 to 12 is implemented.
PCT/CN2023/132440 2022-11-18 2023-11-17 Image generation method and apparatus, electronic device, and storage medium WO2024104477A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211449865.7 2022-11-18
CN202211449865.7A CN118071577A (en) 2022-11-18 2022-11-18 Image generation method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2024104477A1 true WO2024104477A1 (en) 2024-05-23

Family

ID=91083871

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/132440 WO2024104477A1 (en) 2022-11-18 2023-11-17 Image generation method and apparatus, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN118071577A (en)
WO (1) WO2024104477A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112215854A (en) * 2020-10-19 2021-01-12 珠海金山网络游戏科技有限公司 Image processing method and device
US20210125372A1 (en) * 2019-10-24 2021-04-29 Microsoft Technology Licensing, Llc Prior informed pose and scale estimation
CN113012082A (en) * 2021-02-09 2021-06-22 北京字跳网络技术有限公司 Image display method, apparatus, device and medium
CN113590854A (en) * 2021-09-29 2021-11-02 腾讯科技(深圳)有限公司 Data processing method, data processing equipment and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210125372A1 (en) * 2019-10-24 2021-04-29 Microsoft Technology Licensing, Llc Prior informed pose and scale estimation
CN112215854A (en) * 2020-10-19 2021-01-12 珠海金山网络游戏科技有限公司 Image processing method and device
CN113012082A (en) * 2021-02-09 2021-06-22 北京字跳网络技术有限公司 Image display method, apparatus, device and medium
CN113590854A (en) * 2021-09-29 2021-11-02 腾讯科技(深圳)有限公司 Data processing method, data processing equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN118071577A (en) 2024-05-24

Similar Documents

Publication Publication Date Title
CN108010112B (en) Animation processing method, device and storage medium
CN109688463B (en) Clip video generation method and device, terminal equipment and storage medium
CN109729426B (en) Method and device for generating video cover image
CN110968736B (en) Video generation method and device, electronic equipment and storage medium
CN111696176B (en) Image processing method, image processing device, electronic equipment and computer readable medium
US20220350842A1 (en) Video tag determination method, terminal, and storage medium
US20140361974A1 (en) Karaoke avatar animation based on facial motion data
CN108846886B (en) AR expression generation method, client, terminal and storage medium
EP4243398A1 (en) Video processing method and apparatus, electronic device, and storage medium
CN110070496B (en) Method and device for generating image special effect and hardware device
WO2021254502A1 (en) Target object display method and apparatus and electronic device
WO2019114328A1 (en) Augmented reality-based video processing method and device thereof
JP7209851B2 (en) Image deformation control method, device and hardware device
CN113453040A (en) Short video generation method and device, related equipment and medium
CN112035046B (en) Method and device for displaying list information, electronic equipment and storage medium
CN109600559B (en) Video special effect adding method and device, terminal equipment and storage medium
CN103997687A (en) Techniques for adding interactive features to videos
CN114331820A (en) Image processing method, image processing device, electronic equipment and storage medium
Tolosana et al. An introduction to digital face manipulation
CN112785669B (en) Virtual image synthesis method, device, equipment and storage medium
CN114697759A (en) Virtual image video generation method and system, electronic device and storage medium
CN110619602B (en) Image generation method and device, electronic equipment and storage medium
WO2024104477A1 (en) Image generation method and apparatus, electronic device, and storage medium
CN114697568B (en) Special effect video determining method and device, electronic equipment and storage medium
CN113761281B (en) Virtual resource processing method, device, medium and electronic equipment