KR20230163907A

KR20230163907A - Systen and method for constructing converting model for cartoonizing image into character image, and image converting method using the converting model

Info

Publication number: KR20230163907A
Application number: KR1020220115390A
Authority: KR
Inventors: 김승권; 백지혜; 안남혁; 이광호; 곽채헌; 김도현
Original assignee: 네이버웹툰 유한회사
Priority date: 2022-05-24
Filing date: 2022-09-14
Publication date: 2023-12-01

Abstract

타겟 이미지를 카툰화된 캐릭터 이미지로 변환하는 변환 모델을 구축하는 방법으로서, 배경과 얼굴을 각각 포함하는 복수의 이미지들과 복수의 이미지들에 대응하는 특정 캐릭터로 카툰화된 캐릭터 이미지들의 쌍으로 구성되는 학습용 이미지 데이터 셋을 생성하고, 생성된 학습용 이미지 데이터 셋을 학습함으로써 변환 모델을 구축하는 방법이 제공된다. A method of constructing a conversion model that converts a target image into a cartoonized character image, consisting of a pair of images including a background and a face, respectively, and character images cartoonized with a specific character corresponding to the plurality of images. A method of building a transformation model is provided by creating a training image data set and learning the generated training image data set.

Description

A method and system for constructing a conversion model that cartoonizes an image into a character image, and an image conversion method using the conversion model {SYSTEN AND METHOD FOR CONSTRUCTING CONVERTING MODEL FOR CARTOONIZING IMAGE INTO CHARACTER IMAGE, AND IMAGE CONVERTING METHOD USING THE CONVERTING MODEL}

본 개시는 이미지를 캐릭터 이미지로 카툰화시키는 변환 모델의 구축 방법 및 시스템과 해당 변환 모델을 사용하여 타겟 이미지를 변환하는 방법에 관한 것으로, 타겟 이미지의 얼굴 뿐만아니라 배경도 함께 카툰화된 캐릭터 이미지로 변환할 수 있도록 하는 변환 모델을 구축하는 방법 및 시스템과 관련된다.The present disclosure relates to a method and system for constructing a conversion model for cartooning an image into a character image and a method for converting a target image using the conversion model, where not only the face but also the background of the target image is converted into a cartoonized character image. It is related to a method and system for building a conversion model that allows it to be done.

만화, 카툰 또는 웹툰 서비스와 같이 이미지 형태의 콘텐츠를 온라인으로 제공하는 서비스에 대한 관심이 높아지고 있다. 이러한 콘텐츠에 대한 관심이 높아짐에 따라, 사진과 같은 화상이나 프레임으로 구성된 영상(video)을 콘텐츠의 화풍이나 스타일로 변환하는 기술에 대한 관심도 증가하고 있다. Interest in services that provide image-type content online, such as comics, cartoons, or webtoon services, is increasing. As interest in such content increases, interest in technology for converting video consisting of images or frames such as photographs into the style or style of the content is also increasing.

이러한 기술을 통해서는, 사용자를 촬영한 실사의 화상이나 영상(이하, 이미지라고 함)이 특정한 웹툰 콘텐츠의 화풍이나 스타일로 변환될 수 있다. 이러한 변환은 이미지를 카툰화(cartoonization)하는 것으로 명명될 수 있다. 변환된, 즉, 카툰화된 이미지는 웹툰 콘텐츠와 유사한 색감이나 텍스쳐(texture)를 가지게 될 수 있다. 이에 따라, 사용자는 자신의 이미지를 소기의 웹툰 콘텐츠와 유사한 형태로 가공할 수 있고, 따라서, 자신이 웹툰 콘텐츠를 생성하거나 웹툰 콘텐츠에 참여하는 느낌을 가질 수 있다. Through this technology, real-life images or videos (hereinafter referred to as images) captured by users can be converted into the painting style or style of specific webtoon content. This transformation may be named cartoonization of the image. The converted, that is, cartoonized image may have a color or texture similar to webtoon content. Accordingly, the user can process his or her image into a form similar to the desired webtoon content, and thus can feel that he or she is creating webtoon content or participating in webtoon content.

이미지를 카툰화된 이미지로 변환함에 있어서는, 이미지에 포함된 얼굴을 콘텐츠의 캐릭터로 변환하고, 이미지의 배경을 해당 콘텐츠의 배경의 화풍이나 스타일로 변환할 것이 요구된다. When converting an image into a cartoonized image, it is required to convert the face included in the image into a character of the content and the background of the image to the painting style or style of the background of the content.

한국공개특허 제10-2009-0065354호(공개일 2009년 6월 21일)는, 2차원 이미지를 다양한 예술적 이미지로 변환시키는 방법에 관한 것으로, 디지털 카메라 등으로 촬영된 2차원 이미지에 대해 컴퓨터를 이용한 이미지 변환 처리를 통해 다양한 예술적 기법으로 이미지 변환이 가능하도록 하여 유화, 펜 일러스트, 카툰, 이중 그림, 템플릿 모자이크 등의 이미지를 제공하는 방법을 개시하고 있다. Korean Patent Publication No. 10-2009-0065354 (published on June 21, 2009) relates to a method of converting two-dimensional images into various artistic images. A method of providing images such as oil paintings, pen illustrations, cartoons, double paintings, and template mosaics is disclosed by enabling image conversion using various artistic techniques through image conversion processing.

상기에서 설명된 정보는 단지 이해를 돕기 위한 것이며, 종래 기술의 일부를 형성하지 않는 내용을 포함할 수 있으며, 종래 기술이 통상의 기술자에게 제시할 수 있는 것을 포함하지 않을 수 있다.The information described above is for illustrative purposes only and may include content that does not form part of the prior art and may not include what the prior art would suggest to a person skilled in the art.

일 실시예는, 타겟 이미지를 카툰화된 캐릭터 이미지로 변환하는 변환 모델을 구축하는 방법으로서, 배경과 얼굴을 각각 포함하는 복수의 이미지들과 복수의 이미지들에 대응하는 특정 캐릭터로 카툰화된 캐릭터 이미지들의 쌍으로 구성되는 학습용 이미지 데이터 셋을 생성하고, 생성된 학습용 이미지 데이터 셋을 학습함으로써 변환 모델을 구축하는 방법을 제공할 수 있다.One embodiment is a method of building a conversion model for converting a target image into a cartoonized character image, wherein a character is cartoonized as a plurality of images each including a background and a face, and a specific character corresponding to the plurality of images. A method of building a transformation model can be provided by generating a training image data set composed of pairs of images and learning the generated training image data set.

일 실시예는, 배경과 얼굴을 각각 포함하는 복수의 이미지들과 복수의 이미지들에 대응하는 특정 캐릭터로 카툰화된 캐릭터 이미지들의 쌍으로 구성되는 학습용 이미지 데이터 셋을 통해 학습된 변환 모델을 사용하여, 타겟 이미지를 배경과 얼굴이 카툰화된 캐릭터 이미지로 변환하는 방법을 제공할 수 있다. One embodiment uses a transformation model learned through a training image data set consisting of a pair of character images cartoonized with a specific character corresponding to a plurality of images each including a background and a face, and the plurality of images. , a method of converting the target image into a character image with a cartoonized background and face can be provided.

일 측면에 있어서, 컴퓨터 시스템에 의해 수행되는, 타겟 이미지를 카툰화된 캐릭터 이미지로 변환하는 변환 모델을 구축하는 방법에 있어서, 배경과 얼굴을 각각 포함하는 복수의 이미지들과 상기 복수의 이미지들에 대응하는 제1 캐릭터로 카툰화된 캐릭터 이미지들의 쌍으로 구성되는 학습용 이미지 데이터 셋을 생성하는 단계 및 생성된 상기 학습용 이미지 데이터 셋을 학습함으로써 상기 변환 모델을 구축하는 단계를 포함하고, 상기 학습용 이미지 데이터 셋을 생성하는 단계는, 얼굴 변환 모델을 사용하여, 상기 제1 캐릭터의 얼굴을 포함하는 복수의 학습용 얼굴 이미지들에 기반하여 상기 복수의 이미지들의 각 이미지의 얼굴을 상기 제1 캐릭터로 카툰화된 얼굴로 변환하는 단계, 배경 변환 모델을 사용하여, 상기 각 이미지의 배경을 카툰화된 배경으로 변환하는 단계 및 상기 카툰화된 얼굴과 상기 카툰화된 배경을 합성함으로써, 상기 각 이미지에 대응하는 캐릭터 이미지를 생성하는 단계를 포함하는, 변환 모델을 구축하는 방법이 제공된다. In one aspect, a method of constructing a conversion model for converting a target image into a cartoonized character image, performed by a computer system, includes a plurality of images each including a background and a face, and the plurality of images. A step of generating a training image data set consisting of a pair of character images cartoonized with a corresponding first character, and constructing the transformation model by learning the generated training image data set, wherein the training image data The step of generating the set includes cartoonizing the face of each image of the plurality of images to the first character based on a plurality of learning face images including the face of the first character using a face transformation model. Converting the background of each image into a cartoonized background using a background conversion model, and combining the cartoonized face and the cartoonized background to create a character corresponding to each image. A method for building a transformation model is provided, including generating an image.

상기 제1 캐릭터는 화상 또는 영상을 포함하는 콘텐츠에 포함된 것이고, 상기 학습용 이미지 데이터 셋을 생성하는 단계는, 상기 콘텐츠로부터 상기 제1 캐릭터의 얼굴을 포함하는 상기 학습용 얼굴 이미지들을 획득하는 단계를 포함할 수 있다. The first character is included in content including an image or video, and the step of generating the image data set for learning includes obtaining the learning facial images including the face of the first character from the content. can do.

상기 콘텐츠는 복수의 컷들을 포함하는 웹툰 콘텐츠이고, 상기 학습용 얼굴 이미지들을 획득하는 단계는, 상기 콘텐츠로부터 소정의 개수의 컷들을 추출하는 단계, 상기 컷들로부터, 상기 제1 캐릭터를 포함하는 적어도 하나의 캐릭터의 얼굴을 포함하는 제1 얼굴 이미지들을 추출하는 단계, 상기 제1 얼굴 이미지들 중 적어도 하나에 대해, 얼라인먼트(alignment), 리사이징(reziing) 및 해상도 변경 중 적어도 하나를 수행함으로써 보정된, 제2 얼굴 이미지들을 획득하는 단계 및 상기 제2 얼굴 이미지들을 캐릭터별로 클러스터링함으로써, 상기 제1 캐릭터의 얼굴을 포함하는 상기 학습용 얼굴 이미지들을 획득하는 단계를 포함할 수 있다. The content is webtoon content including a plurality of cuts, and the step of acquiring the face images for learning includes extracting a predetermined number of cuts from the content, and from the cuts, at least one including the first character. extracting first facial images including a face of a character, at least one of the first facial images being corrected by performing at least one of alignment, resizing, and resolution change; It may include acquiring face images and clustering the second face images by character, thereby obtaining the learning face images including the face of the first character.

상기 복수의 이미지들은 제1 개수이고, 상기 복수의 학습용 얼굴 이미지들은 상기 제1 개수보다 더 작은 제2 개수이고, 상기 얼굴 변환 모델은, 상기 제1 개수의 상기 복수의 이미지들과 상기 제2 개수의 상기 복수의 학습용 얼굴 이미지들에 기반하여, 상기 각 이미지의 얼굴과 매칭되는 상기 카툰화된 얼굴을 포함하는, 상기 제1 개수의 복수의 이미지들에 대응하는 상기 제1 개수의 얼굴 변환 이미지들을 생성하도록 구성될 수 있다. The plurality of images are a first number, the plurality of training face images are a second number smaller than the first number, and the face transformation model is the first number of the plurality of images and the second number. Based on the plurality of facial images for training, the first number of face transformation images corresponding to the first number of plurality of images, including the cartoonized face matching the face of each image. It can be configured to generate

상기 얼굴 변환 모델을 사용하여 변환된 상기 카툰화된 얼굴은, 상기 각 이미지의 얼굴과 얼굴의 방향, 구도 및 얼굴에 포함된 부위의 위치가 매칭될 수 있다. The cartoonized face converted using the face transformation model may be matched with the face in each image in terms of direction, composition, and location of parts included in the face.

상기 각 이미지의 얼굴을 상기 제1 캐릭터로 카툰화된 얼굴로 변환하는 단계는, 상기 복수의 학습용 얼굴 이미지들에 기반하여 상기 얼굴 변환 모델의 잠재 공간(latent space)을 정의하는 단계, 상기 잠재 공간에 기반하여 상기 각 이미지의 변환을 위한 변환 코드(inversion code)를 생성하는 단계 및 상기 변환 코드에 기반하여, 상기 각 이미지에 대응하는 얼굴 변환 이미지를 생성하는 단계를 포함할 수 있다. The step of converting the face of each image into a cartoonized face as the first character includes defining a latent space of the face conversion model based on the plurality of face images for learning, the latent space It may include generating an inversion code for conversion of each image based on and generating a face conversion image corresponding to each image based on the conversion code.

상기 제1 캐릭터는 화상 또는 영상을 포함하는 콘텐츠에 포함된 것이고, 상기 배경 변환 모델은, 상기 콘텐츠로부터 추출된 캐릭터가 포함되지 않은 이미지들에 기반하여 학습된 것이고, 상기 배경 변환 모델은, 상기 각 이미지의 배경을 상기 콘텐츠의 카툰화된 배경으로 변환함으로써, 배경 변환 이미지를 생성하도록 구성될 수 있다. The first character is included in content including an image or video, the background transformation model is learned based on images that do not contain the character extracted from the content, and the background transformation model is It may be configured to generate a background conversion image by converting the background of the image to a cartoonized background of the content.

상기 배경 변환 모델은, 배경만을 포함하는 실사 이미지들에 더 기반하여 학습된 것일 수 있다. The background transformation model may be learned based on real-world images containing only the background.

상기 각 이미지에 대응하는 캐릭터 이미지를 생성하는 단계는, 상기 배경 변환 이미지에 상기 카툰화된 얼굴을 합성함으로써, 상기 각 이미지에 대응하는 캐릭터 이미지를 생성할 수 있다. In the step of generating a character image corresponding to each image, the character image corresponding to each image may be generated by combining the cartoonized face with the background conversion image.

상기 변환 모델을 구축하는 방법은, 구축된 상기 변환 모델에 대해, 배경과 얼굴을 포함하는 상기 타겟 이미지를 입력하는 단계, 상기 변환 모델에 기반한 추론에 따라, 상기 타겟 이미지의 얼굴이 상기 제1 캐릭터로 카툰화된 얼굴로 변환되고, 상기 타겟 이미지의 배경이 상기 제1 캐릭터를 포함하는 콘텐츠의 카툰화된 배경으로 변환된 변환 이미지를 생성하는 단계 및 상기 변환 이미지를 출력하는 단계를 더 포함할 수 있다. The method of building the transformation model includes inputting the target image including a background and a face to the constructed transformation model, and according to inference based on the transformation model, the face of the target image is the first character. The method may further include generating a converted image in which the face is converted to a cartoonized face, and the background of the target image is converted to the cartoonized background of the content including the first character, and outputting the converted image. there is.

상기 제1 캐릭터는 화상 또는 영상을 포함하는 콘텐츠에 포함된 것이고, 상기 콘텐츠는 제1 캐릭터를 포함하는 복수의 캐릭터들을 포함하고, 상기 생성하는 단계는, 상기 복수의 이미지들과 상기 복수의 이미지들에 대응하는 상기 복수의 캐릭터들의 각 캐릭터로 카툰화된 캐릭터 이미지들의 쌍으로 구성되는 학습용 이미지 데이터 셋을 생성하고, 상기 변환 모델은 상기 타겟 이미지를 상기 각 캐릭터로 카툰화된 캐릭터 이미지로 변환 가능하도록 구축될 수 있다. The first character is included in content including an image or video, the content includes a plurality of characters including the first character, and the generating step includes the plurality of images and the plurality of images. Create a learning image data set consisting of a pair of character images cartoonized with each character of the plurality of characters corresponding to, and the conversion model is configured to convert the target image into a character image cartoonized with each character It can be built.

상기 제1 캐릭터는 화상 또는 영상을 포함하는 콘텐츠에 포함된 것이고, 상기 콘텐츠는 제1 캐릭터를 포함하는 복수의 캐릭터들을 포함하고, 상기 변환 모델은 상기 타겟 이미지를 상기 각 캐릭터로 카툰화된 캐릭터 이미지로 변환 가능하도록 구축되고, 상기 복수의 캐릭터들 중 상기 타겟 이미지를 변환할 캐릭터를 선택하기 위한 사용자 인터페이스(UI)를 제공하는 단계를 더 포함하고, 상기 제1 캐릭터는 상기 UI를 통해 선택된 캐릭터일 수 있다. The first character is included in content including an image or video, the content includes a plurality of characters including the first character, and the conversion model converts the target image into a cartoonized character image for each character. and further comprising providing a user interface (UI) for selecting a character to convert the target image from among the plurality of characters, wherein the first character is a character selected through the UI. You can.

상기 타겟 이미지는 복수의 프레임들을 포함하는 영상(video)이고, 상기 변환 이미지는 상기 영상에 포함된 얼굴과 배경이 상기 카툰화된 얼굴과 상기 카툰화된 배경으로 각각 변환된 변환 영상일 수 있다. The target image is a video including a plurality of frames, and the converted image may be a converted image in which the face and background included in the video are converted into the cartoonized face and the cartoonized background, respectively.

다른 일 측면에 있어서, 타겟 이미지를 카툰화된 캐릭터 이미지로 변환하는 변환 모델을 구축하는 컴퓨터 시스템에 있어서, 상기 컴퓨터 시스템에서 판독 가능한 명령을 실행하도록 구현되는 적어도 하나의 프로세서를 포함하고, 상기 적어도 하나의 프로세서는, 배경과 얼굴을 각각 포함하는 복수의 이미지들과 상기 복수의 이미지들에 대응하는 제1 캐릭터로 카툰화된 캐릭터 이미지들의 쌍으로 구성되는 학습용 이미지 데이터 셋을 생성하고, 생성된 상기 학습용 이미지 데이터 셋을 학습함으로써 상기 변환 모델을 구축하고, 상기 적어도 하나의 프로세서는, 상기 학습용 이미지 데이터 셋을 생성함에 있어서, 얼굴 변환 모델을 사용하여, 상기 제1 캐릭터의 얼굴을 포함하는 복수의 학습용 얼굴 이미지들에 기반하여 상기 복수의 이미지들의 각 이미지의 얼굴을 상기 제1 캐릭터로 카툰화된 얼굴로 변환하고, 배경 변환 모델을 사용하여, 상기 각 이미지의 배경을 카툰화된 배경으로 변환하고, 상기 카툰화된 얼굴과 상기 카툰화된 배경을 합성함으로써, 상기 각 이미지에 대응하는 캐릭터 이미지를 생성하는, 컴퓨터 시스템이 제공된다. In another aspect, a computer system for building a conversion model for converting a target image into a cartoonized character image, comprising at least one processor implemented to execute instructions readable by the computer system, and the at least one The processor generates a training image data set consisting of a pair of a plurality of images each including a background and a face and cartoonized character images with a first character corresponding to the plurality of images, and the generated training image data set. The transformation model is constructed by learning an image data set, and the at least one processor uses a face transformation model to generate the training image data set, and generates a plurality of learning faces including the face of the first character. Based on the images, the face of each image of the plurality of images is converted into a cartoonized face as the first character, and the background of each image is converted into a cartoonized background using a background conversion model, A computer system is provided that generates a character image corresponding to each image by combining a cartoonized face and the cartoonized background.

또 다른 일 측면에 있어서, 컴퓨터 시스템에 의해 수행되는, 타겟 이미지를 카툰화된 캐릭터 이미지로 변환하는 이미지 변환 방법에 있어서, 배경과 얼굴을 각각 포함하는 복수의 이미지들과 상기 복수의 이미지들에 대응하는 제1 캐릭터로 카툰화된 캐릭터 이미지들의 쌍으로 구성되는 학습용 이미지 데이터 셋을 학습함으로써 구축된 변환 모델에 대해, 배경과 얼굴을 포함하는 상기 타겟 이미지를 입력하는 단계, 상기 변환 모델에 기반한 추론에 따라, 상기 타겟 이미지의 얼굴이 상기 제1 캐릭터로 카툰화된 얼굴로 변환되고, 상기 타겟 이미지의 배경이 상기 제1 캐릭터를 포함하는 콘텐츠의 카툰화된 배경으로 변환된 변환 이미지를 생성하는 단계 및 상기 변환 이미지를 출력하는 단계를 포함하고, 상기 변환 모델의 상기 학습용 이미지 데이터 셋을 구성하는 상기 복수의 이미지들의 각 이미지에 대응하는 캐릭터 이미지는, 얼굴 변환 모델을 사용하여, 상기 각 이미지의 얼굴을 상기 제1 캐릭터의 얼굴을 포함하는 복수의 학습용 얼굴 이미지들에 기반하여 상기 제1 캐릭터로 카툰화된 얼굴로 변환한 결과와, 배경 변환 모델을 사용하여, 상기 각 이미지의 배경을 카툰화된 배경으로 변환한 결과가 합성됨으로써 생성된 것인, 이미지 변환 방법이 제공된다. In another aspect, in an image conversion method performed by a computer system for converting a target image into a cartoonized character image, a plurality of images each including a background and a face and corresponding to the plurality of images Inputting the target image including a background and a face to a conversion model built by learning a training image data set consisting of a pair of cartoonized character images as a first character, and inference based on the conversion model Accordingly, generating a converted image in which the face of the target image is converted to a cartoonized face of the first character, and the background of the target image is converted to a cartoonized background of content including the first character; and outputting the converted image, wherein the character image corresponding to each image of the plurality of images constituting the training image data set of the conversion model is a face of each image using a face conversion model. The result of converting the first character into a cartoonized face based on a plurality of learning face images including the face of the first character, and the background of each image into a cartoonized background using a background conversion model An image conversion method is provided, which is generated by compositing the result of conversion.

또 다른 일 측면에 있어서, 컴퓨터 시스템에 의해 수행되는, 타겟 이미지를 카툰화된 캐릭터 이미지로 변환하는 변환 모델을 위한 학습용 이미지 데이터 셋을 생성하는 방법에 있어서, 얼굴 변환 모델을 사용하여, 제1 캐릭터의 얼굴을 포함하는 복수의 학습용 얼굴 이미지들에 기반하여, 배경과 얼굴을 각각 포함하는 복수의 이미지들의 각 이미지의 얼굴을 상기 제1 캐릭터로 카툰화된 얼굴로 변환하는 단계, 배경 변환 모델을 사용하여, 상기 각 이미지의 배경을 카툰화된 배경으로 변환하는 단계, 상기 카툰화된 얼굴과 상기 카툰화된 배경을 합성함으로써, 상기 각 이미지에 대응하는 캐릭터 이미지를 생성하는 단계 및 상기 각 이미지와 상기 캐릭터 이미지의 쌍으로 구성되는, 상기 복수의 이미지들과 상기 복수의 이미지들에 대응하는 상기 제1 캐릭터로 카툰화된 캐릭터 이미지들의 쌍들을 학습용 이미지 데이터 셋으로서 생성하는 단계를 포함하는, 학습용 이미지 데이터 셋을 생성하는 방법이 제공된다.In another aspect, in a method of generating a training image data set for a transformation model for converting a target image into a cartoonized character image, performed by a computer system, using a face transformation model, the first character Based on a plurality of learning face images including the face, converting the face of each image of the plurality of images including the background and the face into a cartoonized face as the first character, using a background conversion model Thus, converting the background of each image into a cartoonized background, combining the cartoonized face and the cartoonized background to generate a character image corresponding to each image, and each image and the cartoonized background. Image data for learning, comprising the step of generating pairs of character images cartoonized with the first character corresponding to the plurality of images and the plurality of images, which are composed of pairs of character images, as a learning image data set. A method for creating sets is provided.

얼굴 변환 모델을 사용하여 다수의 실사 이미지들에 대응하는 캐릭터로 카툰화된 얼굴 변환 이미지들을 생성하고, 배경 변환 모델을 사용하여 카툰화된 배경 변환 이미지들을 생성하여, 얼굴 변환 이미지와 배경 변환 이미지를 합성하는 것을 통해 카툰화된 캐릭터 이미지를 생성할 수 있고, 실사 이미지와 실사 이미지에 대응하는 캐릭터 이미지의 쌍으로 구성되는 학습용 이미지 데이터 셋을 변환 모델을 위해 생성할 수 있다. Using a face transformation model, cartoonized face transformation images are created with characters corresponding to multiple real-life images, and cartoonized background transformation images are created using a background transformation model to create face transformation images and background transformation images. Through compositing, a cartoonized character image can be created, and a learning image data set consisting of a pair of real-life images and character images corresponding to the real-life images can be created for the transformation model.

실사 이미지와 실사 이미지에 대응하는 캐릭터 이미지의 쌍으로 구성된 학습용 이미지 데이터 셋으로 변환 모델을 학습시킴으로써, 구축된 변환 모델을 통해, 타겟 이미지의 얼굴과 배경이 모두 특정한 콘텐츠의 화풍 또는 스타일로 카툰화된 변환 이미지가 생성될 수 있다. 따라서, 배경과 얼굴에 이질감이 없는 변환 이미지(캐릭터 이미지)가 생성될 수 있다.By learning a conversion model with a learning image data set consisting of a pair of real-life images and character images corresponding to the real-life images, through the constructed conversion model, both the face and background of the target image are cartoonized in the style or style of a specific content. A converted image may be created. Therefore, a converted image (character image) with no sense of heterogeneity in the background and face can be created.

도 1은 일 실시예에 따른, 구축된 변환 모델을 사용하여, 타겟 이미지를 배경과 얼굴이 카툰화된 캐릭터 이미지로 변환하는 방법을 나타낸다.
도 2는 일 실시예에 따른, 타겟 이미지를 배경과 얼굴이 카툰화된 캐릭터 이미지로 변환하기 위한 변환 모델을 구축하는 컴퓨터 시스템을 나타낸다.
도 3은 일 실시예에 따른, 타겟 이미지를 배경과 얼굴이 카툰화된 캐릭터 이미지로 변환하기 위한 변환 모델을 구축하는 방법을 나타내는 흐름도이다.
도 4는 일 예에 따른, 변환 모델의 구축을 위한 학습용 얼굴 이미지들을 획득하는 방법을 나타내는 흐름도이다.
도 5는 일 예에 따른, 변환 모델의 구축을 위한 학습용 이미지 데이터 셋을 생성하기 위해, 얼굴 변환 모델을 사용하여 각 실사 이미지에 대응하는 얼굴 변환 이미지를 생성하는 방법을 나타내는 흐름도이다.
도 6은 일 실시예에 따른, 구축된 변환 모델을 사용하여, 타겟 이미지를 배경과 얼굴이 카툰화된 캐릭터 이미지로 변환하는 방법을 나타내는 흐름도이다.
도 7 내지 도 10은 일 예에 따른, 타겟 이미지를 배경과 얼굴이 카툰화된 캐릭터 이미지로 변환하기 위한 변환 모델을 구축하고, 구축된 변환 모델을 사용하여 타겟 이미지를 배경과 얼굴이 카툰화된 캐릭터 이미지로 변환하는 방법을 나타내는 흐름도이다.
도 11은 일 예에 따른, 변환 모델의 구축을 위한 학습용 이미지 데이터 셋을 생성하기 위해, 각 실사 이미지에 대응하여 얼굴과 배경이 카툰화된 캐릭터 이미지를 생성하는 방법을 나타낸다.
도 12는 일 예에 따른, 얼굴 변환 모델을 사용하여, 캐릭터의 얼굴을 포함하는 제2 개수의 학습용 얼굴 이미지들에 기반하여, 제1 개수의 실사 이미지들을 대응하는 카툰화된 제1 개수의 얼굴 변환 이미지들로 변환하는 방법을 나타낸다. Figure 1 shows a method of converting a target image into a character image with a cartoonized background and face using a constructed conversion model, according to one embodiment.
Figure 2 shows a computer system that builds a conversion model for converting a target image into a character image with a cartoonized background and face, according to one embodiment.
Figure 3 is a flowchart showing a method of building a conversion model for converting a target image into a character image with a cartoonized background and face, according to an embodiment.
Figure 4 is a flowchart showing a method of acquiring facial images for learning to build a transformation model, according to an example.
FIG. 5 is a flowchart illustrating a method of generating a face transformation image corresponding to each real-life image using a face transformation model to generate a training image data set for building a transformation model, according to an example.
Figure 6 is a flowchart showing a method of converting a target image into a character image with a cartoonized background and face using a constructed conversion model, according to an embodiment.
7 to 10 show, according to an example, constructing a conversion model for converting a target image into a character image with a cartoonized background and face, and converting the target image into a character image with a cartoonized background and face using the constructed conversion model. This is a flowchart showing how to convert to a character image.
FIG. 11 illustrates a method of generating a character image with a cartoonized face and background corresponding to each real-life image to generate a learning image data set for building a transformation model, according to an example.
12 illustrates a first number of cartoonized faces corresponding to a first number of real-life images, based on a second number of training facial images including faces of characters, using a facial transformation model, according to an example. Indicates how to convert converted images.

이하, 본 발명의 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the attached drawings.

도 1은 일 실시예에 따른, 구축된 변환 모델을 사용하여, 타겟 이미지를 배경과 얼굴이 카툰화된 캐릭터 이미지로 변환하는 방법을 나타낸다.Figure 1 shows a method of converting a target image into a character image with a cartoonized background and face using a constructed conversion model, according to one embodiment.

컴퓨터 시스템(100)은 타겟 이미지(10)를 배경과 얼굴이 카툰화된 캐릭터 이미지(50)로 변환하기 위한 변환 모델을 구축할 수 있다. 이러한 구축된 변환 모델에 타겟 이미지(10)가 입력되면, 타겟 이미지(10)의 얼굴(20)은 변환 이미지(50)의 캐릭터의 얼굴(60)로 변환되고 타겟 이미지(10)의 배경(30)은 변환 이미지(50)의 배경(70)으로 변환될 수 있다. The computer system 100 may build a conversion model for converting the target image 10 into a character image 50 in which the background and face are cartoonized. When the target image 10 is input to this constructed conversion model, the face 20 of the target image 10 is converted to the face 60 of the character in the conversion image 50 and the background 30 of the target image 10 ) can be converted into the background 70 of the converted image 50.

타겟 이미지(10)는 실사 이미지일 수 있다. "이미지"는 화상(picture) 또는 프레임으로 구성되는 영상(image)을 포괄할 수 있다. 말하자면, 변환 모델은 입력된 실사의 화상 또는 영상을 카툰화된 변환 이미지(50)로 변환할 수 있다. The target image 10 may be a real image. “Image” can encompass a picture or an image consisting of a frame. In other words, the conversion model can convert an input real image or video into a cartoonized conversion image 50.

변환 모델은 머신러닝 또는 인공지능 기반의 모델일 수 있고, 일례로, 딥러닝(DNN) 또는 기타 인공 신경망(CNN, ANN, RNN 등) 기반의 모델일 수 있다The transformation model may be a machine learning or artificial intelligence-based model, for example, a model based on deep learning (DNN) or other artificial neural networks (CNN, ANN, RNN, etc.)

변환 이미지(50)는 타겟 이미지(10)의 카툰화된 이미지로서 후술될 상세한 설명에서는 "캐릭터 이미지"로 명명될 수 있다. 말하자면, 변환 이미지(50)는 타겟 이미지(10)의 얼굴(20)과 배경(30)을 카툰화 한 이미지일 수 있다.The converted image 50 is a cartoonized image of the target image 10 and may be referred to as a “character image” in the detailed description to be given later. In other words, the converted image 50 may be a cartoonized image of the face 20 and background 30 of the target image 10.

타겟 이미지(10)의 "카툰화(cartoonization)"란 특정한 콘텐츠의 화풍이나 스타일로 타겟 이미지(10)를 변환하는 것일 수 있다. 콘텐츠는 화상 또는 영상을 포함하는 것일 수 있고, 콘텐츠에 포함되는 화상 또는 영상은 적어도 하나의 캐릭터를 포함할 수 있다. 실시예의 변환 모델은 타겟 이미지(10)의 얼굴(20)을 콘텐츠의 캐릭터로 카툰화하고, 타겟 이미지(10)의 배경(20)을 콘텐츠의 배경으로 카툰화함으로써 변환 이미지(50)를 생성할 수 있다. 이에 따라, 변환 이미지(50)는 상기 콘텐츠와 유사한 색감이나 텍스쳐(texture)를 가지게 될 수 있다. 또한, 변환 이미지(50)의 얼굴(60)은, 타겟 이미지(10)의 얼굴(20)의 특징에 콘텐츠의 특정한 캐릭터의 특징이 반영된 것으로서 표현될 수 있다. 캐릭터의 특징은 예컨대, 눈 모양 등과 같은, 얼굴 부위의 형태, 모양 또는 색상의 특징을 포함할 수 있다. 변환 이미지(50)의 배경(70)은, 타겟 이미지(10)의 배경(30)의 특징에 콘텐츠의 배경의 특징이 반영된 것으로서 표현될 수 있다. 배경의 특징은 콘텐츠의 색감이나 텍스쳐를 포함할 수 있다. “Cartoonization” of the target image 10 may mean converting the target image 10 into the painting style or style of specific content. Content may include an image or video, and the image or video included in the content may include at least one character. The conversion model of the embodiment creates the conversion image 50 by cartooning the face 20 of the target image 10 as a character of content and cartooning the background 20 of the target image 10 as the background of the content. You can. Accordingly, the converted image 50 may have a color or texture similar to the content. Additionally, the face 60 of the converted image 50 may be expressed as a reflection of the characteristics of a specific character of the content in the characteristics of the face 20 of the target image 10. Characteristics of a character may include, for example, characteristics of the shape, shape, or color of facial areas, such as eye shape. The background 70 of the converted image 50 may be expressed as a reflection of the background characteristics of the content in the characteristics of the background 30 of the target image 10. Background characteristics may include the color or texture of the content.

콘텐츠는 카툰 또는 웹툰 콘텐츠일 수 있다. 또는, 콘텐츠는 영상으로서 만화, 애니매이션, 또는 영화일 수 있다. 콘텐츠에 포함되는 "캐릭터"는 콘텐츠의 등장 인물에 해당하는 객체일 수 있다. 콘텐츠는 복수의 캐릭터들을 포함할 수 있다. The content may be cartoon or webtoon content. Alternatively, the content may be a video, such as a cartoon, animation, or movie. A “character” included in content may be an object corresponding to a character in the content. Content may include multiple characters.

실시예에서의 "배경"은 이미지에서 캐릭터나 인물의 "얼굴"에 해당하는 영역을 제외한 나머지 영역을 나타낼 수 있다. 일례로, 타겟 이미지(10)에서의 배경(30)은 얼굴(20)을 제외한 영역일 수 있고, 변환 이미지(50)에서의 배경(70)은 얼굴(60)을 제외한 영역일 수 있다. The “background” in the embodiment may represent the remaining area in the image excluding the area corresponding to the “face” of the character or person. For example, the background 30 in the target image 10 may be an area excluding the face 20, and the background 70 in the converted image 50 may be an area excluding the face 60.

실시예에서는, 변환 모델을 통해, 타겟 이미지(10)의 얼굴(20)과 배경(30)이 모두 특정한 콘텐츠의 화풍 또는 스타일로 카툰화된 변환 이미지(50)가 생성됨으로써, 배경(70)과 얼굴(60)에 이질감이 없는 캐릭터 이미지가 생성될 수 있다. In the embodiment, through the conversion model, a conversion image 50 is generated in which both the face 20 and the background 30 of the target image 10 are cartooned in the painting style or style of a specific content, thereby creating the background 70 and the background 70. A character image without a sense of heterogeneity can be created on the face 60.

컴퓨터 시스템(100)은 이러한 변환 모델을 구축하기 위해, 배경과 얼굴을 각각 포함하는 복수의 이미지들과 복수의 이미지들에 대응하는 특정 캐릭터로 카툰화된 캐릭터 이미지들의 쌍으로 구성되는 학습용 이미지 데이터 셋을 생성할 수 있고, 생성된 학습용 이미지 데이터 셋을 학습시킴으로써 변환 모델을 구축할 수 있다. In order to build this conversion model, the computer system 100 uses a learning image data set consisting of a pair of character images cartoonized as a specific character corresponding to a plurality of images each including a background and a face and the plurality of images. can be created, and a transformation model can be built by learning the generated training image data set.

컴퓨터 시스템(100)에 의해 변환 모델을 구축하는 방법과, 구축된 변환 모델을 이용하여 타겟 이미지(10)를 캐릭터 이미지(50)로 변환하는 방법에 대해서는 후술될 도 2 내지 도 12를 참조하여 더 자세하게 설명된다. A method of constructing a conversion model by the computer system 100 and a method of converting the target image 10 into a character image 50 using the constructed conversion model will be further described with reference to FIGS. 2 to 12 to be described later. It is explained in detail.

도 2는 일 실시예에 따른, 타겟 이미지를 배경과 얼굴이 카툰화된 캐릭터 이미지로 변환하기 위한 변환 모델을 구축하는 컴퓨터 시스템을 나타낸다.Figure 2 shows a computer system that builds a conversion model for converting a target image into a character image with a cartoonized background and face, according to one embodiment.

컴퓨터 시스템(100)은 전술한 타겟 이미지(10)를 배경(30)과 얼굴(20)이 카툰화된 캐릭터 이미지(50)로 변환하기 위한 변환 모델을 구축하기 위한 컴퓨팅 장치일 수 있다. 즉, 컴퓨터 시스템(100)은, 변환 모델을 구축하기 위해, 학습용 이미지 데이터 셋을 생성하고, 생성된 학습용 이미지 데이터 셋으로 모델을 학습시킴으로써 변환 모델을 구축할 수 있다. The computer system 100 may be a computing device for building a conversion model for converting the above-described target image 10 into a character image 50 in which the background 30 and the face 20 are cartoonized. That is, in order to build a transformation model, the computer system 100 can build a transformation model by creating a training image data set and training the model with the generated training image data set.

컴퓨터 시스템(100)은 적어도 하나의 컴퓨팅 장치를 포함하도록 구성될 수 있다. 컴퓨터 시스템(100)은 학습용 이미지 데이터 셋을 생성하고, 생성된 학습용 이미지 데이터 셋을 학습하기 위한 작업을 수행하는 서버 또는 서버의 일부이거나, 사용자 단말 또는 사용자 단말의 일부일 수 있다. 즉, 후술될 실시예에서, 컴퓨터 시스템(100)은 서버 또는 사용자 단말 중 어느 것에도 해당할 수 있는 것으로 설명된다. Computer system 100 may be configured to include at least one computing device. The computer system 100 may be a server or a part of a server that generates a training image data set and perform a task for learning the generated training image data set, or may be a user terminal or a part of a user terminal. That is, in the embodiments to be described later, the computer system 100 is described as being able to correspond to either a server or a user terminal.

사용자 단말은, 스마트 폰과 같은 스마트 기기이거나, PC(personal computer), 노트북 컴퓨터(laptop computer), 랩탑 컴퓨터(laptop computer), 태블릿(tablet), 사물 인터넷(Internet Of Things) 기기, 또는 웨어러블 컴퓨터(wearable computer) 등일 수 있다.The user terminal is a smart device such as a smart phone, a personal computer (PC), a laptop computer, a laptop computer, a tablet, an Internet of Things device, or a wearable computer ( wearable computer), etc.

컴퓨터 시스템(100)이 서버인 경우, 서버는 사용자 단말과는 구분되는 컴퓨팅 장치로서, 사용자 단말과 통신하는 장치일 수 있다. 이러한 서버인 컴퓨터 시스템(100)은 사용자 단말을 통해 수신되는 요청에 따라, 사용자 단말에 변환 모델을 제공할 수 있고, 따라서, 사용자 단말은, 변환 모델을 사용하여, 타겟 이미지(10)를 배경(30)과 얼굴(20)이 카툰화된 캐릭터 이미지(50)로 변환할 수 있다.When the computer system 100 is a server, the server may be a computing device that is distinct from the user terminal and may be a device that communicates with the user terminal. The computer system 100, which is such a server, can provide a conversion model to the user terminal according to a request received through the user terminal, and therefore, the user terminal uses the conversion model to change the target image 10 to the background ( 30) and the face 20 can be converted into a cartoonized character image 50.

컴퓨터 시스템(100)은 도시된 것처럼, 메모리(130), 프로세서(120), 통신부(110) 및 입출력 인터페이스(140)를 포함할 수 있다.As shown, the computer system 100 may include a memory 130, a processor 120, a communication unit 110, and an input/output interface 140.

메모리(130)는 컴퓨터에서 판독 가능한 기록매체로서, RAM(random access memory), ROM(read only memory) 및 디스크 드라이브와 같은 비소멸성 대용량 기록장치(permanent mass storage device)를 포함할 수 있다. 여기서 ROM과 비소멸성 대용량 기록장치는 메모리(130)와 분리되어 별도의 영구 저장 장치로서 포함될 수도 있다. 또한, 메모리(130)에는 운영체제와 적어도 하나의 프로그램 코드가 저장될 수 있다. 이러한 소프트웨어 구성요소들은 메모리(130)와는 별도의 컴퓨터에서 판독 가능한 기록매체로부터 로딩될 수 있다. 이러한 별도의 컴퓨터에서 판독 가능한 기록매체는 플로피 드라이브, 디스크, 테이프, DVD/CD-ROM 드라이브, 메모리 카드 등의 컴퓨터에서 판독 가능한 기록매체를 포함할 수 있다. 다른 실시예에서 소프트웨어 구성요소들은 컴퓨터에서 판독 가능한 기록매체가 아닌 통신부(110)를 통해 메모리(130)에 로딩될 수도 있다. The memory 130 is a computer-readable recording medium and may include a non-permanent mass storage device such as random access memory (RAM), read only memory (ROM), and a disk drive. Here, the ROM and the non-perishable mass recording device may be separated from the memory 130 and included as a separate permanent storage device. Additionally, an operating system and at least one program code may be stored in the memory 130. These software components may be loaded from a computer-readable recording medium separate from the memory 130. Such separate computer-readable recording media may include computer-readable recording media such as floppy drives, disks, tapes, DVD/CD-ROM drives, and memory cards. In another embodiment, software components may be loaded into the memory 130 through the communication unit 110 rather than a computer-readable recording medium.

프로세서(120)는 기본적인 산술, 로직 및 입출력 연산을 수행함으로써, 컴퓨터 프로그램의 명령을 처리하도록 구성될 수 있다. 명령은 메모리(130) 또는 통신부(110)에 의해 프로세서(120)로 제공될 수 있다. 예를 들어, 프로세서(120)는 메모리(130)에 로딩된 프로그램 코드에 따라 수신되는 명령을 실행하도록 구성될 수 있다. The processor 120 may be configured to process instructions of a computer program by performing basic arithmetic, logic, and input/output operations. Commands may be provided to the processor 120 by the memory 130 or the communication unit 110. For example, the processor 120 may be configured to execute instructions received according to program code loaded into the memory 130.

통신부(110)는 컴퓨터 시스템(100)이 다른 장치(사용자 단말 또는 다른 서버 등)와 통신하기 위한 구성일 수 있다. 말하자면, 통신부(110)는 다른 장치에 대해 데이터 및/또는 정보를 전송/수신하는, 컴퓨터 시스템(100)의 안테나, 데이터 버스, 네트워크 인터페이스 카드, 네트워크 인터페이스 칩 및 네트워킹 인터페이스 포트 등과 같은 하드웨어 모듈 또는 네트워크 디바이스 드라이버(driver) 또는 네트워킹 프로그램과 같은 소프트웨어 모듈일 수 있다.The communication unit 110 may be a component that allows the computer system 100 to communicate with other devices (such as user terminals or other servers). That is, the communication unit 110 is a hardware module or network, such as an antenna, a data bus, a network interface card, a network interface chip, and a networking interface port, of the computer system 100 that transmits/receives data and/or information to and from other devices. It may be a software module such as a device driver or networking program.

입출력 인터페이스(140)는 키보드 또는 마우스 등과 같은 입력 장치 및 디스플레이나 스피커와 같은 출력 장치와의 인터페이스를 위한 수단일 수 있다.The input/output interface 140 may be a means for interfacing with an input device such as a keyboard or mouse and an output device such as a display or speaker.

프로세서(120)는 컴퓨터 시스템(100)의 구성 요소들을 관리할 수 있고, 전술한 변환 모델을 구축하기 위해 학습용 이미지 데이터 셋을 생성하고, 생성된 학습용 이미지 데이터 셋으로 모델을 학습시키기 위한 프로그램 또는 어플리케이션을 실행할 수 있으며, 상기 프로그램 또는 어플리케이션의 실행 및 데이터의 처리 등에 필요한 연산을 처리할 수 있다. 프로세서(120)는 컴퓨터 시스템(100)의 적어도 하나의 프로세서(CPU 또는 GPU 등) 또는 프로세서 내의 적어도 하나의 코어(core)일 수 있다.The processor 120 is a program or application that can manage the components of the computer system 100, generates a training image data set to build the above-described transformation model, and trains the model with the generated training image data set. can be executed, and operations necessary for executing the program or application and processing data can be processed. The processor 120 may be at least one processor (such as a CPU or GPU) of the computer system 100 or at least one core within the processor.

또한, 실시예들에서 컴퓨터 시스템(100) 및 프로세서(120)는 도시된 구성요소들보다 더 많은 구성요소들을 포함할 수도 있다. 예컨대, 프로세서(120)는 변환 모델을 구축하고, 구축된 변환 모델을 사용하여 타겟 이미지(10)로부터 캐릭터 이미지(50)를 생성하기 위한 기능들을 수행하는 구성들을 포함할 수 있다. 이러한 프로세서(120)의 구성들은 프로세서(120)의 일부이거나 프로세서(120)에 의해 구현되는 기능일 수 있다. 프로세서(120)가 포함하는 구성들은, 운영체제의 코드나 적어도 하나의 컴퓨터 프로그램의 코드에 따른 제어 명령(instruction)에 따라 프로세서(120)가 수행하는 서로 다른 기능들(different functions)의 표현일 수 있다.Additionally, in embodiments, computer system 100 and processor 120 may include more components than those shown. For example, the processor 120 may include components that perform functions for building a transformation model and generating the character image 50 from the target image 10 using the built transformation model. These components of the processor 120 may be part of the processor 120 or may be functions implemented by the processor 120. The components included in the processor 120 may be expressions of different functions performed by the processor 120 according to control instructions according to the code of the operating system or the code of at least one computer program. .

컴퓨터 시스템(100)이 수행하는 변환 모델을 구축하는 방법과, 구축된 변환 모델을 이용하여 타겟 이미지(10)를 캐릭터 이미지(50)로 변환하는 방법에 대해서는 후술될 도 3 내지 도 12를 참조하여 더 자세하게 설명된다. A method of constructing a conversion model performed by the computer system 100 and a method of converting the target image 10 into a character image 50 using the constructed conversion model will be described with reference to FIGS. 3 to 12, which will be described later. This is explained in more detail.

이상 도 1을 참조하여 전술된 기술적 특징에 대한 설명은, 도 2에 대해서도 그대로 적용될 수 있으므로 중복되는 설명은 생략한다.The description of the technical features described above with reference to FIG. 1 can also be applied to FIG. 2 , so overlapping descriptions will be omitted.

후술될 상세한 설명에서, 컴퓨터 시스템(100) 또는 프로세서(120)나 이들의구성들에 의해 수행되는 동작은 설명의 편의상 컴퓨터 시스템(100)에 의해 수행되는 동작으로 설명될 수 있다.In the detailed description to be described later, operations performed by the computer system 100 or the processor 120 or their components may be described as operations performed by the computer system 100 for convenience of explanation.

도 3은 일 실시예에 따른, 타겟 이미지를 배경과 얼굴이 카툰화된 캐릭터 이미지로 변환하기 위한 변환 모델을 구축하는 방법을 나타내는 흐름도이다. Figure 3 is a flowchart showing a method of building a conversion model for converting a target image into a character image with a cartoonized background and face, according to an embodiment.

도 3을 참조하여, 타겟 이미지(10)를 배경(30)과 얼굴(20)이 카툰화된 캐릭터 이미지(50)로 변환하기 위한 변환 모델을 구축하는 구체적인 방법에 대해 더 자세하게 설명한다. Referring to FIG. 3, a specific method of building a conversion model for converting the target image 10 into a character image 50 in which the background 30 and the face 20 are cartoonized will be described in more detail.

단계(310)에서, 컴퓨터 시스템(100)은 변환 모델을 학습시키기 위한 학습용 이미지 데이터 셋을 생성할 수 있다. 예컨대, 컴퓨터 시스템(100)은, 배경과 얼굴을 각각 포함하는 복수의 이미지들과 복수의 이미지들에 대응하는 제1 캐릭터로 카툰화된 캐릭터 이미지들의 쌍으로 구성되는 학습용 이미지 데이터 셋을 생성할 수 있다. In step 310, the computer system 100 may generate a training image data set for training a transformation model. For example, the computer system 100 may generate a training image data set consisting of a pair of cartoonized character images with a plurality of images each including a background and a face and a first character corresponding to the plurality of images. there is.

복수의 이미지들의 각각은 배경과 얼굴을 포함하는 실사 이미지일 수 있다. 복수의 이미지들은 랜덤하게 수집된 이미지들로서 적어도 하나의 인물(즉, 인물의 얼굴)을 포함할 수 있다.Each of the plurality of images may be a real-life image including a background and a face. The plurality of images are randomly collected images and may include at least one person (ie, the person's face).

제1 캐릭터는 변환 모델을 통해 타겟 이미지(10)를 카툰화하고자 하는 콘텐츠에 포함된 캐릭터일 수 있다. 즉, 제1 캐릭터는 화상 또는 영상을 포함하는 콘텐츠에 포함된 것일 수 있고, 일례로, 콘텐츠가 웹툰 콘텐츠인 경우 웹툰 콘텐츠의 등장 인물들 중 하나일 수 있다. The first character may be a character included in content for which the target image 10 is to be cartoonized through a transformation model. That is, the first character may be included in content including an image or video, and for example, if the content is webtoon content, it may be one of the characters in the webtoon content.

변환 모델을 학습시키기 위한 학습용 이미지 데이터 셋은 실사 이미지인 복수의 이미지들과, 복수의 이미지들에 대응하는 제1 캐릭터로 카툰화된 캐릭터 이미지의 쌍들을 포함할 수 있다. 말하자면, 학습용 이미지 데이터 셋은 '각 이미지(각 실사 이미지) - 해당 각 이미지에 대응하는 캐릭터 이미지'의 쌍으로 구성될 수 있다. 각 이미지에 대응하는 캐릭터 이미지는 해당 각 이미지에 매칭되는 캐릭터 이미지로서, 상기 각 이미지의 얼굴이 제1 캐릭터로 카툰화된 얼굴로 변환되고 상기 각 이미지의 배경이 카툰화된 배경으로 변환된 것일 수 있다.A training image data set for training a transformation model may include a plurality of images that are real-life images and pairs of character images that are cartoonized as a first character corresponding to the plurality of images. In other words, the training image data set may be composed of a pair of 'each image (each real-life image) - the character image corresponding to each image.' The character image corresponding to each image is a character image matching each image, and the face of each image may be converted into a cartoonized face as the first character and the background of each image may be converted into a cartoonized background. there is.

각 이미지에 매칭되는 캐릭터 이미지를 생성하는 구체적인 방법에 대해서는, 후술될 단계들(312 내지 318)을 참조하여 더 자세하게 설명한다. A specific method of generating a character image matching each image will be described in more detail with reference to steps 312 to 318 to be described later.

단계(320)에서, 컴퓨터 시스템(100)은 단계(310)에서 생성된 학습용 이미지 데이터 셋을 학습함으로써 변환 모델을 구축할 수 있다. 구축된 변환 모델은, 도 1을 참조하여 전술한 것처럼, 변환 모델에 입력되는 타겟 이미지(10)를 배경(30)과 얼굴(20)이 카툰화된 캐릭터 이미지(50)로 변환할 수 있다. In step 320, the computer system 100 may build a transformation model by learning the training image data set generated in step 310. As described above with reference to FIG. 1 , the constructed conversion model can convert the target image 10 input to the conversion model into a character image 50 in which the background 30 and the face 20 are cartoonized.

아래에서는, 단계들(312 내지 318)을 참조하여 학습용 이미지 데이터 셋을 생성하는 방법에 대해 더 자세하게 설명한다. Below, the method for generating a training image data set will be described in more detail with reference to steps 312 to 318.

실시예의 변환 모델을 학습시키기 위한 학습용 이미지 데이터 셋을 생성함에 있어서는, 실사 이미지인 상기 복수의 이미지들에 대응하는 캐릭터 이미지들을 생성하기 위해 얼굴 변환 모델과 배경 변환 모델이 사용될 수 있다. When creating a learning image data set for learning the transformation model of the embodiment, a face transformation model and a background transformation model may be used to generate character images corresponding to the plurality of images that are real images.

얼굴 변환 모델과 배경 변환 모델은 컴퓨터 시스템(100) 상에 미리 구축되어 있거나 컴퓨터 시스템(100)이 접근 가능한 다른 컴퓨터 시스템에 미리 구축되어 있는 모델일 수 있다. The face transformation model and the background transformation model may be pre-built on the computer system 100 or may be models pre-built in another computer system accessible to the computer system 100.

단계(312)에서, 컴퓨터 시스템(100)은, 얼굴 변환 모델을 사용하여, 제1 캐릭터의 얼굴을 포함하는 복수의 학습용 얼굴 이미지들에 기반하여, 복수의 이미지들의 각 이미지의 얼굴을 제1 캐릭터로 카툰화된 얼굴로 변환할 수 있다. 즉, 컴퓨터 시스템(100)은 상기 각 이미지의 열굴이 제1 캐릭터로 카툰화된 얼굴로 변환된 얼굴 변환 이미지를 생성할 수 있다. At step 312, the computer system 100 uses a facial transformation model to transform the face of each image of the plurality of images into the first character, based on the plurality of training facial images including the face of the first character. You can convert it into a cartoonized face with . That is, the computer system 100 may generate a face conversion image in which the heat curves of each image are converted into a cartoonized face as the first character.

복수의 학습용 얼굴 이미지들은 콘텐츠의 제1 캐릭터의 얼굴을 포함하는 이미지로서 수집된 것일 수 있다. 복수의 학습용 얼굴 이미지들은 콘텐츠로부터 추출될 수 있다. 복수의 학습용 얼굴 이미지들을 획득하는 구체적인 방법에 대해서는 후술될 도 4 및 도 7을 참조하여 더 자세하게 설명된다.A plurality of face images for learning may be collected as images including the face of the first character of the content. A plurality of face images for learning may be extracted from content. A specific method of acquiring a plurality of facial images for learning will be described in more detail with reference to FIGS. 4 and 7, which will be described later.

얼굴 변환 모델은 적은 수의 학습용 얼굴 이미지들에 기반하여, 더 많은 수의 실사 이미지들에 대응하는 캐릭터 이미지들을 생성하기 위한 모델일 수 있다.A face transformation model may be a model for generating character images corresponding to a larger number of real-life images based on a small number of facial images for learning.

예컨대, 실사 이미지인 상기 복수의 이미지들은 제1 개수일 수 있고, 상기 복수의 학습용 얼굴 이미지들은 제1 개수보다 더 작은 제2 개수일 수 있다. 이 때, 얼굴 변환 모델은 제1 개수의 복수의 이미지들과 제2 개수의 복수의 학습용 얼굴 이미지들에 기반하여, 복수의 이미지들의 각 이미지의 얼굴과 매칭되는 카툰화된 얼굴을 포함하는, 제1 개수의 복수의 이미지들에 대응하는 제1 개수의 얼굴 변환 이미지들을 생성하도록 구성될 수 있다. 말하자면, 얼굴 변환 모델은 실사 이미지인 복수의 이미지들의 각 이미지에 매칭되는 얼굴 변환 이미지를 생성할 수 있다. For example, the plurality of images that are real-life images may be a first number, and the plurality of facial images for learning may be a second number smaller than the first number. At this time, the face transformation model is based on a first number of images and a second number of learning face images, and includes a cartoonized face matching the face of each image of the plurality of images. It may be configured to generate a first number of face transformation images corresponding to a plurality of images. In other words, the face transformation model can generate a face transformation image that matches each image of a plurality of images that are real images.

얼굴 변환 모델을 사용하여 변환된 카툰화된 얼굴, 즉, 얼굴 변환 이미지에 포함되는 카툰화된 얼굴은, 대응하는 실사 이미지인 각 이미지의 얼굴과 얼굴의 방향, 구도 및 얼굴에 포함된 부위의 위치가 매칭되는 것일 수 있다. 말하자면, 얼굴 변환 이미지에 포함되는 카툰화된 얼굴과, 대응하는 실사 이미지의 얼굴은 동일한 방향을 바라보며, 전반적인 구도(색감, 형태 등)가 동일하며, 얼굴의 부위의 위치(눈 또는 눈동자, 코, 입, 귀 등의 부위의 상대적인 위치 관계), 얼굴형, 얼굴 모양 등이 동일할 수 있다.The cartoonized face converted using the face transformation model, that is, the cartoonized face included in the face transformation image, is the face of each image, which is the corresponding real-life image, and the direction, composition, and position of the face of the face. may be matching. In other words, the cartoonized face included in the face conversion image and the face in the corresponding live image face the same direction, have the same overall composition (color, shape, etc.), and have the same location of facial parts (eyes or pupils, nose, etc.) The relative positional relationship of parts such as the mouth and ears), face shape, etc. may be the same.

이처럼, 실시예에서는 학습용 이미지 데이터 셋을 생성함에 있어서 얼굴 변환 모델을 사용함으로써 적은 수의 제1 캐릭터의 학습용 얼굴 이미지만을 이용하여서도, 더 많은 수의 실사 이미지들에 대응하는 얼굴 변환 이미지들을 획득할 수 있다. As such, in the embodiment, by using a face transformation model in generating a training image data set, it is possible to obtain face transformation images corresponding to a larger number of real-life images even by using only a small number of training face images of the first character. You can.

얼굴 변환 모델에 의한 얼굴 변환 이미지의 생성 방법에 대해서는 후술될 도 5와 도 8을 참조하여 더 자세하게 설명된다. The method of generating a face transformation image using a face transformation model will be described in more detail with reference to FIGS. 5 and 8, which will be described later.

또한, 실시예의 변환 모델을 학습시키기 위한 학습용 이미지 데이터 셋을 생성함에 있어서는, 실사 이미지인 상기 복수의 이미지들의 배경을 콘텐츠의 배경으로 카툰화하기 위한 배경 변환 모델이 사용될 수 있다. Additionally, when creating a learning image data set for training the transformation model of the embodiment, a background transformation model may be used to cartoonize the background of the plurality of images, which are real-life images, into the background of the content.

단계(314)에서, 컴퓨터 시스템(100)은, 배경 변환 모델을 사용하여, 실사 이미지인 복수의 이미지들의 각 이미지의 배경을 카툰화된 배경으로 변환할 수 있다. 즉, 컴퓨터 시스템(100)은 상기 각 이미지의 배경이 제1 캐릭터를 포함하는 콘텐츠의 배경으로 카툰화된 배경 변환 이미지를 생성할 수 있다.In step 314, the computer system 100 may convert the background of each image of the plurality of images, which are real-life images, into a cartoonized background using a background conversion model. That is, the computer system 100 may generate a background conversion image in which the background of each image is cartoonized as the background of content including the first character.

배경 변환 모델은, 제1 캐릭터를 포함하는 콘텐츠로부터 추출된 이미지들에 기반하여 학습된 것일 수 있다. 상기 추출된 이미지들은 캐릭터나 기타 오브젝트를 포함하지 않는 이미지들일 수 있다. The background transformation model may be learned based on images extracted from content including the first character. The extracted images may be images that do not include characters or other objects.

배경 변환 모델은, 실사 이미지인 복수의 이미지들의 각 이미지의 배경을 콘텐츠의 카툰화된 배경으로 변환함으로써, 배경 변환 이미지를 생성하도록 구성될 수 있다. The background conversion model may be configured to generate a background conversion image by converting the background of each image of a plurality of images, which are real-life images, into a cartoonized background of the content.

배경 변환 모델에 의한 배경 변환 이미지의 생성 방법에 대해서는 후술될 도 9를 참조하여 더 자세하게 설명된다.The method of generating a background transformation image using a background transformation model will be described in more detail with reference to FIG. 9, which will be described later.

단계(316)에서, 컴퓨터 시스템(100)은 단계(312)에 따라 얼굴 변환 모델을 사용하여 실사 이미지인 각 이미지의 얼굴을 변환한 결과인 카툰화된 얼굴과, 단계(314)에 따라 배경 변환 모델을 사용하여 실사 이미지인 각 이미지의 배경을 변환한 결과인 카툰화된 배경을 합성함으로써, 각 이미지에 대응하는 캐릭터 이미지를 생성할 수 있다. At step 316, the computer system 100 converts the face of each image, which is a real-life image, into a cartoonized face, which is the result of transforming the face of each image using the face transformation model according to step 312, and transforms the background according to step 314. By using the model to synthesize a cartoonized background that is the result of converting the background of each image, which is a real-life image, it is possible to create a character image corresponding to each image.

컴퓨터 시스템(100)은 전술한 얼굴 변환 이미지와 배경 변환 이미지를 합성함으로써 캐릭터 이미지를 생성할 수 있다. 예컨대, 컴퓨터 시스템(100)은 생성된 배경 변환 이미지에 얼굴 변환 이미지에 포함된 카툰화된 얼굴을 합성함으로써, 실사 이미지인 각 이미지에 대응하는 캐릭터 이미지를 생성할 수 있다.The computer system 100 may generate a character image by combining the above-described face conversion image and the background conversion image. For example, the computer system 100 may generate character images corresponding to each image, which is a real-life image, by combining the cartoonized face included in the face conversion image with the generated background conversion image.

단계(318)에서, 컴퓨터 시스템(100)은 단계(316)에서 생성된 캐릭터 이미지와 거기에 대응하는 실사 이미지인 각각의 이미지의 쌍으로 구성되는 학습용 이미지 데이터 셋을 생성할 수 있다. In step 318, the computer system 100 may generate a training image data set consisting of pairs of the character image generated in step 316 and each image that is a corresponding real-life image.

이로서, 각각의 실사 이미지와, 실사 이미지에 대응하여 배경과 얼굴이 카툰화된 캐릭터 이미지의 쌍이 변환 모델을 위한 최종적인 학습용 이미지 데이터 셋으로서 생성될 수 있다. As a result, a pair of each real-life image and a character image with a cartoonized background and face corresponding to the real-life image can be generated as the final training image data set for the transformation model.

얼굴 변환 모델과 배경 변환 모델과 이미지 합성에 의한 학습용 이미지 데이터 셋을 생성하는 방법에 대해서는 후술될 도 9를 참조하여 더 자세하게 설명된다.The method of generating a training image data set by combining a face transformation model, a background transformation model, and images will be described in more detail with reference to FIG. 9, which will be described later.

이상 도 1 및 도 2를 참조하여 전술된 기술적 특징에 대한 설명은, 도 3에 대해서도 그대로 적용될 수 있으므로 중복되는 설명은 생략한다.The description of the technical features described above with reference to FIGS. 1 and 2 can also be applied to FIG. 3 , so overlapping descriptions will be omitted.

도 4는 일 예에 따른, 변환 모델의 구축을 위한 학습용 얼굴 이미지들을 획득하는 방법을 나타내는 흐름도이다. Figure 4 is a flowchart showing a method of acquiring facial images for learning to build a transformation model, according to an example.

도 4를 참조하여, 전술한 단계(310)에서의 학습용 이미지 데이터 셋을 생성함에 있어서 사용되는 "제1 캐릭터의 얼굴을 포함하는 복수의 학습용 얼굴 이미지들"을 획득하는 방법에 대해 더 자세하게 설명한다. 상기 복수의 학습용 얼굴 이미지들은 실사 이미지인 복수의 이미지들에 대응하는 얼굴 변환 이미지들을 생성하기 위해 얼굴 변환 모델에 입력되는 이미지들일 수 있다. Referring to FIG. 4, the method of obtaining “a plurality of learning face images including the face of the first character” used in generating the learning image data set in the above-described step 310 will be described in more detail. . The plurality of facial images for learning may be images input to a face transformation model to generate face transformation images corresponding to a plurality of images that are real images.

이러한, 복수의 학습용 얼굴 이미지들은 제1 캐릭터를 포함하는 화상 또는 영상을 포함하는 콘텐츠로부터 획득될 수 있다. 복수의 학습용 얼굴 이미지들의 각각은 제1 캐릭터의 얼굴을 포함할 수 있다. These plurality of face images for learning may be obtained from content including an image or video including the first character. Each of the plurality of facial images for learning may include the face of the first character.

콘텐츠는 제1 캐릭터를 등장 인물로서 포함하는 웹툰 콘텐츠일 수 있다. 웹툰 콘텐츠는 적어도 하나의 회차를 포함하도록 구성될 수 있다. 웹툰 콘텐츠 또는 웹툰 콘텐츠의 회차는 복수의 컷들을 포함할 수 있다.The content may be webtoon content including the first character as a character. Webtoon content may be configured to include at least one episode. Webtoon content or an episode of webtoon content may include multiple cuts.

아래에서는, 단계들(410 내지 450)을 참조하여, 콘텐츠가 웹툰 콘텐츠인 경우에 있어서, 웹툰 콘텐츠로부터 복수의 학습용 얼굴 이미지들을 획득하는 방법을 더 자세하게 설명한다. Below, with reference to steps 410 to 450, a method of obtaining a plurality of facial images for learning from webtoon content when the content is webtoon content will be described in more detail.

단계(410)에서, 컴퓨터 시스템(100)은 콘텐츠로부터 소정의 개수의 컷들을 추출할 수 있다. At step 410, computer system 100 may extract a predetermined number of cuts from the content.

단계(420)에서, 컴퓨터 시스템(100)은 추출된 컷들로부터, 제1 캐릭터를 포함하는 적어도 하나의 캐릭터의 얼굴을 포함하는 제1 얼굴 이미지들을 추출할 수 있다. 콘텐츠는 복수의 캐릭터들을 포함할 수 있고, 컴퓨터 시스템(100)은 추출된 컷들의 각 컷에 포함된 캐릭터의 얼굴을 포함하는 이미지를 상기 제1 얼굴 이미지로서 추출할 수 있다. 제1 얼굴 이미지는 캐릭터의 얼굴에 해당하는 영역을 포함하도록 컷을 크롭한 이미지이거나, 컷 자체일 수도 있다. In step 420, the computer system 100 may extract first facial images including the face of at least one character including the first character from the extracted cuts. Content may include a plurality of characters, and the computer system 100 may extract an image including the face of a character included in each of the extracted cuts as the first face image. The first face image may be an image cropped from a cut to include an area corresponding to the character's face, or may be the cut itself.

단계(430)에서, 컴퓨터 시스템(100)은 단계(420)에서 획득된 제1 얼굴 이미지들 중 적어도 하나를 처리함으로써 보정된 제2 얼굴 이미지들을 획득할 수 있다. 예컨대, 컴퓨터 시스템(100)은 제1 얼굴 이미지들 중 적어도 하나에 대해, 얼라인먼트(alignment), 리사이징(reziing) 및 해상도 변경 중 적어도 하나의 처리를 수행함으로써 보정된 제2 얼굴 이미지들을 획득할 수 있다. In step 430, the computer system 100 may obtain corrected second facial images by processing at least one of the first facial images obtained in step 420. For example, the computer system 100 may obtain corrected second facial images by performing at least one process of alignment, resizing, and resolution change on at least one of the first facial images. .

단계(440)에서, 컴퓨터 시스템(100)은 단계(430)에서 획득된 제2 얼굴 이미지들을 캐릭터별로 분류할 수 있다. 예컨대, 컴퓨터 시스템(100)은 제2 얼굴 이미지들을 캐릭터별로 클러스터링할 수 있다.In step 440, the computer system 100 may classify the second facial images obtained in step 430 by character. For example, the computer system 100 may cluster the second facial images by character.

단계(450)에서, 컴퓨터 시스템(100)은 단계(440)에서의 제2 얼굴 이미지들의 분류에 기반하여 제1 캐릭터의 얼굴을 포함하는 학습용 얼굴 이미지들을 획득할 수 있다. In step 450, the computer system 100 may obtain training facial images including the face of the first character based on the classification of the second facial images in step 440.

학습용 얼굴 이미지들의 개수는, 예컨대, 100개 이하(또는 100개 이하 50개 이상)의 상대적으로 적은 수일 수 있다. 실시예에서는, 전술한 얼굴 변환 모델을 사용하여, 이러한 적은 수의 학습용 얼굴 이미지들에 기반하여, 많은 수의 실사 이미지들에 대응하는 카툰화된 얼굴 변환 이미지들이 획득될 수 있다. The number of face images for learning may be a relatively small number, for example, 100 or less (or 100 or less or 50 or more). In an embodiment, using the above-described face transformation model, cartoonized face transformation images corresponding to a large number of real-life images can be obtained based on this small number of facial images for training.

아래에서는, 도 7을 참조하여, 제1 캐릭터의 얼굴을 포함하는 학습용 얼굴 이미지들을 생성하는 방법에 대해 좀 더 자세하게 설명한다.Below, with reference to FIG. 7, a method of generating facial images for learning including the face of the first character will be described in more detail.

학습용 얼굴 이미지들은 제1 캐릭터의 '원본 데이터 셋'으로 명명될 수 있다. Facial images for learning may be named 'original data set' of the first character.

도 7은 일 예에 따른, 웹툰 콘텐츠에 포함되는 제1 캐릭터의 원본 데이터 셋을 콘텐츠로부터 획득하는 방법을 나타낸다. Figure 7 shows a method of obtaining the original data set of the first character included in webtoon content from content, according to an example.

단계(710)에서, 컴퓨터 시스템(100)은 웹툰 콘텐츠인 콘텐츠 단위(즉, 작품 단위)로 콘텐츠가 포함하는 컷들의 이미지를 추출할 수 있다. 예컨대, 컴퓨터 시스템(100)은 웹툰 콘텐츠를 다운로드하여, ToonCutter 알고리즘을 사용하여 웹툰 콘텐츠의 컷 단위의 이미지들을 추출할 수 있다. 컴퓨터 시스템(100)은 소정의 개수의 컷들, 일례로, 100개의 컷들을 추출할 수 있다. 이는 전술한 단계(410)에 대응할 수 있다. In step 710, the computer system 100 may extract images of cuts included in the content by content unit (i.e., work unit), which is webtoon content. For example, the computer system 100 may download webtoon content and extract cut-unit images of the webtoon content using the ToonCutter algorithm. The computer system 100 may extract a predetermined number of cuts, for example, 100 cuts. This may correspond to step 410 described above.

단계(720)에서, 컴퓨터 시스템(100)은 추출된 컷들의 각각에서 웹툰 콘텐츠에 포함되는 제1 캐릭터의 얼굴을 탐지하여 추출할 수 있다. 예컨대, 컴퓨터 시스템(100)은 컷으로부터의 캐릭터의 얼굴을 탐지하기 위해 오픈 소스 알고리즘으로서, YoloV4 알고리즘을 사용할 수 있다. 이는 전술한 단계(420)에 대응할 수 있다.In step 720, the computer system 100 may detect and extract the face of the first character included in the webtoon content from each of the extracted cuts. For example, computer system 100 may use the YoloV4 algorithm, an open source algorithm, to detect the face of a character from a cut. This may correspond to step 420 described above.

단계(730)에서, 컴퓨터 시스템(100)은 단계(720)에 따라 탐지된 얼굴에 해당하는 얼굴 이미지들을 추출할 수 있고, 이에 대해 얼라인먼트를 수행할 수 있다. 예컨대, 컴퓨터 시스템(100)은 얼굴 이미지들의 추출 및/또는 얼라인먼트를 위해 오픈 소스 알고리즘으로서, StyleGAN2 및 OpenCV 알고리즘을 사용할 수 있다. In step 730, the computer system 100 may extract facial images corresponding to the face detected in step 720 and perform alignment on them. For example, computer system 100 may use the StyleGAN2 and OpenCV algorithms, which are open source algorithms, for extraction and/or alignment of facial images.

추출된 얼굴 이미지들에 대해 수행되는 얼라인먼트는 얼굴 이미지에 포함된 얼굴의 방향을 정면을 바라보도록 보정하는 것일 수 있고, 얼굴 이미지를 리사이징(resizing)하는 것을 포함할 수 있다. 리사이징되는 얼굴 이미지의 크기는 256X256일 수 있다. Alignment performed on the extracted face images may correct the direction of the face included in the face image to face the front, and may include resizing the face image. The size of the resized face image may be 256X256.

단계(740)에서, 컴퓨터 시스템(100)은 얼굴 이미지들에 대해 초해상화 처리를 수행할 수 있다. 예컨대, 얼굴 이미지에 포함된 얼굴의 크기가 너무 작은 경우 컴퓨터 시스템(100)은 오픈 소스 알고리즘으로서, Real-esrgan와 같은 초해상화 알고리즘을 사용하여 해당 얼굴 이미지의 해상도를 높일 수 있다. 초해상화 알고리즘에 따라 리사이징되는 얼굴 이미지의 크기는 1024X1024일 수 있다.At step 740, the computer system 100 may perform super-resolution processing on facial images. For example, if the size of the face included in the face image is too small, the computer system 100 can increase the resolution of the face image by using a super-resolution algorithm such as Real-esrgan, which is an open source algorithm. The size of the face image resized according to the super-resolution algorithm may be 1024X1024.

단계들(730 및 740)은 전술한 단계(430)에 대응할 수 있다. 단계들(730 및 740)은 얼굴 이미지들을 중에서 보정이 필요한 이미지들에 대해서만 수행될 수도 있다. Steps 730 and 740 may correspond to step 430 described above. Steps 730 and 740 may be performed only on facial images that require correction.

단계(750)에서, 컴퓨터 시스템(100)은 단계들(730 및 740)에 따라 처리된 얼굴 이미지들을 캐릭터별 이미지로 분류할 수 있다. 예컨대, 컴퓨터 시스템(100)은 오픈 소스 알고리즘으로서, K-means clustering 알고리즘을 사용하여 얼굴 이미지들을 캐릭터별 이미지로 분류할 수 있다. 컴퓨터 시스템(100)은 얼굴 이미지들의 분류된 결과에 기반하여 학습용 이미지 데이터 셋을 구축하기 위해 사용될 제1 캐릭터의 얼굴 이미지들을 제1 캐릭터의 '원본 데이터 셋'으로서 획득할 수 있다. 단계들(750)은 전술한 단계들(440 및 450)에 대응할 수 있다.In step 750, the computer system 100 may classify the facial images processed according to steps 730 and 740 into images for each character. For example, the computer system 100 is an open source algorithm and can classify facial images into character-specific images using the K-means clustering algorithm. The computer system 100 may acquire facial images of the first character to be used to construct a learning image data set based on the classification results of the facial images as an 'original data set' of the first character. Step 750 may correspond to steps 440 and 450 described above.

'원본 데이터 셋'으로 획득되는 제1 캐릭터의 학습용 얼굴 이미지들의 개수는, 예컨대, 100개 이하(또는 100개 이하 50개 이상)의 상대적으로 적은 수일 수 있다.The number of facial images for learning of the first character acquired as the 'original data set' may be a relatively small number, for example, 100 or less (or 100 or less or 50 or more).

이상 도 1 내지 도 3을 참조하여 전술된 기술적 특징에 대한 설명은, 도 4 및 도 7에 대해서도 그대로 적용될 수 있으므로 중복되는 설명은 생략한다.The description of the technical features described above with reference to FIGS. 1 to 3 can also be applied to FIGS. 4 and 7 , so overlapping descriptions will be omitted.

도 5는 일 예에 따른, 변환 모델의 구축을 위한 학습용 이미지 데이터 셋을 생성하기 위해, 얼굴 변환 모델을 사용하여 각 실사 이미지에 대응하는 얼굴 변환 이미지를 생성하는 방법을 나타내는 흐름도이다.FIG. 5 is a flowchart illustrating a method of generating a face transformation image corresponding to each real-life image using a face transformation model to generate a training image data set for building a transformation model, according to an example.

후술될 단계들(510 내지 530)에 의해, 얼굴 변환 모델을 사용하여, 실사 이미지인 복수의 이미지들의 각 이미지에 대응하는 얼굴 변환 이미지가 생성될 수 있다.Through steps 510 to 530, which will be described later, a face transformation image corresponding to each image of the plurality of images that are real-life images may be generated using a face transformation model.

단계(510)에서, 컴퓨터 시스템(100)은 전술한 단계(450)에 의해 획득된 복수의 학습용 얼굴 이미지들에 기반하여 얼굴 변환 모델의 잠재 공간(latent space)을 정의할 수 있다. 즉, 컴퓨터 시스템(100)은 획득된 복수의 학습용 얼굴 이미지들을 얼굴 변환 모델의 잠재 공간으로 변환시킬(inversion)수 있다. In step 510, the computer system 100 may define a latent space of a face transformation model based on the plurality of facial images for training acquired in step 450 described above. That is, the computer system 100 can invert the acquired plurality of facial images for learning into the latent space of the face transformation model.

단계(520)에서, 컴퓨터 시스템(100)은 잠재 공간에 기반하여 실사 이미지인 복수의 이미지들의 각 이미지의 변환을 위한 변환 코드(inversion code)를 생성할 수 있다. In step 520, the computer system 100 may generate an inversion code for converting each image of the plurality of images that are real images based on the latent space.

단계(530)에서, 컴퓨터 시스템(100)은 생성된 변환 코드에 기반하여, 얼굴 변환 모델을 통해 각 이미지에 대응하는 얼굴 변환 이미지를 생성할 수 있다.In step 530, the computer system 100 may generate a face transformation image corresponding to each image through a face transformation model based on the generated transformation code.

실시예의 얼굴 변환 모델을 통해서는, 더 적은 수(예컨대, 100개 이하)의 제1 캐릭터의 학습용 얼굴 이미지들에 기반하여, 더 많은 수(예컨대, 10000개 이상)의 실사 이미지들에 대응하는 얼굴 변환 이미지들이 생성될 수 있다. 각각의 실사 이미지와 각각의 얼굴 변환 이미지는 얼굴이 서로 매칭되는 것일 수 있다. Through the face transformation model of the embodiment, based on a smaller number (e.g., 100 or less) of facial images for learning of the first character, a face corresponding to a larger number (e.g., 10,000 or more) of real-life images Transformed images may be created. Each real image and each face conversion image may have faces that match each other.

관련하여, 아래에서는 도 8을 참조하여, 얼굴 변환 모델을 생성하는 방법과 얼굴 변환 모델을 사용하여 각각의 실사 이미지에 대응하는 얼굴 변환 이미지를 생성하는 방법에 대해 더 자세하게 설명한다. In relation to this, below, with reference to FIG. 8, a method for generating a face transformation model and a method for generating a face transformation image corresponding to each real-life image using the face transformation model will be described in more detail.

도 8은 일 예에 따른, 얼굴 변환 모델을 사용하여, 도 7의 단계들(710 내지 750)에 따라 획득된 제1 캐릭터의 원본 데이터 셋에 기반하여, 실사 이미지들에 대응하는 얼굴 변환 이미지들을 획득하는 방법을 나타낸다. FIG. 8 shows face transformation images corresponding to real-life images based on the original data set of the first character obtained according to steps 710 to 750 of FIG. 7 using a face transformation model, according to an example. Indicates how to obtain it.

실사 이미지와 거기에 대응하는 얼굴 변환 이미지는 서로 쌍을 이룬다는 점에서, 제1 캐릭터의, 즉, 캐릭터별, "변환 모델 학습용 쌍 데이터 셋"으로 명명될 수 있다. In that the real-life image and the corresponding facial conversion image are paired with each other, the first character, that is, for each character, can be named a "pair data set for learning a conversion model."

단계(810)에서, 컴퓨터 시스템(100)은 전술한 단계(450)에 의해 획득된 제1 캐릭터의 복수의 학습용 얼굴 이미지들(즉, 전술한 단계(750)에 의해 획득된 제1 캐릭터의 얼굴 이미지들)에 기반하여, 제1 캐릭터의 얼굴 모델을 생성할 수 있다. 컴퓨터 시스템은, 예컨대, StyleGAN2 알고리즘을 사용하여 이러한 제1 캐릭터의 얼굴 모델을 생성할 수 있다. 구체적으로, 컴퓨터 시스템(100)은 제1 캐릭터의 복수의 학습용 얼굴 이미지들을 데이터 셋을 Flickr-Faces-HQ (FFHQ) pre-trained StyelGAN2 모델과 StyleGAN2-ADA 모델을 사용하여 파인튜닝(fine-tuning) 학습함으로써 제1 캐릭터의 얼굴 모델을 생성할 수 있다.In step 810, the computer system 100 selects a plurality of training facial images of the first character obtained by step 450 described above (i.e., the face of the first character obtained by step 750 described above). Based on the images), a face model of the first character may be created. The computer system may generate a facial model of this first character using, for example, the StyleGAN2 algorithm. Specifically, the computer system 100 fine-tunes a data set of a plurality of learning face images of the first character using the Flickr-Faces-HQ (FFHQ) pre-trained StyelGAN2 model and StyleGAN2-ADA model. By learning, a face model of the first character can be created.

단계(820)에서, 컴퓨터 시스템(100)은 파인튜닝 학습된 모델과 FFHQ pre-trained 모델을 합성(blending)함으로써 얼굴 변환 모델을 생성할 수 있다. 이러한 모델의 합성은 기본적으로 32x32 해상도에서 이루어질 수 있다. 다만, 캐릭터의 특성에 따라 상이한 해상도에서 모델의 합성이 이루어질 수도 있다. 모델의 합성에는 다양한 블렌딩(blending) 알고리즘이 사용될 수 있다. In step 820, the computer system 100 may generate a facial transformation model by blending the fine tuning learned model and the FFHQ pre-trained model. Synthesis of these models can be done natively at 32x32 resolution. However, depending on the characteristics of the character, the model may be synthesized at different resolutions. Various blending algorithms can be used to synthesize models.

단계(830)에서, 컴퓨터 시스템(100)은 전술한 제1 캐릭터의 복수의 학습용 얼굴 이미지들을 단계(820)에서의 모델 합성에 따라 생성된 얼굴 변환 모델의 잠재 공간으로 변환시킬(inversion)수 있다. 컴퓨터 시스템(100)은 잠재 공간에 기반하여 실사 이미지인 복수의 이미지들의 각 이미지의 변환을 위한 변환 코드(inversion code)를 생성할 수 있다. 컴퓨터 시스템(100)은 생성된 변환 코드에 기반하여, 얼굴 변환 모델을 통해 각 이미지에 대응하는 캐릭터 얼굴 텍스쳐를 생성함으로써 얼굴 변환 이미지를 생성할 수 있다. 컴퓨터 시스템(100)은 각 이미지에 대응하는 캐릭터 얼굴 텍스쳐를 생성함에 있어서, 예컨대, StyleGAN2 알고리즘을 사용할 수 있다. 이는 전술한 단계들(510 내지 530)에 대응할 수 있다. In step 830, the computer system 100 may invert the plurality of training facial images of the above-described first character into the latent space of the facial transformation model generated according to model synthesis in step 820. . The computer system 100 may generate an inversion code for converting each image of a plurality of images that are real images based on the latent space. The computer system 100 may generate a face transformation image by generating a character face texture corresponding to each image through a face transformation model based on the generated transformation code. The computer system 100 may use, for example, the StyleGAN2 algorithm in generating a character face texture corresponding to each image. This may correspond to steps 510 to 530 described above.

단계(840)에서, 컴퓨터 시스템(100)은 생성된 얼굴 변환 이미지와 거기에 대응하는 실사 이미지를 쌍(paired) 학습 데이터 셋으로서 생성할 수 있다. In step 840, the computer system 100 may generate the generated face transformation image and the corresponding real-life image as a paired learning data set.

실사 이미지인 전술한 복수의 이미지들은 랜덤으로 수집된 이미지들일 수 있다. 복수의 이미지들의 각각은 인물의 얼굴을 포함할 수 있고, 복수의 인물들의 얼굴을 포함할 수 있다. 이미지들은 다양한 기존의 알고리즘들에 기반하여 수집될 수 있다. The plurality of images described above, which are real images, may be randomly collected images. Each of the plurality of images may include a face of a person, and may include the faces of a plurality of people. Images can be collected based on a variety of existing algorithms.

예컨대, 이러한 복수의 이미지들은 제1 개수(예컨대, 10000개)의 랜덤 잠재 코드들(random latent codes)을 FFHQ pre-trained 모델로 디코딩함으로써(decoding) 획득되는 실사 이미지들일 수 있다. 즉, 복수의 이미지들은 10000개의 실사 이미지들일 수 있다. For example, these plurality of images may be real images obtained by decoding a first number (eg, 10000) of random latent codes with an FFHQ pre-trained model. That is, the plurality of images may be 10000 real-life images.

얼굴 변환 모델은, 상기 랜덤 잠재 코드들에 대해 제1 캐릭터에 해당하는 잠개 코드(latent code)를 합성한 후 이를 디코딩함으로써 상기 복수의 이미지들에 대응하는 얼굴 변환 이미지들을 생성할 수 있다. 말하자면, 단계(820)에서 합성된 모델을 통해 상기의 디코딩이 이루어짐으로써 10000개의 실사 이미지들에 대응하는 10000개의 얼굴 변환 이미지들이 생성될 수 있다. The face transformation model may generate face transformation images corresponding to the plurality of images by combining the random latent codes with a latent code corresponding to the first character and then decoding it. In other words, by performing the decoding through the synthesized model in step 820, 10,000 facial transformation images corresponding to 10,000 real-life images can be generated.

이러한 얼굴 변환 모델을 사용함으로써, 실시예에서는, 쉽게 획득될 수 있는 대량의 실사 이미지들에 매칭되는, 즉, 얼굴의 방향, 얼굴의 구도, 얼굴이 포함하는 부위(들)의 위치 등이 매칭되는, 대량의 얼굴 변환 이미지들이 단지 소량의 이미지들로 구성되는 원본 데이터 셋에 기반하여 생성될 수 있다. By using this face transformation model, in an embodiment, a large amount of real-life images that can be easily obtained are matched, that is, the direction of the face, the composition of the face, the location of the part(s) included in the face, etc. are matched. , large amounts of facial transformation images can be generated based on the original data set consisting of only a small number of images.

관련하여, 도 12는 일 예에 따른, 얼굴 변환 모델을 사용하여, 캐릭터의 얼굴을 포함하는 제2 개수의 학습용 얼굴 이미지들에 기반하여, 제1 개수의 실사 이미지들을 대응하는 카툰화된 제1 개수의 얼굴 변환 이미지들로 변환하는 방법을 나타낸다. Relatedly, FIG. 12 shows first cartoonized images corresponding to a first number of real-life images based on a second number of training facial images including faces of characters, using a facial transformation model, according to an example. Indicates how to convert a number of face conversion images.

도 12에서는, 얼굴 변환 모델(1200)을 사용하여 제1 개수의 실사 이미지들을 제1 개수의 얼굴 변환 이미지들로 변환하는 방법이 개략적으로 도시되었다. 실시예에서는, 제1 개수의 얼굴 변환 이미지들을 얻기 위해 제1 개수보다 훨씬 더 작은 제2 개수의 학습용 얼굴 이미지들만이 요구될 수 있다.In FIG. 12 , a method of converting a first number of real-life images into a first number of face transformation images using a face transformation model 1200 is schematically shown. In an embodiment, only a second number of training face images that are much smaller than the first number may be required to obtain the first number of facial transformation images.

얼굴 변환 모델(1200)은 실사 이미지들의 각각을 인코더를 통해 인코딩하고, 각 실사 이미지에 해당하는 랜덤 잠재 코드에 학습용 얼굴 이미지들을 학습함에 따른 캐릭터의 잠재 코드를 합성할 수 있고, 합성된 코드를 디코더를 통해 디코딩함으로써 각 실사 이미지에 대응하는 얼굴 변환 이미지를 생성할 수 있다. The face transformation model 1200 encodes each of the live-action images through an encoder, synthesizes the character's latent code according to learning the facial images for learning to a random latent code corresponding to each live-action image, and uses the synthesized code as a decoder. By decoding through , a face transformation image corresponding to each real-life image can be generated.

얼굴 변환 모델(1200)은, 예컨대, DualStyleGAN 알고리즘 또는 JoJoGAN 알고리즘을 사용할 수 있다. The face transformation model 1200 may use, for example, the DualStyleGAN algorithm or the JoJoGAN algorithm.

이상 도 1 내지 도 4 및 도 7을 참조하여 전술된 기술적 특징에 대한 설명은, 도 5, 도 8 및 도 12에 대해서도 그대로 적용될 수 있으므로 중복되는 설명은 생략한다.The description of the technical features described above with reference to FIGS. 1 to 4 and FIG. 7 can also be applied to FIGS. 5, 8, and 12, so overlapping descriptions will be omitted.

도 9는 일 예에 따른, 배경 변환 모델을 사용하여 실사 이미지들의 배경을 카툰화된 배경으로 변경하고, 최종적인 학습용 이미지 데이터 셋을 생성하여 변환 모델을 구축하는 방법을 나타낸다. Figure 9 shows a method of building a transformation model by changing the background of real-life images into a cartoonized background using a background transformation model and generating a final training image data set, according to an example.

단계(910)에서, 컴퓨터 시스템(100)은 배경 변환 모델을 학습시키기 위한 배경 데이터를 수집할 수 있다. 배경 데이터는 제1 캐릭터를 포함하는 콘텐츠로부터 추출된 이미지들을 포함할 수 있고, 배경 변환 모델은 이러한 배경 데이터를 학습할 수 있다. 상기 추출된 이미지들은 캐릭터나 기타 오브젝트를 포함하지 않는 이미지들일 수 있다. 말하자면, 추출된 이미지의 각각은 얼굴을 포함하지 않고, 나무, 산, 하늘, 바다 등과 같은 자연물만을 포함하는 것일 수 있다. 컴퓨터 시스템(100)은, 일례로, 2000개의 이미지들을 수집함으로써 배경 변환 모델의 학습을 위한 데이터 셋을 구축할 수 있다.At step 910, computer system 100 may collect background data for training a background transformation model. The background data may include images extracted from content including the first character, and the background transformation model may learn this background data. The extracted images may be images that do not include characters or other objects. In other words, each of the extracted images may not contain a face but only natural objects such as trees, mountains, sky, sea, etc. The computer system 100 may build a data set for learning a background transformation model by collecting, for example, 2000 images.

한편, 컴퓨터 시스템(100)은 배경만을 포함하는, 즉, 자연물만을 포함하는, 실사 이미지들을 추가로 더 수집할 수 있고, 수집된 실사 이미지들을 학습을 위한 데이터 셋에 포함시킬 수 있다. 컴퓨터 시스템(100)은, 일례로, 2000개의 실사 이미지들을 더 수집함으로써 배경 변환 모델의 학습을 위한 데이터 셋을 구축할 수 있다. 말하자면, 실시예의 배경 변환 모델은, 배경만을 포함하는 실사 이미지들에 더 기반하여 학습된 것일 수 있다. Meanwhile, the computer system 100 may additionally collect real-life images containing only backgrounds, that is, only natural objects, and include the collected real-life images in a data set for learning. For example, the computer system 100 may build a data set for learning a background transformation model by collecting 2000 more real-world images. In other words, the background transformation model of the embodiment may be learned based more on real-world images containing only the background.

배경 변환 모델의 학습을 위한 데이터 셋에 이러한 실사 이미지들을 더 포함시킴으로써 배경 변환 모델의 성능이 향상될 수 있다. The performance of the background transformation model can be improved by including more such real-life images in the data set for learning the background transformation model.

단계(920)에서, 컴퓨터 시스템(100)은 단계(910)에서 구축된 학습을 위한 데이터 셋을 학습함으로써 배경 변환 모델을 생성할 수 있다. 배경 변환 모델은, 예컨대, AnimeGAN 알고리즘을 사용할 수 있다. In step 920, the computer system 100 may generate a background transformation model by learning the training data set constructed in step 910. The background transformation model may use, for example, the AnimeGAN algorithm.

단계(940)에서, 배경 변환 모델은 실사 이미지인 복수의 이미지들의 각 이미지의 배경을 제1 캐릭터를 포함하는 콘텐츠의 카툰화된 배경으로 변환할 수 있다. 이에 따라, 배경 변환 모델은 상기 각 이미지에 대응하는 배경 변환 이미지를 생성할 수 있다. 한편, 단계(930)에서처럼, 배경 변환 모델은 단계(910)에서 수집된 실사 이미지들을 제1 캐릭터를 포함하는 콘텐츠의 카툰화된 배경으로 변환할 수 있다. 단계(930)에서의 변환의 결과는 후술할 단계(960)에서의 최종적인 학습용 이미지 데이터 셋의 구축에 사용될 수 있다. In step 940, the background conversion model may convert the background of each image of the plurality of images, which are real-life images, into a cartoonized background of content including the first character. Accordingly, the background transformation model can generate a background transformation image corresponding to each image. Meanwhile, as in step 930, the background conversion model may convert the real-life images collected in step 910 into a cartoonized background of the content including the first character. The result of the transformation in step 930 can be used to build a final training image data set in step 960, which will be described later.

단계(950)에서, 컴퓨터 시스템(100)은 얼굴 변환 모델을 사용하여 실사 이미지인 각 이미지의 얼굴을 변환한 결과인 카툰화된 얼굴과, 배경 변환 모델을 사용하여 실사 이미지인 각 이미지의 배경을 변환한 결과인 카툰화된 배경을 합성할 수 있다. 즉, 컴퓨터 시스템(100)은 전술한 얼굴 변환 이미지와 배경 변환 이미지를 합성함으로써 최종적인 학습용 이미지 데이터 셋의 구축을 위한 캐릭터 이미지를 생성할 수 있다. 예컨대, 컴퓨터 시스템(100)은 배경 변환 이미지에 얼굴 변환 이미지에 포함된 카툰화된 얼굴을 합성함으로써, 실사 이미지인 각 이미지에 대응하는 캐릭터 이미지를 생성할 수 있다. 이는 전술한 단계(316)에 대응할 수 있다. In step 950, the computer system 100 converts the face of each image, which is a photorealistic image, into a cartoonized face using a face transformation model and the background of each image, which is a photorealistic image, using a background conversion model. The cartoonized background that is the result of conversion can be synthesized. That is, the computer system 100 can generate a character image for building a final training image data set by combining the above-described face conversion image and background conversion image. For example, the computer system 100 may generate character images corresponding to each image, which is a real-life image, by combining the cartoonized face included in the face conversion image with the background conversion image. This may correspond to step 316 described above.

컴퓨터 시스템(100)은 얼굴 변환 이미지와 배경 변환 이미지를 합성하기 위해 예컨대, Ibug face parsing bisenet 알고리즘을 사용할 수 있다.The computer system 100 may use, for example, the Ibug face parsing bisenet algorithm to synthesize the face transformation image and the background transformation image.

단계(960)에서, 컴퓨터 시스템(100)은 단계(950)에서의 합성의 결과로서 생성된 캐릭터 이미지와 거기에 대응하는 실사 이미지의 쌍으로 구성되는 학습용 이미지 데이터 셋을 최종적으로 생성할 수 있다. 이는 전술한 단계(318)에 대응할 수 있다. 한편, 컴퓨터 시스템(100)은 최종적인 학습용 이미지 데이터 셋을 구축하기 위해 단계(930)에서의 변환의 결과를 더 사용할 수 있다. 예컨대, 최종적인 학습용 이미지 데이터 셋은 단계(930)에서의 변환의 결과를 더 포함할 수 있다. In step 960, the computer system 100 may finally generate a training image data set consisting of a pair of a character image generated as a result of synthesis in step 950 and a corresponding real-life image. This may correspond to step 318 described above. Meanwhile, the computer system 100 may further use the results of the transformation in step 930 to build a final training image data set. For example, the final training image data set may further include the result of the transformation in step 930.

관련하여, 도 11은 일 예에 따른, 변환 모델의 구축을 위한 학습용 이미지 데이터 셋을 생성하기 위해, 각 실사 이미지에 대응하여 얼굴과 배경이 카툰화된 캐릭터 이미지를 생성하는 방법을 나타낸다.In relation to this, FIG. 11 shows a method of generating a character image with a cartoonized face and background corresponding to each real-life image in order to generate a training image data set for building a transformation model, according to an example.

도시된 것처럼, 실사 이미지의 얼굴은 전술한 얼굴 변환 모델을 사용하여 제1 캐릭터로 카툰화될 수 있다. 얼굴 변환 모델을 사용하여 실사 이미지의 얼굴을 변환할 경우, 변환된 얼굴 변환 이미지에서 실사 이미지의 배경은 카툰화되지 않고 블러 처리되는 등 손실될 수 있다. 따라서, 실사 이미지 - 얼굴 변환 이미지의 쌍을 변환 모델을 위한 학습용 이미지 데이터 셋으로 사용하여 변환 모델을 구축할 경우, 구축된 변환 모델은 타겟 이미지(10)의 배경(30)을 적절하게 카툰화할 수 없다. As shown, a face in a live-action image may be cartoonized into a first character using the facial transformation model described above. When the face of a live-action image is converted using a face conversion model, the background of the live-action image may be lost, such as being blurred rather than cartoonized, in the converted face conversion image. Therefore, when a transformation model is built using a pair of real-life image - face transformation image as a training image data set for the transformation model, the constructed transformation model can appropriately cartoonize the background 30 of the target image 10. does not exist.

이에 실시예에서는, 배경 변환 모델을 사용하여 실사 이미지의 배경을 카툰화할 수 있다. 배경이 콘텐츠의 배경으로 카툰화된 배경 변환 이미지와 상기 얼굴 변환 이미지가 합성됨으로써, 실사 이미지의 얼굴과 배경이 모두 카툰화된 캐릭터 이미지가 생성될 수 있다. 실시예에서는, 이러한 캐릭터 이미지 - 실사 이미지의 쌍을 변환 모델을 위한 학습용 이미지 데이터 셋으로 사용하여 변환 모델을 구축함으로써, 구축된 변환 모델은 타겟 이미지(10)의 얼굴(20)과 배경(30)이 모두 효과적으로 카툰화될 수 있다. Accordingly, in an embodiment, the background of a real-life image can be cartoonized using a background transformation model. By combining the face conversion image with a background conversion image in which the background is a cartoon as the background of the content, a character image in which both the face and the background of the live-action image are cartoons can be created. In an embodiment, a transformation model is built using this character image-real-life image pair as a training image data set for the transformation model, so that the constructed transformation model includes the face 20 and the background 30 of the target image 10. All of this can be effectively cartoonized.

단계(970)에서, 컴퓨터 시스템(100)은 단계(960)에서 생성된 최종적인 학습용 이미지 데이터 셋을 학습함으로써 변환 모델을 생성할 수 있다. 생성된 변환 모델은, 도 1을 참조하여 전술한 것처럼, 변환 모델에 입력되는 타겟 이미지(10)를 배경(30)과 얼굴(20)이 카툰화된 캐릭터 이미지(50)로 변환할 수 있다. 컴퓨터 시스템(100)은 이러한 변환 모델의 생성을 위해, 예컨대, DynamicUnet 알고리즘을 사용할 수 있다. In step 970, the computer system 100 may generate a transformation model by learning the final training image data set generated in step 960. As described above with reference to FIG. 1 , the generated conversion model can convert the target image 10 input to the conversion model into a character image 50 in which the background 30 and the face 20 are cartoonized. Computer system 100 may use, for example, the DynamicUnet algorithm to generate such a transformation model.

이상 도 1 내지 도 5, 도 7, 도 8 및 도 12를 참조하여 전술된 기술적 특징에 대한 설명은, 도 9 및 도 11에 대해서도 그대로 적용될 수 있으므로 중복되는 설명은 생략한다.The description of the technical features described above with reference to FIGS. 1 to 5, 7, 8, and 12 can also be applied to FIGS. 9 and 11, so overlapping descriptions will be omitted.

한편, 실시예의 변환 모델은 콘텐츠가 포함하는 복수의 캐릭터들의 각 캐릭터로 타겟 이미지(10)를 카툰화할 수 있도록 구현될 수도 있다. 즉, 실시예의 변환 모델은 단일 모델로서 콘텐츠가 포함하는 복수의 캐릭터들로의 타겟 이미지(10)의 카툰화된 변환 이미지(50)로의 변환을 지원할 수 있다. Meanwhile, the conversion model of the embodiment may be implemented to cartoonize the target image 10 with each character of a plurality of characters included in the content. That is, the conversion model of the embodiment is a single model and can support conversion of the target image 10 into the cartoonized conversion image 50 of a plurality of characters included in the content.

이러한 단일 모델로서의 변환 모델은 다음과 같은 방법으로 구축될 수 있다.This transformation model as a single model can be built in the following way.

예컨대, 컴퓨터 시스템(100)은, 전술한 단계(310)에 있어서, 실사 이미지인 복수의 이미지들과 상기 복수의 이미지들에 대응하는 복수의 캐릭터들의 각 캐릭터로 카툰화된 캐릭터 이미지들의 쌍으로 구성되는 학습용 이미지 데이터 셋을 생성할 수 있다. 컴퓨터 시스템(100)은 이러한 '각 캐릭터로 카툰화된 캐릭터 이미지 - 실사 이미지'의 쌍으로 구성되는 학습용 이미지 데이터 셋을 학습함으로써 변환 모델을 구축할 수 있다. 캐릭터별로 학습용 이미지 데이터 셋을 구축함에 있어서는 전술한 단계(440)에서의 캐릭터별로 클러스터링된 얼굴 이미지들(즉, 학습용 얼굴 이미지들)이 사용될 수 있다. 구축된 변환 모델은 타겟 이미지를 각 캐릭터로 카툰화된 캐릭터 이미지로 변환 가능하게 될 수 있다.For example, in the above-described step 310, the computer system 100 consists of a plurality of images that are real images and a pair of character images cartoonized with each character of a plurality of characters corresponding to the plurality of images. You can create an image data set for learning. The computer system 100 can build a transformation model by learning a training image data set consisting of pairs of 'character images cartoonized for each character - real images'. When constructing a learning image data set for each character, face images clustered for each character in the above-described step 440 (i.e., facial images for learning) can be used. The constructed conversion model can convert the target image into a cartoonized character image for each character.

이로서, 변환 모델은 다양한 얼굴의 추상화 레벨들을 갖는 웹툰의 캐릭터들의 각 캐릭터에 대해, 해당 각 캐릭터로의 타겟 이미지(10)의 카툰화를 지원 가능하도록 구현될 수 있다.As a result, the transformation model can be implemented to support cartooning of the target image 10 for each character of the webtoon having various facial abstraction levels.

도 6은 일 실시예에 따른, 구축된 변환 모델을 사용하여, 타겟 이미지를 배경과 얼굴이 카툰화된 캐릭터 이미지로 변환하는 방법을 나타내는 흐름도이다. Figure 6 is a flowchart showing a method of converting a target image into a character image with a cartoonized background and face using a constructed conversion model, according to an embodiment.

도 6을 참조하여, 변환 모델을 사용한 추론에 따라, 타겟 이미지(10)로부터 변환 이미지(60)를 획득하는 방법에 대해 더 자세하게 설명한다. Referring to FIG. 6, a method of obtaining the converted image 60 from the target image 10 according to inference using the conversion model will be described in more detail.

후술될 실시예에서의 단계들(610 내지 630 또는 1010 내지 1030)을 수행하는 컴퓨터 시스템은 구축된 변환 모델을 사용하여 타겟 이미지(10)로부터 변환 이미지(60)를 생성하는 사용자 단말일 수 있다. 이러한 사용자 단말은 전술한 변환 모델을 구축하는 컴퓨터 시스템(100)과는 별개의 장치일 수 있으나, 설명의 편의상 컴퓨터 시스템(100)이 단계들(610 내지 630 또는 1010 내지 1030)을 수행하는 것으로 후술한다. The computer system that performs steps 610 to 630 or 1010 to 1030 in the embodiment described later may be a user terminal that generates the converted image 60 from the target image 10 using the constructed conversion model. This user terminal may be a separate device from the computer system 100 that builds the conversion model described above, but for convenience of explanation, it will be described later that the computer system 100 performs steps 610 to 630 or 1010 to 1030. do.

단계(610)에서, 컴퓨터 시스템(100)은 구축된 변환 모델에 대해, 배경(30)과 얼굴(30)을 포함하는 타겟 이미지(10)를 입력 받을 수 있다. 말하자면, 컴퓨터 시스템(100)은 타겟 이미지(10)를 입력할 수 있다. 컴퓨터 시스템(100)은 타겟 이미지(10)를 수신하여 변환 모델로 입력시킬 수 있다. 컴퓨터 시스템(100)으로 수신되는 타겟 이미지(10)는 컴퓨터 시스템(100)이 제공하는 사용자 인터페이스(UI)를 통해 사용자에 의해 입력되는 것일 수 있다. 타겟 이미지(10)는 사용자에 의해 UI를 통해 선택된 영상 또는 이미지일 수 있고, 또는, 컴퓨터 시스템(100)이 포함하는 카메라에 의해 실시간으로 촬영되는 영상 또는 이미지일 수 있다. 전술한 것처럼, 변환 모델은 배경과 얼굴을 각각 포함하는 복수의 이미지들과 해당 복수의 이미지들에 대응하는 제1 캐릭터로 카툰화된 캐릭터 이미지들의 쌍으로 구성되는 학습용 이미지 데이터 셋을 학습함으로써 구축된 것일 수 있다. 이 때, 변환 모델의 학습용 이미지 데이터 셋을 구성하는 복수의 이미지들의 각 이미지에 대응하는 캐릭터 이미지는, 얼굴 변환 모델을 사용하여 각 이미지의 얼굴을 제1 캐릭터의 얼굴을 포함하는 복수의 학습용 얼굴 이미지들에 기반하여 제1 캐릭터로 카툰화된 얼굴로 변환한 결과와, 배경 변환 모델을 사용하여 각 이미지의 배경을 카툰화된 배경으로 변환한 결과가 합성됨으로써 생성된 것일 수 있다. In step 610, the computer system 100 may receive a target image 10 including a background 30 and a face 30 for the constructed transformation model. That is, computer system 100 may input target image 10. The computer system 100 may receive the target image 10 and input it into a transformation model. The target image 10 received by the computer system 100 may be input by the user through a user interface (UI) provided by the computer system 100. The target image 10 may be a video or image selected by the user through a UI, or may be a video or image captured in real time by a camera included in the computer system 100. As described above, the conversion model is constructed by learning a training image data set consisting of a pair of cartoonized character images with a plurality of images each containing a background and a face and a first character corresponding to the plurality of images. It could be. At this time, the character image corresponding to each image of the plurality of images constituting the training image data set of the transformation model is a plurality of learning face images including the face of the first character using the face transformation model. It may be generated by combining the result of converting the first character into a cartoonized face based on the first character and the result of converting the background of each image into a cartoonized background using a background conversion model.

단계(620)에서, 컴퓨터 시스템(100)은 변환 모델에 기반한 추론에 따라, 타겟 이미지(10)의 얼굴(20)이 제1 캐릭터로 카툰화된 얼굴(60)로 변환되고, 타겟 이미지(10)의 배경(30)이 제1 캐릭터를 포함하는 콘텐츠의 카툰화된 배경(70)으로 변환된 변환 이미지(50)를 생성할 수 있다. In step 620, the computer system 100 converts the face 20 of the target image 10 into the cartoonized face 60 as the first character, according to reasoning based on the transformation model, and the face 20 of the target image 10 ) of the background 30 may be converted into a cartoonized background 70 of content including the first character to generate a converted image 50.

단계(630)에서, 컴퓨터 시스템(100)은 변환 이미지(50)를 출력할 수 있다. 컴퓨터 시스템(100)은 디스플레이를 통해 변환 이미지(50)를 출력할 수 있다. 예컨대, 컴퓨터 시스템(100)은 타겟 이미지(10) 대신에 변환 이미지(50)를 표시하거나, 타겟 이미지(10)와 변환 이미지(50)를 함께 표시하여 양자를 비교 가능하게 표시할 수도 있다. 컴퓨터 시스템(100)은 실시간으로 촬영되는 영상 또는 이미지인 타겟 이미지(10)에 대한 카툰화된 변환 이미지(50)를 실시간으로 표시할 수 있다. At step 630, computer system 100 may output converted image 50. The computer system 100 may output the converted image 50 through a display. For example, the computer system 100 may display the converted image 50 instead of the target image 10, or may display the target image 10 and the converted image 50 together so that they can be compared. The computer system 100 may display, in real time, a cartoonized converted image 50 for the target image 10, which is a video or image captured in real time.

한편, 전술한 것처럼, 제1 캐릭터가 포함된 콘텐츠가 복수의 캐릭터들을 포함하는 경우에 있어서, 변환 모델은 타겟 이미지(10)를 복수의 캐릭터들의 각 캐릭터로 카툰화된 캐릭터 이미지(50)로 변환 가능하도록 구축된 것일 수 있다.Meanwhile, as described above, in the case where the content including the first character includes a plurality of characters, the conversion model converts the target image 10 into a character image 50 cartoonized with each character of the plurality of characters. It may have been built to make this possible.

이 때, 단계(615)에서처럼, 컴퓨터 시스템(100)은 콘텐츠의 복수의 캐릭터들 중 타겟 이미지(10)를 변환할 캐릭터를 선택하기 위한 사용자 인터페이스(UI)를 제공할 수 있다. 제1 캐릭터는 상기 UI를 통해 선택된 캐릭터일 수 있다. 컴퓨터 시스템(100)은 타겟 이미지(10)를 선택된 제1 캐릭터로 카툰화하여 변환 이미지(50)를 생성할 수 있다. At this time, as in step 615, the computer system 100 may provide a user interface (UI) for selecting a character to convert the target image 10 from among a plurality of characters in the content. The first character may be a character selected through the UI. The computer system 100 may generate the converted image 50 by cartooning the target image 10 with the selected first character.

이처럼, 실시예의 변환 모델은 단일 모델로서 콘텐츠가 포함하는 복수의 캐릭터들로의 타겟 이미지(10)의 카툰화된 변환 이미지(50)로의 변환을 지원할 수 있다. 일례로, 사용자는 웹툰 콘텐츠의 등장 인물인 캐릭터들 중 자신이 원하는 캐릭터를 선택하여, 해당 캐릭터의 화풍 또는 스타일로 타겟 이미지(10)의 변환을 요청할 수 있다. In this way, the conversion model of the embodiment can support conversion of the target image 10 into the cartoonized conversion image 50 of a plurality of characters included in the content as a single model. For example, a user may select a character he or she wants from among characters appearing in webtoon content and request conversion of the target image 10 into the painting style or style of the character.

한편, 타겟 이미지(10)는 복수의 프레임들을 포함하는 영상(video)일 수 있다. 이 때, 변환 이미지(50)는 영상에 포함된 얼굴(20)과 배경(30)이 카툰화된 얼굴(60)과 카툰화된 배경(70)으로 변환된 변환 영상이 될 수 있다. Meanwhile, the target image 10 may be a video including a plurality of frames. At this time, the converted image 50 may be a converted image in which the face 20 and background 30 included in the image are converted into a cartoonized face 60 and a cartoonized background 70.

변환 모델은 프레임 단위로 타겟 이미지(10)를 변환하여 변환 이미지(50)를 생성할 수 있다. 타겟 이미지(10)가 실시간으로 입력 또는 촬영되는 영상인 경우, 변환 이미지(50) 역시 실시간으로 생성될 수 있다. The conversion model can generate the conversion image 50 by converting the target image 10 on a frame-by-frame basis. If the target image 10 is an image input or captured in real time, the converted image 50 may also be generated in real time.

관련하여, 도 10은 변환 모델에 의한 추론에 따라, 타겟 이미지(10)로부터 변환 이미지(50)를 출력하는 방법을 설명한다.In relation to this, FIG. 10 explains a method of outputting the converted image 50 from the target image 10 according to inference based on the conversion model.

단계(1010)에서, 추론 대상이 되는 타겟 이미지(10)가 구축된 변환 모델에 대해 입력될 수 있다. 이는 전술한 단계(610)에 대응할 수 있다. In step 1010, the target image 10 that is the subject of inference may be input to the constructed transformation model. This may correspond to step 610 described above.

단계(1020)에서, 변환 모델은 추론을 통해, 타겟 이미지(10)를 카툰화된 변환 이미지(50)로 변환할 수 있다. 이는 전술한 단계(620)에 대응할 수 있다. In step 1020, the transformation model may convert the target image 10 into a cartoonized transformation image 50 through inference. This may correspond to step 620 described above.

단계(1020)에서, 컴퓨터 시스템(100)은 생성된 변환 이미지(50)를 출력할 수 있다. 이는 전술한 단계(630)에 대응할 수 있다.In step 1020, the computer system 100 may output the generated converted image 50. This may correspond to step 630 described above.

이상 도 1 내지 도 5, 도 7 내지 도 9, 도 11 및 도 12를 참조하여 전술된 기술적 특징에 대한 설명은, 도 6 및 도 10에 대해서도 그대로 적용될 수 있으므로 중복되는 설명은 생략한다.The description of the technical features described above with reference to FIGS. 1 to 5, 7 to 9, 11 and 12 can also be applied to FIGS. 6 and 10, so overlapping descriptions will be omitted.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 어플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The device described above may be implemented with hardware components, software components, and/or a combination of hardware components and software components. For example, the devices and components described in the embodiments include a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA), and a programmable logic unit (PLU). It may be implemented using one or more general-purpose or special-purpose computers, such as a logic unit, microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. Additionally, a processing device may access, store, manipulate, process, and generate data in response to the execution of software. For ease of understanding, a single processing device may be described as being used; however, those skilled in the art will understand that a processing device includes multiple processing elements and/or multiple types of processing elements. It can be seen that it may include. For example, a processing device may include a plurality of processors or one processor and one controller. Additionally, other processing configurations, such as parallel processors, are possible.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치에서 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may include a computer program, code, instructions, or a combination of one or more of these, which may configure a processing unit to operate as desired, or may be processed independently or collectively. You can command the device. Software and/or data may be used on any type of machine, component, physical device, virtual equipment, computer storage medium or device to be interpreted by or to provide instructions or data to a processing device. It can be embodied in . Software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 이때, 매체는 컴퓨터로 실행 가능한 프로그램을 계속 저장하거나, 실행 또는 다운로드를 위해 임시 저장하는 것일 수도 있다. 또한, 매체는 단일 또는 수개 하드웨어가 결합된 형태의 다양한 기록수단 또는 저장수단일 수 있는데, 어떤 컴퓨터 시스템에 직접 접속되는 매체에 한정되지 않고, 네트워크 상에 분산 존재하는 것일 수도 있다. 매체의 예시로는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등을 포함하여 프로그램 명령어가 저장되도록 구성된 것이 있을 수 있다. 또한, 다른 매체의 예시로, 애플리케이션을 유통하는 앱 스토어나 기타 다양한 소프트웨어를 공급 내지 유통하는 사이트, 서버 등에서 관리하는 기록매체 내지 저장매체도 들 수 있다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer-readable medium. At this time, the medium may continuously store a computer-executable program, or temporarily store it for execution or download. In addition, the medium may be a variety of recording or storage means in the form of a single or several pieces of hardware combined. It is not limited to a medium directly connected to a computer system and may be distributed over a network. Examples of media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, And there may be something configured to store program instructions, including ROM, RAM, flash memory, etc. Additionally, examples of other media include recording or storage media managed by app stores that distribute applications, sites or servers that supply or distribute various other software, etc.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with limited examples and drawings, various modifications and variations can be made by those skilled in the art from the above description. For example, the described techniques are performed in a different order than the described method, and/or components of the described system, structure, device, circuit, etc. are combined or combined in a different form than the described method, or other components are used. Alternatively, appropriate results may be achieved even if substituted or substituted by an equivalent.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims also fall within the scope of the claims described below.

Claims

In a method of building a conversion model for converting a target image into a cartoonized character image, performed by a computer system,
Generating a training image data set consisting of a pair of a plurality of images each including a background and a face, and cartoonized character images as a first character corresponding to the plurality of images; and
Building the transformation model by learning the generated training image data set.
Including,
The step of generating the training image data set is,
Using a face conversion model, converting the face of each image of the plurality of images into a cartoonized face of the first character based on a plurality of learning face images including the face of the first character;
Converting the background of each image into a cartoonized background using a background conversion model; and
Creating a character image corresponding to each image by combining the cartoonized face and the cartoonized background.
A method of building a transformation model, including.

According to paragraph 1,
The first character is included in content including an image or video,
The step of generating the training image data set is,
Obtaining the learning facial images including the face of the first character from the content
A method of building a transformation model, including.

According to paragraph 2,
The content is webtoon content including a plurality of cuts,
The step of acquiring the facial images for learning is,
extracting a predetermined number of cuts from the content;
extracting first facial images including the face of at least one character including the first character from the cuts;
Obtaining corrected second facial images by performing at least one of alignment, resizing, and resolution change on at least one of the first facial images; and
Obtaining the learning facial images including the face of the first character by clustering the second facial images by character.
A method of building a transformation model, including.

According to paragraph 1,
The plurality of images is a first number,
The plurality of face images for learning is a second number smaller than the first number,
The face transformation model is,
Based on the first number of the plurality of images and the second number of the plurality of facial images for training, the first number of plurality, including the cartoonized face matching a face in each image A method for building a transformation model, configured to generate the first number of facial transformation images corresponding to images of

According to clause 4,
The cartoonized face converted using the face transformation model is,
A method of constructing a transformation model in which the face of each image is matched with the direction, composition, and location of the parts included in the face.

According to paragraph 4,
The step of converting the face of each image into a cartoonized face of the first character,
defining a latent space of the face transformation model based on the plurality of facial images for learning;
generating an inversion code for conversion of each image based on the latent space; and
Based on the conversion code, generating a face conversion image corresponding to each image.
A method of building a transformation model, including.

According to paragraph 1,
The first character is included in content including an image or video,
The background transformation model is learned based on images that do not contain characters extracted from the content,
The background transformation model is configured to generate a background transformation image by converting the background of each image into a cartoonized background of the content.

In clause 7,
A method of building a transformation model, wherein the background transformation model is learned based on real-life images containing only the background.

In clause 7,
The step of generating a character image corresponding to each image is,
A method of building a transformation model, wherein a character image corresponding to each image is generated by combining the cartoonized face with the background transformation image.

According to paragraph 1,
Inputting the target image including a background and a face into the constructed conversion model;
According to inference based on the conversion model, the face of the target image is converted to a cartoonized face of the first character, and the background of the target image is converted to a cartoonized background of content including the first character. generating a converted image; and
Outputting the converted image
A method of building a transformation model, further comprising:

According to paragraph 1,
The first character is included in content including an image or video,
The content includes a plurality of characters including a first character,
The generating step generates a learning image data set consisting of a pair of the plurality of images and cartoonized character images of each character of the plurality of characters corresponding to the plurality of images,
The conversion model is constructed to enable conversion of the target image into a cartoonized character image for each character.

According to clause 10,
The first character is included in content including an image or video,
The content includes a plurality of characters including a first character,
The conversion model is constructed to convert the target image into a cartoonized character image for each character,
Providing a user interface (UI) for selecting a character to convert the target image from among the plurality of characters.
It further includes,
A method of building a transformation model, wherein the first character is a character selected through the UI.

According to clause 10,
The target image is a video including a plurality of frames,
The converted image is a converted image in which the face and background included in the image are converted into the cartoonized face and the cartoonized background, respectively.

A program recorded on a computer-readable recording medium for executing the method of claim 1 on the computer system.

In a computer system that builds a conversion model that converts a target image into a cartoonized character image,
At least one processor implemented to execute instructions readable by the computer system
Including,
The at least one processor,
Generate a learning image data set consisting of a pair of cartoonized character images with a plurality of images each including a background and a face and a first character corresponding to the plurality of images,
Constructing the transformation model by learning the generated training image data set,
The at least one processor,
In generating the training image data set,
Using a face conversion model, the face of each image of the plurality of images is converted into a cartoonized face of the first character based on a plurality of learning face images including the face of the first character, and the background is converted. A computer system that uses a model to generate a character image corresponding to each image by converting the background of each image into a cartoonized background and compositing the cartoonized face with the cartoonized background.

In an image conversion method performed by a computer system for converting a target image into a cartoonized character image,
For a transformation model built by learning a training image data set consisting of a pair of a plurality of images each containing a background and a face and cartoonized character images with a first character corresponding to the plurality of images, the background and Inputting the target image including a face;
According to inference based on the conversion model, the face of the target image is converted to a cartoonized face of the first character, and the background of the target image is converted to a cartoonized background of content including the first character. generating a converted image; and
Outputting the converted image
Including,
The character image corresponding to each image of the plurality of images constituting the training image data set of the transformation model is,
Using a face conversion model, a result of converting the face of each image into a cartoonized face of the first character based on a plurality of learning face images including the face of the first character;
Results of converting the background of each image into a cartoonized background using a background conversion model
An image conversion method created by compositing.

In a method of generating a training image data set for a conversion model for converting a target image into a cartoonized character image, performed by a computer system,
Using a face transformation model, based on a plurality of learning face images including the face of the first character, the face of each image of the plurality of images including the background and the face is cartoonized as the first character. Converting to;
Converting the background of each image into a cartoonized background using a background conversion model;
generating a character image corresponding to each image by combining the cartoonized face and the cartoonized background; and
Generating pairs of character images cartoonized with the plurality of images and the first character corresponding to the plurality of images, consisting of a pair of each image and the character image, as an image data set for learning.
A method of generating an image data set for training, including.