WO2023191182A1 - System and method for automatically converting animation into webcomics by one touch - Google Patents

System and method for automatically converting animation into webcomics by one touch Download PDF

Info

Publication number
WO2023191182A1
WO2023191182A1 PCT/KR2022/007300 KR2022007300W WO2023191182A1 WO 2023191182 A1 WO2023191182 A1 WO 2023191182A1 KR 2022007300 W KR2022007300 W KR 2022007300W WO 2023191182 A1 WO2023191182 A1 WO 2023191182A1
Authority
WO
WIPO (PCT)
Prior art keywords
webtoon
animation
onomatopoeia
cut
sound
Prior art date
Application number
PCT/KR2022/007300
Other languages
French (fr)
Korean (ko)
Inventor
김탁훈
최종원
배소연
황진수
Original Assignee
(주)탁툰엔터프라이즈
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by (주)탁툰엔터프라이즈 filed Critical (주)탁툰엔터프라이즈
Publication of WO2023191182A1 publication Critical patent/WO2023191182A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/0483Interaction with page-structured environments, e.g. book metaphor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state

Definitions

  • the present invention relates to a system and method for automatically converting animation into webtoon with one touch.
  • the image editing technology field includes filtering technology that changes the original color of an image to express a desired effect, and image warping technology that changes the entire rectangular area of the image or part of the image into a desired shape.
  • Video editing technology includes video compression technology that deals with the format of the video itself, and from the perspective of compositing animation, keyframe animation technology transforms the shape of a given object based on several predetermined key shapes, and the movement is expressed in a mathematical formula.
  • keyframe animation technology transforms the shape of a given object based on several predetermined key shapes, and the movement is expressed in a mathematical formula.
  • procedural animation creation technology that expresses and applies a function of time by using, etc.
  • simulation-based animation creation technology that applies the laws of motion of particles or high-dimensional physical laws.
  • the purpose of the present invention is to provide a system and method for automatically converting animation to webtoon with one touch, which can solve conventional problems.
  • a system for automatically converting animation into webtoon with one touch includes an input unit that receives animation from a user terminal; After determining the object's movement, the object's dialogue start and end point, and camera movement based on the frame, sound, and scene camera coordinate information in the animation, at least one of a plurality of valid cuts that match the judgment result is converted into a webtoon cut.
  • an image cut extraction unit that extracts
  • an onomatopoeia and tone analysis unit that separates the onomatopoeia of an object from the sound and then analyzes the object tone
  • a speech balloon and special effect application unit that inserts or changes concentration lines, speed lines, speech balloons, sound effects, and layout colors to reflect the onomatopoeia and tone in the effective cut image.
  • a method of operating a system for automatically converting animation to webtoon with one touch includes the steps of receiving animation from an input unit; After the image cut extractor determines the movement of the object, the starting and ending points of the object's dialogue, and the camera movement based on the frame, sound, and scene camera coordinate information in the animation, at least one of a plurality of valid cuts matching the judgment result.
  • Step of extracting the above into webtoon cuts Checking the accuracy between the webtoon cut and the correct answer cut (GT) in an accuracy inspection unit; separating object onomatopoeia from the sound in an onomatopoeia and tone analysis unit and then analyzing object tone; And a step of inserting a concentration line, a speed line, a speech balloon, a sound effect, and a layout color so that the onomatopoeia and tone of the object are reflected in the webtoon cut in the speech balloon and special effect application unit.
  • GT correct answer cut
  • Figure 1 is a block diagram of a system for automatically converting animation into webtoon with one touch according to an embodiment of the present invention.
  • FIG. 2 is a detailed configuration diagram of the image cut extractor shown in FIG. 1.
  • Figures 3 and 4 are exemplary diagrams for explaining the cropping process.
  • Figure 5 is an example diagram comparing the test result cut and GT cut for accuracy inspection and the result cut.
  • Figure 6 is an example diagram of a special effect applied by the speech balloon and special effect application unit shown in Figure 1.
  • FIG. 7 is an example diagram illustrating the arrangement of speech balloons applied by the speech balloon and special effect application unit shown in FIG. 1.
  • Figure 8 is an example of a special line inserted into a webtoon cut based on the coordinate values of objects for each frame.
  • Figure 9 is an example diagram of layout rules among the character cropping algorithm.
  • Figures 10 and 11 are examples of a clustering effectiveness analysis sheet and a speaker classification accuracy analysis sheet.
  • Figure 12 is a flowchart explaining a method of automatically converting animation to webtoon with one touch according to an embodiment of the present invention.
  • FIG. 13 is a detailed flowchart of the S720 process shown in FIG. 12.
  • FIG. 14 is a detailed flowchart of the S750 process shown in FIG. 12.
  • Figure 15 is a diagram illustrating an example computing environment in which one or more embodiments disclosed herein may be implemented.
  • Figure 1 is a block diagram of a system for automatically converting animation to webtoon with one touch according to an embodiment of the present invention
  • Figure 2 is a detailed configuration diagram of the image cut extractor shown in Figure 1
  • Figures 3 and 4 are crop It is an example diagram to explain the process
  • Figure 5 is an example diagram comparing the result cut with the test result cut and GT cut for accuracy inspection
  • Figure 6 is a special diagram applied in the speech bubble and special effect application unit shown in Figure 1.
  • Figure 7 is an example diagram explaining the arrangement of the speech balloon applied in the speech balloon and special effect application unit shown in Figure 1
  • Figure 8 shows a special line inserted into the webtoon cut based on the coordinate values of the objects for each frame.
  • Figure 9 is an example diagram of the layout rule among the character cropping algorithm
  • Figures 10 and 11 are example diagrams of a clustering effectiveness analysis sheet and a speaker classification accuracy analysis sheet.
  • the system 100 for automatically converting animation to webtoon with one touch includes an input unit 110, an image cut extractor 120, and application of speech balloons and special effects. Includes a unit 130 and an output unit 140.
  • the present invention may further include an onomatopoeia and tone analysis unit 150.
  • the system 100 of the present invention can be linked with an OPEN API running on a user terminal (not shown), and the OPEN API refers to an application on the terminal, for example, a mobile terminal (smart Includes apps that run on the phone. Apps can be downloaded and installed from the application market, a virtual marketplace where mobile content can be freely bought and sold, or run in conjunction with the cloud.
  • OPEN API refers to an application on the terminal, for example, a mobile terminal (smart Includes apps that run on the phone. Apps can be downloaded and installed from the application market, a virtual marketplace where mobile content can be freely bought and sold, or run in conjunction with the cloud.
  • the input unit 110 may be configured to receive animation input from a user terminal.
  • the image cut extraction unit 120 determines the movement of the object, the starting and ending points of the object's dialogue, and the camera movement based on the frame, sound, and scene camera coordinate information in the animation, and then generates a plurality of valid cuts that match the judgment results. It may be a configuration that extracts at least one of the webtoon cuts.
  • the image cut extraction unit 120 includes a source classification unit 121, a background/object tracking unit 122, an amplitude and frequency extraction unit 123, a camera coordinate tracking unit 124, and an object movement confirmation unit. It includes a unit 125, a dialogue timing point tracking unit 126, a camera technique determination unit 127, an effective cut determination unit 128, and a webtoon cut extraction unit 129.
  • the source classification unit 121 may be configured to classify the sources in the animation transmitted from the input unit 110 into 1 frame image, 2 sound, and 3 scene camera coordinates.
  • the background/object tracking unit 122 may be configured to track an object or its movement or position change through differences between consecutive frame images using an image distribution-based neural network learning algorithm (Representation Learning).
  • an image distribution-based neural network learning algorithm Representation Learning
  • the background/object tracking unit 122 can separate the object and the background when animation is input and analyze the shape and size of the object.
  • the object area is specified by connecting pixels at the top, bottom, left, right, and diagonal positions with respect to the pixel, in addition to the pixel with a value of 0, which is judged as the background. can do.
  • the amplitude and frequency extraction unit 123 may be configured to extract the amplitude and frequency of the sound of the animation.
  • the camera coordinate tracking unit 124 may be configured to track camera movement based on scene camera coordinates.
  • the object movement confirmation unit 125 is configured to check (select) frames in which the movement or position change of the object tracked by the background/object tracking unit 122 is minimal.
  • the object motion confirmation unit 125 may calculate a comparison value by comparing the distance to objects detected in the next frame based on the location and size information of the object detected in the previous frame. Additionally, the closest object among objects within a certain distance can be set as the next position to which the current object moves, and then the coordinate values of the objects can be output for each frame.
  • the dialogue timing point tracking unit 126 may be configured to track frames corresponding to the dialogue timing point from the start point where the object starts the dialogue to the end point where the object ends the dialogue.
  • the valid cut determination unit 128 may be configured to determine the frame confirmed by the object motion confirmation unit 125 and the frame tracked by the dialogue timing point tracking unit 126 as a valid cut.
  • the camera technique determination unit 127 may be configured to determine a frame in which a camera technique (eg, zoom in/out, tilt, pan) is reflected within the frame based on camera coordinates tracked in camera coordinate tracking.
  • a camera technique eg, zoom in/out, tilt, pan
  • matching feature points extracted from a difference image with objects removed from the original image For example, matching feature points extracted from a difference image with objects removed from the original image (Image feature matching), for example, using a difference image with saliency applied in the previous frame and using the original image in the next frame, between two frames.
  • Image feature matching For example, matching feature points extracted from a difference image with objects removed from the original image
  • the webtoon cut extraction unit 129 may be configured to select the frames determined as valid cuts by the valid cut determination unit 128 and the frames determined by the camera technique determination unit 127 as webtoon cuts.
  • the onomatopoeia and tone analysis unit 150 may be configured to separate the object onomatopoeia from the sound and then analyze the object tone.
  • the onomatopoeia and tone analysis unit 160 detects the frame where the voice appears through a pre-trained voice activity detection algorithm, and when the object voice is recognized in the sound, the corresponding When there is no dialogue at the point of view, it may be classified as onomatopoeia.
  • the onomatopoeia and tone analysis unit 150 analyzes the decibel of the onomatopoeia. For example, a class can be assigned by detecting a place where the voice decibel at the time of cut extraction is a certain decibel higher than the average decibel of the corresponding animation.
  • the section in which the dialogue occurs is received as a time or frame position, and if the proportion of places in this section that are greater than a certain decibel exceeds 20%, it may be classified as a specific decibel greater than the average decibel. .
  • the speech balloon and special effect application unit 130 may be configured to insert or change concentration lines, speed lines, speech balloons, sound effects, and layout colors so that the object onomatopoeia and tone are reflected in the webtoon cut image.
  • the speech balloon and special effect application unit 130 automatically generates a speech balloon according to the object onomatopoeia and the tone of the object.
  • the object is cropped using an object crop algorithm. After cropping the (object), a speech bubble and onomatopoeia converted into letters are placed in the area adjacent to the speaker, and the layout color is changed when the background in the video becomes dark.
  • the speech balloon and special effect application unit 130 may insert an effect line into the webtoon cut along the movement line in proportion to the size of the movement or position change of the object or object.
  • the speech balloon and special effect application unit 130 can arrange speech balloons according to the location of the object. For example, if the object is located on the left within the webtoon cut, the speech bubble is placed on the right.
  • the speech balloon and special effect application unit 130 has the highest corresponding value based on (sum of difference image values between frames in which dialogue is performed)/(area of salience map) calculated by the background and object extraction unit. The part can be recognized as the speaker.
  • the speech balloon and special effect application unit 130 provides a comparison value and a certain distance by comparing the distance to objects detected in the next frame based on the location and size information of the object detected in the previous frame in the object movement confirmation unit 125. After setting the closest object among the objects inside to the next position after the current object moves, a special line can be inserted into the webtoon cut based on the coordinate values of the objects for each frame.
  • the object crop algorithm may be a program that recognizes and clusters cuts with similar positions and compositions of objects as similar cuts, and repeats the layout form of the entire cut and speaker crop in order within each group.
  • the object crop algorithm recognizes and extracts cuts with less change in object and character composition compared to previous webtoon cuts as similar cuts, and determines whether they can actually be used in webtoons when applying the overall cut and speaker crop layout patterns for each group. Determine validity.
  • FIG. 12 is a flowchart explaining a method of automatically converting animation to webtoon with one touch according to an embodiment of the present invention
  • FIG. 13 is a detailed flowchart of the S720 process shown in FIG. 12
  • FIG. 14 is S740 shown in FIG. 12. This is a detailed flow chart of the process.
  • the method (S700) for automatically converting animation to webtoon includes the step of receiving animation from an input unit (S710); After the image cut extractor determines the movement of the object, the starting and ending points of the object's dialogue, and the camera movement based on the frame, sound, and scene camera coordinate information in the animation, at least one of a plurality of valid cuts matching the judgment result. The above is extracted as a webtoon cut (S720).
  • the object onomatopoeia is separated from the sound in the onomatopoeia and tone analysis unit, and then the object tone is analyzed (S730).
  • the frame where the voice appears is detected through a pre-trained voice activity detection algorithm, and when the object voice is recognized in the sound, the corresponding point is included in the script.
  • a step of classifying it as onomatopoeia may be included.
  • the speech balloon and special effect application unit inserts concentration lines, speed lines, speech balloons, sound effects, and layout colors to reflect the onomatopoeia and tone of the object within the webtoon cut (S740).
  • the S720 process classifies sources in the animation into frame images, sounds, and scene camera coordinates, tracks object movement or position changes through differences between consecutive frame images, extracts the amplitude and frequency of the sound, and calculates the scene camera coordinates. Based on this, track the camera coordinates, check the frame with the least movement or position change of the tracked object, track the frame corresponding to the dialogue timing point from the start point of the dialogue to the end point, and then check the position change. It includes a series of processes to determine the frame corresponding to the dialogue timing point from the starting point of the dialogue to the end point of the dialogue as a valid cut, and extracting the effective cut that reflects the camera technique among the valid cuts as a webtoon cut.
  • the process S740 may include automatically generating a speech bubble according to the object onomatopoeia and the tone of the object.
  • the object is cropped, and then It may include placing speech bubbles and onomatopoeia converted into letters in an area adjacent to the speaker, and changing the layout color using camera coordinate information.
  • step of inserting an effect line into the webtoon cut along the movement line in proportion to the size of the object or object's movement or change in position may be further included.
  • computing device 1100 may include a personal computer, server computer, handheld or laptop device, mobile device (mobile phone, PDA, media player, etc.), multiprocessor system, consumer electronics, minicomputer, mainframe computer, Distributed computing environments including any of the above-described systems or devices, etc. are included, but are not limited thereto.
  • Computing device 1100 may include at least one processing unit 1110 and memory 1120.
  • the processing unit 1110 may include, for example, a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), etc. and can have multiple cores.
  • Memory 1120 may be volatile memory (eg, RAM, etc.), non-volatile memory (eg, ROM, flash memory, etc.), or a combination thereof.
  • computing device 1100 may include additional storage 1130.
  • Storage 1130 includes, but is not limited to, magnetic storage, optical storage, etc.
  • the storage 1130 may store computer-readable instructions for implementing one or more embodiments disclosed in this specification, and other computer-readable instructions for implementing an operating system, application program, etc. may also be stored. Computer-readable instructions stored in storage 1130 may be loaded into memory 1120 for execution by processing unit 1110. Computing device 1100 may also include input device(s) 1140 and output device(s) 1150.
  • the input device(s) 1140 may include, for example, a keyboard, mouse, pen, voice input device, touch input device, infrared camera, video input device, or any other input device, etc.
  • output device(s) 1150 may include, for example, one or more displays, speakers, printers, or any other output devices.
  • the computing device 1100 may use an input device or output device provided in another computing device as the input device(s) 1140 or the output device(s) 1150.
  • computing device 1100 may include communication connection(s) 1160 that allows computing device 1100 to communicate with another device (e.g., computing device 1300).
  • communication connection(s) 1160 may include a modem, network interface card (NIC), integrated network interface, radio frequency transmitter/receiver, infrared port, USB connection, or other device for connecting computing device 1100 to another computing device. May contain interfaces. Additionally, communication connection(s) 1160 may include a wired connection or a wireless connection. Each component of the computing device 1100 described above may be connected by various interconnections such as buses (e.g., peripheral component interconnect (PCI), USB, firmware (IEEE 1394), optical bus structure, etc.) and may be interconnected by a network 1200. As used herein, terms such as “component”, “system”, etc. generally refer to computer-related entities that are hardware, a combination of hardware and software, software, or software in execution.
  • PCI peripheral component interconnect
  • IEEE 1394 firmware
  • optical bus structure etc.
  • a component may be, but is not limited to, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer.
  • both the application running on the controller and the controller can be components.
  • One or more components may reside within a process and/or thread of execution, and a component may be localized on one computer or distributed between two or more computers.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Hospice & Palliative Care (AREA)
  • Strategic Management (AREA)
  • Primary Health Care (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Child & Adolescent Psychology (AREA)
  • General Business, Economics & Management (AREA)
  • Psychiatry (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Processing Or Creating Images (AREA)

Abstract

A system for automatically converting an animation into webcomics by one touch according to an embodiment of the present invention comprises: an input unit for receiving an input of an animation from a user terminal; an image cut extraction unit for determining a motion of an object, speech start and end points of the object, and a motion of a camera on the basis of a frame within the animation, a sound, and scene camera coordinate information, and then extracting at least one of multiple valid cuts matching a result of the determination, as a webcomics cut; an onomatopoeia and tone analysis unit for separating object onomatopoeia from the sound, and then analyzing an object tone; and a speech balloon and special effect application unit for inserting or changing concentration lines, speed lines, a speech balloon, a sound effect, and a layout color so that the object onomatopoeia and tone are reflected to an image of the valid cut.

Description

원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템 및 방법System and method for automatically converting animation to webtoon with one touch
본 발명은 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템 및 방법에 관한 것이다.The present invention relates to a system and method for automatically converting animation into webtoon with one touch.
이미지 편집 기술 분야는 이미지의 원래의 색을 원하는 효과를 표현하도록 색을 변경시키는 필터링 기술과 이미지의 전체 사각형 영역 또는 이미지의 일부를 원하는 형태로 변경시키는 이미지 워핑 기술 분야가 존재한다. The image editing technology field includes filtering technology that changes the original color of an image to express a desired effect, and image warping technology that changes the entire rectangular area of the image or part of the image into a desired shape.
또한 주어진 이미지에서 원하는 부분이나 물체를 선택하여 배경과 분리시키는 이미지 셀렉션 기술, 분리된 이미지를 다른 이미지에 합치는 이미지 블랜딩 기술 등이 있다. 동영상 편집기술은 동영상 자체의 형식을 다루는 동영상 압축 기술이 있으며, 애니메이션을 합성한다라는 관점에서 바라보면, 주어진 물체의 모양을 미리 정해진 몇 개의 키 모양을 바탕으로 변형시키는 키프레임 애니메이션 기술, 움직임을 수학식 등을 활용하여 시간의 함수로 표현하고 이를 적용하는 절차적 애니메이션 생성 기술, 그리고 입자의 운동 법칙이나, 고차원의 물리 법칙을 적용하는 시뮬레이션 기반 애니메이션 생성 기술 등이 있다.Additionally, there are image selection technologies that select desired parts or objects in a given image and separate them from the background, and image blending technologies that combine separate images with other images. Video editing technology includes video compression technology that deals with the format of the video itself, and from the perspective of compositing animation, keyframe animation technology transforms the shape of a given object based on several predetermined key shapes, and the movement is expressed in a mathematical formula. There is a procedural animation creation technology that expresses and applies a function of time by using, etc., and a simulation-based animation creation technology that applies the laws of motion of particles or high-dimensional physical laws.
[선행기술문헌][Prior art literature]
[특허문헌][Patent Document]
등록특허공보 제10-2086780호Registered Patent Publication No. 10-2086780
본 발명이 해결하고자 하는 과제는 종래의 문제점을 해결할 수 있는 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템 및 방법을 제공하는 데 그 목적이 있다.The purpose of the present invention is to provide a system and method for automatically converting animation to webtoon with one touch, which can solve conventional problems.
상기 과제를 해결하기 위한 본 발명의 일 실시예에 따른 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템은 사용자 단말로부터 애니메이션을 입력받는 입력부; 상기 애니메이션 내의 프레임, 사운드, 씬 카메라 좌표정보를 기초로 객체의 움직임, 상기 객체의 대사 시작점과 끝지점, 카메라 움직임을 판단한 후, 판단 결과에 부합하는 복수의 유효 컷들 중 적어도 하나 이상을 웹툰 컷으로 추출하는 이미지 컷 추출부; 상기 사운드로부터 객체의 의성어를 분리한 후, 객체 어조를 분석하는 의성어 및 어조 분석부; 및 상기 유효컷 이미지 내에 상기 의성어 및 어조가 반영되도록 집중선, 속도선, 말풍선 및 효과음 및 레이아웃 컬러를 삽입 또는 변경하는 말풍선 및 특수효과 적용부를 포함한다.A system for automatically converting animation into webtoon with one touch according to an embodiment of the present invention to solve the above problem includes an input unit that receives animation from a user terminal; After determining the object's movement, the object's dialogue start and end point, and camera movement based on the frame, sound, and scene camera coordinate information in the animation, at least one of a plurality of valid cuts that match the judgment result is converted into a webtoon cut. an image cut extraction unit that extracts; an onomatopoeia and tone analysis unit that separates the onomatopoeia of an object from the sound and then analyzes the object tone; and a speech balloon and special effect application unit that inserts or changes concentration lines, speed lines, speech balloons, sound effects, and layout colors to reflect the onomatopoeia and tone in the effective cut image.
상기 과제를 해결하기 위한 본 발명의 일 실시예에 따른 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템의 동작방법은 입력부에서 애니메이션을 입력받는 단계; 이미지 컷 추출부에서 상기 애니메이션 내의 프레임, 사운드, 씬 카메라 좌표정보를 기초로 객체의 움직임, 상기 객체의 대사 시작점과 끝지점, 카메라 움직임을 판단한 후, 판단 결과에 부합하는 복수의 유효 컷들 중 적어도 하나 이상을 웹툰 컷으로 추출하는 단계; 정확도 검사부에서 상기 웹툰 컷과 상기 정답 컷(GT) 간을 정확도를 검사하는 단계; 의성어 및 어조 분석부에서 상기 사운드로부터 객체 의성어를 분리한 후, 객체 어조를 분석하는 단계; 및 말풍선 및 특수효과 적용부에서 상기 웹툰 컷 내에 상기 객체의 의성어 및 어조가 반영되도록 집중선, 속도선, 말풍선 및 효과음 및 레이아웃 컬러를 삽입하는 단계를 포함한다.In order to solve the above problem, a method of operating a system for automatically converting animation to webtoon with one touch according to an embodiment of the present invention includes the steps of receiving animation from an input unit; After the image cut extractor determines the movement of the object, the starting and ending points of the object's dialogue, and the camera movement based on the frame, sound, and scene camera coordinate information in the animation, at least one of a plurality of valid cuts matching the judgment result. Step of extracting the above into webtoon cuts; Checking the accuracy between the webtoon cut and the correct answer cut (GT) in an accuracy inspection unit; separating object onomatopoeia from the sound in an onomatopoeia and tone analysis unit and then analyzing object tone; And a step of inserting a concentration line, a speed line, a speech balloon, a sound effect, and a layout color so that the onomatopoeia and tone of the object are reflected in the webtoon cut in the speech balloon and special effect application unit.
따라서, 본 발명의 일 실시예에 따른 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템 및 방법을 이용하면, 애니메이션을 웹툰화하는 과정에서 발생되는 노동집약적인 반복 업무를 최소화시켜, 비용과 시간을 절약할 수 있다는 이점을 제공한다.Therefore, by using the system and method for automatically converting animation to webtoon with one touch according to an embodiment of the present invention, labor-intensive repetitive work that occurs in the process of converting animation to webtoon can be minimized, saving cost and time. It provides the advantage of being able to
또한, 누구나 쉽게 사용 가능함으로 업계 종사자의 접근성을 극대화할 수 있고, 분할 레이어로 결과물을 출력함으로써, 손쉬운 보정, 보완 가능하다는 이점이 있다.In addition, anyone can easily use it, maximizing accessibility for industry workers, and by outputting the results in split layers, it has the advantage of being easy to correct and supplement.
도 1은 본 발명의 일 실시예에 따른 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템의 블럭도이다.Figure 1 is a block diagram of a system for automatically converting animation into webtoon with one touch according to an embodiment of the present invention.
도 2는 도 1에 도시된 이미지 컷 추출부의 세부 구성도이다.FIG. 2 is a detailed configuration diagram of the image cut extractor shown in FIG. 1.
도 3 및 도 4는 크롭 과정을 설명하기 위한 예시도이다.Figures 3 and 4 are exemplary diagrams for explaining the cropping process.
도 5는 정확도 검사를 위한 테스트 결과 컷 및 GT 컷과 결과 컷을 비교한 예시도이다.Figure 5 is an example diagram comparing the test result cut and GT cut for accuracy inspection and the result cut.
도 6은 도 1에 도시된 말풍선 및 특수효과 적용부에서 적용하는 특수효과의 일 예시도이다.Figure 6 is an example diagram of a special effect applied by the speech balloon and special effect application unit shown in Figure 1.
도 7은 도 1에 도시된 말풍선 및 특수효과 적용부에서 적용하는 말풍선의 배치를 설명한 예시도이다.FIG. 7 is an example diagram illustrating the arrangement of speech balloons applied by the speech balloon and special effect application unit shown in FIG. 1.
도 8은 프레임마다 객체들의 좌표값을 토대로 웹툰 컷 내에 특수선이 삽입된 예시도이다.Figure 8 is an example of a special line inserted into a webtoon cut based on the coordinate values of objects for each frame.
도 9는 케릭터 크롭 알고리즘 중 레이아웃 규칙에 대한 예시도이다.Figure 9 is an example diagram of layout rules among the character cropping algorithm.
도 10 및 도 11은 클러스터링 유효도 분석지, 화자 구분 정확도 분석지의 예시도이다.Figures 10 and 11 are examples of a clustering effectiveness analysis sheet and a speaker classification accuracy analysis sheet.
도 12는 본 발명의 일 실시예에 따른 원터치로 애니메이션을 웹툰으로 자동 변환하는 방법을 설명한 흐름도이다.Figure 12 is a flowchart explaining a method of automatically converting animation to webtoon with one touch according to an embodiment of the present invention.
도 13은 도 12에 도시된 S720 과정의 세부 흐름도이다.FIG. 13 is a detailed flowchart of the S720 process shown in FIG. 12.
도 14는 도 12에 도시된 S750 과정의 세부 흐름도이다.FIG. 14 is a detailed flowchart of the S750 process shown in FIG. 12.
도 15는 본 명세서에 개진된 하나 이상의 실시예가 구현될 수 있는 예시적인 컴퓨팅 환경을 도시한 도이다.Figure 15 is a diagram illustrating an example computing environment in which one or more embodiments disclosed herein may be implemented.
이하, 첨부된 도면들에 기초하여 본 발명의 일 실시예에 따른 애니메이션을 웹툰으로 자동 변환하는 시스템 및 방법을 보다 상세하게 설명하도록 한다.Hereinafter, a system and method for automatically converting animation into webtoon according to an embodiment of the present invention will be described in more detail based on the attached drawings.
도 1은 본 발명의 일 실시예에 따른 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템의 블록도이고, 도 2는 도 1에 도시된 이미지 컷 추출부의 세부 구성도이고, 도 3 및 도 4는 크롭 과정을 설명하기 위한 예시도이고, 도 5는 정확도 검사를 위한 테스트 결과 컷 및 GT 컷과 결과 컷을 비교한 예시도이고, 도 6은 도 1에 도시된 말풍선 및 특수효과 적용부에서 적용하는 특수효과의 일 예시도이고, 도 7은 도 1에 도시된 말풍선 및 특수효과 적용부에서 적용하는 말풍선의 배치를 설명한 예시도이고, 도 8은 프레임마다 객체들의 좌표값을 토대로 웹툰 컷 내에 특수선이 삽입된 예시도이고, 도 9는 케릭터 크롭 알고리즘 중 레이아웃 규칙에 대한 예시도이고, 도 10 및 도 11은 클러스터링 유효도 분석지, 화자 구분 정확도 분석지의 예시도이다.Figure 1 is a block diagram of a system for automatically converting animation to webtoon with one touch according to an embodiment of the present invention, Figure 2 is a detailed configuration diagram of the image cut extractor shown in Figure 1, and Figures 3 and 4 are crop It is an example diagram to explain the process, and Figure 5 is an example diagram comparing the result cut with the test result cut and GT cut for accuracy inspection, and Figure 6 is a special diagram applied in the speech bubble and special effect application unit shown in Figure 1. It is an example diagram of the effect, and Figure 7 is an example diagram explaining the arrangement of the speech balloon applied in the speech balloon and special effect application unit shown in Figure 1, and Figure 8 shows a special line inserted into the webtoon cut based on the coordinate values of the objects for each frame. Figure 9 is an example diagram of the layout rule among the character cropping algorithm, and Figures 10 and 11 are example diagrams of a clustering effectiveness analysis sheet and a speaker classification accuracy analysis sheet.
먼저, 도 1에 도시된 바와 같이, 본 발명의 일 실시예에 따른 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템(100)은 입력부(110), 이미지 컷 추출부(120), 말풍선 및 특수효과 적용부(130) 및 출력부(140) 포함한다.First, as shown in FIG. 1, the system 100 for automatically converting animation to webtoon with one touch according to an embodiment of the present invention includes an input unit 110, an image cut extractor 120, and application of speech balloons and special effects. Includes a unit 130 and an output unit 140.
또한, 본 발명은 의성어 및 어조 분석부(150)를 더 포함할 수 있다.Additionally, the present invention may further include an onomatopoeia and tone analysis unit 150.
한편, 본 발명의 시스템(100)은 사용자 단말(미도시)에 실행되는 OPEN API와 연동될 수 있고, 해당 OPEN API는 단말 상의 응용프로그램(application)을 의미하며, 예를 들어, 모바일 단말(스마트폰)에서 실행되는 앱(app)을 포함한다. 앱(app)은 모바일 콘텐츠를 자유롭게 사고 파는 가상의 장터인 애플리케이션 마켓에서 다운로드 받아서 설치하거나 클라우드와 연동되어 구동할 수 있다.Meanwhile, the system 100 of the present invention can be linked with an OPEN API running on a user terminal (not shown), and the OPEN API refers to an application on the terminal, for example, a mobile terminal (smart Includes apps that run on the phone. Apps can be downloaded and installed from the application market, a virtual marketplace where mobile content can be freely bought and sold, or run in conjunction with the cloud.
보다 구체적으로, 도 1을 참조하면, 입력부(110)는 사용자 단말로부터 애니메이션을 입력받는 구성일 수 있다.More specifically, referring to FIG. 1, the input unit 110 may be configured to receive animation input from a user terminal.
이미지 컷 추출부(120)는 상기 애니메이션 내의 프레임, 사운드, 씬 카메라 좌표정보를 기초로 객체의 움직임, 상기 객체의 대사 시작점과 끝지점, 카메라 움직임을 판단한 후, 판단 결과에 부합하는 복수의 유효 컷들 중 적어도 하나 이상을 웹툰 컷으로 추출하는 구성일 수 있다.The image cut extraction unit 120 determines the movement of the object, the starting and ending points of the object's dialogue, and the camera movement based on the frame, sound, and scene camera coordinate information in the animation, and then generates a plurality of valid cuts that match the judgment results. It may be a configuration that extracts at least one of the webtoon cuts.
도 2를 참조하면, 이미지 컷 추출부(120)는 소스 분류부(121), 배경/객체 추적부(122), 진폭 및 진동수 추출부(123), 카메라 좌표 추적부(124), 객체 움직임 확인부(125), 대사 타이밍 지점 추척부(126), 카메라 기법 판단부(127), 유효컷 판단부(128) 및 웹툰 컷 추출부(129)를 포함한다.Referring to FIG. 2, the image cut extraction unit 120 includes a source classification unit 121, a background/object tracking unit 122, an amplitude and frequency extraction unit 123, a camera coordinate tracking unit 124, and an object movement confirmation unit. It includes a unit 125, a dialogue timing point tracking unit 126, a camera technique determination unit 127, an effective cut determination unit 128, and a webtoon cut extraction unit 129.
소스 분류부(121)는 입력부(110)에서 전달된 애니메이션 내의 소스를 ①프레임 이미지, ②사운드 및 ③씬 카메라 좌표로 분류하는 구성일 수 있다.The source classification unit 121 may be configured to classify the sources in the animation transmitted from the input unit 110 into ① frame image, ② sound, and ③ scene camera coordinates.
배경/객체 추적부(122)는 이미지 분포 기반 뉴럴 네트워크 학습알고리즘(Representation Learning)를 이용하여 연속되는 프레임 이미지들 간의 차를 통해 객체 또는 객체의 움직임 또는 위치변화를 추적하는 구성일 수 있다.The background/object tracking unit 122 may be configured to track an object or its movement or position change through differences between consecutive frame images using an image distribution-based neural network learning algorithm (Representation Learning).
참고로, 배경/객체 추적부(122)는 애니메이션이 입력되었을 때 객체와 배경을 분리하여 객체의 모양과 크기를 분석할 수 있다.For reference, the background/object tracking unit 122 can separate the object and the background when animation is input and analyze the shape and size of the object.
또한, 이전 프레임과 다음 프레임을 비교하여 움직임이 있는 물체들로만 이루어진 컷을 추출한 후 배경으로 판단되는 0의 값을 갖는 픽셀 이외에 픽셀에 대하여 상하좌우와 대각선의 위치에 있는 픽셀들을 연결하여 객체 구역을 특정할 수 있다.In addition, by comparing the previous frame and the next frame, a cut consisting of only moving objects is extracted, and then the object area is specified by connecting pixels at the top, bottom, left, right, and diagonal positions with respect to the pixel, in addition to the pixel with a value of 0, which is judged as the background. can do.
진폭 및 진동수 추출부(123)는 애니메이션의 사운드의 진폭 및 진동수를 추출하는 구성일 수 있다.The amplitude and frequency extraction unit 123 may be configured to extract the amplitude and frequency of the sound of the animation.
카메라 좌표 추적부(124)는 씬 카메라 좌표를 기초로 카메라 움직임을 추적하는 구성일 수 있다.The camera coordinate tracking unit 124 may be configured to track camera movement based on scene camera coordinates.
객체 움직임 확인부(125)는 상기 배경/객체 추적부(122)에서 추적된 객체의 움직임 또는 위치변화가 가장 적은 프레임을 확인(선별)하는 구성일 있다.The object movement confirmation unit 125 is configured to check (select) frames in which the movement or position change of the object tracked by the background/object tracking unit 122 is minimal.
또한, 객체 움직임 확인부(125)는 이전 프레임에서 탐지된 객체의 위치, 크기정보를 토대로 다음 프레임에서 탐지된 객체들과의 거리를 비교한 비교값을 산출할 수 있다. 또한, 일정 거리 안의 객체들 중 가장 가까운 객체를 현재 객체가 이동한 다음 위치로 설정한 후 각 프레임마다 객체들의 좌표값을 출력할 수 있다.Additionally, the object motion confirmation unit 125 may calculate a comparison value by comparing the distance to objects detected in the next frame based on the location and size information of the object detected in the previous frame. Additionally, the closest object among objects within a certain distance can be set as the next position to which the current object moves, and then the coordinate values of the objects can be output for each frame.
대사 타이밍 지점 추적부(126)는 상기 객체가 대사를 시작한 시작점부터 끝나는 끝지점까지의 대사 타이밍 지점에 해당하는 프레임을 추적하는 구성일 수 있다.The dialogue timing point tracking unit 126 may be configured to track frames corresponding to the dialogue timing point from the start point where the object starts the dialogue to the end point where the object ends the dialogue.
유효 컷 판단부(128)는 객체 움직임 확인부(125)에서 확인된 프레임과 대사 타이밍 지점 추적부(126)에서 추적한 프레임을 유효 컷으로 판단하는 구성일 수 있다.The valid cut determination unit 128 may be configured to determine the frame confirmed by the object motion confirmation unit 125 and the frame tracked by the dialogue timing point tracking unit 126 as a valid cut.
카메라 기법 판단부(127)는 카메라 좌표 추적에서 추적한 카메라 좌표를 기초로 프레임 내에 카메라 기법(예컨대, 줌인/아웃, 틸트, 팬)이 반영된 프레임을 판단하는 구성일 수 있다.The camera technique determination unit 127 may be configured to determine a frame in which a camera technique (eg, zoom in/out, tilt, pan) is reflected within the frame based on camera coordinates tracked in camera coordinate tracking.
예컨대, 원본 이미지에서 객체 제거된 차영상 이미지에서 추출한 특징점을 매칭 (Image feature matching), 예컨대, 이전 프레임에는 샐리언시를 적용한 차영상 이미지를 사용하고 다음 프레임에는 원본 이미지를 사용하여 추출한 두 프레임간의 특징점을 매칭하고, 매칭된 점들에 대하여 x, y축 이동량을 계산하여 카메라의 상하좌우 움직임 계산한 후, 이전 프레임에서 특징점들과 다음 프레임에서의 매칭된 특징점들 중 일부를 활용하여 각 점 사이의 거리를 구하고 거리의 변화 정도를 비율로 계산된 값을 통해 카메라 줌인과 줌아웃 판별한다.For example, matching feature points extracted from a difference image with objects removed from the original image (Image feature matching), for example, using a difference image with saliency applied in the previous frame and using the original image in the next frame, between two frames. After matching the feature points and calculating the x- and y-axis movement amounts for the matched points to calculate the camera's up, down, left, and right movement, some of the feature points from the previous frame and the matched feature points from the next frame are used to determine the distance between each point. Calculate the distance and determine whether the camera zooms in or out using the value calculated as a ratio of the degree of change in distance.
웹툰 컷 추출부(129)는 유효 컷 판단부(128)에서 유효 컷으로 판단한 프레임과 카메라 기법 판단부(127)에서 판단한 프레임을 웹툰 컷으로 선별하는 구성일 수 있다.The webtoon cut extraction unit 129 may be configured to select the frames determined as valid cuts by the valid cut determination unit 128 and the frames determined by the camera technique determination unit 127 as webtoon cuts.
다음으로, 의성어 및 어조 분석부(150)는 상기 사운드로부터 객체 의성어를 분리한 후, 객체 어조를 분석하는 구성일 수 있다.Next, the onomatopoeia and tone analysis unit 150 may be configured to separate the object onomatopoeia from the sound and then analyze the object tone.
의성어 및 어조 분석부(160)는 상기 사운드 내에 상기 객체 의성어가 포함될 경우, 미리 훈련된 voice activity detection 알고리즘을 통해 음성이 나오는 부분을 프레임을 탐지하고, 상기 사운드 내에 상기 객체 음성이 인식된 경우 중 해당 시점이 대사가 없는 경우일 때 이를 의성어로 분류하는 구성일 수 있다.When the object onomatopoeia is included in the sound, the onomatopoeia and tone analysis unit 160 detects the frame where the voice appears through a pre-trained voice activity detection algorithm, and when the object voice is recognized in the sound, the corresponding When there is no dialogue at the point of view, it may be classified as onomatopoeia.
또한, 의성어 및 어조 분석부(150)는 의성어의 데시벨을 분석한다. 예컨대, 컷 추출 시점의 음성 데시벨이 해당 애니메이션의 평균 데시벨보다 일정 데시벨 이상 큰 곳을 탐지하여 클래스 부여할 수 있다.Additionally, the onomatopoeia and tone analysis unit 150 analyzes the decibel of the onomatopoeia. For example, a class can be assigned by detecting a place where the voice decibel at the time of cut extraction is a certain decibel higher than the average decibel of the corresponding animation.
또한, 컷 추출 시에 대사가 진행되는 구간을 시간이나 프레임의 위치로 전달받으며, 이 구간에서 특정 데시벨 이상 큰 곳의 비율이 20%가 넘어갈 경우, 평균 데시벨보다 특정 데시벨 크다고 분류하는 구성일 수 있다.In addition, when extracting a cut, the section in which the dialogue occurs is received as a time or frame position, and if the proportion of places in this section that are greater than a certain decibel exceeds 20%, it may be classified as a specific decibel greater than the average decibel. .
다음으로, 말풍선 및 특수효과 적용부(130)는 상기 웹툰 컷 이미지 내에 상기 객체 의성어 및 어조가 반영되도록 집중선, 속도선, 말풍선 및 효과음 및 레이아웃 컬러를 삽입 또는 변경하는 구성일 수 있다.Next, the speech balloon and special effect application unit 130 may be configured to insert or change concentration lines, speed lines, speech balloons, sound effects, and layout colors so that the object onomatopoeia and tone are reflected in the webtoon cut image.
말풍선 및 특수효과 적용부(130)는 상기 객체 의성어와 객체의 어조에 따른 말풍선을 자동 생성한다.The speech balloon and special effect application unit 130 automatically generates a speech balloon according to the object onomatopoeia and the tone of the object.
또한, 샐리언시 맵을 이용하여 주요 객체(화자)와 객체(청자)의 위치를 추적한 후, 두 컷 이상에서 객체(객체)의 위치 또는 외형에 변화가 없으면, 객체 크롭 알고리즘을 이용하여 객체(객체)를 크롭한 후, 상기 화자의 인접영역에 말풍선 및 글자로 변환된 의성어를 배치하고, 영상에서 배경이 어두워지면 레이아웃 컬러를 변경한다.In addition, after tracking the positions of the main object (speaker) and object (listener) using a saliency map, if there is no change in the position or appearance of the object (object) in two or more cuts, the object is cropped using an object crop algorithm. After cropping the (object), a speech bubble and onomatopoeia converted into letters are placed in the area adjacent to the speaker, and the layout color is changed when the background in the video becomes dark.
또한, 상기 말풍선 및 특수효과 적용부(130)는 객체 또는 물체의 움직임 또는 위치변화의 크기에 비례하 효과선을 움직임 동선을 따라서 웹툰 컷에 삽입할 수 있다.Additionally, the speech balloon and special effect application unit 130 may insert an effect line into the webtoon cut along the movement line in proportion to the size of the movement or position change of the object or object.
참고로, 말풍선 및 특수효과 적용부(130)는 객체의 위치에 따라 말풍선을 배치할 수 있다. 예를 들어, 객체가 웹툰 컷 내에서 왼쪽에 위치하면 말풍선은 오른쪽에 배치시킨다.For reference, the speech balloon and special effect application unit 130 can arrange speech balloons according to the location of the object. For example, if the object is located on the left within the webtoon cut, the speech bubble is placed on the right.
또한, 말풍선 및 특수효과 적용부(130)는 배경 및 객체 추출부에서 계산한 (대사가 진행되는 프레임 간의 차영상 값의 합)/(샐리언시 맵의 면적)을 기준으로 해당 값이 가장 높은 부분을 화자로 인식할 수 있다.In addition, the speech balloon and special effect application unit 130 has the highest corresponding value based on (sum of difference image values between frames in which dialogue is performed)/(area of salience map) calculated by the background and object extraction unit. The part can be recognized as the speaker.
또한 말풍선 및 특수효과 적용부(130)는 객체 움직임 확인부(125)에서 이전 프레임에서 탐지된 객체의 위치, 크기정보를 토대로 다음 프레임에서 탐지된 객체들과의 거리를 비교한 비교값 및 일정 거리 안의 객체들 중 가장 가까운 객체를 현재 객체가 이동한 다음 위치로 설정한 후 각 프레임마다 객체들의 좌표값을 토대로 특수선을 웹툰 컷 내에 삽입할 수 있다.In addition, the speech balloon and special effect application unit 130 provides a comparison value and a certain distance by comparing the distance to objects detected in the next frame based on the location and size information of the object detected in the previous frame in the object movement confirmation unit 125. After setting the closest object among the objects inside to the next position after the current object moves, a special line can be inserted into the webtoon cut based on the coordinate values of the objects for each frame.
참고로, 객체 크롭 알고리즘은 등장한 객체들의 위치와 구도가 비슷한 컷들을 유사한 컷으로 인식하여 클러스터링하고, 각 그룹 안에서 순서대로 전체컷, 화자 크롭의 레이아웃 형태를 반복하는 프로그램일 수 있다.For reference, the object crop algorithm may be a program that recognizes and clusters cuts with similar positions and compositions of objects as similar cuts, and repeats the layout form of the entire cut and speaker crop in order within each group.
또한, 객체 크롭 알고리즘은 이전 웹툰 컷과 비교하여 사물 및 인물 구도의 변화가 적은 컷들을 유사한 컷으로 인식하여 추출하고, 그룹별로 전체컷과 화자 크롭 레이아웃 패턴을 적용했을 시 웹툰에서 실제로 사용가능한지에 대한 유효도를 판단한다.In addition, the object crop algorithm recognizes and extracts cuts with less change in object and character composition compared to previous webtoon cuts as similar cuts, and determines whether they can actually be used in webtoons when applying the overall cut and speaker crop layout patterns for each group. Determine validity.
이하에서는 도 10 및 도 11을 참조, 클러스터링 유효도 분석 및 화자 구분 정확도 분석을 기록한 분석지를 설명한다.Below, with reference to Figures 10 and 11, the analysis sheet recording the clustering effectiveness analysis and speaker classification accuracy analysis will be described.
먼저, 실험 대상 에니메이션은 “포크마을 수상한 이웃들” 1화로 테스트한 결과 79% 유효도를 달성하였다.First, the experimental subject animation was tested with episode 1 of “Suspicious Neighbors in Folk Village” and achieved 79% effectiveness.
화자 구분 정확도는 포크마을 수상한 이웃들 1화 대본을 기준으로 대사마다의 컷을 한 개씩 추출하였다.For speaker classification accuracy, one cut for each line was extracted based on the script of Episode 1 of Suspicious Neighbors in Folk Village.
해당 장면에서 움직임이 비교적 큰 인물을 화자로 구분한 결과 54%의 화자 구분 정확도 도출하였다.As a result of classifying a person with relatively large movements in the scene as a speaker, a speaker classification accuracy of 54% was obtained.
따라서, 본 발명의 일 실시예에 따른 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템을 이용하면, 애니메이션을 웹툰화하는 과정에서 발생되는 노동집약적인 반복 업무를 최소화시켜, 비용과 시간을 절약할 수 있다는 이점을 제공한다. 또한, 누구나 쉽게 사용가능함으로 업계 종사자의 접근성을 극대화할 수 있고, 분할 레이어로 결과물을 출력함으로써, 손쉬운 보정, 보안이 가능하다는 이점이 있다. Therefore, by using the system for automatically converting animation to webtoon with one touch according to an embodiment of the present invention, it is possible to save cost and time by minimizing the labor-intensive repetitive work that occurs in the process of converting animation to webtoon. provides an advantage. In addition, anyone can easily use it, maximizing accessibility for industry workers, and by outputting the results in split layers, there is an advantage in that easy correction and security are possible.
도 12는 본 발명의 일 실시예에 따른 원터치로 애니메이션을 웹툰으로 자동 변환하는 방법을 설명한 흐름도이고, 도 13은 도 12에 도시된 S720 과정의 세부 흐름도이고, 도 14는 도 12에 도시된 S740 과정의 세부 흐름도이다.FIG. 12 is a flowchart explaining a method of automatically converting animation to webtoon with one touch according to an embodiment of the present invention, FIG. 13 is a detailed flowchart of the S720 process shown in FIG. 12, and FIG. 14 is S740 shown in FIG. 12. This is a detailed flow chart of the process.
먼저, 도 11을 참조하면, 본 발명의 일 실시예에 따른 애니메이션을 웹툰으로 자동 변환하는 방법(S700)은 입력부에서 애니메이션을 입력받는 단계(S710); 이미지 컷 추출부에서 상기 애니메이션 내의 프레임, 사운드, 씬 카메라 좌표정보를 기초로 객체의 움직임, 상기 객체의 대사 시작점과 끝지점, 카메라 움직임을 판단한 후, 판단 결과에 부합하는 복수의 유효 컷들 중 적어도 하나 이상을 웹툰 컷으로 추출(S720)한다.First, referring to FIG. 11, the method (S700) for automatically converting animation to webtoon according to an embodiment of the present invention includes the step of receiving animation from an input unit (S710); After the image cut extractor determines the movement of the object, the starting and ending points of the object's dialogue, and the camera movement based on the frame, sound, and scene camera coordinate information in the animation, at least one of a plurality of valid cuts matching the judgment result. The above is extracted as a webtoon cut (S720).
이후, 의성어 및 어조 분석부에서 상기 사운드로부터 객체 의성어를 분리한 후, 객체 어조를 분석(S730)한다.Afterwards, the object onomatopoeia is separated from the sound in the onomatopoeia and tone analysis unit, and then the object tone is analyzed (S730).
이때, S730 과정은 상기 사운드 내에 상기 객체 의성어가 포함될 경우, 미리 훈련된 voice activity detection 알고리즘을 통해 음성이 나오는 부분을 프레임을 탐지하고, 상기 사운드 내에 상기 객체 음성이 인식된 경우 중 해당 시점이 대본에 대사가 없는 경우일 때 이를 의성어로 분류하는 단계를 포함할 수 있다.At this time, in the S730 process, when the object onomatopoeia is included in the sound, the frame where the voice appears is detected through a pre-trained voice activity detection algorithm, and when the object voice is recognized in the sound, the corresponding point is included in the script. In cases where there is no dialogue, a step of classifying it as onomatopoeia may be included.
이후, 말풍선 및 특수효과 적용부에서 상기 웹툰 컷 내에 상기 객체의 의성어 및 어조가 반영되도록 집중선, 속도선, 말풍선 및 효과음 및 레이아웃 컬러를 삽입(S740)하는 일련의 과정을 포함한다.Thereafter, a series of processes are included in which the speech balloon and special effect application unit inserts concentration lines, speed lines, speech balloons, sound effects, and layout colors to reflect the onomatopoeia and tone of the object within the webtoon cut (S740).
상기 S720 과정은 상기 애니메이션 내의 소스를 프레임 이미지, 사운드 및 씬 카메라 좌표로 분류, 연속되는 프레임 이미지들 간의 차를 통해 객체 움직임 또는 위치변화를 추적, 상기 사운드의 진폭 및 진동수를 추출, 상기 씬 카메라 좌표를 기초로 카메라 좌표를 추적, 추적된 객체의 움직임 또는 위치변화가 가장 적은 프레임을 확인하고, 대사를 시작한 시작점부터 끝나는 끝지점까지의 대사 타이밍 지점에 해당하는 프레임을 추적한 후, 위치변화가 확인된 프레임과 대사를 시작한 시작점부터 끝나는 끝지점까지의 대사 타이밍 지점에 해당하는 프레임을 유효 컷으로 판단하고, 상기 유효 컷 중 카메라 기법이 반영된 유효 컷을 웹툰 컷으로 추출하는 일련의 과정을 포함한다.The S720 process classifies sources in the animation into frame images, sounds, and scene camera coordinates, tracks object movement or position changes through differences between consecutive frame images, extracts the amplitude and frequency of the sound, and calculates the scene camera coordinates. Based on this, track the camera coordinates, check the frame with the least movement or position change of the tracked object, track the frame corresponding to the dialogue timing point from the start point of the dialogue to the end point, and then check the position change. It includes a series of processes to determine the frame corresponding to the dialogue timing point from the starting point of the dialogue to the end point of the dialogue as a valid cut, and extracting the effective cut that reflects the camera technique among the valid cuts as a webtoon cut.
상기 S740 과정은 상기 객체 의성어와 객체의 어조에 따른 말풍선을 자동 생성하는 단계를 포함할 수 있다.The process S740 may include automatically generating a speech bubble according to the object onomatopoeia and the tone of the object.
또한, 도 13을 참조, 샐리언시 맵을 이용하여 주요 객체(화자)와 객체(청자)의 위치를 추적한 후, 두 컷 이상에서 객체의 위치에 변화가 없으면, 객체를 크롭한 후, 상기 화자의 인접영역에 말풍선 및 글자로 변환된 의성어를 배치하고, 카메라 좌표 정보를 이용하여 레이아웃 컬러를 변경하는 단계를 포함할 수 있다.Also, referring to FIG. 13, after tracking the positions of the main object (speaker) and object (listener) using the saliency map, if there is no change in the position of the object in two or more cuts, the object is cropped, and then It may include placing speech bubbles and onomatopoeia converted into letters in an area adjacent to the speaker, and changing the layout color using camera coordinate information.
또한, 객체 또는 물체의 움직임 또는 위치변화의 크기에 비례하 효과선을 움직임 동선을 따라서 웹툰 컷에 삽입하는 단계를 더 포함할 수 있다.In addition, the step of inserting an effect line into the webtoon cut along the movement line in proportion to the size of the object or object's movement or change in position may be further included.
따라서, 본 발명의 일 실시예에 따른 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템 및 방법을 이용하면, 애니메이션을 웹툰화하는 과정에서 발생되는 노동집약적인 반복 업무를 최소화시켜, 비용과 시간을 절약할 수 있다는 이점을 제공한다.Therefore, by using the system and method for automatically converting animation to webtoon with one touch according to an embodiment of the present invention, labor-intensive repetitive work that occurs in the process of converting animation to webtoon can be minimized, saving cost and time. It provides the advantage of being able to
또한, 누구나 쉽게 사용가능함으로 업계 종사자의 접근성을 극대화할 수 있고, 분할 레이어로 결과물을 출력함으로써, 손쉬운 보정, 보안이 가능하다는 이점이 있다. In addition, anyone can easily use it, maximizing accessibility for industry workers, and by outputting the results in split layers, there is an advantage in that easy correction and security are possible.
도 15는 본 명세서에 개진된 하나 이상의 실시예가 구현될 수 있는 예시적인 컴퓨팅 환경을 도시하는 도면으로, 상술한 하나 이상의 실시예를 구현하도록 구성된 컴퓨팅 디바이스(1100)를 포함하는 시스템(1000)의 예시를 도시한다. 예를 들어, 컴퓨팅 디바이스(1100)는 개인 컴퓨터, 서버 컴퓨터, 핸드헬드 또는 랩탑 디바이스, 모바일 디바이스(모바일폰, PDA, 미디어 플레이어 등), 멀티프로세서 시스템, 소비자 전자기기, 미니 컴퓨터, 메인프레임 컴퓨터, 임의의 전술된 시스템 또는 디바이스를 포함하는 분산 컴퓨팅 환경 등을 포함하지만, 이것으로 한정되는 것은 아니다.15 is a diagram illustrating an example computing environment in which one or more embodiments disclosed herein may be implemented, and is an illustration of a system 1000 that includes a computing device 1100 configured to implement one or more embodiments described above. shows. For example, computing device 1100 may include a personal computer, server computer, handheld or laptop device, mobile device (mobile phone, PDA, media player, etc.), multiprocessor system, consumer electronics, minicomputer, mainframe computer, Distributed computing environments including any of the above-described systems or devices, etc. are included, but are not limited thereto.
컴퓨팅 디바이스(1100)는 적어도 하나의 프로세싱 유닛(1110) 및 메모리(1120)를 포함할 수 있다. 여기서, 프로세싱 유닛(1110)은 예를 들어 중앙처리장치(CPU), 그래픽처리장치(GPU), 마이크로프로세서, 주문형 반도체(Application Specific Integrated Circuit, ASIC), Field Programmable Gate Arrays(FPGA) 등을 포함할 수 있으며, 복수의 코어를 가질 수 있다. 메모리(1120)는 휘발성 메모리(예를 들어, RAM 등), 비휘발성 메모리(예를 들어, ROM, 플래시 메모리 등) 또는 이들의 조합일 수 있다. 또한, 컴퓨팅 디바이스(1100)는 추가적인 스토리지(1130)를 포함할 수 있다. 스토리지(1130)는 자기 스토리지, 광학 스토리지 등을 포함하지만 이것으로 한정되지 않는다. 스토리지(1130)에는 본 명세서에 개진된 하나 이상의 실시예를 구현하기 위한 컴퓨터 판독 가능한 명령이 저장될 수 있고, 운영 시스템, 애플리케이션 프로그램 등을 구현하기 위한 다른 컴퓨터 판독 가능한 명령도 저장될 수 있다. 스토리지(1130)에 저장된 컴퓨터 판독 가능한 명령은 프로세싱 유닛(1110)에 의해 실행되기 위해 메모리(1120)에 로딩될 수 있다. 또한, 컴퓨팅 디바이스(1100)는 입력 디바이스(들)(1140) 및 출력 디바이스(들)(1150)을 포함할 수 있다. Computing device 1100 may include at least one processing unit 1110 and memory 1120. Here, the processing unit 1110 may include, for example, a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), etc. and can have multiple cores. Memory 1120 may be volatile memory (eg, RAM, etc.), non-volatile memory (eg, ROM, flash memory, etc.), or a combination thereof. Additionally, computing device 1100 may include additional storage 1130. Storage 1130 includes, but is not limited to, magnetic storage, optical storage, etc. The storage 1130 may store computer-readable instructions for implementing one or more embodiments disclosed in this specification, and other computer-readable instructions for implementing an operating system, application program, etc. may also be stored. Computer-readable instructions stored in storage 1130 may be loaded into memory 1120 for execution by processing unit 1110. Computing device 1100 may also include input device(s) 1140 and output device(s) 1150.
여기서, 입력 디바이스(들)(1140)은 예를 들어 키보드, 마우스, 펜, 음성 입력 디바이스, 터치 입력 디바이스, 적외선 카메라, 비디오 입력 디바이스 또는 임의의 다른 입력 디바이스 등을 포함할 수 있다. 또한, 출력 디바이스(들)(1150)은 예를 들어 하나 이상의 디스플레이, 스피커, 프린터 또는 임의의 다른 출력 디바이스 등을 포함할 수 있다. 또한, 컴퓨팅 디바이스(1100)는 다른 컴퓨팅 디바이스에 구비된 입력 디바이스 또는 출력 디바이스를 입력 디바이스(들)(1140) 또는 출력 디바이스(들)(1150)로서 사용할 수도 있다. 또한, 컴퓨팅 디바이스(1100)는 컴퓨팅 디바이스(1100)가 다른 디바이스(예를 들어, 컴퓨팅 디바이스(1300))와 통신할 수 있게 하는 통신접속(들)(1160)을 포함할 수 있다. Here, the input device(s) 1140 may include, for example, a keyboard, mouse, pen, voice input device, touch input device, infrared camera, video input device, or any other input device, etc. Additionally, output device(s) 1150 may include, for example, one or more displays, speakers, printers, or any other output devices. Additionally, the computing device 1100 may use an input device or output device provided in another computing device as the input device(s) 1140 or the output device(s) 1150. Additionally, computing device 1100 may include communication connection(s) 1160 that allows computing device 1100 to communicate with another device (e.g., computing device 1300).
여기서, 통신 접속(들)(1160)은 모뎀, 네트워크 인터페이스 카드(NIC), 통합 네트워크 인터페이스, 무선 주파수 송신기/수신기, 적외선 포트, USB 접속 또는 컴퓨팅 디바이스(1100)를 다른 컴퓨팅 디바이스에 접속시키기 위한 다른 인터페이스를 포함할 수 있다. 또한, 통신 접속(들)(1160)은 유선 접속 또는 무선 접속을 포함할 수 있다. 상술한 컴퓨팅 디바이스(1100)의 각 구성요소는 버스 등의 다양한 상호접속(예를 들어, 주변 구성요소 상호접속(PCI), USB, 펌웨어(IEEE 1394), 광학적 버스 구조 등)에 의해 접속될 수도 있고, 네트워크(1200)에 의해 상호접속될 수도 있다. 본 명세서에서 사용되는 "구성요소", "시스템" 등과 같은 용어들은 일반적으로 하드웨어, 하드웨어와 소프트웨어의 조합, 소프트웨어, 또는 실행중인 소프트웨어인 컴퓨터 관련 엔티티를 지칭하는 것이다. Here, communication connection(s) 1160 may include a modem, network interface card (NIC), integrated network interface, radio frequency transmitter/receiver, infrared port, USB connection, or other device for connecting computing device 1100 to another computing device. May contain interfaces. Additionally, communication connection(s) 1160 may include a wired connection or a wireless connection. Each component of the computing device 1100 described above may be connected by various interconnections such as buses (e.g., peripheral component interconnect (PCI), USB, firmware (IEEE 1394), optical bus structure, etc.) and may be interconnected by a network 1200. As used herein, terms such as "component", "system", etc. generally refer to computer-related entities that are hardware, a combination of hardware and software, software, or software in execution.
예를 들어, 구성요소는 프로세서상에서 실행중인 프로세스, 프로세서, 객체, 실행 가능물(executable), 실행 스레드, 프로그램 및/또는 컴퓨터일 수 있지만, 이것으로 한정되는 것은 아니다. 예를 들어, 컨트롤러 상에서 구동중인 애플리케이션 및 컨트롤러 모두가 구성요소일 수 있다. 하나 이상의 구성요소는 프로세스 및/또는 실행의 스레드 내에 존재할 수 있으며, 구성요소는 하나의 컴퓨터상에서 로컬화될 수 있고, 둘 이상의 컴퓨터 사이에서 분산될 수도 있다.For example, a component may be, but is not limited to, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. For example, both the application running on the controller and the controller can be components. One or more components may reside within a process and/or thread of execution, and a component may be localized on one computer or distributed between two or more computers.
본 발명은 전술한 실시예 및 첨부된 도면에 의해 한정되는 것이 아니다. 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 있어, 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 본 발명에 따른 구성요소를 치환, 변형 및 변경할 수 있다는 것이 명백할 것이다.The present invention is not limited to the above-described embodiments and attached drawings. For those skilled in the art to which the present invention pertains, it will be clear that components according to the present invention can be replaced, modified, and changed without departing from the technical spirit of the present invention.

Claims (15)

  1. 사용자 단말로부터 애니메이션을 입력받는 입력부Input unit that receives animation from the user terminal
    상기 애니메이션 내의 프레임, 사운드, 씬 카메라 좌표정보를 기초로 객체의 움직임, 상기 객체의 대사 시작점과 끝지점, 카메라 움직임을 판단한 후, 판단 결과에 부합하는 복수의 유효 컷들 중 적어도 하나 이상을 웹툰 컷으로 추출하는 이미지 컷 추출부;After determining the object's movement, the object's dialogue start and end point, and camera movement based on the frame, sound, and scene camera coordinate information in the animation, at least one of a plurality of valid cuts that match the judgment result is converted into a webtoon cut. an image cut extraction unit that extracts;
    상기 사운드로부터 객체의 의성어를 분리한 후, 객체 어조를 분석하는 의성어 및 어조 분석부; 및an onomatopoeia and tone analysis unit that separates the onomatopoeia of an object from the sound and then analyzes the object tone; and
    집중선, 속도선, 말풍선 및 효과음 및 레이아웃 컬러를 삽입 또는 변경하는 말풍선 및 특수효과 적용부를 포함하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템.A system that automatically converts animations into webtoons with one touch, including a speech balloon and special effect application section that inserts or changes concentration lines, speed lines, speech bubbles, sound effects, and layout colors.
  2. 제1항에 있어서,According to paragraph 1,
    상기 이미지 컷 추출부는The image cut extraction unit
    상기 애니메이션 내의 소스를 프레임 이미지, 사운드 및 씬 카메라 좌표로 분류하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템.A system that automatically converts animations into webtoons with one touch by classifying sources within the animation into frame images, sounds, and scene camera coordinates.
  3. 제2항에 있어서,According to paragraph 2,
    연속되는 프레임 이미지들 간의 차를 통해 객체의 움직임 또는 위치변화를 추적하는 배경/객체 추적부;A background/object tracking unit that tracks the movement or position change of an object through the difference between consecutive frame images;
    상기 사운드의 진폭 및 진동수를 추출하는 진폭 및 진동수 추출부;Amplitude and frequency extraction unit for extracting the amplitude and frequency of the sound;
    상기 씬 카메라 좌표를 기초로 카메라 좌표를 추적하는 카메라 좌표 추적부를 더 포함하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템.A system for automatically converting an animation into a webtoon with one touch, further comprising a camera coordinate tracking unit that tracks camera coordinates based on the scene camera coordinates.
  4. 제3항에 있어서,According to paragraph 3,
    상기 배경/객체 추적부에서 추적된 객체의 움직임 또는 위치변화가 가장 적은 프레임을 확인하는 객체 움직임 확인부;an object motion confirmation unit that checks the frame in which the movement or position change of the object tracked by the background/object tracking unit is minimal;
    상기 객체가 대사를 시작한 시작점부터 끝나는 끝지점까지의 대사 타이밍 지점에 해당하는 프레임을 추적하는 대사 타이밍 지점 추적부; 및a dialogue timing point tracking unit that tracks frames corresponding to dialogue timing points from the start point where the object starts dialogue to the end point where the object ends the dialogue; and
    상기 객체 움직임 확인부에서 확인된 프레임과 대사 타이밍 지점 추적부에서 추적한 프레임을 유효 컷으로 판단하는 유효 컷 판단부를 더 포함하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템. A system for automatically converting an animation into a webtoon with one touch, further comprising a valid cut determination unit that determines the frame confirmed by the object movement confirmation unit and the frame tracked by the dialogue timing point tracking unit as a valid cut.
  5. 제4항에 있어서,According to paragraph 4,
    상기 유효 컷 중 카메라 기법이 반영된 유효 컷을 웹툰 컷으로 추출하는 웹툰 컷 추출부를 더 포함하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템. A system for automatically converting an animation into a webtoon with one touch, further comprising a webtoon cut extraction unit that extracts a valid cut reflecting a camera technique from among the effective cuts as a webtoon cut.
  6. 제1항에 있어서,According to paragraph 1,
    상기 말풍선 및 특수효과 적용부는The speech balloon and special effects application section
    상기 객체 의성어와 객체의 어조에 따른 말풍선을 자동 생성하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템.A system that automatically converts animations into webtoons with one touch by automatically creating speech bubbles according to the object onomatopoeia and tone of the object.
  7. 제6항에 있어서,According to clause 6,
    상기 의성어 및 어조 분석부는The onomatopoeia and tone analysis unit
    상기 사운드 내에 상기 객체 의성어가 포함될 경우, 미리 훈련된 voice activity detection 알고리즘을 통해 음성이 나오는 부분을 프레임을 탐지하고,When the object onomatopoeia is included in the sound, the frame in which the voice appears is detected through a pre-trained voice activity detection algorithm,
    상기 사운드 내에 상기 객체 음성이 인식된 경우 중 해당 시점이 대본에 대사가 없는 경우일 때 이를 의성어로 분류하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템.A system that automatically converts an animation into a webtoon with one touch by classifying it as an onomatopoeia when the object voice is recognized within the sound and there is no dialogue in the script at that point.
  8. 제6항에 있어서,According to clause 6,
    상기 말풍선 및 특수효과 적용부는The speech balloon and special effects application section
    샐리언시 맵을 이용하여 주요 객체(화자)와 객체(청자)의 위치를 추적한 후, 두 컷 이상에서 객체의 위치에 변화가 없으면, 객체를 크롭한 후, 상기 화자의 인접영역에 말풍선 및 글자로 변환된 의성어를 배치하고, 카메라 좌표 정보를 이용하여 레이아웃 컬러를 변경하는 것을 특징으로 하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템. After tracking the positions of the main object (speaker) and object (listener) using a salience map, if there is no change in the position of the object in two or more cuts, the object is cropped and a speech balloon and a speech bubble are placed in the adjacent area of the speaker. A system that automatically converts animations into webtoons with one touch, featuring onomatopoeia converted into letters and changing layout colors using camera coordinate information.
  9. 제8항에 있어서,According to clause 8,
    상기 말풍선 및 특수효과 적용부는The speech balloon and special effects application section
    객체 또는 물체의 움직임 또는 위치변화의 크기에 비례하 효과선을 움직임 동선을 따라서 웹툰 컷에 삽입하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 시스템.A system that automatically converts animations into webtoons with one touch by inserting effect lines into webtoon cuts along the movement line in proportion to the size of the object or object's movement or position change.
  10. 입력부에서 애니메이션을 입력받는 단계;Receiving animation from an input unit;
    이미지 컷 추출부에서 상기 애니메이션 내의 프레임, 사운드, 씬 카메라 좌표정보를 기초로 객체의 움직임, 상기 객체의 대사 시작점과 끝지점, 카메라 움직임을 판단한 후, 판단 결과에 부합하는 복수의 유효 컷들 중 적어도 하나 이상을 웹툰 컷으로 추출하는 단계;After the image cut extractor determines the movement of the object, the starting and ending points of the object's dialogue, and the camera movement based on the frame, sound, and scene camera coordinate information in the animation, at least one of a plurality of valid cuts matching the judgment result. Step of extracting the above into webtoon cuts;
    의성어 및 어조 분석부에서 상기 사운드로부터 객체 의성어를 분리한 후, 객체 어조를 분석하는 단계; 및separating object onomatopoeia from the sound in an onomatopoeia and tone analysis unit and then analyzing object tone; and
    말풍선 및 특수효과 적용부에서 상기 웹툰 컷 내에 상기 객체의 의성어 및 어조가 반영되도록 집중선, 속도선, 말풍선 및 효과음 및 레이아웃 컬러를 삽입하는 단계를 포함하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 방법.A method of automatically converting an animation into a webtoon with one touch, including the step of inserting a concentration line, a speed line, a speech balloon, a sound effect, and a layout color so that the onomatopoeia and tone of the object are reflected in the webtoon cut in the speech balloon and special effect application unit.
  11. 제10항에 있어서,According to clause 10,
    상기 웹툰 컷으로 추출하는 단계는The steps for extracting the webtoon cut are
    상기 애니메이션 내의 소스를 프레임 이미지, 사운드 및 씬 카메라 좌표로 분류하는 단계;classifying sources within the animation into frame images, sounds and scene camera coordinates;
    연속되는 프레임 이미지들 간의 차를 통해 객체 움직임 또는 위치변화를 추적하는 단계;Tracking object movement or position change through differences between consecutive frame images;
    상기 사운드의 진폭 및 진동수를 추출하는 단계;extracting the amplitude and frequency of the sound;
    상기 씬 카메라 좌표를 기초로 카메라 좌표를 추적하는 단계;tracking camera coordinates based on the scene camera coordinates;
    추적된 객체의 움직임 또는 위치변화가 가장 적은 프레임을 확인하는 단계;Confirming the frame in which the movement or position change of the tracked object is minimal;
    대사를 시작한 시작점부터 끝나는 끝지점까지의 대사 타이밍 지점에 해당하는 프레임을 추적하는 단계;Tracking frames corresponding to dialogue timing points from the starting point of the dialogue to the end point of the dialogue;
    위치변화가 확인된 프레임과 대사를 시작한 시작점부터 끝나는 끝지점까지의 대사 타이밍 지점에 해당하는 프레임을 유효 컷으로 판단하는 단계; 및A step of determining the frame in which the positional change is confirmed and the frame corresponding to the dialogue timing point from the start point of the dialogue to the end point of the dialogue as valid cuts; and
    상기 유효 컷 중 카메라 기법이 반영된 유효 컷을 웹툰 컷으로 추출하는 단계를 포함하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 방법.A method of automatically converting an animation into a webtoon with one touch, including the step of extracting a valid cut reflecting a camera technique among the valid cuts as a webtoon cut.
  12. 제11항에 있어서,According to clause 11,
    상기 집중선, 속도선, 말풍선 및 효과음 및 레이아웃 컬러를 삽입하는 단계는 상기 객체 의성어와 객체의 어조에 따른 말풍선을 자동 생성하는 단계를 포함하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 방법.The step of inserting the concentration line, speed line, speech bubble, sound effect, and layout color includes automatically generating a speech balloon according to the object onomatopoeia and the tone of the object. A method of automatically converting an animation into a webtoon with one touch.
  13. 제12항에 있어서,According to clause 12,
    상기 객체 어조를 분석하는 단계는The step of analyzing the object tone is
    상기 사운드 내에 상기 객체 의성어가 포함될 경우, 미리 훈련된 voice activity detection 알고리즘을 통해 음성이 나오는 부분을 프레임을 탐지하고, 상기 사운드 내에 상기 객체 음성이 인식된 경우 중 해당 시점이 대본에 대사가 없는 경우일 때 이를 의성어로 분류하는 단계를 포함하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 방법.When the object onomatopoeia is included in the sound, the frame where the voice appears is detected through a pre-trained voice activity detection algorithm, and when the object voice is recognized in the sound, the point in time is when there is no dialogue in the script. A method of automatically converting animation to webtoon with one touch, including the step of classifying it as onomatopoeia.
  14. 제13항에 있어서,According to clause 13,
    상기 집중선, 속도선, 말풍선 및 효과음 및 레이아웃 컬러를 삽입하는 단계는 샐리언시 맵을 이용하여 주요 객체(화자)와 객체(청자)의 위치를 추적한 후, 두 컷 이상에서 객체의 위치에 변화가 없으면, 객체를 크롭한 후, 상기 화자의 인접영역에 말풍선 및 글자로 변환된 의성어를 배치하고, 카메라 좌표 정보를 이용하여 레이아웃 컬러를 변경하는 단계를 포함하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 방법. In the step of inserting the concentration line, speed line, speech bubble, sound effect, and layout color, the positions of the main object (speaker) and object (listener) are tracked using a saliency map, and then the positions of the objects are added in two or more cuts. If there is no change, automatically convert the animation to a webtoon with one touch, including cropping the object, placing speech bubbles and onomatopoeia converted into letters in the area adjacent to the speaker, and changing the layout color using camera coordinate information. How to.
  15. 제14항에 있어서,According to clause 14,
    상기 집중선, 속도선, 말풍선 및 효과음 및 레이아웃 컬러를 삽입하는 단계는 객체 또는 물체의 움직임 또는 위치변화의 크기에 비례하 효과선을 움직임 동선을 따라서 웹툰 컷에 삽입하는 단계를 더 포함하는 원터치로 애니메이션을 웹툰으로 자동 변환하는 방법.The step of inserting the concentration line, speed line, speech bubble, sound effect, and layout color is a one-touch process that further includes the step of inserting the effect line into the webtoon cut along the movement line in proportion to the size of the movement or position change of the object or object. How to automatically convert animation to webtoon.
PCT/KR2022/007300 2022-03-31 2022-05-23 System and method for automatically converting animation into webcomics by one touch WO2023191182A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020220040557A KR20230141237A (en) 2022-03-31 2022-03-31 System and method for automatically converting animation into webtoon with one touch
KR10-2022-0040557 2022-03-31

Publications (1)

Publication Number Publication Date
WO2023191182A1 true WO2023191182A1 (en) 2023-10-05

Family

ID=88202989

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/007300 WO2023191182A1 (en) 2022-03-31 2022-05-23 System and method for automatically converting animation into webcomics by one touch

Country Status (2)

Country Link
KR (1) KR20230141237A (en)
WO (1) WO2023191182A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20060030179A (en) * 2004-10-05 2006-04-10 (주)인터넷엠비씨 Electronic cartoon and manufacturing methode thereof
KR20160014072A (en) * 2016-01-12 2016-03-09 정승묵 Movie and Drama is a make Skill Wed toon Technology
KR20190054721A (en) * 2017-11-14 2019-05-22 한성호 Apparatus and method for generating of cartoon using video

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102086780B1 (en) 2018-08-22 2020-03-09 네이버웹툰 주식회사 Method, apparatus and computer program for generating cartoon data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20060030179A (en) * 2004-10-05 2006-04-10 (주)인터넷엠비씨 Electronic cartoon and manufacturing methode thereof
KR20160014072A (en) * 2016-01-12 2016-03-09 정승묵 Movie and Drama is a make Skill Wed toon Technology
KR20190054721A (en) * 2017-11-14 2019-05-22 한성호 Apparatus and method for generating of cartoon using video

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Master Thesis", 1 February 2015, THE GRADUATE SCHOOL OF CULTURE AND ART CONTENT SEJONG UNIVERSITY, Seoul, Korea, article GO DONG-GYUN: "A Study of Editing Conversion from Published Comics to Webtoon: focusing on structural conversion of panel. ", pages: 1 - 83, XP009549344 *
XIN YANG; ZONGLIANG MA; LETIAN YU; YING CAO; BAOCAI YIN; XIAOPENG WEI; QIANG ZHANG; RYNSON W.H. LAU: "Automatic Comic Generation with Stylistic Multi-page Layouts and Emotion-driven Text Balloon Generation", ARXIV.ORG, 26 January 2021 (2021-01-26), XP081867718 *

Also Published As

Publication number Publication date
KR20230141237A (en) 2023-10-10

Similar Documents

Publication Publication Date Title
WO2011055930A2 (en) Method, terminal device, and computer-readable recording medium for setting an initial value for a graph cut
WO2013042992A1 (en) Method and system for recognizing facial expressions
WO2012036424A2 (en) Method and apparatus for performing microphone beamforming
WO2015183015A1 (en) Character recognition method and apparatus therefor
WO2020166883A1 (en) Method and system for editing video on basis of context obtained using artificial intelligence
WO2013048159A1 (en) Method, apparatus and computer readable recording medium for detecting a location of a face feature point using an adaboost learning algorithm
WO2022089170A1 (en) Caption area identification method and apparatus, and device and storage medium
WO2015182904A1 (en) Area of interest studying apparatus and method for detecting object of interest
WO2020138607A1 (en) Method and device for providing question and answer using chatbot
WO2023197648A1 (en) Screenshot processing method and apparatus, electronic device, and computer readable medium
WO2014088125A1 (en) Image photographing device and method for same
WO2013125915A1 (en) Method and apparatus for processing information of image including a face
WO2021172700A1 (en) System for blocking texts extracted from image, and method therefor
WO2023191182A1 (en) System and method for automatically converting animation into webcomics by one touch
WO2011093568A1 (en) Method for recognizing layout-based print medium page
WO2021167312A1 (en) Touch recognition method and device having lidar sensor
WO2024054079A1 (en) Artificial intelligence mirroring play bag
WO2024034924A1 (en) Pointing method and system using stereo camera-based eye tracking
WO2011049408A2 (en) Method for reading code displayed on printed materials
CN112488054A (en) Face recognition method, face recognition device, terminal equipment and storage medium
WO2023163376A1 (en) Virtual collaboration non-contact real-time remote experimental system
WO2022131720A1 (en) Device and method for generating building image
WO2018131729A1 (en) Method and system for detection of moving object in image using single camera
WO2012077909A2 (en) Method and apparatus for recognizing sign language using electromyogram sensor and gyro sensor
CN114697603A (en) Meeting place picture detection method and system for video conference

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22935857

Country of ref document: EP

Kind code of ref document: A1