KR20130003167A

KR20130003167A - Video synthesis method including voice and synthesis system for same

Info

Publication number: KR20130003167A
Application number: KR1020110064285A
Authority: KR
Inventors: 이규원
Original assignee: 이규원
Priority date: 2011-06-30
Filing date: 2011-06-30
Publication date: 2013-01-09

Abstract

.

Description

Video synthesis method including speech and synthesis system for same

The present invention relates to an image system for processing and providing a moving image according to the needs of the user

The present invention relates to a method for synthesizing a video including a voice and a synthesis system therefor. Specifically, a new video that is replaced with a voice by replacing a person image and a voice input to a video synthesis server storage with another person image and a voice is included. To a method and system for generating

Service methods such as generating digital images of digital cameras and camera phones by synthesizing other images to the service server through the wireless Internet to a service server providing a synthesis service have been proposed. It is limited to the movie, drama, video and video character image replacement and user's voice recording is not provided with a synthesis service.

SUMMARY OF THE INVENTION The present invention has been made to solve such a problem, and has an object to provide a method for a user to create a new video by participating in a new video by transmitting a video stored in a computer to a video synthesis server.

In order to achieve the above object, the present invention provides a first step of registering a plurality of person images in which a series of still frame images, such as four-cut cartoons, from electronically produced films, dramas, videos, and video works are recorded as electronically created data. ; Transmitting a video from the computer terminal to the video synthesis server; Synthesizing a new moving image in which the second person image included in the video is removed from only the first person's plurality of person images in the first step image, and replaced with a second person image by inputting a subtitle. The third step; A fourth step of outputting the correct pronunciation when the person of the second person image inputs the voice removed from the support of the designated work in the second person image included in the video and the pronunciation of the input metabolic voice is wrong; ; A fifth step of erasing the subtitles inputted in the video; It provides a video synthesis method comprising a voice including a and a synthesis system for the same

The first step may include extracting and storing motion, expression, and voice data of a plurality of person images with the video synthesis server.

The third step may include extracting data of the second person image, comparing data of the second person image with data of a person image designated in the plurality of person images, and using the comparison result, the designated person. After removing the voice of an image, a new video may be synthesized by replacing the second person image with the designated person image.

In the fourth step, the second person image communicates with the plurality of people in the video since the person who is the second person inputs the voice of the voice of the person image designated in the plurality of person images. I can synthesize the acting video

In addition, when the person of the second person image automatically inputs the voice and checks the pronunciation accuracy, if the wrong person outputs the correct pronunciation, the corrected pronunciation can be input to the image after correcting the voice.

Step 5 may delete the inputted subtitles from the video server.

In another aspect, the present invention provides the steps of electronically creating still frame images from a movie, drama, video, and recording motion, expression and voice data of character images on a server; Transmitting a second person image input or stored in the computer to a server through the Internet; Replacing a second person image transmitted to the server with a person image designated by the server in the plurality of person images included in the video data; Displaying, by the server, the dialogue of the designated person image as a caption; Inputting voice by the person of the second person image to view the subtitles in the removed dialogue of the designated person image in the server; Automatically correcting the pronunciation accuracy upon inputting the voice and outputting the correct pronunciation; It provides a video synthesis method comprising the step of synthesizing a new video by removing the input metabolism subtitles in the server.

The present invention provides a replacement image storage unit for storing data on motions, expressions, and voices of a plurality of person images appearing in a movie, a drama, and a video used in video synthesis, and a video included in a video transmitted from a server to an Internet terminal. 2 a video synthesizer for converting a person image into a designated person image of the plurality of person images to generate a new video, and subtitles of the specified person image in the motion and expression of the designated person image among the plurality of people as subtitles; A subtitle input unit for indicating a subtitle, a voice input unit for inputting a voice by a person of a second person image based on the subtitles, a pronunciation corrector for automatically outputting correct pronunciation when the voice accuracy is automatically corrected when the voice input is performed; Video that includes a video synthesis server that includes a subtitle remover that removes subtitles after the human image is inputted by voice. Provide a synthesis system

As described above, according to the present invention, it is possible to replace a character image of a movie, a drama, and a video stored in a video synthesis server with a voice of a video person image and a person image stored in a user computer. Not only can you give a chance to participate directly in drama film language videos, but also create a profit model through video processing.

1 is a block diagram of a video composition server
2 is a view illustrating a process of synthesizing a video according to an embodiment of the present invention in order;
<Description of Signs of Main Parts in Drawings>
1. Video Synthesis Server 7. Specified Person Image Metabolism Speech Removal
2. Program Storage 8. Subtitle Force
3. Replacement image storage unit 9. Metabolic voice input unit with the replaced person image
4.Control part 10.Pronunciation part
5.Replacement image generation unit 11.Subtitle consonant
6.Video Synthesis

Hereinafter, with reference to the drawings will be described a preferred embodiment of the present invention.

1 is a video composite server (1) stored in the motion picture, motion, expression, voice and the like appearing in the movie, drama, video according to an embodiment of the present invention by combining the person image included in the video with another person image A server for inputting the dialogue of the removed person image, which is included in the video as the voice of the replaced person image after synthesizing the video, and transmitting the same again to the computer terminal, as shown in FIG. 1. 2) Replacement image storage unit (3) Replacement image generation unit (5) Video synthesizing unit (6) Designated person image dialogue voice removal unit (7) Subtitle input unit (8) Metabolic voice input unit (9) Government (10) subtitle writing section (11) is provided.

The program storage unit 2 stores a program for synthesizing a video and a program necessary for server operation according to an embodiment of the present invention. The replacement image storage unit 3 stores data of a replacement image to replace a person image included in the video. Save

According to an embodiment of the present invention will be described in sequence the video synthesis process

First of all, for the embodiment of the present invention, a composite server (1) for a plurality of person images appearing in a movie, drama, video. Replacement image data for motion, expression, and voice should be stored. Connect to video synthesis server (1) by computer terminal. Move to movie, drama, video movie composition page. The video compositing unit 6 performs a compositing operation for each frame constituting the video. Therefore, the video composing unit 6 classifies the motion and the expression area of the person image, and the motion expression area. The video compositing unit 6 captures a plurality of frames constituting the video one by one, and stores the person images included in each frame in the replacement image storage unit 3. It plays the role of compositing new videos by converting them into replacement images.

When the process of changing the person image to the replacement image is completed for all the above-mentioned frames, the video synthesizing unit 6 combines each frame into a single file and synthesizes a new video. The dialogue voice is removed and completed as a video generated by subtitles and transmitted to the user terminal. On the other hand, the person of the second person image rehearses the subtitle metabolism and reconnects to the video synthesis server. If the correctness of the pronunciation is incorrectly checked by the government, the correct dialogue pronunciation is output to the user's speaker. If the above operation is repeated, after inputting the dialogue voice, the subtitle is removed and the subtitle is reconnected. do

Claims

In the system for providing a program user by replacing the specified person image among the plurality of person images on the still frame image extracted from the video work with the user's image
A first step of registering a plurality of person images in the video synthesis server;
Transmitting a video from the computer to the video synthesis server;
Means for synthesizing a new video by removing and replacing only a voice of a designated person image from among the plurality of person images of the first step in the video synthesis server;
Step 3 including means for displaying the designated dialogue of the designated person image as a subtitle in the video synthesis server;
Means for inputting voice by the person of the second person image to view the subtitles of the removed voice of the designated person image in the video server; and
Step 4 including a means for outputting an accurate pronunciation when the pronunciation of the input metabolic voice is wrong;
Removing the input metabolism subtitles from the video server;
Video compositing system including

The method of claim 1,
In the third step,
Removing the voice of the specified person image from the plurality of person images by the second person image in the video synthesizing server;
And a step of displaying, in the video synthesizing server, the designated dialogue of the person image as a subtitle.

The method of claim 1,
In the fourth step,
Inputting audio by watching subtitles in the video synthesis server;
And outputting the correct pronunciation when the input voice is incorrectly pronounced.

The method of claim 1, wherein
The first person image and the plurality of person imaging are moving images synthesis method, characterized in that the image of the entire body of the actor's body appear on the movie, drama, video screen

In claim 1
System that can selectively receive only the dialogue voice of the person specified on the screen from the video synthesis server

In claim 1
System that can be provided on the screen dialogue by text from the video synthesis server