WO2015088196A1

WO2015088196A1 - Subtitle editing apparatus and subtitle editing method

Info

Publication number: WO2015088196A1
Application number: PCT/KR2014/011891
Authority: WO
Inventors: 정재원; 김경중; 정춘선
Original assignee: 넥스트리밍(주)
Priority date: 2013-12-09
Filing date: 2014-12-05
Publication date: 2015-06-18
Also published as: KR101419871B1

Abstract

Disclosed are a subtitle editing apparatus, a subtitle editing method, and a storage medium. The subtitle editing apparatus according to the present invention is a subtitle editing apparatus capable of editing a moving image and a subtitle, and comprises: a display unit that displays a moving image display area displaying at least one moving image preview screen and a clip display area displaying at least one moving image clip corresponding to the displayed preview screen; and a control unit that controls the subtitle editing apparatus to enter an editing mode for the moving image clip in response to a sensed user gesture when the user gesture on the moving image clip displayed in the clip display area is sensed, wherein the control unit is capable of editing a subtitle on at least one moving image clip according to a sensed user gesture when the user gesture on the moving image clip is sensed in the editing mode.

Description

Subtitle editing device and subtitle editing method

The present invention relates to a caption editing apparatus and a caption editing method, and more particularly, to a caption editing apparatus capable of intuitively inputting captions, deleting captions, copying captions, moving captions, and saving captions on a video clip. A subtitle editing method.

Recently, portable terminals such as smartphones and tablets are widely used, and the performance of the portable terminals and the development of wireless communication technology allow a user to shoot, edit, and share a video using the portable terminal.

However, due to the limitations of the LCD screen size and the performance of the hardware, the portable terminal may not edit a video smoothly as in a general PC environment. In order to alleviate such inconvenience, user demand for a video editing method that can be used in a portable terminal is increasing.

As a demand for a video editing method that a user can intuitively use in a mobile terminal increases, there is a need for a method of editing subtitles together with video editing in a mobile terminal.

The conventional subtitle editing method is a method of obtaining a timeline by playing a video, and editing and storing the subtitle text in a spreadsheet method according to a tine line. Such a subtitle editing method using a timeline has an inconvenient problem that a user cannot easily edit subtitles in a smartphone having a limitation of the LCD screen size.

Accordingly, there is a need for a user-intuitive subtitle editing device and subtitle editing method capable of editing subtitles in a movie clip unit, as well as a method for editing a movie in a movie clip unit.

An object of the present invention is to provide a caption editing apparatus and a caption editing method for editing captions by video clip units.

An object of the present invention is to provide a user-intuitive subtitle editing device and a subtitle editing method.

A subtitle editing apparatus capable of editing a video and a subtitle according to an exemplary embodiment of the present invention displays a video display area displaying at least one video preview screen and at least one video clip corresponding to the displayed preview screen. A display unit for displaying a clip display area and a control unit for controlling to enter an edit mode for the video clip in response to the detected user gesture when detecting a user gesture for a video clip displayed on the clip display area. If the controller detects a user gesture for the at least one video clip in the editing mode, the controller may edit the caption clip in the video clip according to the detected gesture.

In this case, the controller extracts start time and play time information from the video clip, reflects the extracted start time and play time information to the subtitle clip input to the video clip, and starts the start time and play time. The caption clip may be converted into a caption file based on the information.

On the other hand, if the control unit detects a caption input gesture with the user gesture, the controller displays an input window for caption input, and generates a caption clip based on the caption input through the input window and displays the caption clip on the video clip. can do.

The control unit may control to delete the caption clip input on the video clip when the caption deletion gesture is detected by the user gesture.

The controller may control to move the caption clip displayed on the original video clip onto the target video clip when the caption movement gesture is detected by the user gesture.

When the subtitle copy gesture is detected by the user gesture, the controller may copy the subtitle clip displayed on the original video clip and copy the copied subtitle clip on the target video clip.

A subtitle editing method according to another embodiment of the present invention is a subtitle editing method using a subtitle editing device capable of editing a video and a subtitle displayed on a display unit, and displaying at least one video preview screen on the display unit. Displaying a clip display area displaying a video display area and at least one video clip corresponding to the displayed preview screen; detecting a user gesture for a video clip displayed on the clip display area; and detecting the detected user gesture. And entering an edit mode for the video clip in response to detecting the user gesture for the at least one video clip in the edit mode, and editing the subtitles in the video clip according to the detected gesture. .

According to the present invention, subtitle editing can be easily performed simultaneously with video editing.

According to the present invention, a user can easily perform subtitle editing in a smartphone having a limitation of a liquid crystal screen size.

According to the present invention, it is possible to edit the subtitles by video clip units, and the user intuitive subtitle editing.

1 is a view showing a user terminal having a display unit;

2 is a block diagram showing a configuration of a user terminal;

3 is a diagram illustrating a system hierarchy of a user terminal;

4 is a block diagram illustrating a configuration of controlling an operation of a display unit using a frame buffer in a user terminal;

5 to 11 are views illustrating a process of inputting a caption into a video clip according to an embodiment of the present invention;

12 to 16 illustrate a process of editing a caption input to a video clip according to an embodiment of the present invention;

17 to 20 are diagrams illustrating a process of deleting a caption input to a video clip according to an embodiment of the present invention;

21 to 24 are views illustrating a process of moving a caption input to a video clip to another video clip according to an embodiment of the present invention;

25 to 28 are views illustrating a process of synchronizing a video clip and an input subtitle according to an embodiment of the present invention;

29 to 38 are views illustrating a process of editing a plurality of input subtitles according to an embodiment of the present invention;

39 to 43 are diagrams illustrating a process of clipping a video including a caption according to an embodiment of the present invention.

Hereinafter, various embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, the present invention is not limited or limited by the embodiments.

Various embodiments of the present invention may be implemented in a user terminal having a display unit such as a smart phone, a tablet, etc. The subtitle editing apparatus according to an embodiment of the present invention may be implemented by a user terminal having a subtitle editing application. have. Alternatively, the present invention may be implemented by a user terminal having an image processor and a controller capable of processing video and subtitle data. The user terminal refers to a portable electronic device.

1 is a diagram illustrating a user terminal having a display unit.

Referring to FIG. 1, the user terminal 100 may include a display 110. The display 110 may display various icons, contents, browsers, applications, programs, and the like. In addition, the display 110 may include a touch panel capable of detecting a user's gesture. A process of processing a video and a caption in the display 110 will be described with reference to a separate drawing.

Hereinafter, various embodiments of the present disclosure will exemplarily describe a caption processing process displayed on the display 110. The layout of the caption processing process displayed on the display unit 110 described below is for convenience of description only, and the technical spirit of the present invention is not limited or limited by the layout.

Before describing a method of editing a video and a caption in the user terminal 100, the hardware configuration of the user terminal 100 will be briefly described, and a caption processing process driven on the user terminal 100 will be described with the drawings. Explain.

Referring to FIG. 2, a block diagram for describing a hardware configuration of the user terminal 100 is described. The user terminal 100 includes a display 110, a storage 120, a controller 130, a sensor 140, a communicator 150, a camera 160, a video editing unit 170, and a power supply 180. And a data bus 190 interconnecting them.

The display unit 110 displays and adapts various applications (eg, a video editing program, a web browser, a document editing program, a search program, a search engine, a data transmission, etc.) executable under the control of the controller 130 and adapted to the same. It is a display device that provides.

The display 110 may be implemented as a touch screen, and may receive at least one touch gesture through a user's body (eg, a finger including a thumb) or a senseable input means (eg, a stylus pen).

The display 110 converts a detection signal regarding a user gesture detected through the touch sensor into a digital signal (for example, X and Y coordinates) and transmits the signal to the controller 130. The controller 130 may perform a control operation corresponding to a user operation input through the display 110 using the received digital signal. For example, the controller 130 may select a predetermined keypad displayed on the display 110 in response to a user's operation or execute an application corresponding to a soft key.

The display unit 110 includes a video display area displaying a video preview screen, a marker area for searching a plurality of videos and displaying the location of the current video preview, and a video clip corresponding to the video displayed on the video preview screen. A screen divided into a clip display area displaying a may be displayed.

The display 110 may detect a user gesture input for the video display region 111, the marker region 113, and the clip display region 115, and transmit the detected user gesture to the controller 130.

The storage unit 120 may store data regarding various applications such as a video editing program. The storage unit 120 may store various video files, subtitle files, and the like, which are editing targets of the video editing program. In addition, the storage unit 120 may store synchronization information between the video clip and the subtitle clip described below. Hereinafter, the term “storage unit” may include a memory card (eg, an SD card or a memory stick) that can be attached / removed / installed in the storage unit 120, a ROM, a RAM, or the user terminal 100. . The storage unit 130 may also include a nonvolatile memory, a volatile memory, a hard disk drive (HDD), or a solid state drive (SSD).

When the controller 130 detects a user gesture for the video clip displayed on the clip display area through the display 110, the controller 130 may control to enter an edit mode for the video clip in response to the detected user gesture. In this case, the editing mode for the video clip refers to a state in which various editing gestures can be input to edit the subtitles input to the video clip.

When the controller 130 detects a user gesture for at least one video clip in the edit mode, the controller 130 may edit the caption in the video clip according to the detected gesture.

The controller 130 may synchronize the caption clip with the caption clip by extracting time information on the video clip to which the user gesture is input and reflecting the extracted time information on the caption clip.

The controller 130 may control the display 110 to display an input window for caption input when the caption input gesture is detected in the video clip displayed on the clip display area. The controller 130 may generate the caption clip based on the caption input through the input window, and control the display 110 to display the generated caption clip on the corresponding video clip.

The controller 130 may control the display 110 to delete the caption clip input on the video clip when the caption deletion gesture is detected in the video clip displayed on the clip display area.

The controller 130 may control to move the caption clip displayed on the original video clip onto the target video clip when the caption movement gesture is detected in the video clip displayed on the clip display area.

The controller 130 may control to copy a caption clip displayed on the original video clip and to paste the copied caption clip on the target video clip when the caption copy gesture is detected in the video clip displayed in the clip display area.

As described above, when moving or copying a subtitle clip, the controller 130 may modify the synchronization information of the corresponding video clip and the subtitle clip, and control to generate the subtitle file based on the modified synchronization information.

The sensor unit 140 may be configured to include a touch sensor, a geomagnetic sensor, an acceleration sensor, a proximity sensor, and the like. In order to understand the various embodiments of the present disclosure, a detailed description of the remaining sensors in addition to the description of the touch sensor is omitted.

The touch sensor is a sensor capable of sensing a touch on the display unit 110 input by the user. The touch sensor may be divided into an electrostatic method and a piezoelectric method according to a method of sensing a user's touch. The touch sensor according to an embodiment of the present invention may be implemented in two ways, respectively. The touch sensor may be included in the display unit 110 together with the display panel.

The touch sensor refers to a sensor that can select various video clips, caption clips, keypads, icons, etc. displayed on the display screen by pressing the display 110 with a body such as a finger or a detectable input means. The touch sensor uses capacitance change, resistance change, or light quantity change.

The communication unit 150 may include a wireless LAN module (not shown), a short range communication module (not shown), and a connector (not shown).

The communication unit 150 may include at least one of a wireless LAN module and a short range communication module. For example, it may include only a wireless LAN module, only a short range communication module, or may include both a wireless LAN module and a short range communication module.

The camera unit 160 may capture a still image or a video under the control of the controller 130. In some cases, two or more cameras of the camera unit 160 may be configured. The camera unit 160 may be provided in the housing of the user terminal 100 or may be connected to the user terminal 100 using a separate connection means. The camera 160 may include an auxiliary light source (eg, a flash (not shown)) that provides an amount of light required for photographing.

The camera unit 160 may detect a movement or a shape of a user and transmit it to the controller 130 as an input for executing or controlling an application. As an example, the movement of the user refers to the movement of the user's hand detected through the camera. The shape of the user may mean a body shape of a user's face, hand, or the like detected through the camera.

The video editing unit 170 may edit a video and / or a subtitle file. That is, the video editing unit 170 generates at least one video clip including the start frame or the last frame in the edited video, and copies various clips, moves the order of clips, and deletes clips in the generated video clip units. You can perform the video editing function.

The video editing unit 170 may display a representative frame in the clip as a thumbnail image on a part of the screen in order to display the clip. That is, the video clip may be displayed as a thumbnail image in the clip display area 115. Then, the manipulation of the thumbnail image displayed in the icon format in the clip display area 115 is performed on the corresponding movie clip. In addition, a user operation of editing a caption clip on a video clip of the clip display area 115 may be performed.

The video editing unit 170 may be designed in an independent configuration that performs various video and subtitle editing exclusively under the control of the controller 130. Alternatively, the video editing unit 170 may be supplied through an online or offline application program and installed in the user terminal 100 to execute video editing under the control of the controller 130.

According to various embodiments of the present disclosure, the editing operation for the caption clip input on the video clip may be directly processed by the controller 130 or may be processed by the video editing unit 170.

The power supply unit 180 is a component for supplying power to each component inside the user terminal 100. The power supply unit 180 may include a rechargeable battery and a power adapter capable of charging the rechargeable battery.

The data bus 190 is a component that provides a path through which data communication is possible to each component inside the user terminal 100 under the control of the controller 130.

3 is a diagram illustrating a system hierarchy structure of a user terminal according to an exemplary embodiment.

Referring to FIG. 3, an operating system (OS) 220 controls a general operation of the hardware 210 and manages the hardware 210. In other words, the OS 220 is a layer that performs basic functions such as hardware management, memory, and security. The OS 220 may include a display driver for driving the display unit 110, a communication driver for transmitting and receiving data, a camera driver for driving a camera, an audio driver for driving an audio unit, an image processing codec for processing a video signal, and It includes a module such as a power manager. It can also include libraries and runtimes that developers can access.

There is a framework work layer 230 above the OS 220. The framework 230 connects the application layer 240 and the OS layer 220. That is, the framework layer 230 includes a location manager, a notification manager, and a frame buffer for displaying an image on the display unit.

In the upper layer of the framework layer 230, an application layer 240 for implementing various functions of the user terminal 100 is located. For example, various applications may be included, such as a call application 241, a video editing application 242, a camera application 243, a browser application 244, and a gesture application 245.

When the user control command is input to the user terminal 100, a specific application corresponding to the input control command is executed while being transmitted from the application layer 240 to the hardware 210, and the result may be displayed on the display 110. have.

A more detailed description of displaying the screen on the display unit 110 of the user terminal 100 will be described with reference to FIG. 4.

4 is a block diagram illustrating a configuration of controlling an operation of a display unit using a frame buffer in a user terminal.

Referring to FIG. 4, the frame buffer 131 is configured to buffer an image frame to be displayed on the display 110. For example, an image frame digitally processed by a GPU (not shown) is stored in the frame buffer 131 in a bitmap form. In this case, the buffering area of the frame buffer 131 is allocated according to the maximum pixel size supported by the display 110. For example, when the maximum pixel that can be displayed on the display 110 is 1024 × 1200, the frame buffer 131 allocates a buffer storage area so that an image having a size of 1024 × 1200 bitmap format can be stored. The display driver 132 analyzes an image frame stored in a bitmap format in the frame buffer 131 and converts the image frame into an image source signal. The display driver 132 provides an image source signal to the display 110 to drive the display 110 to display an image frame.

The hardware configuration of the user terminal 100 described above has been described in the necessary range for the following description. In addition, the user terminal 100 may be upgraded in various ways, and according to the upgrade, the configuration and function of hardware may be differently implemented. However, as will be described below, if the method of realizing the technical idea of the present invention that can edit video and subtitles in clip units is the same, it can be seen that the technical idea of the present invention until such a case.

동영상 클립 단위로 자막 입력 - 제1 실시 예Subtitle input by video clip unit-Embodiment 1

A first embodiment according to the present invention relates to a method for inputting an arbitrary subtitle on a video clip basis. Here, the video clip refers to a video (eg, a plurality of frames) of a partial section extracted from the editing target video according to a user's selection.

The video clip described below may be pre-generated by the user and stored in the storage 120, or may be generated by the user selecting a frame in real time from the video. Alternatively, the user may download a previously generated video clip from an external server.

5 to 11 are diagrams illustrating a process of inputting a subtitle in units of a video clip according to a first embodiment of the present invention.

Referring to FIG. 5, the display unit 110 of the user terminal 100 may include a video display area 111 displaying a preview screen of at least one video, and a marker area 113 displaying the location of at least one video. And a clip display area 115 displaying at least one video clip in thumbnail form.

For example, in FIG. 5, a preview screen for the video B is displayed at the center of the video display region 111. When the user performs a gesture of dragging to the clip display region 115 while touching the movie B display region, the clip display region 115 displays a thumbnail of the movie clip corresponding to the movie B (clip B in FIG. 5). May be displayed. Here, the clip display area 115 is a work space for editing a video clip and a subtitle clip by the user. The user may select a video clip or subtitle clip displayed on the clip display area 115 to perform an operation such as editing.

In this way, when the user executes a gesture of touching and dragging the video A, the video B, the video C, and the video D displayed on the video display area 111 to the clip display area 115, the clip display area ( In 115, four video clips may be displayed in a thumbnail manner. Although four video clips are displayed in the clip display area 115 shown in FIG. 5, four or more video clips may be displayed in a thumbnail manner, or fewer than four video clips may be displayed in a thumbnail manner.

The user may search for at least one video by performing a gesture of touching at least one video displayed on the video display area 111 and sliding left or right.

Alternatively, the user may search for at least one video by touching a triangular icon displayed on the marker area 113 and performing a gesture of sliding left and right.

The video display region 111, the marker region 113, and the clip display region 115 displayed in FIG. 5 refer to a work space capable of processing a video in units of video clips. When the video editing or subtitle editing is completed by the user and the corresponding work is stored, the video clip and the subtitle clip as the work target may be converted into the video file and the subtitle file, respectively, and stored in the storage 120.

In this case, the video file and the subtitle file may be stored in the storage 120 in a separate file format. Alternatively, the storage 120 may be stored in an integrated file format in which a subtitle file is included in the video file.

Referring to FIG. 6, a user executes a long press gesture (shown as A in FIG. 6) on at least one video clip (clip B) displayed in the clip display area 115. Here, the long press gesture refers to a user gesture of contacting the user's finger, stylus pen, electronic pen, etc. with the display 110 for a predetermined time (for example, 1 second to 2 seconds or more). In this case, a finger, a stylus pen, an electronic pen, or the like may be implemented in an indirect contact manner in addition to the direct contact method with the display 110. If the predefined time is changeable by the user, the user may selectively set the desired time.

When a user inputs a long press gesture to at least one video clip (eg, clip B), the user terminal 100 detects that the long press gesture A is input and enters a video clip editing mode. When entering the clip editing mode, the user terminal 100 may display icons (X, Y) for editing a video clip in one region of at least one video clip (eg, clip B). Here, the first icon X is an input icon for inputting a new subtitle clip corresponding to the video clip, and the second icon Y is a delete icon for deleting the video clip.

Here, the first and second icons X and Y may be configured of various types of figures, images, colors, or a combination thereof. In FIG. 7, the first icon X is displayed as a rectangular box, but the first icon X may be configured as a pencil or pen icon. In addition, although the second icon Y is displayed in a form in which the “×” figure and the “○” figure are superimposed, the second icon Y may be configured as various other kinds of icons.

As shown in FIG. 8, the user performs a one-touch gesture (marked B in FIG. 8) in order to execute the first icon (X). Here, the one-touch gesture B refers to a user gesture of contacting the user's finger, stylus pen, electronic pen, etc. with the display 110 for a defined time shorter than the touch time of the long press gesture A. FIG. One-touch gestures may also be implemented to touch the display 110 directly or indirectly.

When the user inputs the one-touch gesture to the first icon X, the user terminal 100 detects that the one-touch gesture is input, and enters a caption input mode for the corresponding video clip. When the user terminal 100 enters the caption input mode, the user terminal 100 displays an input window 117 for caption input for a video clip (eg, clip B) to which a one-touch gesture is input (see FIG. 9).

As shown in FIG. 9, the input window 117 for caption input for the clip B includes a display window 117-1 on which the caption input is displayed and a virtual keyboard 117-typing characters for inputting captions. It may include 2). The input window 117 may be displayed on the display 110 to include some or all of the video display area 111, the marker area 113, and the clip display area 115.

As illustrated in FIG. 10, when a user inputs a caption using the virtual keyboard 117-2, the input caption may be displayed on the display window 117-1. When the user completes the caption input, the user terminal 100 generates a caption clip based on the previously input caption.

As shown in FIG. 11, when the user terminal 100 detects that the caption input is completed by the user, the user terminal 100 displays the input caption (indicated by ST-B in FIG. 11) in the corresponding video clip (clip) of the clip display area 115. Display on B). The user terminal 100 extracts start time and playback time information for the video clip (clip B), and reflects the extracted start time and playback time information to the subtitle clip (ST-B). Formation of a subtitle file using the subtitle clip (ST-B) having start time and reproduction time information will be described separately below.

As described above, according to the first embodiment of the present invention, since the user can generate the caption clip by inputting the caption in the unit of the video clip, the display screen is compared with the conventional method of inputting the caption using the timeline. Despite the size constraints, the user can enter subtitles intuitively.

동영상 클립 단위로 입력된 자막 편집 - 제2 실시 예Edit subtitles input by video clip unit-Second embodiment

According to a second exemplary embodiment of the present invention, a user may intuitively edit a subtitle in a movie clip unit with respect to a video clip in which subtitles are input.

12 to 16 are views for explaining a process of editing a subtitle input to a video clip according to an embodiment of the present invention.

As shown in FIG. 12, a process of a user inputting a long process gesture (indicated by an area A in FIG. 12) for at least one of the plurality of video clips (eg, clip B) shown in FIG. 11 is illustrated. have. In FIG. 12, subtitles are input to only one video clip (clip B), and subtitles are not input to the remaining video clips (clip A, clip C, and clip D).

As illustrated in FIG. 13, when a long press gesture is input to at least one video clip of a plurality of video clips by a user, various types of notifications of contact of the long press gesture are provided in one region of the video clip (clip B). The indicator can be displayed.

Looking more specifically as follows. When the user terminal 100 detects a long press gesture in at least one video clip (eg, clip B) input by the user, the user terminal 100 enters a video clip editing mode. When the user terminal 100 enters the video clip editing mode, the user terminal 100 may display an icon (X, Y) or the like for editing the video clip in one region of the video clip (eg, the clip B) to which the long press gesture is input. .

Here, the first icon (X) is a caption editing icon for executing caption editing, and the second icon (Y) is a deletion icon for deleting a video clip.

In this case, the first and second icons X and Y may be configured of various types of figures, images, colors, or a combination thereof. In FIG. 13, the first icon X is displayed in the form of a rectangular box, but in addition, the first icon X may be configured in the form of an icon such as a pencil or a pen. In addition, although the second icon Y is displayed in a form of overlapping the “×” figure and the “○” figure, the second icon Y may be configured in various icon forms such as a trash can.

As shown in FIG. 13, while the edit icon is displayed on the video clip, the user terminal 100 may edit the caption clip by an edit gesture input by the user.

When the user executes the one-touch gesture on the first icon X, the user terminal 100 detects the one-touch gesture input by the user, and the caption for the video clip (clip B in FIG. 14) in which the one-touch gesture is input. The input window 117 for editing is displayed (see FIG. 14).

As shown in FIG. 14, the input window 117 for subtitle editing includes a display area 117-1 in which the contents of the subtitle clip to be edited is displayed, and a character or the like for adding or deleting contents to the subtitle clip. The virtual keyboard area 117-2 may be included. The input window 117 may be displayed to include some or all of the video display area 111, the marker area 113, and the clip display area 115.

In this case, the display area 117-1 may display the contents of the caption clip previously input to the video clip. By touching a keypad provided in the virtual keyboard area 117-2, the user may modify or edit the contents of a previously inputted subtitle clip displayed on the display area 117-1.

Referring to FIGS. 14 and 15, the user terminal 100 may perform one-touch gestures (B in FIG. 15) for some keys (eg, the backspace key “←”) of the virtual keyboard area 117-2 input by the user. If a detection is performed, the command corresponding to the touched key is executed. For example, when the backspace key is touched, the previously input subtitle (eg, "ABCDEF") may be deleted one letter. The user terminal 100 may perform a variety of editing functions for modifying a part of the subtitles previously input in a video clip unit or adding new content according to a subtitle editing gesture input by a user.

As illustrated in FIG. 16, when the user terminal 100 detects that the caption editing gesture by the user is completed, the user terminal 100 edits the caption (ST) edited in the corresponding video clip (eg, clip B) displayed on the clip display area 115. -B) display. The user terminal 100 stores the content of the edited caption in the storage 120.

As described above, according to the second embodiment of the present invention, the user can edit the subtitles by video clip unit without executing the subtitle editing program using the timeline, so that the user can intuitively Subtitles can be edited with.

동영상 클립 단위로 입력된 자막 삭제 - 제3 실시 예Deleting subtitles input by video clip unit-Third embodiment

According to a third embodiment of the present invention, a subtitle clip may be deleted in a movie clip unit during subtitle editing. Hereinafter, with reference to the separate drawings will be described in more detail.

17 to 20 are diagrams illustrating a process of deleting a caption input to a video clip according to an embodiment of the present invention.

Referring to FIG. 17, a preview screen for a plurality of videos is displayed in the video display region 111. In particular, a preview screen for the video B is displayed in the center area of the video display region 111. In the clip display area 115, a plurality of video clips are displayed in a thumbnail format. Pre-input subtitles ST-B are displayed on at least one clip (eg, clip B) of the plurality of video clips.

As shown in FIG. 17, when the user terminal 100 detects a long press gesture (marked as A in FIG. 17) with respect to the subtitle clip ST-B input by the user, the user terminal 100 edits the subtitle for the corresponding subtitle clip. Enter the mode.

Looking at this in more detail as follows. When the user inputs the long press gesture to the at least one subtitle clip ST-B, the user terminal 100 detects that the long press gesture is input, and the one of the subtitle clips ST-B to which the long press gesture is input. A delete icon Y for subtitle deletion and an edit indicator Z indicating that the subtitle clip is in the edit mode are displayed in the region (see FIG. 18). Here, the delete icon Y is a delete execution icon for deleting the corresponding subtitle clip. The edit indicator Z is an indicator for displaying to the user that the subtitle in which the long press gesture is input has entered the edit mode.

As shown in FIG. 18, the delete icon Y is displayed in the form of an icon in which the "×" figure and the "○" figure are superimposed. However, in addition to the delete icon (Y) may be configured in the form of various icons to shape the trash. In addition, the editing indicator Z may be configured to have a rectangular box shape surrounding the caption and a semi-transparent color inside the box. Alternatively, the editing indicator Z may be configured in various highlighting ways that the user can visually recognize the subtitle, such as a sparkling effect, a vibration effect, and a blinking effect.

Referring to FIG. 19, when the user terminal 100 detects a one-touch gesture (indicated by B in FIG. 19) of the delete icon Y input by the user, the user terminal 100 deletes the corresponding caption clip ST-B.

That is, the user terminal 100 deletes the caption clip ST-B2 from the video clip (eg, clip B) of the clip display area 115 (see FIG. 20).

As described above, according to the third embodiment of the present invention, since the user can select the subtitle displayed on the video clip and delete the corresponding subtitle clip, the user can delete the subtitle clip in the unit of subtitle error, thereby improving user convenience. .

동영상 클립 단위로 입력된 자막 클립 이동/복사 - 제4 실시 예Moving / Copying Subtitle Clips Input by Movie Clip Unit-Embodiment 4

According to the fourth embodiment of the present invention, a caption clip is input for each video clip, and the caption clip can be easily moved and copied to another video clip on the video clip. Hereinafter, with reference to the separate drawings will be described in more detail.

21 to 24 are diagrams for describing a process of moving a caption input to a video clip to another video clip according to an embodiment of the present invention.

Referring to FIG. 21, a user inputs a long press gesture to at least one video clip (clip B) in the clip display area 115, and the subtitle clip (ST-B) previously input to the video clip (clip B). A state in which a first movement gesture (indicated by C1 in FIG. 21) for moving in a specific direction is input is illustrated. Here, the first movement gesture C1 refers to a gesture in which a user's finger, a stylus pen, an electronic pen, or the like is in contact with the moving target caption clip ST-B2.

As described above, the long press gesture A refers to a gesture of contacting the object for a predetermined time, and the one touch gesture B refers to a gesture of contacting the object for a shorter time than the contact time of the long press gesture. do.

In contrast, the first movement gesture C1 refers to a gesture in which a finger or the like is continuously contacted with the target item until the movement of the subtitle clip which is the movement target item is completed.

Therefore, the first movement gesture C1 may be in contact for a longer time than the contact time of the long press gesture A. FIG. However, when compared with the long press gesture A, the first movement gesture C1 is different in that it includes movement in a specific direction without stopping in the initial contact area.

Referring to FIG. 22, it can be seen that the caption clip ST-B is moving from clip B to clip C by the second movement gesture C2. Here, the second movement gesture C2 means a continuous gesture of the first movement gesture C1.

Referring to FIG. 23, it can be seen that the caption clip ST-B is finally moved to the clip D and displayed by the third movement gesture C3. Here, the third movement gesture C3 means a continuous gesture of the second movement gesture C2.

Referring to Fig. 24, when the subtitle clip ST-B is copied from clip B to clip D by the user. The subtitle displayed in the clip B is erased, and the subtitle clip ST-B is displayed in the clip D.

As described above, according to the fourth embodiment of the present invention, since the user can move / copy the subtitle clip to another video clip by intuitive gesture on the video clip, the subtitle clip can be moved and copied in a convenient manner. It has an effect.

동영상 클립 및 자막 클립의 동기화 - 제5 실시 예Synchronization of Video Clips and Subtitle Clips-Embodiment 5

According to a fifth embodiment of the present invention, a method of converting a caption clip input to a video clip into a caption file using time information extracted from the video clip is provided. A more detailed description thereof will be described below with reference to a separate drawing.

25 to 28 are diagrams illustrating a process of converting an input caption clip into a caption file based on time information of a video clip according to an embodiment of the present invention.

Referring to FIG. 25, a plurality of video preview screens are displayed in the video display region 111, and four video clips are displayed in the clip display region 115. Each subtitle clip is input to four video clips.

When a caption clip is input for each video clip by the user, the user terminal 100 extracts time information for each video clip. The extracted time information is reflected in the corresponding subtitle clip, and the subtitle clip is converted into a subtitle file based on the time information.

Referring to FIG. 26, the user terminal 100 analyzes time information of four

video clips

1100, 1200, 1300, and 1400. For example, the user terminal 100 extracts first time information Time 1 for the first video clip 1100. In this case, the first time information Time 1 includes start time information and playback time information of the video clip. In the same manner, the user terminal 100 extracts second time information (Time 2) for the second video clip 1200, and extracts third time information (Time 3) for the third video clip 1300. The fourth time information (Time 4) of the fourth video clip 1400 is extracted.

In the user terminal 100, a first subtitle clip 2100 is input to a clip A 1100 by a user, a second subtitle 2200 is input to a clip B 1200, and a third is input to a clip C 1300. The caption 2300 is input, and the fourth caption 2400 is detected in the clip D 1400.

Referring to FIG. 27, the user terminal 100 analyzes a playback order and a relative length based on time information of four extracted

video clips

1100, 1200, 1300, and 1400. The user terminal 100 generates virtual timelines for four video clips according to the analyzed relative lengths.

As shown in FIG. 27, the clip A is located to the left of the clip B. FIG. That is, the plurality of clips displayed in the clip display area may be sequentially played from left to right. Therefore, the start time information of the clip A is earlier than the start time information of the clip B. At this time, when the start time information of the clip A is extracted, the start time information of the clip B may be predicted by adding the play time of the clip A to the start time information of the clip A. FIG.

Referring to FIG. 28, the user terminal 100 analyzes which video clips four

subtitle clips

2100, 2200, 2300, and 2400 are respectively input.

The user terminal 100 reflects the time information extracted for the four video clips to the four

subtitle clips

2100, 2200, 2300, and 2400, respectively.

The user terminal 100 may combine the four video clips of which the input of the subtitle clip is completed into one video file and store the same in the storage 120. In this case, the user terminal 100 may also generate four corresponding subtitle clips as one subtitle file, and store them in the storage 120.

Alternatively, the user terminal 100 may be integrated and stored in the storage 120 in a format including a caption file generated by four caption clips in a video file generated by four video clips.

If a user requests to play a completed video by combining four video clips, the user terminal 100 reads a video file stored in the storage 120, encodes the video, and plays the video through the display 110. In addition, the user terminal 100 may read the subtitle file stored in the storage unit 120, encode the subtitle file, and play the same together on the display unit 110 where the video is being played.

Alternatively, when the caption file is included in the video file, the integrated video file is read, encoded, and played back through the display 110.

As described above, according to the fifth embodiment of the present invention, even if the user does not recognize the time information on the video clip and simply inputs the caption clip on the video clip, the user may close the caption clip based on the time information of the video clip. Convert to a file.

복수의 동영상 클립에 입력된 복수의 자막 편집 - 제6 실시 예Editing a plurality of subtitles input to a plurality of video clips-Embodiment 6

According to the sixth embodiment of the present disclosure, when a plurality of caption clips are respectively input to the plurality of video clips, the plurality of input caption clips may be deleted, combined, or moved. Hereinafter, with reference to the separate drawings will be described in more detail.

29 to 38 are views illustrating a process of editing a plurality of input caption clips according to an embodiment of the present invention.

Referring to FIG. 29, a preview screen of a plurality of videos is displayed in the video display area 111. The clip display area 115 displays a plurality of video clips in the form of thumbnails.

A plurality of subtitle clips ST-A, ST-B, ST-C, and ST-D are displayed in one region of the plurality of video clips (clip A, clip B, clip C, clip D).

When the user terminal 100 detects a long press chess destination (indicated by A in FIG. 29) of at least one subtitle clip ST-B among a plurality of subtitles input by the user, the user terminal 100 receives a plurality of subtitle clips ST-B. A, ST-B, ST-C, ST-D) enters the subtitle editing mode.

As illustrated in FIG. 30, when a plurality of subtitle clips ST-A, ST-B, ST-C, and ST-D enter a subtitle editing mode, the user terminal 100 may select one region of each subtitle clip. The deletion icons Y1, Y2, Y3, and Y4 may be displayed on the screen. At the same time, the user terminal 100 may display the editing indicators Z1, Z2, Z3, and Z4 indicating that each subtitle clip has entered the editing mode.

Looking at this in more detail as follows. When the user inputs the long press gesture to the at least one subtitle clip ST-B, the user terminal 100 detects that the long press gesture is input, and not only the subtitle clip ST-B to which the long press gesture is input. For other subtitle clips, delete icons Y1, Y2, Y3, and Y4 capable of deleting subtitles and editing indicators Z1, Z2, Z3, and Z4 indicating that the subtitle clip is in the edit mode are also displayed (FIG. 30). Reference). Here, the deletion icons Y1, Y2, Y3, and Y4 are deletion icons for deleting subtitles input to the video clip. The editing indicators Z1, Z2, Z3, and Z4 refer to indicators that highlight and display editable subtitles.

As shown in FIG. 30, the delete icons Y1, Y2, Y3, and Y4 are displayed in a form in which the "×" figure and the "○" figure are superimposed, but in addition to the delete icons Y1, Y2, Y3, Y4) may be configured in the form of various icons such as a trash can. The editing indicators Z1, Z2, Z3, and Z4 may be configured such that a semi-transparent color is colored inside the rectangular box shape surrounding the subtitle. Alternatively, the editing indicators Z1, Z2, Z3, and Z4 may be configured in various highlighting ways that the user can visually recognize the subtitles, such as a sparkling effect, a vibration effect, and a flickering effect.

Referring to FIG. 31, if the user terminal 100 detects a one-touch gesture (indicated by B in FIG. 31) in the delete icon Y4 input by the user, the user terminal 100 detects the subtitle clip ST-D input in the clip D. FIG. Delete it.

Referring to FIG. 32, when the user terminal 100 detects a delete command for a specific caption clip (eg, ST-D) input to a specific video clip (eg, clip D), the user terminal 100 is displayed in one region of clip D. Delete subtitles.

Referring to FIG. 33, when a user inputs a long press gesture (denoted as A1 in FIG. 33) to at least one subtitle clip (eg, ST-A) among a plurality of subtitle clips that have entered the edit mode, the long press gesture may be performed. The input subtitle clip ST-A may be moved / copyed to another video clip.

In more detail, the subtitle clip (eg, ST-A) in which the long press gesture is input in the subtitle editing mode may be represented by a highlight effect such as letters, symbols, numbers, shapes, or a combination thereof constituting the subtitle clip. (See Figure 34).

At this time, the subtitle clip (eg, ST-A2) expressed by the highlight effect may be displayed darker than the characters of other subtitle clips. Alternatively, the moving / copying subtitle clip (eg, ST-A) may be displayed to flicker unlike other subtitle clips. Alternatively, the moving / copying subtitle clip (eg, ST-A) may be displayed as an effect of streaming from left to right or from right to left.

Referring to FIG. 35, when the user terminal 100 detects a long press gesture on an area of clip D input by the user, the user terminal 100 pastes the caption clip (eg, ST-A) copied by the user into clip D. Run (paste)

Referring to FIG. 36, a user inputs a first touch gesture C1 to a moving / copying subtitle clip (eg, ST-A) and drags it in the clip B direction. When the user inputs the second touch gesture C2 to the moving / copying subtitle clip (eg, ST-A) in the clip B, the subtitle clip cut out by the user (eg, ST-A) is inserted into the clip B. It is merged with the subtitle clip ST-B previously input. Here, subtitle clips are combined to mean that a plurality of subtitle clips form one subtitle clip. This is the same principle as combining multiple movie clips into one movie clip. The combined subtitle clips can be managed as one subtitle clip.

Referring to FIG. 37, when the user terminal 100 detects a command for combining a subtitle clip ST-A of clip A and a subtitle clip ST-B of clip B input by the user, the first subtitle clip (ST-A) and the second subtitle clip (ST-B) are combined and converted into the subtitle clip (ST-B) of the target video clip (e.g., clip B).

When the user terminal 100 detects a one-touch gesture (indicated by B in FIG. 38) in an area other than the clip display area 115 input by the user, the user terminal 100 may edit, move, or copy a plurality of subtitle clips in progress. Ends and stores the video clip and the subtitle clip immediately before the end in the storage 120 (see FIG. 38).

As described above, according to the sixth exemplary embodiment of the present invention, a user can move, copy, merge, and delete a subtitle clip input in a movie clip unit into a subtitle clip of another movie clip without separately operating a subtitle editing application. A user interface can be provided.

According to various embodiments of the present disclosure, after generating a video clip by clipping a video, a subtitle clip may be input on the video clip so that the user may intuitively input and edit the subtitle clip.

In addition, the caption clip input to the video in the present invention can be managed in the unit of the video clip, thereby eliminating the cumbersome process of checking the time information on the video using the conventional timeline.

자막을 포함하는 동영상의 클립화 - 제7 실시 예Clipping of Movies Containing Subtitles-Seventh Embodiment

According to the seventh embodiment of the present invention, when a video is clipped into at least two video clips, the subtitles pre- inputted to the video may be generated as subtitle clips. Hereinafter, with reference to the separate drawings will be described in more detail.

Referring to FIG. 39, the video 300 may be composed of consecutive frames (frames 1 to 7). The subtitle file 400 is synchronized with the video 300. That is, when the video 300 is played on the display 110 of the user terminal 100, the subtitle 400 may be played on one region of the display 110.

The user can clip the video 300 into two video clips (clip A, clip B). In detail, the user may arbitrarily select at least one or more frames constituting the first video to generate a desired video clip.

That is, when the user selects the first frame from the plurality of video frames and selects the last frame, one video clip composed of the first frame and the plurality of frames belonging to the last frame may be generated.

Referring to FIG. 39, when the first frame (frame 1) and the last frame (frame 4) of the video 300 are selected by the user, the user terminal 100 includes a first video clip composed of the selected first to fourth frames. Create (clip A). When the first frame (frame 5) and the last frame (frame 7) of the video 300 are selected by the user, the user terminal 100 selects a second video clip (clip B) consisting of selected fifth to seventh frames. Create

The user terminal 100 generates two video clips (clip A, clip B) by grouping the frames of the video according to the video clip gesture of the user, and at the same time, the user terminal 100 clips the subtitles related to the video. Can be run automatically.

That is, the user terminal 100 also performs clipping on the caption 400 using the synchronization information between the video 300 and the caption 400. In detail, the user terminal 100 analyzes the caption 400 displayed during the time to which the first video clip clip A of the video 300 belongs. According to the analysis result, the user terminal 100 clips the first subtitles synchronized with the frames 1 to 3 (indicated by "Kanadaramabasa" in FIG. 39) to the first subtitle clip 2100, and is synchronized with the frame 4. The second subtitle (indicated by "ABCDEFG" in FIG. 39) is clipped to the second subtitle clip 2200-1. In the same manner, the user terminal 100 clips the second subtitles (indicated by "ABCDEFG" in FIG. 39) synchronized to the frames 5 to 7 to the third subtitle clip 2200-2.

Referring to FIG. 40, a preview screen of two videos is displayed in the video display region 111, and two video clips are displayed in the clip display region 115.

In the clip display area 115, the first subtitle clip ST-A is displayed in the first movie clip (clip A), and the second subtitle clip ST-B is displayed in the second movie clip (clip B). It is displayed. In FIG. 40, only the contents of the first subtitle clip 2100 described above are displayed on the first subtitle clip ST-A, but the first subtitle clip ST-A includes the first subtitle clip 2100 and the second. Subtitle clip 2200-1.

The user may check the details of the corresponding subtitle clip in order to check the contents of the subtitle clip composed of the plurality of subtitle clips like the first subtitle clip ST-A.

Referring to FIG. 41, if the user terminal 100 detects a long press gesture (indicated by A in FIG. 41) from the first subtitle ST-A input by the user, the first subtitle clip ST-A. Enter edit mode for.

FIG. 42 illustrates a state in which a caption editing mode is entered for convenience of explanation, but a delete icon and / or an edit icon are not displayed in the caption clip. When the user enters a touch gesture (indicated by D in FIG. 42) for the corresponding subtitle while entering the subtitle editing mode, the user terminal 100 displays the details of the touched subtitle.

Referring to FIG. 43, a touch gesture D is input by a user in an edit mode state of a subtitle clip ST-A input to clip A, and the user terminal 100 of the subtitle clip ST-A is connected to the subtitle clip ST-A. A separate window 119 for displaying the details is displayed on the display 110.

In this separate window 119, as described above, the details of the first subtitle clip (indicated as "Kanadaramabasa" in FIG. 43) and the second subtitle clip (indicated as "ABCDEFG" in FIG. 43) are displayed. Can be.

According to the present invention, if the caption is clipped for the video having the caption, the caption can be executed for the caption.

The above-described methods according to various embodiments of the present disclosure may be stored in a code in a computer-readable storage medium. Code for performing the above-described methods according to various embodiments of the present invention, RAM (Random Access Memory), flash memory, ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (Electronically Erasable and Programmable ROM) ), A register, a hard disk, a removable disk, a memory card, a USB memory, a CD-ROM, and the like, may be stored in various types of recording media readable by an electronic device.

Although exemplary embodiments and applications of the present invention have been shown and described, many changes and modifications are possible without departing from the scope of the spirit of the present invention, and such modifications may be made to those skilled in the art. Can be clearly understood. Accordingly, the described embodiments are to be considered as illustrative and not restrictive, and the invention is not to be limited by the following detailed description, but may be modified within the technical scope of the claims.

The present invention can be used to edit video subtitles.

Claims

In the caption editing apparatus,

A display unit configured to display a video display area displaying at least one video preview screen and a clip display area displaying at least one video clip corresponding to the displayed preview screen; And

And a controller configured to enter a subtitle editing mode for the video clip in response to the detected user gesture when detecting a user gesture for the video clip displayed in the clip display area.

If the control unit detects a user gesture for the at least one video clip in the subtitle editing mode, the control unit edits the subtitle clip on the video clip in the subtitle editing mode according to the detected user gesture. Editing device.
According to claim 1,

The control unit extracts start time and playback time information from the video clip, reflects the extracted start time and playback time information to a subtitle clip input to the video clip, and based on the start time and playback time information. And converting the subtitle clip into a subtitle file.
According to claim 1,

The control unit may display an input window for caption input when the caption input gesture is detected by the user gesture, and generate a caption clip based on the caption input through the input window and control the caption clip to be displayed on the video clip. A subtitle editing device.
According to claim 1,

And the controller controls to delete the caption clip input on the video clip when the caption deletion gesture is detected by the user gesture.
According to claim 1,

And the control unit controls to move the caption clip displayed on the original video clip onto the target video clip when the caption movement gesture is detected by the user gesture.
According to claim 1,

The control unit, when detecting a subtitle copy gesture by the user gesture, the subtitle editing device characterized in that to copy the subtitle clip displayed on the original video clip, and to inject the copied subtitle clip on the target movie clip.
In the subtitle editing method using a subtitle editing device that can edit the video and subtitles displayed on the display unit,

Displaying a video display area displaying at least one video preview screen on the display unit and a clip display area displaying at least one video clip corresponding to the displayed preview screen;

Detecting a user gesture with respect to a video clip displayed in the clip display area;

Entering a subtitle editing mode for the video clip in response to the detected user gesture; And

And if the user gesture for the at least one video clip is detected in the subtitle editing mode, editing the subtitle clip on the video clip in the subtitle editing mode according to the detected user gesture.
The method of claim 7, wherein

After editing the subtitle clip,

Extracting start time and playback time information from the video clip;

Reflecting the extracted start time and playback time information to a caption clip input to the video clip; And

And converting the subtitle clip into a subtitle file based on the start time and the play time information.
The method of claim 7, wherein

Editing the subtitle clip,

Detecting a caption input gesture with the user gesture;

Displaying an input window for caption input in response to the detected caption input gesture;

Generating a caption clip based on the caption input through the input window;

Displaying the generated caption clip on the video clip.
The method of claim 7, wherein

Editing the subtitle clip,

Detecting a subtitle deletion gesture by the user gesture; And

And deleting a caption clip input on the video clip in response to the detected caption deletion gesture.
The method of claim 7, wherein

Editing the subtitle clip,

Detecting a subtitle movement gesture by the user gesture;

Moving the subtitle clip displayed on the original movie clip onto the target movie clip in response to the detected subtitle movement gesture.
The method of claim 7, wherein

Editing the subtitle clip,

Detecting a subtitle copy gesture with the user gesture;

Copying a caption clip displayed on an original video clip in response to the detected caption copy gesture; And

And pasting the copied subtitle clip onto a target movie clip.