WO2017217612A1

WO2017217612A1 - Method for creating and sharing subtitles of video content by using one-touch feature

Info

Publication number: WO2017217612A1
Application number: PCT/KR2016/012880
Authority: WO
Inventors: 박문수
Original assignee: 주식회사 사이
Priority date: 2016-06-17
Filing date: 2016-11-10
Publication date: 2017-12-21
Also published as: WO2017217613A1

Abstract

A method for creating and sharing subtitles of video content by using a one-touch feature is provided. The method, which is a method implemented by a computer, comprises: a step of playing video content; a step of setting at least one time duration according to a touch operation by a user; a step of receiving, from the user, text corresponding to the at least one time duration; and a step of creating subtitles of the video content by combining the at least one time duration and the text corresponding thereto.

Description

How to create and share subtitles of video contents using one touch

The present invention relates to a method for generating and sharing captions of video content.

Korean Patent Publication No. 10-1419871 discloses a caption editing apparatus and a caption editing method. The caption editing method displays a video preview screen, displays a video clip corresponding to the preview screen, detects a user gesture for the video clip, and enters a caption editing mode for the video clip. The subtitle editing method may detect a user's gesture for inputting a subtitle, deleting a subtitle, moving a subtitle, copying a subtitle, and performing a subtitle editing operation in response to the detected gesture.

An object of the present invention is to provide a method for generating and sharing captions of video content.

Problems to be solved by the present invention are not limited to the above-mentioned problems, and other problems not mentioned will be clearly understood by those skilled in the art from the following description.

According to an aspect of the present invention, there is provided a method of generating and sharing captions of video content, the method being realized by a computer, including: playing back video content and at least one time interval according to a user's touch operation. setting a time duration, receiving a text corresponding to the at least one time interval from the user, and combining the at least one time interval and text corresponding to the at least one time interval Generating a subtitle of the video content, wherein setting at least one time interval according to a touch operation of the user includes: generating a subtitle of the video content during a playback time of the video content according to a touch input of the user. The touch input time point of the user is set as a start time of the time interval, According to a user's touch release, the user's touch release time is set as an end time of the time interval during the playback time of the video content.

In some embodiments of the present disclosure, the method may further include displaying an object that visualizes an audio signal of the video content adjacent to the played video content, wherein a reference corresponding to a current view is displayed on the object that visualizes the audio signal. It further includes the step of displaying.

In some embodiments of the present invention, the method comprises: transmitting subtitles of the video content to a server, receiving at least one subtitle of the video content generated by another user from the server, and the other user And providing the at least one subtitle of the video content generated by the at least one time interval of the subtitle in an editable state.

According to another aspect of the present invention, there is provided a method of generating and sharing captions of video contents, the method being realized by a computer, including: playing back video contents and at least one time interval according to a user's touch operation. Setting a subtitle, receiving a text corresponding to the at least one time interval from the user, and combining the at least one time interval and the text corresponding to the at least one time interval to subtitle the video content. And generating at least one time interval according to the user's touch operation, wherein the user's touch input time point of the playback time of the video content is determined according to the user's touch input. Set to the end time, the image of the playback time of the video content A predetermined unit time period before the time point from the user's touch input point, and sets the start time of the time interval.

In some embodiments of the present invention, the method further comprises displaying a unit time manipulation window providing a plurality of predetermined unit times adjacent to the played video content, at least in accordance with the touch manipulation of the user. The setting of one time interval may include using the predetermined unit time selected by the user among the plurality of predetermined unit times, and the predetermined unit time from the touch input time point of the user during the playback time of the video content. The previous time point is set as the start time of the time interval.

The setting of the at least one time interval according to the touch operation of the user may include playing the video content by using a predetermined unit time selected by the user's touch input among the plurality of predetermined unit times. A time point before the predetermined unit time from the touch input time point of the user is set as a start time of the time interval.

In some embodiments of the present disclosure, the setting of at least one time interval according to the touch manipulation of the user may be automatically determined based on an audio signal of the video content or at least one time interval set by another user. Using a predetermined unit time, a time point before the predetermined unit time from the touch input time point of the user among the playback time of the video content is set as the start time of the time interval.

Other specific details of the invention are included in the detailed description and drawings.

According to the present invention, even under a mobile environment, a user can generate subtitles of video contents in real time and easily through touch operations, and can share and modify subtitles with other users in real time, and through collective intelligence, Subtitles with high reliability can be distributed.

Effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

1 is a schematic block diagram illustrating a configuration of a caption generation and sharing system of video content.

FIG. 2 is a schematic block diagram for describing the configuration of the server of FIG. 1.

FIG. 3 is a schematic block diagram illustrating the configuration of the client of FIG. 1.

4 is a flowchart schematically illustrating a method of generating and sharing captions of video content according to an exemplary embodiment of the present invention.

5 is a flowchart schematically illustrating a method of playing video content by sharing captions of video content according to an exemplary embodiment of the present invention.

6 is a flowchart schematically illustrating a method of generating a new subtitle of the video content by sharing the subtitles of the video content according to an embodiment of the present invention.

7 is a schematic flowchart illustrating a subtitle generation interface according to an embodiment of the present invention.

8 is a diagram schematically illustrating a time section setting screen of a caption generating interface according to an embodiment of the present invention.

FIG. 9 is a schematic diagram illustrating a time section setting method of a caption generating interface according to an embodiment of the present invention. FIG.

10 is a schematic flowchart illustrating a subtitle generation interface according to another embodiment of the present invention.

FIG. 11 is a diagram schematically illustrating a time section setting screen of a caption generating interface according to another embodiment of the present invention.

12 is a schematic diagram for explaining a method of setting a time interval of a caption generation interface according to another embodiment of the present invention.

FIG. 13 is a schematic diagram illustrating a subtitle generation interface according to another embodiment of the present invention. FIG.

14 to 15 are schematic diagrams for describing a time interval setting method of a caption generation interface according to another embodiment of the present invention.

16 is a diagram schematically illustrating a time section selection screen of a subtitle generation interface according to embodiments of the present invention.

17 to 18 are schematic diagrams illustrating a time section modification screen of a caption generation interface according to embodiments of the present invention.

19 is a diagram schematically illustrating a text input screen of a caption generating interface according to embodiments of the present invention.

20 is a diagram schematically illustrating a caption selection screen of a caption generating interface according to embodiments of the present invention.

FIG. 21 is a diagram schematically illustrating a time interval sharing screen of a caption generating interface according to embodiments of the present invention.

FIG. 22 is a diagram schematically illustrating a time section and a text sharing screen of a caption generating interface according to embodiments of the present invention.

Advantages and features of the present invention and methods for achieving them will be apparent with reference to the embodiments described below in detail with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but can be embodied in various different forms, and the present embodiments only make the disclosure of the present invention complete, and those of ordinary skill in the art to which the present invention belongs. It is provided to fully inform the skilled worker of the scope of the invention, which is defined only by the scope of the claims.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. In this specification, the singular also includes the plural unless specifically stated otherwise in the phrase. As used herein, "comprises" and / or "comprising" does not exclude the presence or addition of one or more other components in addition to the mentioned components. Like reference numerals refer to like elements throughout, and "and / or" includes each and all combinations of one or more of the mentioned components. Although "first", "second", etc. are used to describe various components, these components are of course not limited by these terms. These terms are only used to distinguish one component from another. Therefore, of course, the first component mentioned below may be a second component within the technical spirit of the present invention.

Unless otherwise defined, all terms used in the present specification (including technical and scientific terms) may be used in a sense that can be commonly understood by those skilled in the art. In addition, terms that are defined in a commonly used dictionary are not ideally or excessively interpreted unless they are specifically defined clearly.

The spatially relative terms " below ", " beneath ", " lower ", " above ", " upper " It can be used to easily describe a component's correlation with other components. Spatially relative terms are to be understood as including terms in different directions of components in use or operation in addition to the directions shown in the figures. For example, when flipping a component shown in the drawing, a component described as "below" or "beneath" of another component may be placed "above" the other component. Can be. Thus, the exemplary term "below" can encompass both an orientation of above and below. Components may be oriented in other directions as well, so spatially relative terms may be interpreted according to orientation.

Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

"Subtitle" refers to a character displayed on the screen together with the video content when the video content is played back. Subtitles include any text for explanation to the viewer regarding the title, cast, place, time, dialogue, production, etc. of the video content. The caption is configured to include a time duration for displaying the caption in the reproduction time of the video content and text displayed on the screen as the caption corresponding to the time period.

Referring to FIG. 1, a subtitle generation and sharing system of video content includes a server 100 and a plurality of clients 200.

The server 100 and the plurality of clients 200 communicate data and / or information with each other via a network. The network is provided in a wired and / or wireless network. The network can have any protocol, scale, or topology.

The server 100 stores at least one video content and provides the at least one video content to the client 200 in response to a request of the client 200. The server 100 receives and stores at least one subtitle of at least one video content from the client 200. The server 100 provides the client 200 with at least one subtitle of the at least one video content in response to a request of the client 200. The server 100 provides the client 200 with at least one video content, at least one video content, subtitles of the video content, or subtitles of the video content at the request of the client 200.

Referring to FIG. 2, the server 100 includes a communication unit 110, a user management unit 120, a content providing unit 130, a subtitle providing unit 140, a user database 150, a content database 160, and a subtitle database. And 170.

The communication unit 110 performs wired and / or wireless communication with the client 200. The user manager 120 registers a user and creates a user account. The user manager 120 performs addition, deletion, modification, search, and the like of user account information. The content provider 130 searches for the video content requested by the client 200 from at least one video content stored in the content database 160, and provides the searched video content to the client 200 through the communication unit 110. The caption provider 140 searches for captions of the video content requested by the client 200 from among at least one caption of the at least one video content stored in the caption database 170, and transmits the caption of the searched video content to the communication unit 110. Provided to the client 200 through. The user database 150 stores user information such as user accounts, user profiles, user logs, and the like. The content database 160 stores at least one video content. In some embodiments, at least one video content stored in the content database 160 is divided by country or type. For example, video content such as entertainment, drama, movies, documentaries, courses, and the like may be stored in the content database 160, but is not limited thereto. The caption database 170 stores captions of at least one video content. In some embodiments, subtitles stored in subtitle database 170 are classified according to the associated video content. In some embodiments, subtitle related information, such as the creator of the subtitle, the date of creation, a subtitle language or a description of the subtitle, and the like are stored together in the subtitle database 170.

On the other hand, the components shown in FIG. 2 are not essential, and in some embodiments, server 100 further includes components not shown in FIG. 2, or does not include some components shown in FIG. 2. May be deformed.

Client 200 represents a computer device used by a user. For example, the client 200 may be provided as a mobile device such as a smart phone, a tablet, a personal digital assistant (PDA), but is not limited thereto. Client 200 may be provided to any non-exemplified computer device capable of communicating data and / or information over a network.

The client 200 receives only predetermined video content from the server 100, plays the received video content, and generates captions of the video content according to a user's manipulation. The client 200 transmits the caption of the video content generated by the user to the server 100. The client 200 receives a caption of the video content generated by a user different from the predetermined video content from the server 100, and plays the video content by using the caption. The client 200 receives a caption of predetermined video content generated by another user from the server 100, and generates a new caption of the video content according to a user's operation based on the caption. The client 200 transmits a new subtitle of the video content generated by the user to the server 100. The client 200 performs a client program for generating and sharing captions of video content. For example, the client program may be provided in the form of a web browser, a desktop application, a mobile application, and the like, but is not limited thereto.

Referring to FIG. 3, the client 200 includes a wireless communication unit 210, an A / V input unit 220, a user input unit 230, a sensing unit 240, an output unit 250, a storage unit 260, and an interface. The unit 270, a controller 280, and a power supply unit 290 are included.

The wireless communication unit 210 communicates wirelessly with an external device such as the server 100. The wireless communication unit 210 wirelessly communicates using a wireless communication scheme such as mobile communication, WiBro, Bluetooth, Wi-Fi, Zigbee, ultrasound, infrared, RF, and the like. However, the wireless communication scheme of the client 200 is not limited to the specific embodiment. The wireless communication unit 210 transmits data and / or information received from the external device to the controller 280, and transmits data and / or information transmitted from the controller 280 to the external device. To this end, the wireless communication unit 210 may include a mobile communication module 211 and a short-range communication module 212.

In addition, the wireless communication unit 210 includes the location information module 213 to obtain location information of the client 200. Location information of the client 200 may be provided from, for example, a GPS positioning system, a WiFi positioning system, a cellular positioning system, or a beacon positioning system, but the present invention is not limited thereto. Location information may be provided from the positioning systems. The wireless communication unit 210 transmits the location information received from the positioning system to the control unit 280.

The A / V input unit 220 is for inputting a video or audio signal, and may include a camera module 221 and a microphone module 222.

The user input unit 230 receives various information from the user. The user input unit 230 includes input means such as a keypad, a button, a switch, a touch pad, and a jog wheel. When the touch pad has a mutual layer structure with the display module 251 described later, a touch screen may be configured.

The sensing unit 240 detects the state of the client 200 or the state of the user. The sensing unit 240 may include sensing means such as a touch sensor, a proximity sensor, a pressure sensor, a vibration sensor, a geomagnetic sensor, a gyro sensor, a speed sensor, an acceleration sensor, and a biometric sensor. In some embodiments, the sensing unit 240 is used for user input.

The output unit 250 notifies the user of various kinds of information. The output unit 250 outputs information in the form of text, video or audio. To this end, the output unit 250 may include a display module 251 and a speaker module 252. The display module 251 is a plasma display panel (PDP), liquid crystal display (LCD), thin film transistor (TFT) LCD, organic light emitting diode (OLED), flexible display, three-dimensional display, electronic ink display, or the present invention. It may be provided in any form well known in the art. The output unit 250 may further include any form of output means well known in the art.

The storage unit 260 stores various data and commands. The storage unit 260 stores system software and various applications for the operation of the client 200. The storage unit 260 may be a random access memory (RAM), a read only memory (ROM), an erasable-programmable ROM (EPROM), an electrically EPROM (EEPROM), a flash memory, a hard disk, a removable disk, or a technical field to which the present invention belongs. Computer-readable recording media of any form well known in the art.

The interface unit 270 serves as a path to an external device connected to the client 200. The interface unit 270 receives data and / or information from an external device or receives power and transmits the data and / or information to components inside the client 200, or transmits data and / or information inside the client 200 to an external device. Or supply internal power. The interface unit 270 may include, for example, a wired / wireless headset port, a charging port, a wired / wireless data port, a memory card port, a universal serial bus (USB) port, and an identification module. Port may be connected to a connected device, an audio input / output (I / O) port, a video input / output (I / O) port, or the like.

The controller 280 controls other components to control the overall operation of the client 200. The controller 280 performs system software and various applications stored in the storage 260.

The power supply unit 290 may include a wireless communication unit 210, an A / V input unit 220, a user input unit 230, a sensing unit 240, an output unit 250, a storage unit 260, an interface unit 270, Supply power for the operation of the control unit 280. The power supply unit 290 may include an internal battery.

Meanwhile, the components shown in FIG. 3 are not essential, and in some embodiments, the client 200 further includes components not shown in FIG. 3 or does not include some components shown in FIG. 3. May be deformed.

Meanwhile, although only one server 100 is illustrated in FIG. 1, in some embodiments, the server 100 may be modified to be provided in plural as necessary.

According to the system for generating and sharing captions of video content of FIG. 1, a user directly generates captions of predetermined video content or shares captions of the video content generated by another user, and uses the captions to make the video content. You can watch it. Alternatively, the user may modify the subtitles of the video content generated by another user to be more complete. Within the subtitle generation and sharing system of the video content of FIG. 1, at least some subtitles generated by the user may be traded to another user for a fee.

Referring to FIG. 4, in operation S310, the client 200 receives predetermined video content from the server 100.

Subsequently, in step S320, the client 200 generates a caption of the video content according to a user's manipulation. A method of generating subtitles of specific video content will be described in detail with reference to FIGS. 7 to 15.

Subsequently, in step S330, the client 200 transmits the caption of the video content generated by the user to the server 100.

Referring to FIG. 5, in operation S410, the client 200 receives predetermined video content from the server 100.

Subsequently, in step S420, the client 200 receives at least one subtitle of the video content generated by another user from the server 100.

Subsequently, in step S430, the client 200 plays the video content by using the caption selected by the user among the at least one caption received from the server 100.

Referring to FIG. 6, in operation S510, the client 200 receives predetermined video content from the server 100.

Subsequently, in step S520, the client 200 receives at least one subtitle of the video content generated by another user from the server 100.

Subsequently, in operation S530, the client 200 provides a subtitle selected by a user among the at least one subtitle generated by another user or a time interval of the subtitle in an editable state.

Subsequently, in operation S540, the client 200 generates a new caption of the video content according to the user's operation based on the caption or the time interval of the caption.

Subsequently, in step S550, the client 200 transmits the caption of the video content generated by the user to the server 100.

Referring to FIG. 7, in operation S610, the client 200 plays predetermined video content received from the server 100.

In operation S620, the client 200 displays an audio signal object of the video content. An audio signal object visualizes the audio signal of the video content. In some embodiments, a reference that corresponds to the current time is displayed on the audio signal object. The user may recognize the audio signal of the current time point through the reference and may refer to it for setting a time section to be described later.

In operation S630, the client 200 sets at least one time interval according to the user's touch input and touch release. The client 200 sets the user's touch input time point as a start time of the time interval among the playing time of the video content according to a user's touch input, and releases the user's touch. ), The touch release time point of the user is set as an end time of the time interval during the playback time of the video content.

Subsequently, in step S640, the client 200 receives a text corresponding to the at least one time interval from the user.

In operation S650, the client 200 generates the caption of the video content by combining the at least one time interval with the text. Each time interval has its own text mapped to it.

Referring to FIG. 8, the video content 10 is played in the video playback window. The timeline 11 of the video content 10 is displayed adjacent to the video content 10. The timeline 11 represents the total playback time and the current time point of the video content 10. The timeline 11 is disposed inside or outside the video playback window. In some embodiments, when the timeline 11 is placed inside the video playback window, the timeline 11 is displayed overlapped on the video content 10 being played. The audio signal object 12 of the video content 10 is displayed in the audio signal window adjacent to the timeline 11. Below the audio signal object 12, a list of at least one time interval 15 of the video content 10 is displayed in the caption editing window.

On the audio signal object 12 a reference 13 corresponding to the current time is displayed. The region 14 corresponding to the time interval 15 set by the user among the audio signals of the video content 10 on the audio signal object 12 is displayed to be distinguished from other regions. In some embodiments, the region 14 is distinguished from other regions by using a bounding box as shown in FIG. 8. In some embodiments, the area 14 is displayed differently from other areas of different sizes or brightness. However, the display method of the region 14 is not limited thereto. In some embodiments, the user sets the time period 15 via a touch 30 to the audio signal window. In some embodiments, the user sets the time period 15 via a touch 30 to the audio signal window and the subtitle editing window. The user may set the time interval 15 through the touch 30 for an arbitrary region.

Referring to FIG. 9, when a user's touch input 30 is provided at a first time point t1 and a user's touch release 30 is provided at a second time point t2, the first time point is displayed. The time between t1 and the second time point t2 is set as a time interval for displaying subtitles. That is, the first time point t1 is set as the start time of the time section, and the second time point t2 is set as the end time of the time section.

Referring to FIG. 10, in operation S710, the client 200 plays predetermined video content received from the server 100.

In operation S720, the client 200 displays an audio signal object of the video content. An audio signal object visualizes the audio signal of the video content. In some embodiments, the reference corresponding to the current time point is displayed on the audio signal object.

Then, in step S730, the client 200 displays the unit time operation window. The unit time operation window is for selecting the unit time of the user. The unit time operation window provides a plurality of predetermined unit times.

In operation S740, the client 200 sets at least one time interval according to the user's touch input and unit time. The client 200 sets the user's touch input time point as an end time of the time interval during the playback time of the video content according to the user's touch input, and during the playback time of the video content. A time point before a predetermined unit time from the touch input time point of the user is set as a start time of the time interval.

Then, in step S750, the client 200 receives a text corresponding to the at least one time interval from the user.

In operation S760, the client 200 generates the caption of the video content by combining the at least one time interval with the text.

Referring to FIG. 11, the video content 10 is played in the video playback window. The timeline 11 of the video content 10 is displayed adjacent to the video content 10. The timeline 11 represents the total playback time and the current time point of the video content 10. The audio signal object 12 of the video content 10 is displayed in the audio signal window adjacent to the timeline 11. The unit time operation window 16 is displayed adjacent to the audio signal object 12. The unit time operation window 16 provides a plurality of predetermined unit times. In some embodiments, the user may set a plurality of predetermined unit times provided by the unit time manipulation window 16. As illustrated in FIG. 11, for example, the unit time operation window 16 may provide unit times such as 0.3 seconds, 1 second, 2 seconds, 4 seconds, AUTO, and the like, but is not limited thereto. A list of at least one time interval 15 of the video content 10 is displayed in the subtitle editing window at the bottom of the unit time operation window 16.

On the audio signal object 12 a reference 13 corresponding to the current time is displayed. The region 14 corresponding to the time interval 15 set by the user among the audio signals of the video content 10 on the audio signal object 12 is displayed to be distinguished from other regions.

The user selects the unit time for setting the time interval through the touch 30 on the unit time operation window 16. The client 200 sets the time interval 15 using the unit time selected by the user among a plurality of predetermined unit times on the unit time operation window 16.

In some embodiments, the user sets the time period 15 via a touch 30 to the audio signal window. In some embodiments, the user sets the time period 15 via a touch 30 to the audio signal window and the subtitle editing window. The user may set the time interval 15 through the touch 30 for an arbitrary region. The user selects a specific unit time by inputting the touch 30 to the unit time operation window 16 before setting the time section 15.

In some embodiments, the user sets the time interval 15 via a touch 30 to the unit time manipulation window 16. In this case, the touch input is not only for selecting a specific unit time but also for setting an end time of a time interval for displaying a subtitle.

In some embodiments, when the user selects AUTO, the client 200 automatically determines the unit time. In some embodiments, the client 200 automatically determines the unit time based on the audio signal of the video content 10. In some embodiments, the client 200 automatically determines the unit time based on subtitles generated by other users (at least one time interval set by another user). In some embodiments, the client 200 analyzes the unit time frequently used by the user, and automatically determines an appropriate unit time according to the analysis result.

Referring to FIG. 12, when a user's touch input 30 is provided at a second time point t2 of a playback time of the video content, the time between the first time point t1 and the second time point t2 displays a caption. It is set to a time interval to. The first time point t1 is determined as a time point that is before a predetermined unit time from the second time point t2. First, the second time point t2 is set as the end time of the time period, and then the first time point t1 is set as the start time of the time period. 12 illustrates a case where 4 seconds is selected as a unit time, for example.

Referring to FIG. 13, the subtitle providing unit 140 of the server 100 includes a machine learning module 141. In some embodiments, the machine learning module 141 learns the audio signal of the video content. In some embodiments, machine learning module 141 learns subtitles (at least one time interval set by another user) generated by another user. The machine learning module 141 may learn a plurality of subtitles related to one video content or learn a plurality of subtitles having different target video contents. Reference numeral 161 denotes predetermined video content, and reference numeral 171 denotes a plurality of subtitles related to the video content. In some embodiments, the machine learning module 141 learns at least one time interval setting pattern of the user. According to the learning result, the machine learning module 141 predicts in real time an optimal time interval for displaying captions of predetermined video content. Similarly, the machine learning module 141 may predict in real time an optimal unit time for setting the start time of the time interval.

The client 200 receives information on the optimal time interval predicted by using machine learning from the server 100 and provides the information to the user for reference in the process of setting the time interval.

Referring to FIG. 14, in some embodiments, an area 17 corresponding to an optimal time interval predicted using machine learning is displayed on the audio signal object 12 before a user's touch input for setting the time interval is displayed. do. The user may set a start time and an end time of a time interval for displaying a subtitle with reference to the area 17. Naturally, the user may set the time interval differently from the optimal time interval predicted using machine learning.

In addition, the client 200 receives information about the optimal unit time predicted by using machine learning from the server 100 and provides the information to the user for reference in the process of setting the time interval. .

Referring to FIG. 15, in some embodiments, prior to a user's touch input for setting a time interval, an optimal unit time is provided in the unit time operation window 18. In some embodiments, the unit time manipulation window 18 provides one or a plurality of optimal unit times. In some embodiments, the unit time manipulation window 18 simultaneously provides the unit time set by the user and the optimal unit time predicted using machine learning. The user may select the optimal unit time with reference to the unit time operation window 18. Naturally, without using the optimal unit time, the user may select the unit time set by the user. Although not explicitly illustrated, an area 17 corresponding to the optimal time interval predicted by using machine learning may be displayed on the audio signal object 12.

Referring to FIG. 16, when any one time interval 15 of the list of at least one time interval 15 of the video content 10 is selected by the user, the time of the video content 10 in the video playback window. The image corresponding to the start time of the section 15 is displayed. In the audio signal window, an area 14 corresponding to the time interval 15 of the audio signal of the video content 10 is displayed on the audio signal object 12.

Referring to FIG. 17, in some embodiments, the user adjusts the start time or end time of the time interval 15 by touching the area 14 corresponding to the time interval 15 on the audio signal object 12. Can be. For example, the user may adjust a start time or end time of the time interval 15 by inputting a predetermined gesture (eg, drag) after a touch input to the area 14, but is not limited thereto. .

Referring to FIG. 18, in some embodiments, the time adjustment object 19 is displayed in the audio signal window adjacent to the audio signal object 12. For example, the time adjustment object 19 may be disposed adjacent to the left and right of the audio signal object 12, but is not limited thereto. The user may adjust the start time or the end time of the time interval 15 through a touch on the time adjustment object 19.

Referring to FIG. 19, in some embodiments, when a user's primary touch 30 is provided for any one time interval 15 of a list of at least one time interval 15 of the video content 10, When a video corresponding to the start time of the time section 15 of the video content 10 is displayed in the video playback window, and the user's second touch 30 is provided for the time section 15, the user The text 20 corresponding to the time interval 15 may be input. In some embodiments, the text input window is overlapped and disposed on the video playback window. In some embodiments, a text input window is disposed adjacent to the time interval 15.

Referring to FIG. 20, in some embodiments, when the predetermined video content 10 is selected by the user, the menu window 21 is displayed adjacent to the video playing window. For example, the menu window 21 may provide a plurality of menus for selecting subtitles, generating subtitles, and the like, but is not limited thereto. When the caption selection is selected, a list of at least one caption 22 of the video content 10 is displayed in the caption selection window at the bottom of the menu window 21. In some embodiments, subtitle related information, such as the creator, date of creation, subtitle language or description of the subtitle, etc., of each subtitle 22 is displayed together within the subtitle selection window. When any one of the subtitles 22 of the list of at least one subtitles 2 of the video content 10 is selected by the user, the selected subtitles are overlapped and displayed on the video playback window.

Referring to FIG. 21, in some embodiments, the user selects editing of any one of the subtitles 22 from the list of at least one subtitles 22 of the given video content 10. In this case, the caption generation screen described with reference to FIG. 8 is displayed, and the caption 22 selected by the user is provided in an editable state. The user may share only at least one time interval 15 of the subtitle 22. On the audio signal object 12, an area 14 corresponding to at least one time interval 15 set by another user is displayed, and set by another user in the subtitle editing window at the bottom of the audio signal object 12. A list of at least one time period 15 is displayed.

The user may adjust the start time or end time of the time interval 15 set by another user. Alternatively, the user may delete the time interval 15 set by another user. Alternatively, the user may additionally set a time period 15 not set by another user.

Referring to FIG. 22, in some embodiments, the user selects editing of any one of the subtitles 22 from the list of at least one subtitles 22 of the given video content 10. In this case, the caption generation screen described with reference to FIG. 8 is displayed, and the caption 22 selected by the user is provided in an editable state. . The user may share both the at least one time interval 15 of the subtitle 22 and the corresponding text 23. At the bottom of the audio signal object 12, a list of at least one time interval 15 and corresponding text 23 set by another user in the subtitle editing window is displayed.

The user may adjust the start time or end time of the time interval 15 set by another user. Alternatively, the user may delete the time interval 15 set by another user. Alternatively, the user may additionally set a time period 15 not set by another user. In addition, the user can modify the text 23 input by another user.

The steps of a method or algorithm described in connection with an embodiment of the invention may be implemented directly in a hardware module, in a software module executed by hardware, or by a combination thereof. Software modules may include random access memory (RAM), read only memory (ROM), erasable programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, hard disk, removable disk, CD-ROM, or It may reside in any form of computer readable recording medium well known in the art.

In the above, embodiments of the present invention have been described with reference to the accompanying drawings, but those skilled in the art to which the present invention pertains may implement the present invention in other specific forms without changing the technical spirit or essential features. I can understand that. Therefore, it should be understood that the embodiments described above are exemplary in all respects and not restrictive.

Claims

As a method realized by a computer,

Playing video content;

Setting at least one time duration according to a user's touch manipulation;

Receiving a text corresponding to the at least one time interval from the user; And

Generating a subtitle of the video content by combining the at least one time interval and text corresponding to the at least one time interval,

Setting at least one time interval according to the touch operation of the user,

In response to the user's touch input, the user's touch input time point is set as a start time of the time interval in the playback time of the video content.

In response to the user's touch release, the user's touch release time is set as an end time of the time interval in the playback time of the video content.

How to create and share captions for video content.
The method of claim 1,

Displaying an object visualizing an audio signal of the video content adjacent to the reproduced video content, and displaying a reference element corresponding to a current time point on the visualized object of the audio signal;

How to create and share captions for video content.
The method of claim 1,

Transmitting a caption of the video content to a server;

Receiving at least one subtitle of the video content generated by another user from the server; And

Providing the at least one subtitle of the video content generated by the other user or at least one time interval of the subtitle in an editable state,

How to create and share captions for video content.
As a method realized by a computer,

Playing video content;

Setting at least one time interval according to a user's touch manipulation;

Receiving a text corresponding to the at least one time interval from the user; And

Generating a subtitle of the video content by combining the at least one time interval and text corresponding to the at least one time interval,

Setting at least one time interval according to the touch operation of the user,

In response to the user's touch input, the user's touch input time point is set as an end time of the time interval in the playback time of the video content.

Setting a time point before a predetermined unit time from the touch input time point of the user among the playing time of the video content as a start time of the time interval;

How to create and share captions for video content.
The method of claim 4, wherein

Displaying an object visualizing an audio signal of the video content adjacent to the reproduced video content, and displaying a reference element corresponding to a current time point on the visualized object of the audio signal;

How to create and share captions for video content.
The method of claim 4, wherein

Displaying a unit time manipulation window providing a plurality of predetermined unit times adjacent to the played video content;

Setting at least one time interval according to the touch operation of the user,

The start time of the time interval is a time point that is earlier than the predetermined unit time from the touch input time point of the user among the playback time of the video content by using a predetermined unit time selected by the user among the plurality of predetermined time units. Set to,

How to create and share captions for video content.
The method of claim 6,

Setting at least one time interval according to the touch operation of the user,

The time that is earlier than the predetermined unit time from the touch input time point of the user in the playback time of the video content by using the predetermined unit time selected by the touch input of the user among the plurality of predetermined unit time points. Set to the start time of the interval,

How to create and share captions for video content.
The method of claim 4, wherein

Setting at least one time interval according to the touch operation of the user,

The predetermined unit from the touch input time point of the user during the playback time of the video content by using a predetermined unit time automatically determined based on an audio signal of the video content or at least one time interval set by another user. Setting a time point earlier than a time as a start time of the time interval,

How to create and share captions for video content.
The method of claim 4, wherein

Transmitting a caption of the video content to a server;

Receiving at least one subtitle of the video content generated by another user from the server; And

Providing the at least one subtitle of the video content generated by the other user or at least one time interval of the subtitle in an editable state,

How to create and share captions for video content.
An application, coupled to a computer, stored in a computer readable recording medium for carrying out the method of any one of claims 1 to 9.