CN112988005A

CN112988005A - Method for automatically loading captions

Info

Publication number: CN112988005A
Application number: CN202011367465.2A
Authority: CN
Inventors: 吴丹
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2020-11-27
Filing date: 2020-11-27
Publication date: 2021-06-18
Anticipated expiration: 2040-11-27
Also published as: CN112988005B; WO2022110844A1

Abstract

The present disclosure relates to a method for automatically loading subtitles. The method for automatically loading the subtitles comprises the following steps: responding to a first preset operation for copying the first text information, and acquiring the first text information; and displaying first subtitle information generated based on the first character information in a first subtitle area of the video editing interface. According to the scheme, the first subtitle information can be generated in the first subtitle region of the video editing interface according to the copied characters, the problem that the subtitles are edited by manually typing or manually pasting the characters is relieved and even thoroughly solved, and the subtitle editing efficiency is high.

Description

Method for automatically loading captions

Technical Field

The disclosure relates to the field of video production, and in particular, to a method for automatically loading subtitles.

Background

In the related art, subtitles need to be added manually in the process of making a video, and the specific process is as follows: dragging the video track to select the video frame to be added with the caption, clicking the input text, manually inputting or pasting the required caption text, dragging to the next video frame to be added with the caption, and repeating the process. When the amount of subtitles involved in a video is large, it takes a lot of time to type.

Therefore, a method for improving the efficiency of subtitle editing in a video production process is needed.

Disclosure of Invention

The present disclosure provides a method for automatically loading subtitles to at least solve the problem of low subtitle editing efficiency in a video production process in the related art. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a method for automatically loading subtitles, including: responding to a first preset operation for copying first text information, and acquiring the first text information; and displaying first subtitle information generated based on the first character information in a first subtitle area of a video editing interface.

Optionally, the step of acquiring the first text message in response to a first predetermined operation for copying the first text message includes: responding to the first preset operation, displaying prompt information on the video editing interface, wherein the prompt information is used for reminding whether to determine the first subtitle information according to the first text information; and responding to a second preset operation acted on the prompt information to acquire the first text information.

Optionally, the step of displaying the first subtitle information generated based on the first text information in a first subtitle region of the video editing interface includes: determining the first caption information according to the first character information; and displaying the first subtitle information in the first subtitle area.

Optionally, the step of determining the first subtitle information according to the first text information includes: and determining that the first text information is the first subtitle information.

Optionally, the step of determining the first subtitle information according to the first text information includes: determining whether the preset time length of the first text information displayed according to the preset mode is less than the time length of the video being edited; comparing the first text information with a preset subtitle segment in a subtitle database under the condition that the preset duration is less than the duration of the video, wherein the subtitle database comprises a plurality of preset subtitle segments; and under the condition that the similarity between the first text information and a target part is greater than a first preset threshold value, determining at least one part of a target subtitle segment including the target part as the first subtitle information according to the length of the video being edited, wherein the target part is the part of the target subtitle segment with the same preset time length as the first text information, and the target subtitle segment is one of a plurality of preset subtitle segments including the target part.

Optionally, the step of displaying the first subtitle information in the first subtitle region includes: determining position information of a first subtitle subregion, wherein the position information comprises a length and an initial position, the length of the first subtitle subregion is a display length of the first subtitle subregion, the length of the first subtitle subregion represents a video time length corresponding to a first subtitle segment in the first subtitle subregion, the initial position of the first subtitle subregion represents a first video frame image corresponding to the first subtitle segment, and the first subtitle segment is obtained by dividing according to the first subtitle information; displaying a plurality of first subtitle sub-regions in the first subtitle region according to the position information, wherein one of the first subtitle sub-regions has one of the first subtitle segments.

Optionally, the video editing interface further includes an image area, where the image area displays a plurality of video frame images, the plurality of video frame images are sequentially arranged along a predetermined direction, the plurality of first subtitle sub-areas are sequentially arranged along the predetermined direction, one of the first subtitle sub-areas is located on one side of the corresponding plurality of video frame images, and the predetermined direction is a length direction of the first subtitle sub-area.

Optionally, the step of determining the location information of the first subtitle subregion further includes: dividing the first caption information into a plurality of first caption segments; acquiring the duration of a video being edited; determining the length of each first caption subregion according to the number of the first caption segments and the time length; determining that the starting point of the first subtitle region is the starting position of the first subtitle sub-region, and the starting positions of the other first subtitle sub-regions are the ending positions of the previous first subtitle sub-region.

Optionally, the step of determining the first subtitle information according to the first text information includes: acquiring text content corresponding to the voice content of the video being edited; determining whether the similarity of the first text information and the text content is greater than a second preset threshold value; determining the text content as the first subtitle information under the condition that the similarity between the first text information and the text content is greater than the second predetermined threshold, wherein the step of determining the position information of the first subtitle subregion comprises: and determining the position information of the first caption segment according to the corresponding relation between the text content and the video frame image.

Optionally, after displaying the first subtitle information generated based on the first text information in a first subtitle region of the video editing interface, the method further includes: in response to a third predetermined operation acting on the first target subtitle sub-region, the start position of the first target subtitle sub-region is changed from the initial position to a predetermined position.

Optionally, after displaying the first subtitle information generated based on the first text information in a first subtitle region of the video editing interface, the method further includes: in response to a fourth predetermined operation acting on a second target subtitle sub-region, the length of the second target subtitle sub-region is changed from the initial length to a predetermined length.

Optionally, after displaying the first subtitle information generated based on the first text information in a first subtitle region of the video editing interface, the method further includes: responding to a fifth preset operation of copying second text information, and acquiring the second text information; and displaying second subtitle information in a second subtitle area of the video editing interface according to the second text information, wherein the second subtitle area is positioned at one side of the first subtitle area.

According to a second aspect of the embodiments of the present disclosure, there is provided a method for automatically loading subtitles, including: detecting whether a first preset operation for copying the first text information exists; under the condition that the first preset operation is detected, acquiring the first text information; and generating first subtitle information based on the first character information, and displaying the first subtitle information in a subtitle area of a video editing interface.

According to a third aspect of the embodiments of the present disclosure, there is provided a method for automatically loading subtitles, including: when a target video is edited, acquiring first text information from a memory area for storing the text information, wherein the memory area is used for storing the text information obtained by copying a target text; and generating first subtitle information based on the first text information, and displaying the first subtitle information in a first subtitle area of a video editing interface.

According to a fourth aspect of the embodiments of the present disclosure, there is provided an apparatus for automatically loading subtitles, including a first obtaining unit configured to perform obtaining first text information in response to a first predetermined operation for copying the first text information; the first display unit is configured to display first subtitle information generated based on the first text information in a first subtitle region of a video editing interface.

According to a fifth aspect of embodiments of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the instructions to implement any of the above-described methods of automatically loading subtitles.

According to a sixth aspect of the embodiments of the present disclosure, there is provided a non-volatile storage medium, wherein instructions of the non-volatile storage medium, when executed by a processor of an electronic device, enable the electronic device to perform any one of the above-mentioned methods of automatically loading subtitles.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

in the scheme, the first subtitle information can be generated in the first subtitle area of the video editing interface according to the copied characters, the problem that the subtitles are edited by manual typing or manual character pasting can be relieved and even thoroughly solved, and the subtitle editing efficiency is high.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

Fig. 1 is an architecture diagram illustrating an application scenario of a method of automatically loading subtitles according to an exemplary embodiment.

Fig. 2 is a flowchart illustrating a method of automatically loading subtitles according to an exemplary embodiment.

Fig. 3 is a flowchart illustrating a method of automatically loading subtitles according to another exemplary embodiment.

Fig. 4 is a flowchart illustrating a method of automatically loading subtitles according to still another exemplary embodiment.

Fig. 5 is a schematic diagram illustrating a display interface corresponding to a method for automatically loading subtitles according to an exemplary embodiment.

Fig. 6 is an illustration of a display interface corresponding to a method for automatically loading subtitles, according to another exemplary embodiment.

Fig. 7 is a flowchart illustrating a method of automatically loading subtitles according to still another exemplary embodiment.

Fig. 8 is a flowchart illustrating a method of automatically loading subtitles according to another exemplary embodiment.

Fig. 9 is a block diagram illustrating a structure of an apparatus for automatically loading subtitles according to an exemplary embodiment.

Fig. 10 is a block diagram illustrating a structure of an apparatus for automatically loading subtitles according to another exemplary embodiment.

Fig. 11 is a block diagram illustrating a structure of an apparatus for automatically loading subtitles according to still another exemplary embodiment.

Fig. 12 is a block diagram illustrating a structure of an electronic device for performing a method of automatically loading subtitles according to an exemplary embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

As mentioned in the background, in the prior art, a user is required to manually add or paste subtitle text during the process of editing video, and the efficiency of adding subtitles is low.

Fig. 1 is an architecture diagram illustrating an implementation environment in which the following method for automatically loading subtitles may be applied, as shown in fig. 1, according to an exemplary embodiment. The implementation environment includes an electronic device 01 and a server 02. Wherein, the electronic device 01 and the server 02 may be interconnected and communicate through a network.

The electronic device 01 may be a device that displays the first subtitle information. The electronic device 01 may obtain the corresponding first subtitle information from the server 02 and display the first subtitle information in the video editing interface. Alternatively, the electronic device 01 itself may generate the first subtitle information and display the first subtitle information.

The electronic device 01 may be any electronic product that can interact with a user through one or more modes, such as a keyboard, a touch pad, a touch screen, a remote controller, a voice interaction device, or a handwriting device, for example, a mobile phone, a tablet Computer, a palm Computer, a Personal Computer (PC), a wearable device, a smart television, and the like.

The server 02 may be one server, a server cluster composed of a plurality of servers, or a cloud computing service center. The server 02 may include a processor, memory, and a network interface, among others.

It will be understood by those skilled in the art that the foregoing electronic devices and servers are merely exemplary and that other electronic devices or servers, now known or later developed, that may be suitable for use in the present disclosure are intended to be included within the scope of the present disclosure and are hereby incorporated by reference.

Based on this, embodiments of the present disclosure provide a method for automatically loading subtitles.

The execution main body of the display method provided by the embodiment of the present disclosure may be the electronic device or the server, or may also be a functional module and/or a functional entity capable of implementing the video content display method in the electronic device or the server, and the specific implementation may be determined according to actual use requirements, which is not limited in the embodiment of the present disclosure. The following takes an execution subject as an electronic device as an example, and exemplarily describes a display method provided by the embodiment of the present disclosure.

Fig. 2 is a flowchart illustrating a method for automatically loading subtitles, which is used in an electronic device as shown in fig. 2, according to an exemplary embodiment and includes the following steps S11-S12.

In step S11, the first character information is acquired in response to a first predetermined operation for copying the first character information.

The first predetermined operation described above may be any single operation or a group of operations formed by a series of operations for copying the first textual information. In the case where the electronic apparatus is a personal computer, the first predetermined operation may be a keyboard and mouse operation, for example, a keyboard and mouse operation such as "Ctrl + C". When the electronic device is a mobile phone or a PAD, the first predetermined operation may be a click operation or a long press operation.

In step S12, the first subtitle information generated based on the first text information is displayed in the first subtitle region of the video editing interface, that is, the first subtitle information is generated based on the first text information and displayed in the first subtitle region of the video editing region.

In the above embodiment, first, in response to a first predetermined operation for copying the first text information, the first text information is acquired; and then, displaying the first subtitle information on the video editing interface according to the first character information. According to the scheme, the first subtitle information can be generated in the first subtitle region of the video editing interface according to the copied characters, the problem that the subtitles are edited by manually typing or manually pasting the characters is relieved and even thoroughly solved, and the efficiency of editing the subtitles is high.

In an actual application process, there may be a plurality of predetermined operations for copying the text information, and only part of the text information in the plurality of copied text information may be used for subtitles of the video, in order to make the displayed subtitles more accurate in this case, in an embodiment of the present application, as shown in fig. 3, the step S11 is implemented by the step S110 and the step S111.

In step S110, in response to the first predetermined operation, a prompt message is displayed on the video editing interface, where the prompt message is used to prompt whether to determine the first subtitle information according to the first text information. In an actual application process, the prompt information may be a prompt popup, and the prompt popup may display "whether the system detects that the text copy is performed and the text is automatically added as a subtitle". Of course, the prompt pop-up window with the above text is not limited to be displayed, but may also be some other specified prompt information in the video editor, for example, the prompt information is a prompt with "copy? "is a rectangular icon. Those skilled in the art can set the appropriate prompt information according to the actual situation, and can be a prompt icon or the like.

In step S111, the first character information is acquired in response to a second predetermined operation applied to the presentation information. The second predetermined operation may be at least one of a click operation, a long-press operation, a double-click operation, and a slide operation, and may be determined according to actual conditions. In the case of receiving a second predetermined operation on the prompt message, it may be determined that the first subtitle information is determined according to the first text information, and thus the first text information is acquired based on the second predetermined operation.

As shown in fig. 4, the step S12 can be implemented by a step S120 and a step S121, wherein in the step S120, the first subtitle information is determined according to the first text information; in step S121, the first caption information is displayed in the first caption area.

In an actual application process, in some cases, for example, in a case where a time length of the first text information displayed in a predetermined manner substantially coincides with a time length of the video being edited, or in a case where although the time length of the first text information displayed in the predetermined manner is shorter than the time length of the video being edited, the video only partially requires subtitles, the first text information is first subtitle information, and then, correspondingly, the step S120 includes: and determining the first text information as the first subtitle information.

In some cases, the first text information may not satisfy the requirement corresponding to the first caption information, and the first text information may be a part of the first caption information, and in order to further accurately, efficiently and completely display the first caption information, in an embodiment of the present application, the step S120 includes: determining whether the preset time length of the first character information displayed according to a preset mode is shorter than the time length of the video being edited; comparing the first text information with a predetermined caption segment in a caption database under the condition that the predetermined duration is less than the duration of the video, wherein the caption database comprises a plurality of the predetermined caption segments; and in the case that the similarity between the first text information and the target part is greater than a first preset threshold value, determining at least one part of a target subtitle segment including the target part as the first subtitle information according to the length of the video being edited, wherein the target part is one part of the target subtitle segment with the same preset time length as the first text information, and the target subtitle segment is one of a plurality of preset subtitle segments including the target part. According to the scheme, the first subtitle information corresponding to the video being edited can be automatically supplemented according to the first character information.

For example, the first text message is "once a true love was placed before me, i did not get a treasure", the caption database is a database of classical lines segments in a movie, wherein one predetermined caption segment is "once a true love was placed before me, i did not get a treasure, i got back to remoter when me lost, the most painful one in human world, if we can give me a chance to come again in the last day, i would say three words to that girl, i love you, if we do not add a term to that love, i hope to be ten thousand years", the predetermined caption segment is a target caption segment, wherein the target portion is "once a true love was placed before me, i did not get a treasure", if the first predetermined threshold is 90%, the similarity of the target portion and the first text message is 100% greater than the first predetermined threshold, therefore, the predetermined caption segment can be determined as "once a true love is put in front of me, i has no treasure, i repels me when me loses, i has the most painful thing in the world, if me can give me a chance to come again the day, i can say three words to the girl, i love you, if not to add a term to the love, i hopes to be a ten thousand years" target caption segment, then the corresponding first caption information is determined according to the duration of the edited video, firstly whether the duration of the edited video is the same as the duration of the target caption segment displayed according to a predetermined mode is determined, and if the duration of the edited video is the same as the duration of the target caption segment displayed according to a predetermined mode, the target caption segment is determined to be the first caption information. If the duration of the edited video is longer than the duration of the target caption segment displayed in the predetermined manner, the target caption segment is used as the first caption information, and the display manner of the corresponding first caption information is adjusted according to the duration of the edited video, for example, generally speaking, the display duration corresponding to one sentence is 2s, so that the display duration of the first caption information is the same as the duration of the video, the display duration of one sentence can be increased. If the duration of the edited video is less than the duration of the target caption segment displayed in a predetermined manner, a portion of the target caption segment, which includes the target portion and has the same duration as the duration of the edited video, is used as the first caption information, and specifically, the portion of the target caption segment can be intercepted by combining the voice information and the target portion of the video to be used as the first caption information.

In a specific application, the caption database is actually a material library, wherein the predetermined caption segment may be a speech of a classical segment in a movie, a classical segment popular on a network, or any other suitable text segment such as a verse library, and a person skilled in the art may select a suitable caption database according to actual conditions.

Of course, in an actual application process, the method may further include a step of constructing a subtitle database, and the construction process may refer to a construction process of another language database, which is not described herein again.

In order to more accurately display the first subtitle information in the first subtitle region, in a specific embodiment of the present application, the displaying the first subtitle information in the first subtitle region includes: determining position information of a first subtitle subregion, wherein the position information comprises a length and a starting position, the length of the first subtitle subregion is a display length of the first subtitle subregion, the length of the first subtitle subregion represents a video time length corresponding to a first subtitle segment in the first subtitle subregion, the starting position of the first subtitle subregion represents a first video frame image corresponding to the first subtitle segment, and which video frame images corresponding to the first subtitle segment are determined according to the position information, wherein the first subtitle segment is obtained by dividing according to the first subtitle information; according to the position information, a plurality of first subtitle sub-regions are displayed in the first subtitle region, that is, according to the length and the starting position of the first subtitle sub-regions, corresponding first subtitle sub-regions are displayed, and the first subtitle segments are located in the first subtitle sub-regions in a one-to-one correspondence manner, as shown in fig. 5.

In a specific embodiment, the video editing interface further includes an image area, the image area displays a plurality of video frame images, the plurality of video frame images are sequentially arranged along a predetermined direction, the plurality of first subtitle sub-areas are sequentially arranged along the predetermined direction, one of the first subtitle sub-areas is located at one side of the corresponding plurality of video frame images, the plurality of video frame images and the corresponding first subtitle sub-area are correspondingly displayed, and the predetermined direction is a length direction of the first subtitle sub-area. In particular, the start position of the first subtitle subregion may be aligned with the start position of the first video frame image of the corresponding plurality of video frame images, i.e. the projection of the start position of the first subtitle subregion onto the video frame image region is located on the start edge of the first video frame image. The length of the first subtitle subregion may be the same as the display length of the corresponding plurality of video frame images. The projection of the starting position of the first subtitle subregion in the video frame image region may also be located in the first video frame image of the corresponding plurality of video frame images, and the length of the first subtitle subregion may also be smaller than the total display length of the corresponding plurality of video frame images. As shown in fig. 5, five video frame images on one side of the first subtitle subregion 101 are the five video frame images corresponding to the first subtitle segment in the first subtitle subregion.

Of course, in practical applications, the first subtitle sub-area is not limited to being located on one side of the video frame image as shown in fig. 5 (see fig. 5 in a view facing the screen, and the upper side, the left side, and the right side below are all viewed in this view), and may also be located on the upper side of the video frame image.

It should be noted that the first subtitle sub-region may also be displayed at other display positions, for example, the first subtitle sub-region is located at the left side or the right side of the corresponding video frame images. In the prior art, it is generally referred to that the image frame display area is a main track, the subtitle display area is a sub-track, and the sub-track and the main track are distributed in a direction perpendicular to the predetermined direction, specifically, as shown in fig. 5, the first subtitle area 100 in fig. 5 can be regarded as the sub-track, and the frame image area 200 is the main track.

In order to determine the position information of the first subtitle sub-region more accurately, so as to display the first subtitle segment more accurately, in an embodiment of the present application, the step of determining the position information of the first subtitle sub-region further includes: dividing the first caption information into a plurality of first caption segments, generally speaking, a sentence into a first caption segment; acquiring the duration of a video being edited; determining the length of each first subtitle sub-region according to the number of the first subtitle segments and the time length, specifically, because the length of the first subtitle sub-region represents the time length of the video corresponding to the first subtitle sub-region, according to the time length of the video and the number of the first subtitle segments, the video time length corresponding to the first subtitle segment in each first subtitle sub-region may be determined, and according to the video time length corresponding to the video time length, that is, the number of the video frame images corresponding to the video time length may be determined, so that the length of the corresponding first subtitle sub-region may be determined according to the number of the video frame images corresponding to the video frame images, specifically, the length of the first subtitle sub-region may be approximately equal to the total display length of the plurality of video frame images corresponding to the first subtitle sub-region, and may also be smaller than the total display length of the plurality of; the starting point of the first caption area is determined to be the starting position of the first caption sub-area, the starting position of the other first caption sub-areas is determined to be the ending position of the previous first caption sub-area, where "after" refers to after in the length direction of the first caption sub-area, specifically, the distribution interval of two adjacent first caption segments may be determined according to the actual situation, and the two adjacent first caption segments may be in contact, that is, the interval distance is 0, or may not be in contact, and the interval distance is greater than 0.

In order to more accurately and efficiently determine the video frame image corresponding to the first caption segment, in a specific embodiment of the present application, the step of determining the first caption information according to the first text information includes: acquiring the text content corresponding to the voice content of the video being edited; determining whether the similarity between the first text information and the text content is greater than a second preset threshold value; and determining the text content as the first subtitle information when the similarity between the first text information and the text content is greater than the second predetermined threshold. The step of determining the position information of the first subtitle subregion includes: determining the position information of the first caption segment according to the corresponding relationship between the text content and the video frame image, specifically, the text content and the voice content, and the voice content and the video frame image, so that the text content and the video frame image have a corresponding relationship, and according to the corresponding relationship, the video frame image corresponding to each part of the text content (the first caption segment) can be more accurately determined, so as to more accurately determine the position information of the first caption sub-region, i.e., determine the starting position and the length of the first caption sub-region, where the starting position of the first caption sub-region can correspond to the starting position of the corresponding first frame image, and also after the starting position of the first frame image, the length of the first caption sub-region can be the same as the total display length of the corresponding plurality of frame images, or may be smaller than the total display length of the corresponding multiple frame images.

In an actual application process, sometimes the display position of the first subtitle sub-region is not very accurate, and to solve this problem, in an embodiment of the present application, after the first subtitle information generated based on the first text information is displayed in the first subtitle region of the video editing interface, the method further includes: in response to a third predetermined operation acting on the first target subtitle sub-region, the start position of the first target subtitle sub-region is changed from the initial position to a predetermined position. By adjusting the initial position of the target first caption subregion, the display position of the target first caption subregion can be adjusted, so that the adjusted display position of the target first caption subregion can be further ensured to be more accurate, and the video duration and the video frame image corresponding to the first caption segment in the first caption subregion in the target first caption subregion are further ensured to be more accurate.

In another embodiment of the present application, after displaying, in a first subtitle region of a video editing interface, first subtitle information generated based on the first text information, the method further includes: in response to a fourth predetermined operation acting on a second target subtitle subregion, the length of the second target subtitle subregion is changed from the initial length to a predetermined length, and the number of the video frame images corresponding to the second target subtitle subregion is changed from the first predetermined number to a second predetermined number. By the method, the display position of the second target subtitle subregion can be adjusted, so that the adjusted display position of the target first subtitle subregion is more accurate, and the video duration and the video frame image corresponding to the first subtitle segment in the first subtitle subregion are more accurate.

In some specific applications, some videos need to be loaded with dual subtitles, and in order to better load subtitles of such videos, in a specific embodiment of the present application, after displaying, in a first subtitle region of a video editing interface, first subtitle information generated based on the first text information, the method further includes: responding to a fifth preset operation of copying second character information, and acquiring the second character information; and displaying second subtitle information in a second subtitle area of the video editing interface according to the second character information, wherein the second subtitle area is positioned at one side of the first subtitle area, can be any side of the first subtitle area, and can be set according to actual conditions. In a specific embodiment, as shown in fig. 6, the second subtitle region 300 is located on a side of the first subtitle region 100 away from the frame image region 200.

It should be further noted that, in the present application, the display process of the second subtitle information may refer to the description about the display process of the first subtitle information, and is not described herein again.

In another embodiment of the present application, after the subtitles of the video are loaded, the method further includes: and responding to a fifth preset operation, and playing the video comprising the first caption information.

Fig. 7 is a flowchart illustrating a method of automatically loading subtitles, which is used in an electronic device as shown in fig. 7, according to an exemplary embodiment and includes the following steps S21-S23.

In step S21, it is detected whether there is a first predetermined operation for copying the first letter information;

in step S22, when the first predetermined operation is detected, the first character information is acquired;

in step S23, first caption information is generated based on the first character information, and the first caption information is displayed in a caption area of a video editing interface.

In the method, when the first predetermined operation is detected, the first subtitle information generated based on the first character information is displayed in the subtitle area of the video editing interface. According to the scheme, the first subtitle information can be generated in the first subtitle region of the video editing interface according to the copied characters, the problem that the subtitles are edited by manually typing or manually pasting the characters is relieved and even thoroughly solved, and the subtitle editing efficiency is high.

The specific process of step S22 and the specific process of step S23 may refer to the description in the above scheme, and are not repeated here.

Fig. 8 is a flowchart illustrating a method of automatically loading subtitles, which is used in an electronic device as shown in fig. 8, according to an exemplary embodiment and includes the following steps S31-S32.

In step S31, when editing the target video, acquiring first text information from a memory area for storing text information, where the memory area is used to store text information obtained by performing a copy operation (a first predetermined operation) on the target text;

in step S32, first subtitle information is generated based on the first text information, and the first subtitle information is displayed in a first subtitle region of a video editing interface.

In the above-described embodiment, the first subtitle information may be generated from the copied first text information, and the corresponding first subtitle information may be displayed in the first subtitle region. According to the scheme, the first subtitle information can be generated in the first subtitle region of the video editing interface according to the copied characters, the problem that the subtitles are edited by manually typing or manually pasting the characters is relieved and even thoroughly solved, and the subtitle editing efficiency in the video editing process is high.

The specific process of step S31 and the specific process of step S32 may refer to the description in the above scheme, and are not repeated here.

It should be noted that the first predetermined operation, the second predetermined operation, and the fifth predetermined operation in the present application may be any operation that is feasible in the prior art, such as an operation including at least one of a click operation, a slide operation, a long-press operation, and a double-click operation. The skilled person can select a suitable operation or combination of operations to correspond to the above five predetermined operations of the present application according to the actual situation.

Fig. 9 is a block diagram illustrating an apparatus for automatically loading subtitles according to an example embodiment. Referring to fig. 9, the apparatus includes a first acquisition unit 10 and a first display unit 20.

The first acquiring unit 10 is configured to acquire the first text information in response to a first predetermined operation for copying the first text information.

The first display unit 20 is configured to display first subtitle information generated based on the first text information in a first subtitle region of a video editing interface.

In an actual application process, there may be a plurality of predetermined operations for copying text information, and only part of the text information in the plurality of copied text information may be used for subtitles of a video, in order to make the displayed subtitles more accurate in this case, in an embodiment of the present application, the first obtaining unit includes a first display module and an obtaining module.

The first display module is configured to respond to the first predetermined operation, and display prompt information on the video editing interface, wherein the prompt information is used for reminding whether to determine the first subtitle information according to the first text information. In the actual application process, the prompt information may be a prompt popup, and the prompt popup may display "whether the system detects that you have copied a word and automatically adds a subtitle. Of course, the prompt pop-up window with the above text is not limited to be displayed, but may also be some other specified prompt information in the video editor, for example, the prompt information is a prompt with "copy? "is a rectangular icon. Those skilled in the art can set the appropriate prompt information according to the actual situation.

The acquisition module is configured to respond to a second preset operation acted on the prompt message to acquire the first text message. The second predetermined operation may be at least one of a click operation, a long-press operation, a double-click operation, and a slide operation, and may be determined according to actual conditions. In the case of receiving a second predetermined operation on the prompt message, it may be determined that the first subtitle information is determined according to the first text information, and thus the first text information is acquired based on the second predetermined operation.

The first display unit comprises a determining module and a second display module, wherein the determining module is configured to determine the first caption information according to the first text information; the second display module is configured to display the first subtitle information in the first subtitle region.

In an actual application process, in some cases, for example, in a case where a time length of the first text information when displayed in a predetermined manner substantially coincides with a time length of a video being edited, or in a case where although the time length of the first text information when displayed in the predetermined manner is shorter than the time length of the video being edited, the video only partially requires subtitles, the first text information is first subtitle information, and accordingly, the determining module is configured to determine that the first text information is the first subtitle information.

In some cases, the first text information cannot meet the requirement corresponding to the first caption information, the first text information may be used as a part of the first caption information, and in order to further accurately, efficiently and completely display the first caption information, in an embodiment of the present application, the determining module further includes a first determining sub-module, a second determining sub-module and a third determining sub-module, where the first determining sub-module is configured to determine whether a predetermined duration of the first text information displayed in a predetermined manner is less than a duration of the video being edited; the second determining submodule is configured to compare the first text information with a predetermined subtitle segment in a subtitle database if the predetermined duration is less than the duration of the video, wherein the subtitle database comprises a plurality of the predetermined subtitle segments; the third determining sub-module is configured to determine, according to the length of the video being edited, at least a portion of a target subtitle segment including the target portion, which is a portion of the target subtitle segment that is the same as the predetermined length of the first text information, as the first subtitle information, where the target subtitle segment is one of a plurality of the predetermined subtitle segments including the target portion, if the similarity between the first text information and the target portion is greater than a first predetermined threshold. According to the scheme, the first subtitle information corresponding to the video being edited can be automatically supplemented according to the first character information.

In order to more accurately display the first subtitle information in the first subtitle region, in a specific embodiment of the present application, the second display module includes a fourth determining sub-module and a first display sub-module.

The fourth determining submodule is configured to determine position information of a first subtitle sub-region, where the position information includes a length and a start position, the length of the first subtitle sub-region is a display length of the first subtitle sub-region, the length of the first subtitle sub-region represents a video time length corresponding to a first subtitle segment in the first subtitle sub-region, and the start position of the first subtitle sub-region represents a first video frame image corresponding to the first subtitle segment, and which video frame images correspond to the first subtitle segment can be determined according to the position information, where the first subtitle segment is obtained by dividing according to the first subtitle information;

the first display sub-module is configured to display a plurality of first subtitle sub-regions in the first subtitle region according to the position information, that is, determine the length and the starting position of the first subtitle sub-region according to the length and the starting position of the first subtitle sub-region, where the first subtitle segments are located in the first subtitle sub-regions in a one-to-one correspondence manner, as shown in fig. 5.

In a specific embodiment, the video editing interface further includes an image area, the image area displays a plurality of video frame images, the plurality of video frame images are sequentially arranged along a predetermined direction, the plurality of first subtitle sub-areas are sequentially arranged along the predetermined direction, one of the first subtitle sub-areas is located at one side of the corresponding plurality of video frame images, the plurality of video frame images and the corresponding first subtitle sub-area are correspondingly displayed, and the predetermined direction is a length direction of the first subtitle sub-area. In particular, the start position of the first subtitle subregion may be aligned with the start position of the first video frame image of the corresponding plurality of video frame images, i.e. the projection of the start position of the first subtitle subregion onto the video frame image region is located on the start edge of the first video frame image. The length of the first subtitle subregion may be the same as the display length of the corresponding plurality of video frame images. The projection of the starting position of the first subtitle subregion in the video frame image region may also be located in the first video frame image of the corresponding plurality of video frame images, and the length of the first subtitle subregion may also be smaller than the total display length of the corresponding plurality of video frame images. As shown in fig. 5, five video frame images 201 on one side of the first subtitle subregion are the five video frame images 201 corresponding to the first subtitle segment in the first subtitle subregion.

It should be noted that the first subtitle sub-region may also be displayed at other display positions, for example, the first subtitle sub-region is located at the left side or the right side of the corresponding video frame images. In the prior art, it is generally referred to that the image frame display area is a main track, the subtitle display area is a sub-track, and the sub-track and the main track are distributed in a direction perpendicular to the predetermined direction, specifically, as shown in fig. 5, the first subtitle area 100 in fig. 5 can be regarded as the sub-track, and the frame image area 200 is the main track. .

In order to more accurately determine the position information of the first subtitle sub-region, so as to more accurately display the first subtitle segment, in an embodiment of the present application, the fourth determining sub-module includes a dividing sub-module, a first obtaining sub-module, a fifth determining sub-module, and a sixth determining sub-module.

Wherein the dividing submodule is configured to divide the first subtitle information into a plurality of the first subtitle segments, and generally speaking, a sentence is divided into one first subtitle segment.

The first acquisition submodule is configured to acquire a duration of a video being edited.

The fifth determining sub-module is configured to determine a length of each of the first subtitle sub-regions according to the number of the first subtitle segments and the time length, and specifically, since the length of the first subtitle sub-region represents the time length of the video corresponding to the first subtitle sub-region, according to the time length of the video and the number of the first subtitle segments, a video time length corresponding to the first subtitle segment in each of the first subtitle sub-regions may be determined, and according to the video time length corresponding to the video time length, that is, the number of the video frame images corresponding to the first subtitle sub-region may be determined, so that the length of the first subtitle sub-region may be approximately equal to a total display length of the plurality of video frame images corresponding to the first subtitle sub-region, or may be smaller than the total display length of the plurality of video frame images corresponding to the first subtitle sub-region.

The sixth determining sub-module is configured to determine that the starting point of the first subtitle region is the starting position of the first subtitle sub-region, and the starting positions of the other first subtitle sub-regions are the ending positions of the previous first subtitle sub-regions, where "after" refers to after in the length direction of the first subtitle sub-region, specifically, the distribution interval of two adjacent first subtitle segments may be determined according to actual conditions, and the two adjacent first subtitle segments may be in contact, i.e., the separation distance is 0, or may not be in contact, and the separation distance is greater than 0.

In order to determine the video frame image corresponding to the first caption segment more accurately and efficiently, in a specific embodiment of the present application, the determining module includes a second obtaining sub-module, a seventh determining sub-module, and an eighth determining sub-module.

And the second acquisition submodule is configured to acquire the text content corresponding to the voice content of the video being edited.

The seventh determining submodule is configured to determine whether the similarity between the first text information and the text content is greater than a second predetermined threshold.

The eighth determining sub-module is configured to determine that the text content is the first subtitle information if the similarity between the first text information and the text content is greater than the second predetermined threshold.

The fourth determining sub-module is configured to determine the position information of the first caption segment according to the corresponding relationship between the text content and the video frame image, specifically, the text content and the voice content have a corresponding relationship, and therefore, the text content and the video frame image have a corresponding relationship, and according to the corresponding relationship, the video frame image corresponding to the text content (the first caption segment) of each portion can be determined more accurately, so that the position information of the first caption sub-region can be determined more accurately, that is, the start position and the length of the first caption sub-region are determined, the start position of the first caption sub-region can correspond to the start position of the corresponding first frame image, and also after the start position of the first frame image, the length of the first caption sub-region can be the same as the total display length of the corresponding multiple frame images, or may be smaller than the total display length of the corresponding plurality of frame images.

In order to solve the problem that the display position of the first subtitle sub-region is not very accurate in practical application, in an embodiment of the present application, the apparatus further includes a first adjusting unit configured to change the start position of the first target subtitle sub-region from the initial position to a predetermined position in response to a third predetermined operation acting on the first target subtitle sub-region after the first subtitle information generated based on the first text information is displayed in the first subtitle region of the video editing interface. By adjusting the starting position of the target first caption subregion, the display position of the target first caption subregion can be adjusted, so that the adjusted display position of the target first caption subregion can be further ensured to be more accurate, and the video duration and the video frame image corresponding to the first caption segment in the first caption subregion in the target first caption subregion are further ensured to be more accurate.

In another embodiment of the present application, the apparatus further includes a second adjusting unit configured to, after the first subtitle information generated based on the first text information is displayed in the first subtitle region of the video editing interface, change the length of the second target subtitle subregion from an initial length to a predetermined length and change the number of the video frame images corresponding to the second target subtitle subregion from a first predetermined number to a second predetermined number in response to a fourth predetermined operation acting on the second target subtitle subregion. By the method, the display position of the second target subtitle subregion can be adjusted, so that the adjusted display position of the target first subtitle subregion is further ensured to be more accurate, and the video duration and the video frame image corresponding to the first subtitle segment in the first subtitle subregion are further ensured to be more accurate.

In some specific applications, some videos need to be loaded with double subtitles, and in order to better load subtitles of such videos, in a specific embodiment of the present application, the apparatus further includes a second obtaining unit and a second display unit, where the second obtaining unit is configured to obtain the second text information in response to a fifth predetermined operation of copying the second text information after displaying the first subtitle information generated based on the first text information in the first subtitle region of the video editing interface; the second display unit is configured to display second caption information in a second caption area of the video editing interface according to the second character information, wherein the second caption area is located on one side of the first caption area, can be any side of the first caption area, and can be set according to actual conditions. In a specific embodiment, as shown in fig. 6, the second subtitle region is located on a side of the first subtitle region away from the video frame image.

In another embodiment of the present application, the apparatus further includes a playing unit, and the playing unit is configured to play the video including the first subtitle information in response to a fifth predetermined operation after the subtitle of the video is loaded.

Fig. 10 is a block diagram illustrating an apparatus for automatically loading subtitles, according to an exemplary embodiment, and as shown in fig. 10, the apparatus for automatically loading subtitles includes a detection unit 30, a second acquisition unit 40, and a second display unit 50.

The detection unit 30 is configured to detect whether there is a first predetermined operation for copying the first letter information;

the second acquiring unit 40 is configured to acquire the first text information when the first predetermined operation is detected;

the second display unit 50 is configured to generate first subtitle information based on the first text information and display the first subtitle information in a subtitle region of a video editing interface.

In the above-described apparatus, the first display unit displays, in a subtitle region of the video editing interface, first subtitle information generated based on the first subtitle information, when the first predetermined operation is detected. According to the scheme, the first subtitle information can be generated in the first subtitle region of the video editing interface according to the copied characters, the problem that the subtitles are edited by manually typing or manually pasting the characters is relieved and even thoroughly solved, and the efficiency of editing the subtitles is high.

The specific implementation processes of the first obtaining unit and the first display unit may refer to the description in the above scheme, and are not described herein again.

Fig. 11 is a flowchart illustrating a method for automatically loading subtitles, which is used in an electronic device and includes a third obtaining unit 60 and a third displaying unit 70, as shown in fig. 11, according to an exemplary embodiment.

The third obtaining unit 60 is configured to obtain, when editing the target video, first text information from a memory area for storing text information, where the memory area is used for storing text information obtained by performing a copy operation (a first predetermined operation) on the target text;

the third display unit 70 is configured to generate first subtitle information based on the first text information and present the first subtitle information in a first subtitle region of the video editing interface.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 12 is a block diagram illustrating an electronic device 01 for performing automatic loading of subtitles according to an exemplary embodiment.

In an exemplary embodiment, a storage medium comprising executable instructions, such as the memory 402 for storing executable instructions, is also provided, which are executable by the processor 404 of the electronic device 01 to perform the above-described method. Alternatively, the storage medium may be a non-transitory computer readable storage medium, such as a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method for automatically loading subtitles, comprising:

responding to a first preset operation for copying first text information, and acquiring the first text information;

and displaying first subtitle information generated based on the first character information in a first subtitle area of a video editing interface.

2. The method of claim 1, wherein the step of obtaining the first textual information in response to a first predetermined operation for copying the first textual information comprises:

responding to the first preset operation, displaying prompt information on the video editing interface, wherein the prompt information is used for reminding whether to determine the first subtitle information according to the first text information;

and responding to a second preset operation acted on the prompt message to acquire the first text message.

3. The method of claim 1, wherein the step of displaying the first subtitle information generated based on the first text information in a first subtitle region of the video editing interface comprises:

determining the first caption information according to the first character information;

and displaying the first subtitle information in the first subtitle area.

4. The method of claim 3, wherein the step of determining the first caption information based on the first text information comprises:

and determining that the first text information is the first subtitle information.

5. The method of claim 3, wherein the step of determining the first caption information based on the first text information comprises:

determining whether the preset time length of the first text information displayed according to the preset mode is less than the time length of the video being edited;

comparing the first text information with a preset subtitle segment in a subtitle database under the condition that the preset duration is less than the duration of the video, wherein the subtitle database comprises a plurality of preset subtitle segments;

and under the condition that the similarity between the first text information and a target part is greater than a first preset threshold value, determining at least one part of a target subtitle segment including the target part as the first subtitle information according to the length of the video being edited, wherein the target part is the part of the target subtitle segment with the same preset time length as the first text information, and the target subtitle segment is one of a plurality of preset subtitle segments including the target part.

6. The method of claim 3, wherein the displaying the first subtitle information in the first subtitle region comprises:

determining position information of a first subtitle subregion, wherein the position information comprises a length and an initial position, the length of the first subtitle subregion is a display length of the first subtitle subregion, the length of the first subtitle subregion represents a video time length corresponding to a first subtitle segment in the first subtitle subregion, the initial position of the first subtitle subregion represents a first video frame image corresponding to the first subtitle segment, and the first subtitle segment is obtained by dividing according to the first subtitle information;

and displaying a plurality of first subtitle sub-regions in the first subtitle region according to the position information, wherein one first subtitle sub-region has one first subtitle segment.

7. The method of claim 6,

the step of determining the first subtitle information according to the first text information includes:

acquiring text content corresponding to the voice content of the video being edited;

determining whether the similarity of the first text information and the text content is greater than a second preset threshold value;

determining the text content as the first subtitle information when the similarity between the first text information and the text content is greater than the second predetermined threshold,

the step of determining the position information of the first subtitle subregion comprises the following steps:

and determining the position information of the first caption segment according to the corresponding relation between the text content and the video frame image.

8. The method of any one of claims 1 to 7, wherein after displaying first subtitle information generated based on the first textual information in a first subtitle region of a video editing interface, the method further comprises:

in response to a third predetermined operation acting on the first target subtitle sub-region, the start position of the first target subtitle sub-region is changed from the initial position to a predetermined position.

9. The method of any one of claims 1 to 7, wherein after displaying first subtitle information generated based on the first textual information in a first subtitle region of a video editing interface, the method further comprises:

in response to a fourth predetermined operation acting on a second target subtitle sub-region, the length of the second target subtitle sub-region is changed from the initial length to a predetermined length.

10. The method of any one of claims 1 to 7, wherein after displaying first subtitle information generated based on the first textual information in a first subtitle region of a video editing interface, the method further comprises:

responding to a fifth preset operation of copying second text information, and acquiring the second text information;

and displaying second subtitle information in a second subtitle area of the video editing interface according to the second text information, wherein the second subtitle area is positioned at one side of the first subtitle area.