CN112988005B

CN112988005B - Method for automatically loading captions

Info

Publication number: CN112988005B
Application number: CN202011367465.2A
Authority: CN
Inventors: 吴丹
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2020-11-27
Filing date: 2020-11-27
Publication date: 2023-02-28
Anticipated expiration: 2040-11-27
Also published as: CN112988005A; WO2022110844A1

Abstract

The present disclosure relates to a method for automatically loading subtitles. The method for automatically loading the subtitles comprises the following steps: responding to a first preset operation for copying the first text information, and acquiring the first text information; and displaying first subtitle information generated based on the first character information in a first subtitle area of the video editing interface. According to the scheme, the first subtitle information can be generated in the first subtitle region of the video editing interface according to the copied characters, the problem that the subtitles are edited by manually typing or manually pasting the characters is relieved and even thoroughly solved, and the subtitle editing efficiency is high.

Description

Method for automatically loading captions

Technical Field

The disclosure relates to the field of video production, and in particular, to a method for automatically loading subtitles.

Background

In the related art, subtitles need to be added manually in the process of making a video, and the specific process is as follows: dragging the video track to select the video frame to be added with the caption, clicking the input text, manually inputting or pasting the required caption text, dragging to the next video frame to be added with the caption, and repeating the process. When the amount of subtitles involved in a video is large, it takes a lot of time to type.

Therefore, a method for improving the efficiency of subtitle editing in a video production process is needed.

Disclosure of Invention

The present disclosure provides a method for automatically loading subtitles to at least solve the problem of low subtitle editing efficiency in a video production process in the related art. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a method for automatically loading subtitles, including: responding to a first preset operation for copying first text information, and acquiring the first text information; and displaying first subtitle information generated based on the first character information in a first subtitle area of a video editing interface.

Optionally, the step of acquiring the first text message in response to a first predetermined operation for copying the first text message includes: responding to the first preset operation, displaying prompt information on the video editing interface, wherein the prompt information is used for reminding whether to determine the first subtitle information according to the first text information; and responding to a second preset operation acted on the prompt information to acquire the first text information.

Optionally, the step of displaying the first subtitle information generated based on the first text information in a first subtitle region of the video editing interface includes: determining the first caption information according to the first character information; and displaying the first subtitle information in the first subtitle area.

Optionally, the step of determining the first subtitle information according to the first text information includes: and determining that the first text information is the first subtitle information.

Optionally, the step of determining the first subtitle information according to the first text information includes: determining whether the preset time length of the first text information displayed according to the preset mode is less than the time length of the video being edited; comparing the first text information with a preset subtitle segment in a subtitle database under the condition that the preset duration is less than the duration of the video, wherein the subtitle database comprises a plurality of preset subtitle segments; and under the condition that the similarity between the first text information and a target part is greater than a first preset threshold value, determining at least one part of a target subtitle segment comprising the target part as the first subtitle information according to the length of the video being edited, wherein the target part is the part of the target subtitle segment with the same preset time length as the first text information, and the target subtitle segment is one of a plurality of preset subtitle segments comprising the target part.

Optionally, the step of displaying the first subtitle information in the first subtitle region includes: determining position information of a first subtitle subregion, wherein the position information comprises a length and an initial position, the length of the first subtitle subregion is a display length of the first subtitle subregion, the length of the first subtitle subregion represents a video time length corresponding to a first subtitle segment in the first subtitle subregion, the initial position of the first subtitle subregion represents a first video frame image corresponding to the first subtitle segment, and the first subtitle segment is obtained by dividing according to the first subtitle information; displaying a plurality of first subtitle sub-regions in the first subtitle region according to the position information, wherein one of the first subtitle sub-regions has one of the first subtitle segments.

Optionally, the video editing interface further includes an image area, where the image area displays a plurality of video frame images, the plurality of video frame images are sequentially arranged along a predetermined direction, the plurality of first subtitle sub-areas are sequentially arranged along the predetermined direction, one of the first subtitle sub-areas is located on one side of the corresponding plurality of video frame images, and the predetermined direction is a length direction of the first subtitle sub-area.

Optionally, the step of determining the location information of the first subtitle subregion further includes: dividing the first caption information into a plurality of first caption segments; acquiring the duration of a video being edited; determining the length of each first caption subregion according to the number of the first caption segments and the time length; and determining that the starting point of the first caption area is the starting position of the first caption sub-area, and the starting positions of the other first caption sub-areas are the ending positions of the previous first caption sub-area.

Optionally, the step of determining the first subtitle information according to the first text information includes: acquiring text content corresponding to the voice content of the video being edited; determining whether the similarity of the first text information and the text content is greater than a second preset threshold value; determining the text content as the first subtitle information under the condition that the similarity between the first text information and the text content is greater than the second predetermined threshold, wherein the step of determining the position information of the first subtitle subregion comprises: and determining the position information of the first caption segment according to the corresponding relation between the text content and the video frame image.

Optionally, after displaying the first subtitle information generated based on the first text information in a first subtitle region of a video editing interface, the method further includes: in response to a third predetermined operation acting on the first target subtitle sub-region, the start position of the first target subtitle sub-region is changed from the initial position to a predetermined position.

Optionally, after displaying the first subtitle information generated based on the first text information in a first subtitle region of a video editing interface, the method further includes: in response to a fourth predetermined operation acting on a second target subtitle sub-region, the length of the second target subtitle sub-region is changed from the initial length to a predetermined length.

Optionally, after displaying the first subtitle information generated based on the first text information in a first subtitle region of a video editing interface, the method further includes: responding to a fifth preset operation of copying second text information, and acquiring the second text information; and displaying second subtitle information in a second subtitle area of a video editing interface according to the second text information, wherein the second subtitle area is positioned at one side of the first subtitle area.

According to a second aspect of the embodiments of the present disclosure, there is provided a method for automatically loading subtitles, including: detecting whether a first preset operation for copying the first text information exists; under the condition that the first preset operation is detected, acquiring the first text information; and generating first subtitle information based on the first character information, and displaying the first subtitle information in a subtitle area of a video editing interface.

According to a third aspect of the embodiments of the present disclosure, there is provided a method for automatically loading subtitles, including: when a target video is edited, acquiring first text information from a memory area for storing the text information, wherein the memory area is used for storing the text information obtained by copying a target text; and generating first subtitle information based on the first text information, and displaying the first subtitle information in a first subtitle area of a video editing interface.

According to a fourth aspect of the embodiments of the present disclosure, there is provided an apparatus for automatically loading subtitles, including a first obtaining unit configured to perform obtaining first text information in response to a first predetermined operation for copying the first text information; a first display unit configured to display first subtitle information generated based on the first text information in a first subtitle region of a video editing interface.

According to a fifth aspect of an embodiment of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the instructions to implement any of the above-described methods of automatically loading subtitles.

According to a sixth aspect of the embodiments of the present disclosure, there is provided a non-volatile storage medium, wherein when instructions in the non-volatile storage medium are executed by a processor of an electronic device, the electronic device is enabled to execute any one of the above-mentioned methods for automatically loading subtitles.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

in the scheme, the first subtitle information can be generated in the first subtitle area of the video editing interface according to the copied characters, the problem that the subtitles are edited by manual typing or manual character pasting can be relieved and even thoroughly solved, and the subtitle editing efficiency is high.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

Fig. 1 is an architecture diagram illustrating an application scenario of a method of automatically loading subtitles according to an exemplary embodiment.

Fig. 2 is a flowchart illustrating a method of automatically loading subtitles according to an exemplary embodiment.

Fig. 3 is a flowchart illustrating a method of automatically loading subtitles according to another exemplary embodiment.

Fig. 4 is a flowchart illustrating a method of automatically loading subtitles according to still another exemplary embodiment.

Fig. 5 is a schematic diagram illustrating a display interface corresponding to a method of automatically loading subtitles according to an exemplary embodiment.

Fig. 6 is an illustration of a display interface corresponding to a method for automatically loading subtitles, according to another exemplary embodiment.

Fig. 7 is a flowchart illustrating a method of automatically loading subtitles according to yet another exemplary embodiment.

Fig. 8 is a flowchart illustrating a method of automatically loading subtitles according to another exemplary embodiment.

Fig. 9 is a block diagram illustrating a structure of an apparatus for automatically loading subtitles according to an exemplary embodiment.

Fig. 10 is a block diagram illustrating a structure of an apparatus for automatically loading subtitles according to another exemplary embodiment.

Fig. 11 is a block diagram illustrating a structure of an apparatus for automatically loading subtitles according to still another exemplary embodiment.

Fig. 12 is a block diagram illustrating the structure of an electronic device for performing a method of automatically loading subtitles according to an exemplary embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in other sequences than those illustrated or described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

As mentioned in the background art, in the prior art, a user is required to manually add or paste subtitle texts during the process of editing videos, and the efficiency of adding subtitles is low.

Fig. 1 is an architecture diagram illustrating an implementation environment in which the following method for automatically loading subtitles may be applied, as shown in fig. 1, according to an exemplary embodiment. The implementation environment includes an electronic device 01 and a server 02. Wherein, the electronic device 01 and the server 02 may be interconnected and communicate through a network.

The electronic device 01 may be a device that displays the first subtitle information. The electronic device 01 may obtain the corresponding first subtitle information from the server 02 and display the first subtitle information in the video editing interface. Alternatively, the electronic device 01 itself may generate the first subtitle information and display the first subtitle information.

The electronic device 01 may be any electronic product that can interact with a user through one or more modes such as a keyboard, a touch pad, a touch screen, a remote controller, voice interaction or handwriting equipment, for example, a mobile phone, a tablet Computer, a palm Computer, a Personal Computer (PC), a wearable device, a smart television, and the like.

The server 02 may be one server, a server cluster composed of a plurality of servers, or a cloud computing service center. The server 02 may include a processor, memory, and a network interface, among others.

Those skilled in the art will appreciate that the above described electronic devices and servers are merely exemplary, and that other electronic devices or servers, now known or later developed, that may be suitable for use with the present disclosure are intended to be included within the scope of the present disclosure and are hereby incorporated by reference.

Based on this, the embodiment of the present disclosure provides a method for automatically loading subtitles.

An execution main body of the display method provided in the embodiment of the present disclosure may be the electronic device or the server, or may also be a functional module and/or a functional entity capable of implementing the video content display method in the electronic device or the server, and the specific implementation may be determined according to actual use requirements, which is not limited in the embodiment of the present disclosure. The following takes an execution subject as an electronic device as an example, and exemplarily describes a display method provided by the embodiment of the present disclosure.

Fig. 2 is a flowchart illustrating a method for automatically loading subtitles, which is used in an electronic device as shown in fig. 2, and includes the following steps S11 to S12 according to an exemplary embodiment.

In step S11, the first text information is acquired in response to a first predetermined operation for copying the first text information.

The first predetermined operation described above may be any single operation or a group of operations formed by a series of operations for copying the first textual information. In the case where the electronic apparatus is a personal computer, the first predetermined operation may be a keyboard and mouse operation, for example, a keyboard and mouse operation such as "Ctrl + C". When the electronic device is a mobile phone or a PAD, the first predetermined operation may be a click operation or a long press operation.

In step S12, first subtitle information generated based on the first text information is displayed in a first subtitle region of the video editing interface, that is, first subtitle information is generated based on the first text information, and the first subtitle information is displayed in the first subtitle region of the video editing region.

In the above embodiment, first, in response to a first predetermined operation for copying the first text information, the first text information is acquired; and then, displaying the first subtitle information on the video editing interface according to the first character information. According to the scheme, the first subtitle information can be generated in the first subtitle region of the video editing interface according to the copied characters, the problem that the subtitles are edited by manually typing or manually pasting the characters is relieved and even thoroughly solved, and the efficiency of editing the subtitles is high.

In an actual application process, there may be a plurality of predetermined operations for copying text information, and only part of the text information in the plurality of copied text information may be used for subtitles of a video, in order to make the displayed subtitles more accurate in this case, in an embodiment of the present application, as shown in fig. 3, the step S11 is implemented by the step S110 and the step S111.

In step S110, in response to the first predetermined operation, a prompt message is displayed on the video editing interface, where the prompt message is used to prompt whether to determine the first subtitle information according to the first text information. In an actual application process, the prompt information may be a prompt popup, and the prompt popup may display "whether the system detects that the text copy is performed and the text is automatically added as a subtitle". Of course, the prompt popup window is not limited to displaying the text, but may also be other specified prompt messages in the video editor, for example, a prompt message showing "copy? "is a rectangular icon. Those skilled in the art can set the appropriate prompt information according to the actual situation, and can be a prompt icon or the like.

In step S111, the first character information is acquired in response to a second predetermined operation applied to the prompt information. The second predetermined operation may be at least one of a click operation, a long-press operation, a double-click operation, and a slide operation, and may be determined according to actual conditions. In the case where the second predetermined operation on the prompt information is received, it may be determined that the first caption information is determined according to the first text information, and thus, the first text information is acquired based on the second predetermined operation.

As shown in fig. 4, the step S12 may be implemented by a step S120 and a step S121, wherein in the step S120, the first subtitle information is determined according to the first text information; in step S121, the first caption information is displayed in the first caption area.

In an actual application process, in some cases, for example, in a case where a time length of the first text information displayed in a predetermined manner substantially coincides with a time length of the video being edited, or in a case where although the time length of the first text information displayed in the predetermined manner is shorter than the time length of the video being edited, the video only partially requires subtitles, the first text information is first subtitle information, and then, correspondingly, the step S120 includes: and determining the first text information as the first subtitle information.

In some cases, the first text information may not satisfy the requirement corresponding to the first caption information, and the first text information may be a part of the first caption information, and in order to further accurately, efficiently and completely display the first caption information, in an embodiment of the present application, the step S120 includes: determining whether the preset time length of the first text message displayed according to a preset mode is shorter than the time length of the video being edited or not; comparing the first text information with a preset caption segment in a caption database under the condition that the preset time length is less than the time length of the video, wherein the caption database comprises a plurality of preset caption segments; and in the case that the similarity between the first text information and the target part is greater than a first preset threshold value, determining at least one part of a target subtitle segment including the target part as the first subtitle information according to the length of the video being edited, wherein the target part is one part of the target subtitle segment with the same preset time length as the first text information, and the target subtitle segment is one of a plurality of preset subtitle segments including the target part. According to the scheme, the first subtitle information corresponding to the video being edited can be automatically supplemented according to the first character information.

For example, the first textual information is "once a truthful love is put in front of me without being cherished", the subtitle database is a database of classical lines segments in the movie, one of the predefined subtitle segments is "once a truthful love is put in front of me, me has no cherish, me remorses after me loses a time, me most distressed for the world, if me can be given a chance to come again in the last day, me says three words for that child, me likes you, if not to add a term to this love, me wishes to be ten thousand years", this predefined subtitle segment is a target subtitle segment, wherein the target portion is "once a truthful love is put in front of me, me does not get cherished", if the first predefined threshold is 90%, the similarity of the target portion and the first textual information is 100% greater than the first predefined threshold, so, it can be determined that me has a truthful love to be the same love as 100% of me, if me has no cherish, "if me has a truthful love is not to show me a time, me is a corresponding to a gratuitous time, if me is determined that me does not put a time, if me has a truthful title segment, me is a following," if me has a truthful love segment is a time, me editing time, me shows a pity, me is not a first predefined target segment, if me is not a pity, me is a first predefined target segment, if me is not a pity, if me target segment is a first predefined target segment, me is a pity, if me is a third caption segment is determined to be given a pity, if me date is not a first predefined target segment, me is not a pity, and me date is not a pity, then, if me target segment is a third predefined target segment is not true, i is not had a corresponding to me is not had a pity, if me is not had a corresponding to be a pity, if me is not a pity, if me is given a first predefined target segment, then, if me target segment, if me is determined. If the duration of the edited video is longer than the duration of the target caption segment displayed in a predetermined manner, the target caption segment is used as the first caption information, and the display manner of the corresponding first caption information is adjusted according to the duration of the edited video, for example, generally, the display duration corresponding to one sentence is 2s, so that the display duration of the first caption information is the same as the duration of the video, the display duration of one sentence may be increased. If the duration of the edited video is less than the duration of the target subtitle segment displayed in a predetermined manner, a portion of the target subtitle segment, which includes the target portion and has the same duration as the duration of the edited video, is taken as the first subtitle information, and specifically, a portion of the target subtitle segment may be intercepted by combining the voice information and the target portion of the video as the first subtitle information.

In a specific application, the caption database is actually a material library, wherein the predetermined caption segment may be a speech of a classical segment in a movie, a popular classical segment on a network, or any other suitable text segment such as a poetry library, and a person skilled in the art can select a suitable caption database according to an actual situation.

Certainly, in an actual application process, the method may further include a step of constructing a subtitle database, and the construction process may refer to a construction process of another language database, which is not described herein again.

In order to more accurately display the first subtitle information in the first subtitle region, in a specific embodiment of the present application, the step of displaying the first subtitle information in the first subtitle region includes: determining position information of a first subtitle sub-region, wherein the position information comprises a length and a starting position, the length of the first subtitle sub-region is the display length of the first subtitle sub-region, the length of the first subtitle sub-region represents a video time length corresponding to a first subtitle segment in the first subtitle sub-region, the starting position of the first subtitle sub-region represents a first video frame image corresponding to the first subtitle segment, and which video frame images corresponding to the first subtitle segment are determined according to the position information, wherein the first subtitle segment is obtained by dividing according to the first subtitle information; according to the position information, a plurality of first subtitle sub-regions are displayed in the first subtitle region, that is, according to the length and the starting position of the first subtitle sub-regions, corresponding first subtitle sub-regions are displayed, and the first subtitle segments are located in the first subtitle sub-regions in a one-to-one correspondence manner, as shown in fig. 5.

In a specific embodiment, the video editing interface further includes an image area, the image area displays a plurality of video frame images, the plurality of video frame images are sequentially arranged along a predetermined direction, the plurality of first subtitle sub-areas are sequentially arranged along the predetermined direction, one of the first subtitle sub-areas is located at one side of the corresponding plurality of video frame images, the plurality of video frame images and the corresponding first subtitle sub-area are correspondingly displayed, and the predetermined direction is a length direction of the first subtitle sub-area. Specifically, the start position of the first subtitle subregion may be aligned with the start position of the first video frame image in the corresponding plurality of video frame images, that is, the projection of the start position of the first subtitle subregion on the video frame image region is located on the start edge of the first video frame image. The length of the first subtitle subregion may be the same as the display length of the corresponding plurality of video frame images. The projection of the starting position of the first subtitle subregion in the video frame image region may also be located in the first video frame image of the corresponding plurality of video frame images, and the length of the first subtitle subregion may also be smaller than the total display length of the corresponding plurality of video frame images. As shown in fig. 5, five video frame images on one side of the first subtitle subregion 101 are the five video frame images corresponding to the first subtitle segment in the first subtitle subregion.

Of course, in practical applications, the first subtitle subregion is not limited to being located on one side of the video frame image as shown in fig. 5 (see fig. 5 in a view facing the screen, and the upper side, the left side, and the right side below are all viewed from this view), and may also be located on the upper side of the video frame image.

It should be noted that the first subtitle sub-region may also be displayed at other display positions, for example, the first subtitle sub-region is located at the left side or the right side of the corresponding video frame images. In the prior art, it is generally referred to that the image frame display area is a main track, the subtitle display area is a sub-track, and the sub-track and the main track are distributed in a direction perpendicular to the predetermined direction, specifically, as shown in fig. 5, the first subtitle area 100 in fig. 5 can be regarded as the sub-track, and the frame image area 200 is the main track.

In order to determine the position information of the first subtitle sub-region more accurately, so as to display the first subtitle segment more accurately, in an embodiment of the present application, the step of determining the position information of the first subtitle sub-region further includes: dividing the first caption information into a plurality of first caption segments, generally speaking, a sentence into a first caption segment; acquiring the duration of a video being edited; determining the length of each first subtitle sub-region according to the number of the first subtitle segments and the time length, specifically, because the length of the first subtitle sub-region represents the time length of the video corresponding to the first subtitle sub-region, according to the time length of the video and the number of the first subtitle segments, the video time length corresponding to the first subtitle segment in each first subtitle sub-region may be determined, and according to the video time length corresponding to the video time length, that is, the number of the video frame images corresponding to the video time length may be determined, so that the length of the corresponding first subtitle sub-region may be determined according to the number of the video frame images corresponding to the video frame images, specifically, the length of the first subtitle sub-region may be approximately equal to the total display length of the plurality of video frame images corresponding to the first subtitle sub-region, and may also be smaller than the total display length of the plurality of video frame images corresponding to the first subtitle sub-region; the starting point of the first caption area is determined as the starting position of the first caption sub-area, and the starting position of the other first caption sub-areas is determined as the ending position of the previous first caption sub-area, where "after" refers to after in the length direction of the first caption sub-area, specifically, the distribution interval of two adjacent first caption segments may be determined according to actual conditions, and the two adjacent first caption segments may be in contact, that is, the interval distance is 0, or may not be in contact, and the interval distance is greater than 0.

In order to more accurately and efficiently determine the video frame image corresponding to the first caption segment, in a specific embodiment of the present application, the step of determining the first caption information according to the first text information includes: acquiring the text content corresponding to the voice content of the video being edited; determining whether the similarity between the first text information and the text content is greater than a second preset threshold value; and determining the text content as the first subtitle information when the similarity between the first text information and the text content is greater than the second predetermined threshold. The step of determining the position information of the first subtitle subregion includes: the position information of the first caption segment is determined according to the corresponding relationship between the text content and the video frame image, and specifically, the text content and the voice content have a corresponding relationship with the video frame image, so that the text content and the video frame image have a corresponding relationship, and according to the corresponding relationship, the video frame image corresponding to each part of the text content (the first caption segment) can be more accurately determined, so that the position information of the first caption sub-region, that is, the start position and the length of the first caption sub-region can be more accurately determined, the start position of the first caption sub-region can correspond to the start position of the corresponding first frame image, or after the start position of the first frame image, the length of the first caption sub-region can be the same as the total display length of the corresponding plurality of frame images, or can be smaller than the total display length of the corresponding plurality of frame images.

In an actual application process, sometimes the display position of the first subtitle sub-region is not very accurate, and in order to solve this problem, in an embodiment of the present application, after the first subtitle information generated based on the first text information is displayed in the first subtitle region of the video editing interface, the method further includes: in response to a third predetermined operation acting on the first target subtitle sub-region, the start position of the first target subtitle sub-region is changed from the initial position to a predetermined position. By adjusting the initial position of the target first caption subregion, the display position of the target first caption subregion can be adjusted, so that the adjusted display position of the target first caption subregion can be further ensured to be more accurate, and the video duration and the video frame image corresponding to the first caption segment in the first caption subregion in the target first caption subregion are further ensured to be more accurate.

In another embodiment of the present application, after displaying, in a first subtitle region of a video editing interface, first subtitle information generated based on the first text information, the method further includes: in response to a fourth predetermined operation acting on a second target subtitle subregion, the length of the second target subtitle subregion is changed from the initial length to a predetermined length, and the number of the video frame images corresponding to the second target subtitle subregion is changed from the first predetermined number to a second predetermined number. By the method, the display position of the second target subtitle subregion can be adjusted, so that the adjusted display position of the target first subtitle subregion is more accurate, and the video duration and the video frame image corresponding to the first subtitle segment in the first subtitle subregion are more accurate.

In some specific applications, some videos need to be loaded with dual subtitles, and in order to better load subtitles of such videos, in a specific embodiment of the present application, after displaying, in a first subtitle region of a video editing interface, first subtitle information generated based on the first text information, the method further includes: responding to a fifth preset operation of copying second character information, and acquiring the second character information; and displaying second subtitle information in a second subtitle region of the video editing interface according to the second text information, wherein the second subtitle region is positioned at one side of the first subtitle region, can be any side of the first subtitle region, and can be set according to actual conditions. In a specific embodiment, as shown in fig. 6, the second subtitle region 300 is located on a side of the first subtitle region 100 away from the frame image region 200.

It should be further noted that, in the present application, the description about the display process of the first subtitle information may be referred to for the display process of the second subtitle information, and is not repeated here.

In another embodiment of the present application, after the subtitles of the video are loaded, the method further includes: and responding to a fifth preset operation, and playing the video comprising the first subtitle information.

Fig. 7 is a flowchart illustrating a method for automatically loading subtitles, which is used in an electronic device as shown in fig. 7, according to an exemplary embodiment, and includes the following steps S21 to S23.

In step S21, it is detected whether there is a first predetermined operation for copying the first letter information;

in step S22, when the first predetermined operation is detected, the first character information is acquired;

in step S23, first caption information is generated based on the first character information, and the first caption information is displayed in a caption area of a video editing interface.

In the method, when the first predetermined operation is detected, the first subtitle information generated based on the first character information is displayed in the subtitle area of the video editing interface. According to the scheme, the first caption information can be generated in the first caption area of the video editing interface according to the copied characters, the problem that the caption is edited by manual typing or manual character pasting can be relieved or even thoroughly solved, and the efficiency of editing the caption is high.

The specific process of step S22 and the specific process of step S23 may refer to the description in the above scheme, and are not described herein again.

Fig. 8 is a flowchart illustrating a method for automatically loading subtitles, which is used in an electronic device as shown in fig. 8, according to an exemplary embodiment, and includes the following steps S31 to S32.

In step S31, when editing a target video, acquiring first text information from a memory area for storing text information, where the memory area is used to store text information obtained by performing a copy operation (a first predetermined operation) on a target text;

in step S32, first subtitle information is generated based on the first text information, and the first subtitle information is displayed in a first subtitle region of a video editing interface.

In the above-described embodiment, the first subtitle information may be generated from the copied first text information, and the corresponding first subtitle information may be displayed in the first subtitle region. According to the scheme, the first caption information can be generated in the first caption area of the video editing interface according to the copied characters, the problem that the captions need to be edited by manual typing or manual character pasting is relieved or even thoroughly solved, and the caption editing efficiency in the video editing process is high.

The specific process of step S31 and the specific process of step S32 may refer to the description in the above scheme, and are not described herein again.

It should be noted that the first predetermined operation, the second predetermined operation, and the fifth predetermined operation in the present application may be any operation that is feasible in the prior art, such as an operation including at least one of a click operation, a slide operation, a long-press operation, and a double-click operation. The skilled person can select a suitable operation or combination of operations to correspond to the above five predetermined operations of the present application according to the actual situation.

Fig. 9 is a block diagram illustrating an apparatus for automatically loading subtitles according to an example embodiment. Referring to fig. 9, the apparatus includes a first acquisition unit 10 and a first display unit 20.

The first acquiring unit 10 is configured to acquire the first text information in response to a first predetermined operation for copying the first text information.

The first predetermined operation described above may be any single operation or a group of operations formed by a series of operations for copying the first textual information. In the case where the electronic apparatus is a personal computer, the first predetermined operation may be a keystroke operation, for example, a keystroke operation such as "Ctrl + C". When the electronic device is a mobile phone or a PAD, the first predetermined operation may be a click operation or a long press operation.

The first display unit 20 is configured to display first subtitle information generated based on the first text information in a first subtitle region of a video editing interface.

In an actual application process, there may be a plurality of predetermined operations for copying text information, and only part of the text information in the plurality of copied text information may be used for subtitles of a video, in order to make the displayed subtitles more accurate in this case, in an embodiment of the present application, the first obtaining unit includes a first display module and an obtaining module.

The first display module is configured to respond to the first preset operation and display prompt information on the video editing interface, wherein the prompt information is used for reminding whether the first subtitle information is determined according to the first text information. In the actual application process, the prompt information may be a prompt popup, and the prompt popup may display "whether the system detects that you have copied a word and automatically adds a subtitle. Of course, the prompt pop-up window with the above text is not limited to be displayed, but may also be some other specified prompt information in the video editor, for example, the prompt information is a prompt with "copy? "is a rectangular icon. Those skilled in the art can set the appropriate prompt information according to the actual situation.

The acquisition module is configured to respond to a second preset operation acted on the prompt message to acquire the first text message. The second predetermined operation may be at least one of a click operation, a long-press operation, a double-click operation, and a slide operation, and may be determined according to actual conditions. In the case of receiving a second predetermined operation on the prompt message, it may be determined that the first subtitle information is determined according to the first text information, and thus the first text information is acquired based on the second predetermined operation.

The first display unit comprises a determining module and a second display module, wherein the determining module is configured to determine the first caption information according to the first text information; the second display module is configured to display the first subtitle information in the first subtitle region.

In an actual application process, in some cases, for example, in a case where a time length of the first text information when displayed in a predetermined manner substantially coincides with a time length of a video being edited, or in a case where although the time length of the first text information when displayed in the predetermined manner is shorter than the time length of the video being edited, the video only partially requires subtitles, the first text information is first subtitle information, and accordingly, the determining module is configured to determine that the first text information is the first subtitle information.

In some cases, the first text information cannot meet the requirement corresponding to the first caption information, the first text information may be used as a part of the first caption information, and in order to further accurately, efficiently and completely display the first caption information, in an embodiment of the present application, the determining module further includes a first determining sub-module, a second determining sub-module and a third determining sub-module, where the first determining sub-module is configured to determine whether a predetermined duration of the first text information displayed in a predetermined manner is less than a duration of the video being edited; the second determining sub-module is configured to compare the first text information with a predetermined caption segment in a caption database in case that the predetermined time length is less than the time length of the video, wherein the caption database includes a plurality of the predetermined caption segments; the third determining sub-module is configured to determine, according to the length of the video being edited, at least a portion of a target subtitle segment including the target portion, which is a portion of the target subtitle segment that is the same as the predetermined length of the first text information, as the first subtitle information, where the target subtitle segment is one of a plurality of the predetermined subtitle segments including the target portion, if the similarity between the first text information and the target portion is greater than a first predetermined threshold. According to the scheme, the first subtitle information corresponding to the video being edited can be automatically supplemented according to the first character information.

For example, the first text message is "once a true love is placed before me, i do not cherish", the caption database is a database of classical lines segments in a movie, wherein a predetermined caption segment is "once a true love is placed before me, i do not cherish, i do not regress when me loses, i do not regress, i do not go so much, if me can be given a chance to come again in the last day, i will say three words to the girl, i love you long, if not to add a deadline in the love, i want to be ten thousand years", the predetermined caption segment is a target caption segment, wherein a target portion is "once a true love is placed before me, i do not go so much", if a first predetermined threshold is 90%, a similarity of the target portion and the first text message is 100% greater than a first predetermined threshold, thus, i can be determined that the true love is 100% greater than the first predetermined caption information, i can be shown to be a true caption if me is not edited, i do not go too much, if i do not go too much, if the first predetermined caption segment is a target segment, i do not show me a true love, if it is determined that the same as a first predetermined caption segment, i do not go too much, i do not go ahead, i do not go too much, i show a first predetermined period, if it is a true caption segment, i show a first predetermined video, if it is a target video, i is a first predetermined period, if it is determined that is a target video, i is a third predetermined period, and if it is not go too much. If the duration of the edited video is longer than the duration of the target caption segment displayed in a predetermined manner, the target caption segment is used as the first caption information, and the display manner of the corresponding first caption information is adjusted according to the duration of the edited video, for example, generally, the display duration corresponding to one sentence is 2s, so that the display duration of the first caption information is the same as the duration of the video, the display duration of one sentence may be increased. If the duration of the edited video is less than the duration of the target subtitle segment displayed in a predetermined manner, a portion of the target subtitle segment, which includes the target portion and has the same duration as the duration of the edited video, is taken as the first subtitle information, and specifically, a portion of the target subtitle segment may be intercepted by combining the voice information and the target portion of the video as the first subtitle information.

Of course, in an actual application process, the method may further include a step of constructing a subtitle database, and the construction process may refer to a construction process of another language database, which is not described herein again.

In order to more accurately display the first subtitle information in the first subtitle region, in a specific embodiment of the present application, the second display module includes a fourth determining sub-module and a first display sub-module.

The fourth determining submodule is configured to determine position information of a first subtitle sub-region, where the position information includes a length and a start position, the length of the first subtitle sub-region is a display length of the first subtitle sub-region, the length of the first subtitle sub-region represents a video duration corresponding to a first subtitle segment in the first subtitle sub-region, and the start position of the first subtitle sub-region represents a first video frame image corresponding to the first subtitle segment, and which video frame images corresponding to the first subtitle segment are determined according to the position information, where the first subtitle segment is obtained by dividing according to the first subtitle information;

the first display sub-module is configured to display a plurality of first subtitle sub-regions in the first subtitle region according to the position information, that is, determine the length and the starting position of the first subtitle sub-region according to the length and the starting position of the first subtitle sub-region, where the first subtitle segments are located in the first subtitle sub-regions in a one-to-one correspondence manner, as shown in fig. 5.

In a specific embodiment, the video editing interface further includes an image area, the image area displays a plurality of video frame images, the plurality of video frame images are sequentially arranged along a predetermined direction, the plurality of first subtitle sub-areas are sequentially arranged along the predetermined direction, one of the first subtitle sub-areas is located at one side of the corresponding plurality of video frame images, the plurality of video frame images and the corresponding first subtitle sub-area are correspondingly displayed, and the predetermined direction is a length direction of the first subtitle sub-area. In particular, the start position of the first subtitle subregion may be aligned with the start position of the first video frame image of the corresponding plurality of video frame images, i.e. the projection of the start position of the first subtitle subregion onto the video frame image region is located on the start edge of the first video frame image. The length of the first subtitle subregion may be the same as the display length of the corresponding plurality of video frame images. The projection of the starting position of the first subtitle subregion in the video frame image region may also be located in the first video frame image of the corresponding plurality of video frame images, and the length of the first subtitle subregion may also be smaller than the total display length of the corresponding plurality of video frame images. As shown in fig. 5, five video frame images 201 on one side of the first subtitle sub-region are the five video frame images 201 corresponding to the first subtitle segment in the first subtitle sub-region.

Of course, in practical applications, the first subtitle sub-area is not limited to being located on one side of the video frame image as shown in fig. 5 (see fig. 5 in a view facing the screen, and the upper side, the left side, and the right side below are all viewed in this view), and may also be located on the upper side of the video frame image.

It should be noted that the first subtitle sub-region may also be displayed at other display positions, for example, the first subtitle sub-region is located at the left side or the right side of the corresponding video frame images. In the prior art, it is generally referred to that the region where the image frame is displayed is a main track, the region where the subtitles are displayed is a sub-track, and the sub-track and the main track are distributed in a direction perpendicular to the predetermined direction, and specifically, as shown in fig. 5, the first subtitle region 100 in fig. 5 may be considered as the sub-track, and the frame image region 200 is the main track. .

In order to more accurately determine the position information of the first subtitle sub-region, so as to more accurately display the first subtitle segment, in an embodiment of the present application, the fourth determining sub-module includes a dividing sub-module, a first obtaining sub-module, a fifth determining sub-module, and a sixth determining sub-module.

Wherein the dividing submodule is configured to divide the first subtitle information into a plurality of the first subtitle segments, and generally speaking, a sentence is divided into one first subtitle segment.

The first acquisition submodule is configured to acquire a duration of a video being edited.

The fifth determining sub-module is configured to determine a length of each of the first subtitle sub-regions according to the number of the first subtitle segments and the duration, and specifically, since the length of the first subtitle sub-region represents the duration of the video corresponding to the first subtitle sub-region, according to the duration of the video and the number of the first subtitle segments, a video duration corresponding to the first subtitle segment in each of the first subtitle sub-regions may be determined, and according to the video duration corresponding to the video duration, that is, the number of the video frame images corresponding to the video frame images may be determined, so that the length of the corresponding first subtitle sub-region may be determined according to the number of the corresponding video frame images, and specifically, the length of the first subtitle sub-region may be approximately equal to a total display length of a plurality of video frame images corresponding to the first subtitle sub-region, or may be smaller than the total display length of the plurality of video frame images corresponding to the first subtitle sub-region.

The sixth determining sub-module is configured to determine that the starting point of the first subtitle region is the starting position of the first subtitle sub-region, and the starting positions of the other first subtitle sub-regions are the ending positions of the previous first subtitle sub-regions, where "after" refers to after in the length direction of the first subtitle sub-region, specifically, the distribution interval of two adjacent first subtitle segments may be determined according to actual conditions, and the two adjacent first subtitle segments may be in contact, i.e., the separation distance is 0, or may not be in contact, and the separation distance is greater than 0.

In order to determine the video frame image corresponding to the first caption segment more accurately and efficiently, in a specific embodiment of the present application, the determining module includes a second obtaining sub-module, a seventh determining sub-module, and an eighth determining sub-module.

And the second acquisition submodule is configured to acquire the text content corresponding to the voice content of the video being edited.

The seventh determining submodule is configured to determine whether the similarity between the first text information and the text content is greater than a second predetermined threshold.

The eighth determining sub-module is configured to determine that the text content is the first subtitle information if the similarity between the first text information and the text content is greater than the second predetermined threshold.

The fourth determining sub-module is configured to determine the position information of the first caption segment according to the corresponding relationship between the text content and the video frame image, specifically, the text content and the voice content have a corresponding relationship, and therefore, the text content and the video frame image have a corresponding relationship, and according to the corresponding relationship, the video frame image corresponding to each part of the text content (the first caption segment) can be determined more accurately, so that the position information of the first caption sub-region, that is, the start position and the length of the first caption sub-region can be determined more accurately, and the start position of the first caption sub-region can correspond to the start position of the corresponding first frame image, or after the start position of the first frame image, the length of the first caption sub-region can be the same as the total display length of the corresponding plurality of frame images, or can be smaller than the total display length of the corresponding plurality of frame images.

In order to solve the problem that the display position of the first subtitle sub-region is not very accurate in practical application, in an embodiment of the present application, the apparatus further includes a first adjusting unit configured to change the start position of the first target subtitle sub-region from the initial position to a predetermined position in response to a third predetermined operation acting on the first target subtitle sub-region after the first subtitle information generated based on the first text information is displayed in the first subtitle region of the video editing interface. By adjusting the initial position of the target first caption sub-region, the display position of the target first caption sub-region can be adjusted, so that the adjusted display position of the target first caption sub-region can be further ensured to be more accurate, and the video duration and the video frame image corresponding to the first caption segment in the first caption sub-region in the target first caption sub-region can be further ensured to be more accurate.

In another embodiment of the present application, the apparatus further includes a second adjusting unit, configured to, after the first subtitle information generated based on the first text information is displayed in the first subtitle region of the video editing interface, change the length of the second target subtitle subregion from an initial length to a predetermined length in response to a fourth predetermined operation acting on the second target subtitle subregion, and change the number of the video frame images corresponding to the second target subtitle subregion from a first predetermined number to a second predetermined number. By the method, the display position of the second target subtitle subregion can be adjusted, so that the adjusted display position of the target first subtitle subregion is further ensured to be more accurate, and the video duration and the video frame image corresponding to the first subtitle segment in the first subtitle subregion are further ensured to be more accurate.

In some specific applications, some videos need to be loaded with double subtitles, and in order to better load the subtitles of such videos, in a specific embodiment of the present application, the apparatus further includes a second obtaining unit and a second display unit, where the second obtaining unit is configured to obtain the second text information in response to a fifth predetermined operation of copying the second text information after displaying the first subtitle information generated based on the first text information in the first subtitle region of the video editing interface; the second display unit is configured to display second subtitle information in a second subtitle region of the video editing interface according to the second text information, wherein the second subtitle region is located on one side of the first subtitle region, can be any side of the first subtitle region, and can be set according to actual conditions. In a specific embodiment, as shown in fig. 6, the second subtitle region is located on a side of the first subtitle region away from the video frame image.

In another embodiment of the present application, the apparatus further includes a playing unit, and the playing unit is configured to play the video including the first subtitle information in response to a fifth predetermined operation after the subtitle of the video is loaded.

Fig. 10 is a block diagram illustrating an apparatus for automatically loading subtitles, which includes a detection unit 30, a second acquisition unit 40, and a second display unit 50, as shown in fig. 10, according to an exemplary embodiment.

The detection unit 30 is configured to detect whether there is a first predetermined operation for copying the first letter information;

the second acquiring unit 40 is configured to acquire the first text information when the first predetermined operation is detected;

the second display unit 50 is configured to generate first subtitle information based on the first text information and display the first subtitle information in a subtitle region of a video editing interface.

In the above-described apparatus, the first display unit displays, in a subtitle region of the video editing interface, first subtitle information generated based on the first subtitle information, when the first predetermined operation is detected. According to the scheme, the first caption information can be generated in the first caption area of the video editing interface according to the copied characters, the problem that the caption is edited by manual typing or manual character pasting is solved, and the efficiency of editing the caption is high.

For the specific implementation processes of the first obtaining unit and the first display unit, reference may be made to the description in the foregoing scheme, and details are not described here again.

Fig. 11 is a flowchart illustrating a method for automatically loading subtitles, which is used in an electronic device and includes a third obtaining unit 60 and a third displaying unit 70, as shown in fig. 11, according to an exemplary embodiment.

The third obtaining unit 60 is configured to obtain, when editing the target video, first text information from a memory area for storing text information, where the memory area is used for storing text information obtained by performing a copy operation (a first predetermined operation) on the target text;

the third display unit 70 is configured to generate first subtitle information based on the first text information and present the first subtitle information in a first subtitle region of the video editing interface.

In the above-described embodiment, the first subtitle information may be generated from the copied first text information, and the corresponding first subtitle information may be displayed in the first subtitle region. According to the scheme, the first subtitle information can be generated in the first subtitle region of the video editing interface according to the copied characters, the problem that the subtitles are edited by manually typing or manually pasting the characters is relieved and even thoroughly solved, and the subtitle editing efficiency in the video editing process is high.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 12 is a block diagram illustrating an electronic device 01 for performing automatic subtitle loading according to an exemplary embodiment.

In an exemplary embodiment, a storage medium comprising executable instructions, such as the memory 402 for storing executable instructions, is also provided, which are executable by the processor 404 of the electronic device 01 to perform the above-described method. Alternatively, the storage medium may be a non-transitory computer readable storage medium, such as a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method for automatically loading subtitles, comprising:

responding to a first preset operation for copying first text information, and acquiring the first text information;

displaying first subtitle information generated based on the first text information in a first subtitle area of a video editing interface,

the step of displaying the first caption information generated based on the first character information in the first caption area of the video editing interface comprises the following steps:

determining the first subtitle information according to the first text information;

displaying the first subtitle information in the first subtitle region,

the step of determining the first subtitle information according to the first text information includes:

determining whether the preset time length of the first text message displayed according to the preset mode is less than the time length of the video being edited;

comparing the first text information with a preset subtitle segment in a subtitle database under the condition that the preset duration is less than the duration of the video, wherein the subtitle database comprises a plurality of preset subtitle segments;

and under the condition that the similarity between the first text information and a target part is greater than a first preset threshold value, determining at least one part of a target subtitle segment including the target part as the first subtitle information according to the length of the video being edited, wherein the target part is the part of the target subtitle segment with the same preset time length as the first text information, and the target subtitle segment is one of a plurality of preset subtitle segments including the target part.

2. The method of claim 1, wherein the step of obtaining the first textual information in response to the first predetermined operation for copying the first textual information comprises:

responding to the first preset operation, displaying prompt information on the video editing interface, wherein the prompt information is used for reminding whether the first caption information is determined according to the first text information;

and responding to a second preset operation acted on the prompt message to acquire the first text message.

3. The method of claim 1, wherein the step of determining the first caption information based on the first text information comprises:

and determining the first text information as the first subtitle information.

4. The method of claim 1, wherein the displaying the first subtitle information in the first subtitle region comprises:

determining position information of a first subtitle sub-region, wherein the position information comprises a length and an initial position, the length of the first subtitle sub-region is a display length of the first subtitle sub-region, the length of the first subtitle sub-region represents a video time length corresponding to a first subtitle segment in the first subtitle sub-region, the initial position of the first subtitle sub-region represents a first video frame image corresponding to the first subtitle segment, and the first subtitle segment is obtained by dividing according to the first subtitle information;

and displaying a plurality of first subtitle sub-regions in the first subtitle region according to the position information, wherein one first subtitle sub-region has one first subtitle segment.

5. The method according to claim 4, wherein the video editing interface further comprises an image area, the image area displays a plurality of video frame images, the plurality of video frame images are sequentially arranged along a predetermined direction, the plurality of first subtitle sub-areas are sequentially arranged along the predetermined direction, one of the first subtitle sub-areas is located at one side of the corresponding plurality of video frame images, and the predetermined direction is a length direction of the first subtitle sub-area.

6. The method of claim 4, wherein the step of determining the position information of the first subtitle subregion further comprises:

dividing the first caption information into a plurality of first caption segments;

acquiring the duration of a video being edited;

determining the length of each first caption subregion according to the number of the first caption segments and the duration;

determining that the starting point of the first subtitle region is the starting position of the first subtitle sub-region, and the starting positions of the other first subtitle sub-regions are the ending positions of the previous first subtitle sub-region.

7. The method of claim 4,

acquiring text content corresponding to the voice content of the video being edited;

determining whether the similarity of the first text information and the text content is greater than a second preset threshold value;

determining the text content as the first subtitle information when the similarity between the first text information and the text content is greater than the second predetermined threshold,

the step of determining the position information of the first subtitle subregion comprises the following steps:

and determining the position information of the first caption segment according to the corresponding relation between the text content and the video frame image.

8. The method of any one of claims 1 to 7, wherein after displaying first subtitle information generated based on the first textual information in a first subtitle region of a video editing interface, the method further comprises:

in response to a third predetermined operation acting on the first target subtitle sub-region, the start position of the first target subtitle sub-region is changed from the initial position to a predetermined position.

9. The method of any of claims 1-7, wherein after displaying first caption information generated based on the first text information in a first caption region of a video editing interface, the method further comprises:

in response to a fourth predetermined operation acting on a second target subtitle sub-region, the length of the second target subtitle sub-region is changed from the initial length to a predetermined length.

10. The method of any of claims 1-7, wherein after displaying first caption information generated based on the first text information in a first caption region of a video editing interface, the method further comprises:

responding to a fifth preset operation of copying second text information, and acquiring the second text information;

and displaying second subtitle information in a second subtitle area of a video editing interface according to the second text information, wherein the second subtitle area is positioned at one side of the first subtitle area.

11. A method for automatically loading subtitles, comprising:

detecting whether a first preset operation for copying the first text information exists;

under the condition that the first preset operation is detected, acquiring the first text information;

generating first caption information based on the first text information, and displaying the first caption information in a caption area of a video editing interface,

the step of generating first subtitle information based on the first text information and displaying the first subtitle information in a subtitle area of a video editing interface comprises the following steps:

determining the first caption information according to the first character information;

displaying the first subtitle information in a first subtitle region,

determining whether the preset time length of the first text information displayed according to the preset mode is less than the time length of the video being edited;

and under the condition that the similarity between the first text information and a target part is greater than a first preset threshold value, determining at least one part of a target caption segment comprising the target part as the first caption information according to the length of the video being edited, wherein the target part is the part of the target caption segment with the same preset time length as the first text information, and the target caption segment is one of a plurality of preset caption segments comprising the target part.

12. A method for automatically loading subtitles, comprising:

when a target video is edited, acquiring first text information from a memory area for storing the text information, wherein the memory area is used for storing the text information obtained by copying a target text;

generating first subtitle information based on the first text information, and displaying the first subtitle information in a first subtitle area of a video editing interface,

the step of generating first subtitle information based on the first text information and displaying the first subtitle information in a subtitle area of a video editing interface comprises the steps of:

displaying the first subtitle information in the first subtitle region,

13. An apparatus for automatically loading subtitles, comprising:

a first acquisition unit configured to execute acquisition of first text information in response to a first predetermined operation for copying the first text information;

a first display unit configured to display first subtitle information generated based on the first text information in a first subtitle region of a video editing interface,

the first display unit comprises a determining module and a second display module, wherein the determining module is configured to determine the first caption information according to the first text information; the second display module is configured to display the first subtitle information in the first subtitle region,

the determining module further comprises a first determining submodule, a second determining submodule and a third determining submodule, wherein the first determining submodule is configured to determine whether the preset time length of the first text information displayed according to the preset mode is less than the time length of the video being edited; the second determining submodule is configured to compare the first text information with a predetermined caption segment in a caption database in the case that the predetermined duration is less than the duration of the video, wherein the caption database includes a plurality of the predetermined caption segments; the third determining sub-module is configured to determine, according to the length of the video being edited, at least a part of a target subtitle segment including the target portion, which is a part of the target subtitle segment that is the same as the predetermined duration of the first text information, as the first subtitle information, where the target subtitle segment includes the target portion in a plurality of the predetermined subtitle segments, if the similarity between the first text information and the target portion is greater than a first predetermined threshold.

14. An electronic device, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the method of automatically loading subtitles according to any one of claims 1 to 12.

15. A non-volatile storage medium, wherein instructions in the non-volatile storage medium, when executed by a processor of an electronic device, enable an electronic device to perform the method of automatically loading subtitles of any one of claims 1 to 12.