WO2022110844A1 - 自动加载字幕的方法及电子设备 - Google Patents

自动加载字幕的方法及电子设备 Download PDF

Info

Publication number
WO2022110844A1
WO2022110844A1 PCT/CN2021/107903 CN2021107903W WO2022110844A1 WO 2022110844 A1 WO2022110844 A1 WO 2022110844A1 CN 2021107903 W CN2021107903 W CN 2021107903W WO 2022110844 A1 WO2022110844 A1 WO 2022110844A1
Authority
WO
WIPO (PCT)
Prior art keywords
subtitle
information
text information
target
sub
Prior art date
Application number
PCT/CN2021/107903
Other languages
English (en)
French (fr)
Inventor
吴丹
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2022110844A1 publication Critical patent/WO2022110844A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/04817Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance using icons
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04845Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range for image manipulation, e.g. dragging, rotation, expansion or change of colour
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers

Definitions

  • the present disclosure relates to the field of video production, and in particular, to a method and electronic device for automatically loading subtitles.
  • the process of adding subtitles is as follows: drag the video track to select the video frame to which subtitles are to be added, click to enter text, manually input or paste the required subtitle text, and then drag it to the video frame. Repeat the process for the next video frame for which you want to add subtitles.
  • the present disclosure provides a method, apparatus, electronic device, and non-volatile storage medium for automatically loading subtitles.
  • a method for automatically loading subtitles comprising: acquiring first text information in response to a first operation, where the first operation is used to copy the first text information;
  • the first subtitle area displays first subtitle information, where the first subtitle information is generated based on the first text information.
  • an apparatus for automatically loading subtitles including: a first obtaining unit configured to execute, in response to a first operation, to obtain first text information, where the first operation is used to copy all The first text information is described; the first display unit is configured to display the first subtitle information in the first subtitle area of the video editing interface, where the first subtitle information is generated based on the first text information.
  • an electronic device comprising: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions to achieve The following operations are: obtaining first text information in response to a first operation, the first operation being used to copy the first text information; displaying the first subtitle information in the first subtitle area of the video editing interface, where the first subtitle information is generated based on the first text information.
  • a non-volatile storage medium is provided.
  • the instructions in the non-volatile storage medium are executed by a processor of an electronic device, the electronic device can execute any one of the above method to automatically load subtitles.
  • a computer program product that, when the instructions in the computer program product are executed by a processor of an electronic device, enables the electronic device to perform the following operations: in response to a first operation, obtain The first text information, the first operation is used to copy the first text information; the first subtitle information is displayed in the first subtitle area of the video editing interface, and the first subtitle information is generated based on the first text information.
  • FIG. 1 is an architectural diagram of an application scenario of a method for automatically loading subtitles according to an exemplary embodiment.
  • Fig. 2 is a schematic flowchart of a method for automatically loading subtitles according to an exemplary embodiment.
  • Fig. 3 is a schematic flowchart of a method for automatically loading subtitles according to another exemplary embodiment.
  • Fig. 4 is a schematic flowchart of a method for automatically loading subtitles according to yet another exemplary embodiment.
  • FIG. 5 is a schematic diagram of a display interface corresponding to a method for automatically loading subtitles according to an exemplary embodiment.
  • Fig. 6 is a schematic diagram of a display interface corresponding to a method for automatically loading subtitles according to another exemplary embodiment.
  • Fig. 7 is a schematic flowchart of a method for automatically loading subtitles according to another exemplary embodiment.
  • Fig. 8 is a schematic flowchart of a method for automatically loading subtitles according to another exemplary embodiment.
  • Fig. 9 is a structural block diagram of an apparatus for automatically loading subtitles according to an exemplary embodiment.
  • Fig. 10 is a structural block diagram of an apparatus for automatically loading subtitles according to another exemplary embodiment.
  • Fig. 11 is a structural block diagram of an apparatus for automatically loading subtitles according to yet another exemplary embodiment.
  • Fig. 12 is a structural block diagram of an electronic device for executing a method for automatically loading subtitles according to an exemplary embodiment.
  • FIG. 1 is an architectural diagram of an implementation environment according to an exemplary embodiment. As shown in FIG. 1 , the following method for automatically loading subtitles is applied to the implementation environment.
  • the implementation environment includes an electronic device 01 and a server 02 . Wherein, the electronic device 01 and the server 02 are interconnected and communicated through a network.
  • the electronic device 01 is a device that displays the first subtitle information.
  • the electronic device 01 obtains the corresponding first subtitle information from the server 02, and displays the first subtitle information in the video editing interface. Alternatively, the electronic device 01 generates the first subtitle information, and displays the first subtitle information.
  • Electronic device 01 is any electronic product that can interact with the user in one or more ways such as a keyboard, touchpad, touchscreen, remote control, voice interaction or handwriting device. Handheld computers, personal computers (Personal Computers, PCs), wearable devices, smart TVs, etc.
  • Server 02 is a server, or a server cluster composed of multiple servers, or a cloud computing service center.
  • the server 02 includes a processor, a memory, a network interface, and the like.
  • embodiments of the present disclosure provide a method, apparatus, electronic device, and non-volatile storage medium for automatically loading subtitles.
  • the execution subject of the display method provided by the embodiment of the present disclosure is the above-mentioned electronic device or server, or a functional module and/or functional entity in the electronic device or server that can implement the video content display method, which can be determined according to actual use requirements,
  • the embodiments of the present disclosure are not limited.
  • the method for automatically loading subtitles provided by the embodiments of the present disclosure is exemplarily described below by taking the execution subject as an electronic device as an example.
  • FIG. 2 is a flowchart of a method for automatically loading subtitles according to an exemplary embodiment. As shown in FIG. 2 , the method for automatically loading subtitles is applied to an electronic device, and includes the following steps:
  • step 210 in response to a first predetermined operation for copying the first text information, the first text information is acquired.
  • the first predetermined operation is any single operation for copying the first text information, or an operation group formed by a series of operations.
  • the first predetermined operation is a keyboard and mouse operation, for example, the first predetermined operation is a keyboard and mouse operation of "Ctrl+C".
  • the first predetermined operation is a click operation or a long-press operation or the like.
  • the first predetermined operation may be referred to as the first operation.
  • the above step 210 that is, the electronic device obtains the first text information in response to the first operation, and the first operation is used to copy the first text information.
  • the electronic device stores the copied first text information in a memory area in response to the first operation, and acquires the first text information from the memory area when editing the target video.
  • the memory area is used to store text information obtained by copying the target text.
  • step 220 the first subtitle information generated based on the first text information is displayed in the first subtitle area of the video editing interface. That is, the first subtitle information is generated according to the first text information, and the first subtitle information is displayed in the first subtitle area of the video editing area.
  • the first text information is acquired; then, the first subtitle information is displayed on the video editing interface according to the first text information.
  • the first subtitle information is generated in the first subtitle area of the video editing interface according to the copied text, which improves the efficiency of editing subtitles.
  • the editing interface is displayed on the upper panel, and the first text information is acquired based on the displayed prompt information.
  • the process of acquiring the text information will be described below through the embodiment shown in FIG. 3 .
  • the method for automatically loading subtitles is applied to an electronic device and includes the following steps:
  • step 310 in response to the first predetermined operation, prompt information is displayed on the video editing interface, where the prompt information is used to remind whether to determine the first subtitle information according to the first text information. That is, the prompt information is used to prompt whether to generate the first subtitle information based on the first text information.
  • the prompt information is a prompt pop-up window, and the prompt pop-up window displays "The system detects that you have copied text, whether it is automatically added as a subtitle".
  • the prompt information is not limited to the prompt pop-up window displaying the above text, but can also be other prescribed prompt information in the video editor.
  • the prompt information is a rectangular icon displaying "copy?".
  • the prompt information is a prompt icon and the like.
  • step 320 in response to the second predetermined operation acting on the prompt information, the first text information is acquired.
  • the second predetermined operation is at least one of a click operation, a long-press operation, a double-click operation, and a slide operation, and is determined according to the actual situation.
  • the first predetermined operation may be referred to as the first operation
  • the second predetermined operation may be referred to as the second operation.
  • the above step 302 that is, in response to the first operation, displays prompt information on the video editing interface, and in response to the action on the The second operation on the prompt information obtains the first text information.
  • step 330 the first subtitle information generated based on the first text information is displayed in the first subtitle area of the video editing interface.
  • the implementation of this step 330 is the same as the implementation of the above-mentioned step 220 .
  • the electronic device needs to generate the first subtitle information first, and then display the first subtitle information in the first subtitle area.
  • the following describes the process of determining the first subtitle information through the embodiment shown in FIG. 4 .
  • the method for automatically loading subtitles is applied to an electronic device, and includes the following steps:
  • step 410 in response to the first predetermined operation for copying the first text information, the first text information is acquired.
  • the implementation of this step 410 is the same as the implementation of the above-mentioned step 210 .
  • step 420 first subtitle information is determined according to the first text information.
  • step 430 the first subtitle information is displayed in the first subtitle area.
  • the duration of the first text information displayed in a predetermined manner is consistent with the duration of the video being edited, or, although the first text information is displayed in a predetermined manner
  • the duration of the video is shorter than the duration of the video being edited.
  • the first text information is the first subtitle information, that is, when the duration of the first text information displayed in a predetermined manner is shorter than that of the video being edited.
  • the duration of the video is determined, and the first text information is determined as the first subtitle information when only subtitles need to be added to the video segments in the video.
  • the above step 420 includes: determining the first text information as the first subtitle information.
  • the first text information cannot meet the requirements corresponding to the first subtitle information, and the first text information is a part of the first subtitle information.
  • the above step 420 includes: Determine whether the predetermined duration displayed by the first text information in a predetermined manner is less than the duration of the video being edited; if the predetermined duration is less than the duration of the video, compare the first text information with a predetermined subtitle segment in the subtitle database, wherein the subtitle database Including a plurality of predetermined subtitle segments; when the similarity between the first text information and the target part is greater than the first predetermined threshold, according to the length of the video being edited, it is determined that at least part of the target subtitle segment including the target part is the first subtitle information , wherein the target portion is a portion of the target subtitle segment that has the same predetermined duration as the first text information, and the target subtitle segment is one of the plurality of predetermined subtitle segments including the target portion.
  • the first subtitle information corresponding to the video being edited is
  • the first text message is "Once there was a sincere love in front of me, I didn't sleep it”
  • the subtitle database is a database of classic lines in the movie
  • one of the predetermined subtitles is "Once there was a sincere love In front of me, I didn't sleep it. When I lost it, I invoked it too much. The most painful thing in the world is this.
  • this predetermined subtitle segment is the target subtitle segment, the target part of which is "There used to be a sincere love on the In front of me, I did not drain”
  • the first predetermined threshold is 90%, the similarity between the target part and the first text information is 100% greater than the first predetermined threshold, therefore, it can be determined that the predetermined subtitle segment "there was once a sincere love" In front of me, I didn't sleep it. When I lost it, I invoked it too much. The most painful thing in the world is this.
  • the target subtitle segment is used as the first subtitle information, and the display mode of the corresponding first subtitle information is adjusted according to the duration of the edited video, for example, The display duration corresponding to a sentence is 2s. In order to make the display duration of the first subtitle information and the video duration the same, the display duration of a sentence is increased. If the duration of the edited video is less than the duration of the target subtitle segment displayed in a predetermined manner, the part of the target subtitle segment that includes the target part and has the same duration as the edited video is used as the first subtitle information, combined with the voice information of the video and the target subtitle segment. part to intercept the part of the target subtitle segment as the first subtitle information.
  • the subtitle database is actually a material library
  • the predetermined subtitle segments are lines of classic segments in movies, or classic segments popular on the Internet, or any other suitable text segments such as a poetry library, etc.
  • Those skilled in the art select a suitable subtitle database according to the actual situation.
  • the method also includes the step of constructing a subtitle database, and the construction process can refer to the construction process of other language databases, which will not be repeated here.
  • the predetermined subtitle segment in the subtitle database may be referred to as a subtitle segment
  • the first predetermined threshold may be referred to as a first threshold
  • the target portion may be referred to as reference text information
  • the predetermined duration may be referred to as a first duration
  • the predetermined manner may be referred to as a first duration.
  • the electronic device generates the first subtitle information based on the first text information, including: the electronic device compares the first text information with a plurality of subtitle segments in the subtitle database; when the similarity between the first text information and the reference text information is greater than a first threshold, A target subtitle segment is selected from a plurality of subtitle segments; based on the target subtitle segment and the video being edited, the first subtitle information is determined.
  • the subtitle database includes multiple subtitle segments
  • the target subtitle segment is any subtitle segment including reference text information among the multiple subtitle segments
  • the first duration corresponding to the first text information is equal to the second duration corresponding to the reference text information
  • the first duration is the duration for which the first text information is displayed according to the target mode
  • the second duration is the duration for which the reference text information is displayed according to the target mode.
  • the target mode is the display mode of the subtitle information.
  • the target mode means that the display duration of each word is fixed, and the display duration of the subtitle information is determined based on the number of characters included in the subtitle information, or the target mode is other modes. limit.
  • the electronic device determines, based on the first text information, the text information displayed in the subtitle segment in the target manner with the same duration as the first duration, and the determined text information is the reference text information, and then The similarity between the first text information and the reference text information is determined.
  • a subtitle segment includes one or more reference text information.
  • the first text information when the first duration is less than the duration of the video, the first text information is compared with multiple subtitle segments; and when the first duration is greater than the duration of the video being edited, the first text information is determined If the subtitle information is the first subtitle information, the subtitle requirement of the video can be satisfied, and there is no need to perform this with the first text information and multiple subtitle segments in the subtitle database.
  • At least a portion may be referred to as target textual information. Based on the relationship between the duration of the video and the duration of the target subtitle segment displayed in the target manner, it is determined whether to use the entire target subtitle segment as the first subtitle information or to use part of the text information in the target subtitle segment as the first subtitle information. That is, when the duration of the video is equal to the duration of the target subtitle segment displayed in the target manner, the target subtitle segment is determined as the first subtitle information. For example, the duration of the video is 5 seconds, and the duration of the target subtitle segment displayed in the target manner is 5 seconds. In this case, the target subtitle segment is directly determined as the first subtitle information.
  • the target subtitle segment is determined as the first subtitle information, and then the target mode when the target subtitle segment is displayed needs to be adjusted, so that the target subtitle segment is displayed according to the adjusted
  • the target mode of the display duration is equal to the duration of the video.
  • the target text information is selected from the target subtitle segment, and the target text information is determined as the first subtitle information, the target text information includes reference text information, and the target text information
  • the duration of the text information displayed in the targeted manner is equal to the duration of the video. For example, the length of the video is 4 seconds, and the target subtitle segment "I once had a sincere love in front of me, I didn't drain it, and when I lost it, I invoked it too much.
  • the reference text information is "I once had a sincere love in front of me, but I didn't drain it", and the duration of the video is less than the duration of the target subtitle segment.
  • the target text message with a duration of 4 seconds that is, select the target text message "I once had a sincere love in front of me, I didn't drain it, and when I lost it, I invoked it too much, the most painful thing in the world is nothing. too much.”
  • the step of displaying the first subtitle information in the first subtitle area includes: determining position information of the first subtitle sub-area, where the position information includes a length and a The starting position, the length of the first subtitle subregion is the display length of the first subtitle subregion, the length of the first subtitle subregion represents the video duration corresponding to the first subtitle segment in the first subtitle subregion, and the first subtitle subregion
  • the starting position of the first subtitle segment represents the first video frame image corresponding to the first subtitle segment, and the video frame image corresponding to the first subtitle segment can be determined according to the position information, wherein the first subtitle segment is divided according to the first subtitle information.
  • a plurality of first subtitle sub-regions are displayed in the first subtitle region, that is, according to the length and starting position of the first subtitle sub-region, the corresponding first subtitle sub-regions are displayed, and the first subtitle segments are displayed one by one. Correspondingly located in the first subtitle sub-region, as shown in FIG. 5 .
  • the electronic device determines the position information of each first subtitle sub-area in the first subtitle area, and displays the corresponding first subtitle in each first subtitle sub-area based on the position information of each first subtitle sub-area. part. Wherein, a corresponding first subtitle segment is displayed in each first subtitle sub-area.
  • the video editing interface further includes an image area, the image area displays multiple video frame images, the multiple video frame images are sequentially arranged along a predetermined direction, and the multiple first subtitle sub-areas are sequentially arranged along the predetermined direction, One first subtitle sub-area is located on one side of the corresponding multiple video frame images, and the multiple video frame images are displayed correspondingly with their corresponding first subtitle sub-areas, and the predetermined direction is the length direction of the first subtitle sub-area.
  • the starting position of the first subtitle sub-area is aligned with the starting position of the first video frame image in the corresponding multiple video frame images, that is, the starting position of the first subtitle sub-area is projected on the video frame image area at the first position. On the starting edge of a video frame image.
  • the length of the first subtitle sub-region is the same as the display length of the corresponding multiple video frame images.
  • the projection of the starting position of the first subtitle sub-region in the video frame image region is located in the first video frame image of the corresponding plurality of video frame images, and the length of the first subtitle sub-region is smaller than the display of the corresponding plurality of video frame images total length.
  • the five video frame images 520 on one side of the first subtitle sub-area 510 are the five video frame images 530 corresponding to the first subtitle segment in the first subtitle sub-area.
  • the predetermined direction may be referred to as the target direction
  • the display manner of multiple video frame images and multiple first subtitle sub-regions in the video editing interface includes: the electronic device is in the image area of the video editing interface along the target direction. A plurality of video frame images are displayed in a direction, and the target direction is the length direction of the first subtitle sub-area; in the first subtitle area, a plurality of first subtitle sub-areas are displayed along the target direction, and the first subtitle sub-area is located in the corresponding multi-layer sub-area. one side of the video frame image.
  • the length direction of the first subtitle sub-region is from left to right
  • multiple video frame images are displayed sequentially from left to right
  • the multiple first subtitle sub-regions are displayed from left to right. displayed sequentially to the right.
  • the first subtitle sub-region is not limited to the one side of the video frame image shown in FIG. Viewed from this perspective), it can also be located on the upper side of the video frame image.
  • the first subtitle sub-area is also displayed in other display positions, for example, the first subtitle sub-area is located on the left or right side of the corresponding multiple video frame images.
  • the area displayed by the image frame is the main track
  • the area displayed by the subtitles is the sub-track.
  • the sub-track and the main track are distributed in the direction perpendicular to the predetermined direction, as shown in FIG. 5
  • the first sub-track area 530 in FIG. 5 is the sub-track
  • the frame image area 540 is the main track.
  • the step of determining the position information of the first subtitle sub-region further includes: dividing the first subtitle information into A plurality of first subtitle segments, for example, a sentence is divided into one first subtitle segment; the duration of the video being edited is obtained; the length of each first subtitle sub-region is determined according to the number and duration of the first subtitle segments. Since the length of the first subtitle sub-region represents the duration of its corresponding video, the video duration corresponding to the first subtitle segment in each first subtitle sub-region can be determined according to the duration of the video and the number of the first subtitle segments. The video duration corresponding to the first subtitle segment can determine the number of video frame images corresponding to the first subtitle segment, thereby determining the length of the corresponding first subtitle sub-region according to the number of corresponding video frame images.
  • the length of the first subtitle sub-region is equal to the total display length of multiple video frame images corresponding to the first subtitle sub-region, or the length of the first subtitle sub-region is smaller than the display length of the corresponding multiple video frame images Total length; determine the starting point of the first subtitle area as the starting position of the first first subtitle sub-area, and the starting position of other first subtitle sub-areas is the end position of the previous first subtitle sub-area, where the "After" means that after the length direction of the first subtitle sub-region, the distribution interval of two adjacent first subtitle segments is determined according to the actual situation.
  • two adjacent first subtitle segments are Contact, that is, the separation distance is 0, or two adjacent first subtitle segments do not touch, and the separation distance is greater than 0.
  • the electronic device determines the starting point of the first subtitle area as the starting position of the first subtitle sub-area, and determines the ending position of any first subtitle sub-area as the starting position of the next first subtitle sub-area. , or the position after the end position of any subtitle subregion is determined as the start position of the next subtitle subregion. That is, there is a target distance between two adjacent first subtitle sub-regions. When the target distance is 0, the end position of the current first subtitle sub-region is the start position of the next first sub-title sub-region. When the target is greater than 0, the position after the end position of the current first subtitle sub-area and the target distance from the end distance is the start position of the next first subtitle sub-area.
  • the step of determining the first subtitle information according to the first text information includes: acquiring text content corresponding to the voice content of the video being edited; Whether the similarity between the text information and the text content is greater than the second predetermined threshold; if the similarity between the first text information and the text content is greater than the second predetermined threshold, the text content is determined to be the first subtitle information.
  • the step of determining the position information of the first subtitle sub-region includes: determining the position information of the first subtitle segment according to the correspondence between the text content and the video frame image.
  • the text content has a corresponding relationship with the voice content, and the voice content has a corresponding relationship with the video frame image. Therefore, the text content and the video frame image have a corresponding relationship. According to the corresponding relationship, the text content of each part (the first subtitle) can be more accurately determined. segment) corresponding video frame image, so as to more accurately determine the position information of the first subtitle sub-region, i.e.
  • the length of the first subtitle sub-region is the same as the total display length of the corresponding multiple frame images, or is smaller than the corresponding multiple frame images. The total length of the display.
  • the above method further includes: in response to acting on the first target subtitle subtitle In the third predetermined operation on the area, the starting position of the first target subtitle sub-area is changed from the initial position to the predetermined position.
  • the third predetermined operation may be referred to as a third operation, and the predetermined position may be referred to as a target position.
  • the electronic device After displaying the first subtitle information in the first subtitle area, the electronic device changes the start position of the first target subtitle sub-area to the target position in response to the third operation acting on the first target subtitle sub-area.
  • the target position is different from the starting position, and the third operation is a drag operation, a stretching operation, or other operations on the first target subtitle sub-region, and the embodiment of the present disclosure does not limit the operation mode of the third operation.
  • the above method further includes: in response to a fourth predetermined operation acting on the second target subtitle sub-area , the length of the second target subtitle sub-region changes from the initial length to a predetermined length, and the number of video frame images corresponding to the second target subtitle sub-region changes from a first predetermined number to a second predetermined number.
  • the display position of the second target subtitle sub-area is adjusted, thereby further ensuring that the adjusted display position of the second target subtitle sub-area is more accurate, thereby further ensuring that the corresponding first subtitle segment in the second target subtitle sub-area is Video duration and video frame images are more accurate.
  • the fourth predetermined operation may be referred to as the fourth operation, and the predetermined length may be referred to as the target length.
  • the electronic device displays the first subtitle information in the first subtitle area, in response to the fourth operation acting on the second target subtitle sub-area , and change the length of the second target subtitle subregion to the target length.
  • the fourth operation is a stretching operation or other operations, and the embodiment of the present disclosure does not limit the operation manner of the fourth operation.
  • the above method further includes: in response to a fifth predetermined operation of copying the second text information, acquiring the second text information, that is, the electronic device in response to the fifth operation, acquiring Second text information, the fifth operation is used to copy the second text information; according to the second text information, display the second subtitle information in the second subtitle area of the video editing interface.
  • the second subtitle information is generated based on the second text information
  • the second subtitle area is located on either side of the first subtitle, and can be set according to the actual situation.
  • the second subtitle area 610 is located on a side of the first subtitle area 620 away from the video frame image area 630 .
  • the above method further includes: in response to the fifth predetermined operation, playing the video including the first subtitle information. That is, the electronic device plays the video including the first subtitle information in response to the sixth operation.
  • FIG. 7 is a flow chart of a method for automatically loading subtitles according to an exemplary embodiment. As shown in FIG. 7 , the method for automatically loading subtitles is applied to an electronic device, and includes the following steps 710 to 730 .
  • step 710 it is detected whether there is a first predetermined operation for copying the first text information. That is, it is detected whether there is a first operation, and the first operation is used to copy the first text information.
  • step 720 when the first predetermined operation is detected, first text information is acquired.
  • step 730 first subtitle information is generated based on the first text information, and the first subtitle information is displayed in the subtitle area of the video editing interface.
  • the first subtitle information generated based on the first text information is displayed in the subtitle area of the video editing interface.
  • the first subtitle information is generated in the first subtitle area of the video editing interface according to the copied text, which improves the efficiency of editing subtitles.
  • the implementation process of the above step 720 and the implementation process of the step 730 are the same as the implementation process of the above steps 210 and 220, and will not be repeated here.
  • FIG. 8 is a flow chart of a method for automatically loading subtitles according to an exemplary embodiment. As shown in FIG. 8 , the method for automatically loading subtitles is applied to an electronic device, and includes the following steps 810 to 820 .
  • step 810 when the target video is edited, the first text information is obtained from a memory area used for storing text information, wherein the memory area is used to store the target text obtained by copying the target text (the first predetermined operation). text information.
  • the first text information may be referred to as first text information
  • the target text may be referred to as target text.
  • the above step 801 is, when editing the target video, obtain the first text information from the memory area, wherein the memory area is used to store the text information obtained by copying the target text.
  • the target video is any video
  • the target text is any text that can be copied.
  • the text information stored in the memory area is obtained by copying the target text before editing the target video, or by copying the target text during the editing process of the target video. According to the operation, the embodiment of the present disclosure does not limit the storage time of the text information in the memory area.
  • step 820 first subtitle information is generated based on the first text information, and the first subtitle information is displayed in the first subtitle area of the video editing interface. That is, based on the first text information, the first subtitle information corresponding to the first text information is generated, and the first subtitle information is displayed in the first subtitle area on the editing side of the video.
  • the first subtitle information is generated according to the copied first text information, and the corresponding first subtitle information is displayed in the first subtitle area.
  • the first subtitle information is generated in the first subtitle area of the video editing interface according to the copied text, which improves the efficiency of editing subtitles.
  • step 810 the stored first text information is obtained from the memory area, while step 210 is to perform a copy operation on the first text information to obtain the first text information.
  • the first predetermined operation, the second predetermined operation, the third predetermined operation, the fourth predetermined operation, and the fifth predetermined operation in the embodiment of the present disclosure can be any feasible operations, for example, including a click operation, a sliding operation , an operation of at least one of a long-press operation and a double-click operation.
  • Those skilled in the art can select appropriate operations or operation combinations according to actual situations to correspond to the five predetermined operations of the present disclosure.
  • Fig. 9 is a block diagram of an apparatus for automatically loading subtitles according to an exemplary embodiment.
  • the apparatus includes a first acquisition unit 910 and a first display unit 920 .
  • the first obtaining unit 910 is configured to perform, in response to a first predetermined operation for copying the first text information, obtain the first text information. That is, the first obtaining unit 910 is configured to obtain the first text information in response to the first operation, and the first operation is used for copying the first text information.
  • the first predetermined operation is any single operation for copying the first text information, or an operation group formed by a series of operations.
  • the first predetermined operation is a keyboard and mouse operation, for example, the first predetermined operation is a keyboard and mouse operation of "Ctrl+C".
  • the first predetermined operation is a click operation or a long-press operation or the like.
  • the first display unit 920 is configured to display the first subtitle information generated based on the first text information in the first subtitle area of the video editing interface. That is, the first display unit 920 is configured to display the first subtitle information in the first subtitle area of the video editing interface, where the first subtitle information is generated based on the first text information.
  • the first subtitle information is generated in the first subtitle area of the video editing interface according to the copied text, which improves the efficiency of editing subtitles.
  • the first acquisition unit includes a first display module and an acquisition module.
  • the first display module is configured to display prompt information on the video editing interface in response to the first predetermined operation, where the prompt information is used to remind whether the first subtitle information is determined according to the first text information, that is, the prompt information is used to remind whether to determine whether the first subtitle information is based on the first text information.
  • the first text information generates first subtitle information.
  • the prompt information is a prompt pop-up window, and the prompt pop-up window displays "The system detects that you have copied text, whether it is automatically added as a subtitle". Of course, it is not limited to displaying a prompt pop-up window with text, but also other prescribed prompt information in the video editor.
  • the prompt information is a rectangular icon displaying "copy?". Those skilled in the art can set appropriate prompt information according to the actual situation.
  • the acquiring module is configured to perform, in response to the second predetermined operation acting on the prompt information, acquiring the first textual information.
  • the second predetermined operation is at least one of a click operation, a long-press operation, a double-click operation, and a slide operation, and is determined according to the actual situation.
  • the acquiring module is configured to display prompt information on the video editing interface in response to the first operation, and acquire first text information in response to the second operation acting on the prompt information.
  • the first display unit includes a determination module and a second display module, wherein the determination module is configured to determine the first subtitle information according to the first text information; the second display module is configured to display the first subtitle information in the first subtitle area.
  • the duration of the first text information displayed in a predetermined manner is consistent with the duration of the video being edited, or, although the first text information is displayed in a predetermined manner
  • the duration of the video is shorter than the duration of the video being edited.
  • the first text information is the first subtitle information, that is, the duration of the first text information displayed in a predetermined manner is shorter than that of the video being edited.
  • the first text information is determined as the first subtitle information when only the subtitles need to be added to the video clips in the video.
  • the determining module is configured to perform determining that the first text information is the first subtitle information, that is, the determining module is configured to perform determining that the first text information is the first subtitle information.
  • the determining module further includes: The first determination sub-module, the second determination sub-module and the third determination sub-module, wherein the first determination sub-module is configured to determine whether the predetermined duration of the display of the first text information in a predetermined manner is less than the duration of the video being edited;
  • the second determination sub-module is configured to compare the first text information with the predetermined subtitle segments in the subtitle database when the predetermined duration is less than the duration of the video, wherein the subtitle database includes a plurality of predetermined subtitle segments;
  • the third determination sub-module is configured to determine that at least a portion of the target subtitle segment including the target portion is the first subtitle information according to the length of the video being edited when the similarity between the first text information and the target portion is greater than a first predetermined threshold, wherein, The target portion is a portion of the
  • the first text message is "Once there was a sincere love in front of me, I didn't sleep it”
  • the subtitle database is a database of classic lines in the movie
  • one of the predetermined subtitles is "Once there was a sincere love In front of me, I didn't sleep it. When I lost it, I invoked it too much. The most painful thing in the world is this.
  • this predetermined subtitle segment is the target subtitle segment, the target part of which is "There used to be a sincere love on the In front of me, I didn't drain”
  • the first predetermined threshold is 90%, the similarity between the target part and the first text information is 100% greater than the first predetermined threshold, therefore, it can be determined that the predetermined subtitle segment is "there was once a sincere love" In front of me, I didn't sleep it. When I lost it, I invoked it too much. The most painful thing in the world is this.
  • the target subtitle segment is used as the first subtitle information, and the display mode of the corresponding first subtitle information is adjusted according to the duration of the edited video, for example,
  • the display duration corresponding to a sentence is 2s.
  • the display duration of a sentence can be increased. If the duration of the edited video is less than the duration of the target subtitle segment displayed in a predetermined manner, the part of the target subtitle segment that includes the target part and has the same duration as the edited video is used as the first subtitle information, combined with the voice information of the video and the target subtitle segment. part to intercept the part of the target subtitle segment as the first subtitle information.
  • the above-mentioned subtitle database is actually a material library, in which the predetermined subtitle segments are lines of classic segments in movies, or classic segments popular on the Internet, or any other suitable text such as a poetry library and so on. segment, those skilled in the art can select an appropriate subtitle database according to the actual situation.
  • the method also includes the step of constructing a subtitle database, and the construction process refers to the construction process of other language databases, and will not be repeated here.
  • the second determination sub-module is configured to perform the comparison of the first text information and a plurality of subtitle segments in the subtitle database; the third determination sub-module is configured to perform when the similarity between the first text information and the reference text information is greater than that of the first text information and the reference text information.
  • the target subtitle segment is selected from the plurality of subtitle segments; the third determination sub-module is further configured to determine the first subtitle information based on the target subtitle segment and the video being edited.
  • the subtitle database includes multiple subtitle segments
  • the target subtitle segment is any subtitle segment including reference text information among the multiple subtitle segments
  • the first duration corresponding to the first text information is equal to the second duration corresponding to the reference text information
  • the first duration is the duration for which the first text information is displayed according to the target mode
  • the second duration is the duration for which the reference text information is displayed according to the target mode.
  • the second determining sub-module is configured to compare the first text information with the plurality of subtitle segments when the first duration is less than the duration of the video.
  • the third determination submodule is configured to select target text information from the target subtitle segment when the duration of the video is less than the duration of the target subtitle segment displayed in the target manner, and determine the target text information as The first subtitle information, the target text information includes reference text information, and the duration of the target text information displayed according to the target mode is equal to the duration of the video; when the duration of the video is not less than the duration of the target subtitle segment displayed in the target mode, the target subtitles are displayed in the target mode.
  • the segment is determined as the first subtitle information.
  • the second display module includes a fourth determination sub-module and a first display sub-module.
  • the fourth determination sub-module is configured to perform determination of position information of the first subtitle sub-region, the position information includes a length and a starting position, the length of the first sub-title sub-region is the display length of the first sub-title sub-region, and the first sub-title sub-region is the display length of the first sub-title sub-region.
  • the length of the subregion represents the video duration corresponding to the first subtitle segment in the first subtitle subregion
  • the start position of the first subtitle subregion represents the first video frame image corresponding to the first subtitle segment, which can be determined according to the position information What are the video frame images corresponding to the first subtitle segment, wherein the first subtitle segment is obtained by dividing according to the first subtitle information.
  • the first display submodule is configured to display a plurality of first subtitle subregions in the first subtitle region according to the position information, that is, according to the length and starting position of the first subtitle subregion, determine the length and the first subtitle subregion. Starting position, the first subtitle segments are located in the first subtitle sub-region in a one-to-one correspondence, as shown in FIG. 5 .
  • the fourth determination sub-module is configured to perform the respective determination of the position information of each first sub-title sub-region in the first sub-title region
  • the first display sub-module is configured to perform the position information based on each of the first sub-title sub-regions. , and display the corresponding first subtitle segment in each first subtitle sub-area.
  • the video editing interface further includes an image area, the image area displays multiple video frame images, the multiple video frame images are sequentially arranged along a predetermined direction, and the multiple first subtitle sub-areas are sequentially arranged along the predetermined direction,
  • One first subtitle sub-area is located on one side of the corresponding plurality of video frame images, and the plurality of video frame images and their corresponding first sub-title sub-areas are displayed correspondingly, and the predetermined direction is the length direction of the first sub-title sub-area.
  • the starting position of the first subtitle sub-area is aligned with the starting position of the first video frame image in the corresponding multiple video frame images, that is, the starting position of the first subtitle sub-area is projected on the video frame image area at the first position.
  • the length of the first subtitle sub-region is the same as the display length of the corresponding multiple video frame images.
  • the projection of the starting position of the first subtitle sub-area in the video frame image area is located in the first video frame image of the corresponding plurality of video frame images, and the length of the first subtitle sub-area is smaller than the display of the corresponding plurality of video frame images. total length.
  • the five video frame images 520 on one side of the first subtitle sub-area 510 are the five video frame images 520 corresponding to the first subtitle segment in the first subtitle sub-area.
  • the first display unit is also configured to display a plurality of video frame images along the target direction in the image area of the video editing interface, and the target direction is the length direction of the first subtitle sub-area; In the area, a plurality of first subtitle sub-areas are displayed along the target direction, and the first subtitle sub-areas are located on one side of the corresponding plurality of video frame images.
  • the first subtitle sub-region is not limited to the one side of the video frame image shown in FIG. Viewed from this perspective), it can also be located on the upper side of the video frame image.
  • the first subtitle sub-area can also be displayed in other display positions, for example, the first subtitle sub-area is located on the left or right side of the corresponding multiple video frame images.
  • the area displayed by the image frame is the main track
  • the area displayed by the subtitles is the sub-track.
  • the sub-track and the main track are distributed in the direction perpendicular to the predetermined direction, as shown in FIG. 5
  • the first sub-track area 530 in FIG. 5 is the sub-track
  • the frame image area 540 is the main track.
  • the fourth determination sub-module includes a division sub-module, a first acquisition sub-module, and a fifth determination sub-module module and the sixth determine the sub-module.
  • the dividing submodule is configured to perform dividing the first subtitle information into a plurality of first subtitle segments, for example, a sentence is divided into one first subtitle segment.
  • the first obtaining submodule is configured to perform obtaining the duration of the video being edited.
  • the fifth determining submodule is configured to perform determining the length of each first subtitle subregion according to the number and duration of the first subtitle segments. Since the length of the first subtitle sub-region represents the duration of its corresponding video, the video duration corresponding to the first subtitle segment in each first subtitle sub-region can be determined according to the duration of the video and the number of the first subtitle segments. The video duration corresponding to the first subtitle segment can determine the number of video frame images corresponding to the first subtitle segment, thereby determining the length of the corresponding first subtitle sub-region according to the number of corresponding video frame images.
  • the length of the first subtitle sub-region is equal to the total display length of multiple video frame images corresponding to the first subtitle sub-region, or the length of the first subtitle sub-region is smaller than the display length of the corresponding multiple video frame images total length.
  • the sixth determination sub-module is configured to perform determining that the starting point of the first subtitle area is the starting position of the first first subtitle sub-area, and the starting positions of the other first sub-title sub-areas are first A termination position of the first subtitle sub-area, where "after" refers to the end in the length direction of the first subtitle sub-area, specifically, the distribution interval of two adjacent first subtitle segments is based on the actual situation It is determined that, in some embodiments, two adjacent first subtitle segments are in contact, that is, the separation distance is 0, or two adjacent first subtitle segments are not in contact, and the separation distance is greater than 0.
  • the sixth determination sub-module is configured to perform determining the starting point of the first subtitle area as the starting position of the first subtitle sub-area, and determining the ending position of any first sub-title sub-area as the next first subtitle sub-area. The starting position of the subtitle subarea.
  • the determination module includes a second acquisition submodule, a seventh determination submodule, and an eighth determination submodule.
  • the second obtaining submodule is configured to execute obtaining the text content corresponding to the voice content of the video being edited.
  • the seventh determination submodule is configured to perform determining whether the similarity between the first textual information and the textual content is greater than a second predetermined threshold.
  • the eighth determination submodule is configured to perform determining that the text content is the first subtitle information when the similarity between the first text information and the above-mentioned text content is greater than the second predetermined threshold.
  • the fourth determination sub-module is configured to determine the position information of the first subtitle segment according to the correspondence between the text content and the video frame image. Specifically, the text content has a corresponding relationship with the voice content, and the voice content has a corresponding relationship with the video frame image. , therefore, the text content and the video frame image have a corresponding relationship. According to the corresponding relationship, the video frame image corresponding to each part of the text content (the first subtitle segment) can be more accurately determined, thereby more accurately determining the first subtitle sub-region.
  • Position information that is, to determine the starting position and length of the first subtitle sub-area, the starting position of the first sub-title sub-area corresponds to the starting position of the corresponding first frame image, or at the beginning of the first frame image After the location, the length of the first subtitle sub-region is the same as the total display length of the corresponding multiple frame images, or less than the total display length of the corresponding multiple frame images.
  • the device further includes a first adjustment unit, and the first adjustment unit is configured to display the first subtitle information generated based on the first text information in the first subtitle area of the video editing interface
  • the start position of the first target subtitle sub-region is changed from the initial position to the predetermined position.
  • the first adjustment unit is configured to perform the third operation in response to the third operation acting on the first target subtitle sub-area, to change the starting position of the first target subtitle sub-area to the target position.
  • the above-mentioned apparatus further includes a second adjustment unit, and the second adjustment unit is configured to perform, after displaying the first subtitle information generated based on the first text information in the first subtitle area of the video editing interface, in response to acting on In the fourth predetermined operation on the second target subtitle subregion, the length of the second target subtitle subregion changes from the initial length to the predetermined length, and the number of video frame images corresponding to the second target subtitle subregion changes from the first predetermined number to A second predetermined number.
  • the display position of the second target subtitle sub-area is adjusted, thereby further ensuring that the adjusted display position of the second target subtitle sub-area is more accurate, thereby further ensuring that the corresponding first subtitle segment in the second target subtitle sub-area is Video duration and video frame images are more accurate.
  • the second adjustment unit is configured to perform, in response to the fourth operation acting on the second target subtitle subregion, the length of the second target subtitle subregion to the target length.
  • the above-mentioned apparatus when the video needs to be loaded with dual subtitles, for example, when the video needs to be loaded with Chinese subtitles and English subtitles, the above-mentioned apparatus further includes a second acquisition unit and a second display unit, wherein the second acquisition unit It is configured to perform, after displaying the first subtitle information generated based on the first text information in the first subtitle area of the video editing interface, in response to the fifth predetermined operation of copying the second text information, to obtain the second text information, that is, the first subtitle information.
  • the second acquiring unit is configured to acquire second text information in response to a fifth operation, and the fifth operation is used to copy the second text information;
  • the second display unit is configured to perform, according to the second text information, in the video editing interface
  • the second subtitle area displays second subtitle information.
  • the second subtitle information is generated based on the second text information, the second subtitle area is located on either side of the first subtitle area, and can be set according to actual conditions. In some embodiments, as shown in FIG. 6 , the second subtitle area 610 is located on a side of the first subtitle area 620 away from the video frame image 620 .
  • the above apparatus further includes a playing unit, which is configured to play the video including the first subtitle information in response to the fifth predetermined operation after the subtitles of the video are loaded.
  • FIG. 10 is a block diagram of an apparatus for automatically loading subtitles according to an exemplary embodiment.
  • the apparatus for automatically loading subtitles includes a detection unit 1010 , a second acquisition unit 1020 and a second display unit 1030 .
  • the detection unit 1010 is configured to perform a first predetermined operation of detecting whether there is a copy of the first text information.
  • the second obtaining unit 1020 is configured to obtain the first text information when the first predetermined operation is detected.
  • the second display unit 1030 is configured to generate the first subtitle information based on the first text information, and display the first subtitle information in the subtitle area of the video editing interface.
  • the first display unit displays the first subtitle information generated based on the first text information in the subtitle area of the video editing interface.
  • the first subtitle information can be generated in the first subtitle area of the video editing interface, which improves the efficiency of editing subtitles.
  • Fig. 11 is a flowchart of a method for automatically loading subtitles according to an exemplary embodiment. As shown in Fig. 11 , the method for automatically loading subtitles is used in an electronic device, including a third acquiring unit 1110 and a third display unit 1120.
  • the third obtaining unit 1110 is configured to execute, when the target video is edited, obtain the first text information from the memory area used for storing text information, wherein the memory area is used to store the target text by performing a copy operation ( The text information obtained by the first predetermined operation).
  • the third display unit 1120 is configured to perform generating the first subtitle information based on the first text information, and display the first subtitle information in the first subtitle area of the video editing interface.
  • the first subtitle information is generated according to the copied first text information, and the corresponding first subtitle information is displayed in the first subtitle area.
  • the first subtitle information is generated in the first subtitle area of the video editing interface according to the copied text, which improves the efficiency of editing subtitles.
  • FIG. 12 is a block diagram of an electronic device 01 for performing automatic subtitle loading according to an exemplary embodiment.
  • a storage medium including executable instructions such as a memory 1210 for storing executable instructions, is also provided, and the above-mentioned instructions can be executed by the processor 1220 of the electronic device 01 to complete the above-mentioned method.
  • the storage medium may be a non-transitory computer-readable storage medium, for example, the above-mentioned non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.
  • Embodiments of the present disclosure also provide an electronic device, comprising: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions to implement the following operations: in response to the first operation, Acquiring the first text information, the first operation is for copying the first text information; displaying the first subtitle information in the first subtitle area of the video editing interface, where the first subtitle information is generated based on the first text information.
  • the processor is configured to execute the instructions to implement the method for automatically loading subtitles provided by other embodiments of the foregoing method embodiments.
  • the embodiment of the present disclosure also provides a non-volatile storage medium, when the instructions in the non-volatile storage medium are executed by the processor of the electronic device, the electronic device can perform the following operations: in response to the first operation, obtain The first text information.
  • the first operation is used to copy the first text information; the first subtitle information is displayed in the first subtitle area of the video editing interface, and the first subtitle information is generated based on the first text information.
  • the electronic device when the instructions in the non-volatile storage medium are executed by the processor of the electronic device, the electronic device is enabled to execute the method for automatically loading subtitles provided by other embodiments of the foregoing method embodiments.
  • An embodiment of the present disclosure provides a method for automatically loading subtitles, including: in response to a first predetermined operation for copying first text information, acquiring first text information; displaying in a first subtitle area of a video editing interface generated based on the first text information of the first subtitle information.
  • the step of determining the first subtitle information according to the first text information includes: determining whether the predetermined duration displayed by the first text information in a predetermined manner is less than the duration of the video being edited; if the predetermined duration is less than the duration of the video, setting the The first text information is compared with the predetermined subtitle segments in the subtitle database, wherein the subtitle database includes a plurality of predetermined subtitle segments; when the similarity between the first text information and the target part is greater than the first predetermined threshold, according to the video being edited.
  • Length determine that at least a part of the target subtitle segment including the target part is the first subtitle information, wherein the target part is the part of the target subtitle segment that is the same as the predetermined duration of the first text information, and the target subtitle segment is a plurality of predetermined subtitle segments that include One of the target sections.
  • the step of displaying the first subtitle information in the first subtitle area includes: determining position information of the first subtitle sub-area, where the position information includes a length and a start position, and the length of the first subtitle sub-area is the length of the first subtitle sub-area.
  • the display length of the region, the length of the first subtitle subregion represents the video duration corresponding to the first subtitle segment in the first subtitle subregion, and the start position of the first subtitle subregion represents the first video frame corresponding to the first subtitle segment
  • the image, the first subtitle segment is obtained by dividing according to the first subtitle information; according to the position information, a plurality of first subtitle sub-regions are displayed in the first subtitle region, wherein one first subtitle sub-region has one first subtitle segment.
  • the video editing interface further includes an image area, the image area displays multiple video frame images, the multiple video frame images are sequentially arranged along a predetermined direction, and the multiple first subtitle sub-areas are sequentially arranged along the predetermined direction, One side of the first subtitle sub-region is located on one side of the corresponding plurality of video frame images, and the predetermined direction is the length direction of the first subtitle sub-region.
  • the step of determining the location information of the first subtitle sub-region further includes: dividing the first subtitle information into a plurality of first subtitle segments; obtaining the duration of the video being edited; according to the number and duration of the first subtitle segments , determine the length of each first subtitle subregion; determine the starting point of the first subtitle region as the starting position of the first first subtitle subregion, and the starting position of other first subtitle subregions is in the previous first subregion The end position of the subtitle subregion.
  • the step of determining the first subtitle information according to the first text information includes: acquiring text content corresponding to the voice content of the video being edited; determining whether the similarity between the first text information and the text content is greater than a second predetermined threshold; When the similarity between the text information and the text content is greater than the second predetermined threshold, determine that the text content is the first subtitle information; the step of determining the position information of the first subtitle sub-region includes: according to the correspondence between the text content and the video frame image, determining Location information of the first subtitle segment.
  • An embodiment of the present disclosure provides a method for automatically loading subtitles, including: detecting whether there is a first predetermined operation for copying first text information; in the case of detecting the first predetermined operation, acquiring the first text information; based on the first text The information generates first subtitle information, and displays the first subtitle information in the subtitle area of the video editing interface.
  • An embodiment of the present disclosure provides a method for automatically loading subtitles, including: when editing a target video, acquiring first text information from a memory area used for storing text information, wherein the memory area is used to store the Text information obtained by copying the text; first subtitle information is generated based on the first text information, and the first subtitle information is displayed in the first subtitle area of the video editing interface.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Studio Circuits (AREA)
  • Television Signal Processing For Recording (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本公开关于一种自动加载字幕的方法及电子设备。该自动加载字幕的方法包括:响应于第一操作,获取第一文字信息,所述第一操作用于复制所述第一文字信息;在视频编辑界面的第一字幕区域显示第一字幕信息,所述第一字幕信息是基于所述第一文字信息生成的。

Description

自动加载字幕的方法及电子设备
本申请基于申请号为202011367465.2、申请日为2020年11月27日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本公开涉及视频制作领域,尤其涉及一种自动加载字幕的方法及电子设备。
背景技术
相关技术中,制作视频的过程中,需要手动添加字幕,字幕的添加过程为:拖动视频轨道选择要添加字幕的视频帧,点击输入文本,手动输入或粘贴所需的字幕文本,再拖到下一个要添加字幕的视频帧,重复该过程。
发明内容
本公开提供了一种自动加载字幕的方法、装置、电子设备以及非易失性存储介质。
根据本公开实施例的一方面,提供一种自动加载字幕的方法,包括:响应于第一操作,获取第一文字信息,所述第一操作用于复制所述第一文字信息;在视频编辑界面的第一字幕区域显示第一字幕信息,所述第一字幕信息是基于所述第一文字信息生成的。
根据本公开实施例的另一方面,提供一种自动加载字幕的装置,包括:第一获取单元,被配置为执行响应于第一操作,获取第一文字信息,所述第一操作用于复制所述第一文字信息;第一显示单元,被配置为执行在视频编辑界面的第一字幕区域显示第一字幕信息,所述第一字幕信息是基于所述第一文字信息生成的。
根据本公开实施例的另一方面,提供一种电子设备,包括:处理器;用于存储所述处理器可执行指令的存储器;其中,所述处理器被配置为执行所述指令,以实现如下操作:响应于第一操作,获取第一文字信息,所述第一操作用于复制所述第一文字信息;在视频编辑界面的第一字幕区域显示第一字幕信息,所述第一字幕信息是基于所述第一文字信息生成的。
根据本公开实施例的另一方面,提供一种非易失性存储介质,当所述非易失性存储介质中的指令由电子设的处理器执行时,使得电子设备能够执行任一种上述的自动加载字幕的方法。
根据本公开实施例的另一方面,提供一种计算机程序产品,当所述计算机程序产品中的指令由电子设备的处理器执行时,使得电子设备能够执行如下操作:响应于第一操作,获取第一文字信息,所述第一操作用于复制所述第一文字信息;在视频编辑界面的第一字幕区域显示第一字幕信息,所述第一字幕信息是基于所述第一文字信息生成的。
附图说明
图1是根据一示例性实施例示出的一种自动加载字幕的方法的应用场景的架构图。
图2是根据一示例性实施例示出的一种自动加载字幕的方法的流程示意图。
图3是根据另一示例性实施例示出的一种自动加载字幕的方法的流程示意图。
图4是根据再一示例性实施例示出的一种自动加载字幕的方法的流程示意图。
图5是根据一示例性实施例示出的一种自动加载字幕的方法对应的显示界面的示意图。
图6是根据另一示例性实施例示出的一种自动加载字幕的方法对应的显示界面的示意图。
图7是根据又一示例性实施例示出的一种自动加载字幕的方法的流程示意图。
图8是根据另一示例性实施例示出的一种自动加载字幕的方法的流程示意图。
图9是根据一示例性实施例示出的一种自动加载字幕的装置的结构框图。
图10是根据另一示例性实施例示出的一种自动加载字幕的装置的结构框图。
图11是根据再一示例性实施例示出的一种自动加载字幕的装置的结构框图。
图12是根据一示例性实施例示出的一种用于执行自动加载字幕的方法的电子设备的结构框图。
具体实施方式
相关技术中,在编辑视频的过程中,需要用户手动添加或者粘贴字幕文本,添加字幕的效率较低。
图1是根据一示例性实施例示出的一种实施环境的架构图,如图1所示,下述自动加载字幕的方法应用于该实施环境中。该实施环境包括电子设备01和服务器02。其中,电子设备01和服务器02通过网络互连并通信。
其中,电子设备01为显示第一字幕信息的设备。电子设备01从服务器02获取对应的第一字幕信息,并在视频编辑界面中显示该第一字幕信息。或者,电子设备01生成第一字幕信息,显示该第一字幕信息。
电子设备01是任一种可与用户通过键盘、触摸板、触摸屏、遥控器、语音交互或手写设 备等一种或多种方式进行人机交互的电子产品,例如电子设备为手机、平板电脑、掌上电脑、个人计算机(Personal Computer,PC)、可穿戴设备、智能电视等。
服务器02是一台服务器,或者是由多台服务器组成的服务器集群,或者是一个云计算服务中心。服务器02包括处理器、存储器以及网络接口等。
本领域技术人员应能理解上述电子设备和服务器仅为举例,其他现有的或今后可能出现的电子设备或服务器如可适用于本公开,也应包含在本公开保护范围以内,并在此以引用方式包含于此。
基于图1所示的实施环境,本公开的实施例提供了一种自动加载字幕的方法、装置、电子设备以及非易失性存储介质。
本公开实施例提供的显示方法的执行主体为上述的电子设备或者服务器,或者为该电子设备或者服务器中能够实现该视频内容显示方法的功能模块和/或功能实体,能够根据实际使用需求确定,本公开实施例不作限定。下面以执行主体为电子设备为例,对本公开实施例提供的自动加载字幕的方法进行示例性的说明。
图2是根据一示例性实施例示出的一种自动加载字幕的方法的流程图,如图2所示,该自动加载字幕的方法应用于电子设备中,包括以下步骤:
在步骤210中,响应于用于复制第一文字信息的第一预定操作,获取第一文字信息。
其中,第一预定操作为用于复制第一文字信息的任何单个操作,或者为一系列操作形成的操作组。在该电子设备为个人计算机的情况下,该第一预定操作为键鼠操作,例如,第一预定操作为“Ctrl+C”的键鼠操作。在该电子设备为手机或者PAD等设备时,该第一预定操作为点击操作或者长按操作等。
其中,第一预定操作可称为第一操作,上述步骤210,也即是电子设备响应于第一操作,获取第一文字信息,该第一操作用于复制第一文字信息。
在一些实施例中,电子设备响应于第一操作,将复制的第一文字信息存储在内存区域中,在对目标视频进行编辑时,从该内存区域中获取该第一文字信息。其中内存区域用于存储通过对目标文字进行复制操作得到的文字信息。
在步骤220中,在视频编辑界面的第一字幕区域显示基于第一文字信息生成的第一字幕信息。即根据第一文字信息生成第一字幕信息,并将该第一字幕信息显示在视频编辑区域的第一字幕区域中。
本公开实施例中,首先,响应于用于复制第一文字信息的第一预定操作,获取第一文字信息;之后,根据该第一文字信息在视频编辑界面上显示第一字幕信息。该方案中,根据复制的文字,在视频编辑界面的第一字幕区域中生成第一字幕信息,提高了编辑字幕的效率较 高。
在实际的应用过程中,在存在多个复制文字信息的预定操作,且复制的多个文字信息中的部分文字信息用于视频的字幕的情况下,为了提高字幕的准确率,先在编辑界面上显示提示信息,基于显示的提示信息获取第一文字信息,下面通过图3所示的实施例对获取文字信息的过程进行说明。如图3所示,该自动加载字幕的方法应用于电子设备中,包括以下步骤:
在步骤310中,响应于第一预定操作,在视频编辑界面上显示提示信息,该提示信息用于提醒是否根据第一文字信息确定第一字幕信息。即该提示信息用于提醒是否基于第一文字信息生成第一字幕信息。实际的应用过程中,该提示信息为提示弹窗,该提示弹窗显示有“系统检测到您进行了文字复制,是否自动添加为字幕”。当然,并不限于显示有上述文字的提示弹窗,还能够为其他在视频编辑器中的规定的提示信息,例如,该提示信息为一个显示有“copy?”的矩形图标。本领域技术人员能够根据实际情况设置合适的提示信息,例如提示信息为提示图标等。
在步骤320中,响应于作用在提示信息上的第二预定操作,获取第一文字信息。该第二预定操作为点击操作、长按操作、双击操作和滑动操作中的至少一个,根据实际情况来确定。在接收到作用在该提示信息上的第二预定操作的情况下,即能够确定根据第一文字信息来确定第一字幕信息,因而,基于该第二预定操作,获取第一文字信息。
其中,第一预定操作可称为第一操作,第二预定操作可称为第二操作,上述步骤302,也即是响应于第一操作,在视频编辑界面上显示提示信息,响应于作用在提示信息上的第二操作,获取第一文字信息。
在步骤330中,在视频编辑界面的第一字幕区域显示基于第一文字信息生成的第一字幕信息。该步骤330的实施方式与上述步骤220的实施方式同理。
在一些实施例中,电子设备需要先生成第一字幕信息,再在第一字幕区域显示该第一字幕信息,下面通过图4所示的实施例对确定第一字幕信息的过程进行说明。如图4所示,该自动加载字幕的方法应用于电子设备中,包括以下步骤:
在步骤410中,响应于用于复制第一文字信息的第一预定操作,获取第一文字信息。该步骤410的实施方式与上述步骤210的实施方式同理。
在步骤420中,根据第一文字信息确定第一字幕信息。
在步骤430中,在第一字幕区域显示第一字幕信息。
在实际的应用过程中,在一些实施例中,例如,在第一文字信息按照预定方式显示时的时长和正在编辑的视频的时长一致的情况下,或者,在虽然第一文字信息按照预定方式显示时的时长相对正在编辑的视频的时长较短,但是,该视频仅部分需要字幕的情况中,第一文 字信息为第一字幕信息,也即是在第一文字信息按照预定方式显示的时长小于正在编辑的视频的时长,且只需对视频中的视频片段添加字幕的情况下,将第一文字信息确定为第一字幕信息。对应地,上述步骤420包括:将第一文字信息确定为第一字幕信息。
在一些实施例中,第一文字信息不能满足第一字幕信息对应的需求,第一文字信息作为第一字幕信息中的一部分,为了进一步准确、高效且完整地显示第一字幕信息,上述步骤420包括:确定第一文字信息按照预定方式显示的预定时长是否小于正在编辑的视频的时长;在预定时长小于视频的时长的情况下,将第一文字信息与字幕数据库中的预定字幕段进行比较,其中,字幕数据库包括多个预定字幕段;在第一文字信息与目标部分的相似度大于第一预定阈值的情况下,根据正在编辑的视频的长度,确定目标字幕段的包括目标部分的至少部分为第一字幕信息,其中,目标部分为目标字幕段中与第一文字信息的预定时长相同的一部分,目标字幕段为多个预定字幕段中包括目标部分的一个。该方案中,根据第一文字信息自动补全正在编辑的视频对应的第一字幕信息。
例如,第一文字信息为“曾经有一份真诚的爱情放在我面前,我没有珍惜”,字幕数据库为电影中的经典台词片段的数据库,其中的一个预定字幕段为“曾经有一份真诚的爱情放在我面前,我没有珍惜,等我失去的时候我才后悔莫及,人世间最痛苦的事莫过于此,如果上天能够给我一个再来一次的机会,我会对那个女孩子说三个字,我爱你,如果非要在这份爱上加上一个期限,我希望是一万年”,这个预定字幕片段就是目标字幕片段,其中的目标部分为的“曾经有一份真诚的爱情放在我面前,我没有珍惜”,如果第一预定阈值为90%,目标部分和第一文字信息的相似度为100%大于第一预定阈值,因此,能够确定预定字幕段“曾经有一份真诚的爱情放在我面前,我没有珍惜,等我失去的时候我才后悔莫及,人世间最痛苦的事莫过于此,如果上天能够给我一个再来一次的机会,我会对那个女孩子说三个字,我爱你,如果非要在这份爱上加上一个期限,我希望是一万年”为目标字幕段,后续根据编辑的视频的时长来确定对应的第一字幕信息,首先确定编辑的视频时长与该目标字幕段按照预定方式显示的时长是否相同,如果相同,则确定目标字幕段为第一字幕信息。如果编辑的视频时长大于该目标字幕段按照预定方式显示的时长,则根据该目标字幕段作为第一字幕信息,并根据编辑的视频的时长来调整对应的第一字幕信息的显示方式,例如,一句话对应的显示时长为2s,为了使得第一字幕信息的显示时长和视频的时长相同,则将一句话的显示时长增加。如果编辑的视频时长小于该目标字幕段按照预定方式显示的时长,则将目标字幕段的包括目标部分的且时长与编辑的视频时长相同的部分作为第一字幕信息,结合视频的语音信息和目标部分来截取目标字幕段的部分作为第一字幕信息。
在一些实施例中,字幕数据库实际上就是一个素材库,其中的预定字幕段为电影中经典 片段的台词,或者为网路上流行的经典的段子,或者为诗词库等其他任何合适的文字片段,本领域技术人员根据实际情况选择合适的字幕数据库。
当然,实际的应用过程中,该方法还包括构建字幕数据库的步骤,该构建过程能够参考其他的语言数据库的构建过程,此处就不再赘述了。
在一些实施例中,字幕数据库中的预定字幕段可称为字幕段,第一预定阈值可称为第一阈值,目标部分可称为参考文字信息,预定时长可称为第一时长,预定方式可称为目标方式。电子设备基于第一文字信息,生成第一字幕信息,包括:电子设备对比第一文字信息与字幕数据库中的多个字幕段;在第一文字信息与参考文字信息的相似度大于第一阈值的情况下,从多个字幕段中选取目标字幕段;基于目标字幕段和正在编辑的视频,确定第一字幕信息。其中,字幕数据库包括多个字幕段,目标字幕段为多个字幕段中包括参考文字信息的任一字幕段,第一文字信息对应的第一时长等于参考文字信息对应的第二时长,第一时长为第一文字信息按照目标方式显示的时长,第二时长为参考文字信息按照目标方式显示的时长。目标方式为字幕信息的显示方式,例如,目标方式是指每个字的显示时长固定,基于字幕信息包括的字数确定字幕信息显示的时长,或者目标方式为其他方式,本公开对目标方式不做限制。
在一些实施例中,对于每个字幕段,电子设备基于第一文字信息,确定该字幕段中按照目标方式显示的时长与第一时长相同的文字信息,确定的文字信息即为参考文字信息,之后确定第一文字信息与该参考文字信息的相似度。其中,一个字幕段包括一个或多个参考文字信息。
在一些实施例中,在第一时长小于视频的时长的情况下,对比第一文字信息与多个字幕段;而在第一时长大于正在编辑的视频的时长的情况下,将该第一文字信息确定为第一字幕信息,即可满足视频的字幕需求,无需再将第一文字信息与字幕数据库中的多个字幕段进行对此。
在一些实施例中,至少部分可称为目标文字信息。基于视频的时长与目标字幕段按照目标方式显示的时长之间的关系,确定是将完整的目标字幕段作为第一字幕信息,还是将目标字幕段中的部分文字信息作为第一字幕信息。也即是在视频的时长等于目标字幕段按照目标方式显示的时长的情况下,将目标字幕段确定为第一字幕信息。例如,视频的时长为5秒,目标字幕段按照目标方式显示的时长为5秒,此时直接将目标字幕段确定为第一字幕信息。在视频的时长大于目标字幕段按照目标方式显示的时长的情况下,将目标字幕段确定为第一字幕信息,之后还需要调整目标字幕段显示时的目标方式,以使目标字幕段按照调整后的目标方式显示的时长等于视频的时长。
在视频的时长小于目标字幕段按照目标方式显示的时长的情况下,从目标字幕段中选取 目标文字信息,将目标文字信息确定为第一字幕信息,该目标文字信息包括参考文字信息,且目标文字信息按照目标方式显示的时长等于视频的时长。例如,视频的时长为4秒,目标字幕段“曾经有一份真诚的爱情放在我面前,我没有珍惜,等我失去的时候我才后悔莫及,人世间最痛苦的事莫过于此,如果上天能够给我一个再来一次的机会,我会对那个女孩子说三个字,我爱你,如果非要在这份爱上加上一个期限,我希望是一万年”对应的时长为10秒,参考文字信息为“曾经有一份真诚的爱情放在我面前,我没有珍惜”,视频的时长小于目标字幕段对应的时长,此时,从目标字幕段中选取包括参考文字信息,且对应的时长为4秒的目标文字信息,即选取出目标文字信息“曾经有一份真诚的爱情放在我面前,我没有珍惜,等我失去的时候我才后悔莫及,人世间最痛苦的事莫过于此”。
为了更加精确地在第一字幕区域中显示第一字幕信息,在一些实施例中,在第一字幕区域显示第一字幕信息步骤包括:确定第一字幕子区域的位置信息,位置信息包括长度和起始位置,第一字幕子区域的长度为第一字幕子区域的显示长度,第一字幕子区域的长度表征第一字幕子区域中的第一字幕段对应的视频时长,第一字幕子区域的起始位置表征第一字幕段对应的第一个视频帧图像,根据位置信息就能够确定该第一字幕段对应的视频帧图像为哪些,其中,第一字幕段为根据第一字幕信息划分得到的;根据位置信息,在第一字幕区域显示多个第一字幕子区域,即根据第一字幕子区域的长度和起始位置,显示对应的第一字幕子区域,第一字幕段一一对应地位于第一字幕子区域中,如图5所示。
也就是说,电子设备分别确定第一字幕区域中每个第一字幕子区域的位置信息,基于每个第一字幕子区域的位置信息,在每个第一字幕子区域显示对应的第一字幕段。其中,每个第一字幕子区域中显示一个对应的第一字幕段。
在一些实施例中,视频编辑界面还包括图像区域,图像区域显示有多个视频帧图像,多个视频帧图像沿着预定方向依次排列,多个第一字幕子区域沿着预定方向依次排列,一个第一字幕子区域的位于对应的多个视频帧图像的一侧,多个视频帧图像和其对应的第一字幕子区域对应显示,预定方向为第一字幕子区域的长度方向。第一字幕子区域的起始位置与对应的多个视频帧图像中的第一个视频帧图像的起始位置对齐,即第一字幕子区域的起始位置在视频帧图像区域的投影位于第一个视频帧图像的起始边上。第一字幕子区域的长度与对应的多个视频帧图像的显示长度相同。第一字幕子区域的起始位置在视频帧图像区域的投影位于对应的多个视频帧图像的第一个视频帧图像内,第一字幕子区域的长度小于对应的多个视频帧图像的显示总长度。如图5所示,第一字幕子区域510的一侧五个视频帧图像520就是该第一字幕子区域中的第一字幕段对应的五个视频帧图像530。
在一些实施例中,预定方向可称为目标方向,视频编辑界面中多个视频帧图像和多个第 一字幕子区域的显示方式包括:电子设备在视频编辑界面的图像区域中,沿着目标方向显示多个视频帧图像,该目标方向为第一字幕子区域的长度方向;在第一字幕区域中,沿着目标方向显示多个第一字幕子区域,第一字幕子区域位于对应的多个视频帧图像的一侧。例如,如图5所示,第一字幕子区域的长度方向是从左向右的方向,多个视频帧图像即是从左向右依次显示的,多个第一字幕子区域即是从左向右依次显示的。
当然,实际的应用过程中,第一字幕子区域并不限于图5所示的位于视频帧图像的一侧(以面对屏幕的视角观看图5,以下的上侧、左侧和右侧均以该视角观察),还能够位于视频帧图像的上侧。
需要说明的是,第一字幕子区域的还显示在其他的显示位置,例如,第一字幕子区域位于对应的多个视频帧图像的左侧或者右侧。图像帧显示的区域为主轨道,字幕显示的区域为副轨道,副轨道和主轨道在垂直于预定方向的方向上分布,参见图5所示,图5中的第一字幕区域530为副轨道,帧图像区域540为主轨道。
为了更准确地确定第一字幕子区域的位置信息,从而更加准确地显示第一字幕段,在一些实施例中,确定第一字幕子区域的位置信息步骤还包括:将第一字幕信息分为多个第一字幕段,例如,一句话划分为一个第一字幕段;获取正在编辑的视频的时长;根据第一字幕段的数量和时长,确定每个第一字幕子区域的长度。由于第一字幕子区域的长度表征其对应的视频的时长,所以根据视频的时长和第一字幕段的数量,能够确定每个第一字幕子区域中的第一字幕段对应的视频时长,根据第一字幕段对应的视频时长,即可确定该第一字幕段对应的视频帧图像的数量,从而根据对应的视频帧图像的数量确定对应的第一字幕子区域的长度。
在一些实施例中,第一字幕子区域的长度等于该第一字幕子区域对应的多个视频帧图像的显示总长度,或者第一字幕子区域的长度小于对应的多个视频帧图像的显示总长度;确定第一字幕区域的起始点为第一个第一字幕子区域的起始位置,其他的第一字幕子区域的起始位置在前一个第一字幕子区域的终止位置,这里的“之后”是指在第一字幕子区域的长度方向上的之后,相邻的两个第一字幕段的分布间隔根据实际情况确定,在一些实施例中,相邻的两个第一字幕段接触,即间隔距离为0,或者相邻的两个第一字幕段不接触,间隔距离大于0。
也即是电子设备将第一字幕区域的起始点确定为第一个字幕子区域的起始位置,将任一第一字幕子区域的终止位置确定为下一个第一字幕子区域的起始位置,或者将任一字幕子区域的终止位置之后的位置确定为下一个字幕子区域的起始位置。即相邻的两个第一字幕子区域之间间隔目标距离,在目标距离为0的情况下,当前第一字幕子区域的终止位置即为下一个第一字幕子区域的起始位置,在目标大于0的情况下,当前第一字幕子区域的终止位置之 后、与该终止距离间隔目标距离的位置即为下一个第一字幕子区域的起始位置。
为了更准确且高效地确定第一字幕段对应的视频帧图像,在一些实施例中,根据第一文字信息确定第一字幕信息步骤包括:获取正在编辑的视频的语音内容对应的文字内容;确定第一文字信息与文字内容的相似度是否大于第二预定阈值;在第一文字信息与文字内容的相似度大于第二预定阈值的情况下,确定文字内容为第一字幕信息。
在一些实施例中,确定第一字幕子区域的位置信息步骤包括:根据文字内容与视频帧图像的对应关系,确定第一字幕段的位置信息。其中文字内容与语音内容有对应关系,语音内容与视频帧图像有对应关系,因此,文字内容和视频帧图像具有对应关系,根据该对应关系,更加准确地确定每一部分的文字内容(第一字幕段)对应的视频帧图像,从而更准确地确定第一字幕子区域的位置信息,即确定第一字幕子区域的起始位置和长度,第一字幕子区域的起始位置与对应的第一个帧图像的起始位置对应,或者在第一个帧图像的起始位置之后,第一字幕子区域的长度与对应的多个帧图像的显示总长度相同,或者小于对应的多个帧图像的显示总长度。
在实际的应用过程中,在一些实施例中,在在视频编辑界面的第一字幕区域显示基于第一文字信息生成的第一字幕信息之后,上述方法还包括:响应于作用在第一目标字幕子区域上的第三预定操作,第一目标字幕子区域的起始位置从初始位置变更至预定位置。通过调整第一目标字幕子区域的起始位置,进而调整第一目标字幕子区域的显示位置,进一步保证调整后的第一目标字幕子区域的显示位置更加准确,从而进一步保证第一目标字幕子区域中的第一字幕段对应的视频时长以及视频帧图像更加准确。
其中,第三预定操作可称为第三操作,预定位置可称为目标位置。电子设备在第一字幕区域显示第一字幕信息之后,响应于作用在第一目标字幕子区域的第三操作,将第一目标字幕子区域的起始位置变更至目标位置。其中,目标位置与起始位置不同,第三操作为对第一目标字幕子区域的拖动操作、拉伸操作或其他操作,本公开实施例对第三操作的操作方式不做限制。
在一些实施例中,在在视频编辑界面的第一字幕区域显示基于第一文字信息生成的第一字幕信息之后,上述方法还包括:响应于作用在第二目标字幕子区域上的第四预定操作,第二目标字幕子区域的长度从初始长度变为预定长度,且第二目标字幕子区域对应的视频帧图像的数量从第一预定数量变为第二预定数量。通过该方式,调整第二目标字幕子区域的显示位置,从而进一步保证调整后的第二目标字幕子区域的显示位置更加准确,从而进一步保证第二目标字幕子区域中的第一字幕段对应的视频时长以及视频帧图像更加准确。
其中,第四预定操作可称为第四操作,预定长度可称为目标长度,电子设备在第一字幕 区域显示第一字幕信息后,响应于作用在第二目标字幕子区域上的第四操作,将第二目标字幕子区域的长度变为目标长度。其中,第二目标字幕子区域的原长度与目标长度不同,第四操作为拉伸操作或其他操作,本公开实施例对第四操作的操作方式不做限制。
在一些实施例中,在视频需要加载双字幕的情况下,例如在视频需要加载中文字幕和英文字幕的情况下,为了更好地对视频的字幕进行加载,在在视频编辑界面的第一字幕区域显示基于第一文字信息生成的第一字幕信息之后,上述方法还包括:响应于复制第二文字信息的第五预定操作,获取第二文字信息,也即是电子设备响应于第五操作,获取第二文字信息,该第五操作用于复制第二文字信息;根据第二文字信息,在视频编辑界面的第二字幕区域显示第二字幕信息。其中,第二字幕信息是基于第二文字信息生成的,该第二字幕区域位于第的任一侧,能够根据实际情况进行设置。在一些实施例中,如图6所示,第二字幕区域610位于第一字幕区域620远离视频帧图像区域630的一侧。
还需要说明的是,本公开实施例中的第二字幕信息的显示过程能够参考关于第一字幕信息的显示过程的描述,此处就不再赘述了。
在一些实施例中,在视频的字幕加载完之后,上述方法还包括:响应于第五预定操作,播放包括第一字幕信息的视频。也即是电子设备响应于第六操作,播放包括第一字幕信息的视频。
图7是根据一示例性实施例示出的一种自动加载字幕的方法的流程图,如图7所示,该自动加载字幕的方法应用于电子设备中,包括以下步骤710-步骤730。
在步骤710中,检测是否存在用于复制第一文字信息的第一预定操作。也即是检测是否存在第一操作,该第一操作用于复制第一文字信息。
在步骤720中,在检测到第一预定操作的情况下,获取第一文字信息。
在步骤730中,基于第一文字信息生成第一字幕信息,并在视频编辑界面的字幕区域显示第一字幕信息。
本公开实施例提供的方法中,在检测到第一预定操作的情况下,在在视频编辑界面的字幕区域显示基于第一文字信息生成的第一字幕信息。该方案中,根据复制的文字,在视频编辑界面的第一字幕区域中生成第一字幕信息,提高了编辑字幕的效率。
上述步骤720的实施过程和步骤730的实施过程与上述步骤210和220的实施过程同理,此处就不再赘述了。
图8是根据一示例性实施例示出的一种自动加载字幕的方法的流程图,如图8所示,该自动加载字幕的方法应用于电子设备中,包括以下步骤810-步骤820。
在步骤810中,在对目标视频进行编辑时,从用于存储文本信息的内存区域中获取第一 文本信息,其中,内存区域用于存储通过对目标文本进行复制操作(第一预定操作)得到的文本信息。
其中,第一文本信息可称为第一文字信息,目标文本可称为目标文字。上述步骤801也即是,在对目标视频进行编辑时,从内存区域中获取第一文字信息,其中内存区域用于存储通过对目标文字进行复制操作得到的文字信息。其中,目标视频为任一视频,目标文字为任意可被复制的文字。
在一些实施例中,内存区域中存储的文字信息是在对目标视频进行编辑之前,通过对目标文字进行复制操作得到的,或者是在对目标视频进行编辑的过程中,通过对目标文字进行复制操作得到的,本公开实施例对内存区域中文字信息的存储时间不做限制。
在步骤820中,基于第一文本信息生成第一字幕信息,并在视频编辑界面的第一字幕区域展示第一字幕信息。也即是基于第一文字信息,生成第一文字信息对应的第一字幕信息,在视频编辑几面的第一字幕区域显示第一字幕信息。
本公开实施例中,根据复制得到的第一文本信息来生成第一字幕信息,并且,将对应的第一字幕信息显示在第一字幕区域中。该方案中,根据复制的文字,在视频编辑界面的第一字幕区域中生成第一字幕信息,提高了编辑字幕的效率。
上述步骤810的实施过程和步骤820的实施过程与上述步骤210和220的实施过程同理,此处就不再赘述了。不同的是,步骤810中是从内存区域中获取存储的第一文字信息,而步骤210则是对第一文字信息进行复制操作,获取第一文字信息。
需要说明的是,本公开实施例中的第一预定操作、第二预定操作、第三预定操作、第四预定操作以及第五预定操作均能够为任何可行的操作,例如包括点击操作、滑动操作、长按操作与双击操作中的至少一个的操作。本领域技术人员根据实际情况选择合适的操作或者操作组合来对应于本公开的五个预定操作。
图9是根据一示例性实施例示出的一种自动加载字幕的装置框图。参照图9,该装置包括第一获取单元910和第一显示单元920。
该第一获取单元910被配置为执行响应于用于复制第一文字信息的第一预定操作,获取第一文字信息。也即是第一获取单元910被配置为执行响应于第一操作,获取第一文字信息,第一操作用于复制第一文字信息。
其中,第一预定操作为用于复制第一文字信息的任何单个操作,或者为一系列操作形成的操作组。在该电子设备为个人计算机的情况下,该第一预定操作为键鼠操作,例如,第一预定操作为“Ctrl+C”的键鼠操作。在该电子设备为手机或者PAD等设备时,该第一预定操作为点击操作或者长按操作等。
该第一显示单元920被配置为执行在视频编辑界面的第一字幕区域显示基于第一文字信息生成的第一字幕信息。也即是第一显示单元920被配置为在视频编辑界面的第一字幕区域显示第一字幕信息,第一字幕信息是基于第一文字信息生成的。
上述的方案中,根据复制的文字,在视频编辑界面的第一字幕区域中生成第一字幕信息,提高了编辑字幕的效率。
在实际的应用过程中,在存在多个复制文字信息的预定操作,且复制的多个文字信息中,的部分文字信息是用于视频的字幕的情况下,为了提高字幕的准确率,本公开实施例中,第一获取单元包括第一显示模块和获取模块。
第一显示模块被配置为执行响应于第一预定操作,在视频编辑界面上显示提示信息,该提示信息用于提醒是否根据第一文字信息确定第一字幕信息,即该提示信息用于提醒是否基于第一文字信息生成第一字幕信息。实际的应用过程中,该提示信息为提示弹窗,该提示弹窗显示有“系统检测到您进行了文字复制,是否自动添加为字幕”。当然,并不限于显示有文字的提示弹窗,还为其他在视频编辑器中的规定的提示信息,例如,该提示信息为一个显示有“copy?”的矩形图标。本领域技术人员能够根据实际情况设置合适的提示信息。
获取模块被配置为执行响应于作用在提示信息上的第二预定操作,获取第一文字信息。该第二预定操作为点击操作、长按操作、双击操作和滑动操作中的至少一个,根据实际情况来确定。在接收到在该提示信息上的第二预定操作的情况下,即能够确定根据第一文字信息来确定第一字幕信息,因而,基于该第二预定操作,获取第一文字信息。
也即是获取模块被配置为执行响应于第一操作,在视频编辑界面上显示提示信息,响应于作用在提示信息上的第二操作,获取第一文字信息。
第一显示单元包括确定模块和第二显示模块,其中,确定模块被配置为执行根据第一文字信息确定第一字幕信息;第二显示模块被配置为执行在第一字幕区域显示第一字幕信息。
在实际的应用过程中,在一些实施例中,例如,在第一文字信息按照预定方式显示时的时长和正在编辑的视频的时长一致的情况下,或者,在虽然第一文字信息按照预定方式显示时的时长相对正在编辑的视频的时长较短,但是,该视频仅部分需要字幕的情况中,第一文字信息为第一字幕信息,也即是第一文字信息按照预定方式显示的时长小于正在编辑的视频的时长,且只需对视频中的视频片段添加字幕的情况下,将第一文字信息确定为第一字幕信息。对应地,确定模块被配置为执行确定第一文字信息为第一字幕信息,也即是确定模块被配置为执行将第一文字信息确定为第一字幕信息。
在一些实施例中,第一文字信息不能满足第一字幕信息对应的需求,第一文字信息作为第一字幕信息中的一部分,为了进一步准确、高效且完整地显示第一字幕信息,本确定模块 还包括第一确定子模块、第二确定子模块和第三确定子模块,其中,第一确定子模块被配置为执行确定第一文字信息按照预定方式显示的预定时长是否小于正在编辑的视频的时长;第二确定子模块被配置为执行在预定时长小于视频的时长的情况下,将第一文字信息与字幕数据库中的预定字幕段进行比较,其中,字幕数据库包括多个预定字幕段;第三确定子模块被配置为执行在第一文字信息与目标部分的相似度大于第一预定阈值的情况下,根据正在编辑的视频的长度,确定目标字幕段的包括目标部分的至少部分为第一字幕信息,其中,目标部分为目标字幕段中与第一文字信息的预定时长相同的一部分,目标字幕段为多个预定字幕段中包括目标部分的一个。该方案中,根据第一文字信息自动补全正在编辑的视频对应的第一字幕信息。
例如,第一文字信息为“曾经有一份真诚的爱情放在我面前,我没有珍惜”,字幕数据库为电影中的经典台词片段的数据库,其中的一个预定字幕段为“曾经有一份真诚的爱情放在我面前,我没有珍惜,等我失去的时候我才后悔莫及,人世间最痛苦的事莫过于此,如果上天能够给我一个再来一次的机会,我会对那个女孩子说三个字,我爱你,如果非要在这份爱上加上一个期限,我希望是一万年”,这个预定字幕片段就是目标字幕片段,其中的目标部分为的“曾经有一份真诚的爱情放在我面前,我没有珍惜”,如果第一预定阈值为90%,目标部分和第一文字信息的相似度为100%大于第一预定阈值,因此,能够确定预定字幕段为“曾经有一份真诚的爱情放在我面前,我没有珍惜,等我失去的时候我才后悔莫及,人世间最痛苦的事莫过于此,如果上天能够给我一个再来一次的机会,我会对那个女孩子说三个字,我爱你,如果非要在这份爱上加上一个期限,我希望是一万年”为目标字幕段,后续根据编辑的视频的时长来确定对应的第一字幕信息,首先确定编辑的视频时长与该目标字幕段按照预定方式显示的时长是否相同,如果相同,则确定目标字幕段为第一字幕信息。如果编辑的视频时长大于该目标字幕段按照预定方式显示的时长,则根据该目标字幕段作为第一字幕信息,并根据编辑的视频的时长来调整对应的第一字幕信息的显示方式,例如,一句话对应的显示时长为2s,为了使得第一字幕信息的显示时长和视频的时长相同,则能够将一句话的显示时长增加。如果编辑的视频时长小于该目标字幕段按照预定方式显示的时长,则将目标字幕段的包括目标部分的且时长与编辑的视频时长相同的部分作为第一字幕信息,结合视频的语音信息和目标部分来截取目标字幕段的部分作为第一字幕信息。
在一些实施例中,上述的字幕数据库实际上就是一个素材库,其中的预定字幕段为电影中经典片段的台词,或者为网路上流行的经典的段子,或者为诗词库等其他任何合适的文字片段,本领域技术人员根据实际情况选择合适的字幕数据库。
当然,实际的应用过程中,该方法还包括构建字幕数据库的步骤,该构建过程参考其他 的语言数据库的构建过程,此处就不在赘述了。
也即是第二确定子模块被配置为执行对比第一文字信息与字幕数据库中的多个字幕段;第三确定子模块,被配置为执行在第一文字信息与参考文字信息的相似度大于第一阈值的情况下,从多个字幕段中选取目标字幕段;第三确定子模块,还被配置为执行基于目标字幕段和正在编辑的视频,确定第一字幕信息。其中,字幕数据库包括多个字幕段,目标字幕段为多个字幕段中包括参考文字信息的任一字幕段,第一文字信息对应的第一时长等于参考文字信息对应的第二时长,第一时长为第一文字信息按照目标方式显示的时长,第二时长为参考文字信息按照目标方式显示的时长。
在一些实施例中,第二确定子模块,被配置为执行在第一时长小于视频的时长的情况下,对比第一文字信息与多个字幕段。
在一些实施例中,第三确定子模块,被配置为执行在视频的时长小于目标字幕段按照目标方式显示的时长的情况下,从目标字幕段中选取目标文字信息,将目标文字信息确定为第一字幕信息,目标文字信息包括参考文字信息,且目标文字信息按照目标方式显示的时长等于视频的时长;在视频的时长不小于目标字幕段按照目标方式显示的时长的情况下,将目标字幕段确定为第一字幕信息。
为了更加精确地在第一字幕区域中显示第一字幕信息,在一些实施例中,第二显示模块包括第四确定子模块和第一显示子模块。
其中,第四确定子模块被配置为执行确定第一字幕子区域的位置信息,位置信息包括长度和起始位置,第一字幕子区域的长度为第一字幕子区域的显示长度,第一字幕子区域的长度表征第一字幕子区域中的第一字幕段对应的视频时长,第一字幕子区域的起始位置表征第一字幕段对应的第一个视频帧图像,根据位置信息就能够确定该第一字幕段对应的视频帧图像为哪些,其中,第一字幕段为根据第一字幕信息划分得到的。
第一显示子模块被配置为执行根据位置信息,在第一字幕区域显示多个第一字幕子区域,即根据第一字幕子区域的长度和起始位置,确定第一字幕子区域的长度和起始位置,第一字幕段一一对应地位于第一字幕子区域中,如图5所示。
也即是第四确定子模块被配置为执行分别确定第一字幕区域中每个第一字幕子区域的位置信息,第一显示子模块被配置为执行基于每个第一字幕子区域的位置信息,在每个第一字幕子区域显示对应的第一字幕段。
在一些实施例中,视频编辑界面还包括图像区域,图像区域显示有多个视频帧图像,多个视频帧图像沿着预定方向依次排列,多个第一字幕子区域沿着预定方向依次排列,一个第一字幕子区域的位于对应的多个的视频帧图像的一侧,多个视频帧图像和其对应的第一字幕 子区域对应显示,预定方向为第一字幕子区域的长度方向。第一字幕子区域的起始位置与对应的多个视频帧图像中的第一个视频帧图像的起始位置对齐,即第一字幕子区域的起始位置在视频帧图像区域的投影位于第一个视频帧图像的起始边上。第一字幕子区域的长度与对应的多个视频帧图像的显示长度相同。第一字幕子区域的起始位置在视频帧图像区域的投影位于对应的多个视频帧图像的第一个视频帧图像内,第一字幕子区域的长度小于对应的多个视频帧图像的显示总长度。如图5所示,第一字幕子区域510的一侧五个视频帧图像520就是该第一字幕子区域中的第一字幕段对应的五个视频帧图像520。
也即是第一显示单元,还被配置为执行在视频编辑界面的图像区域中,沿着目标方向显示多个视频帧图像,该目标方向为第一字幕子区域的长度方向;在第一字幕区域中,沿着目标方向显示多个第一字幕子区域,第一字幕子区域位于对应的多个视频帧图像的一侧。
当然,实际的应用过程中,第一字幕子区域并不限于图5所示的位于视频帧图像的一侧(以面对屏幕的视角观看图5,以下的上侧、左侧和右侧均以该视角观察),还能够位于视频帧图像的上侧。
需要说明的是,第一字幕子区域的还能够显示在其他的显示位置,例如,第一字幕子区域位于对应的多个视频帧图像的左侧或者右侧。图像帧显示的区域为主轨道,字幕显示的区域为副轨道,副轨道和主轨道在垂直于预定方向的方向上分布,参见图5所示,图5中的第一字幕区域530为副轨道,帧图像区域540为主轨道。
为了更准确地确定第一字幕子区域的位置信息,从而更加准确地显示第一字幕段,在一些实施例中,第四确定子模块包括划分子模块、第一获取子模块、第五确定子模块和第六确定子模块。
其中,划分子模块被配置为执行将第一字幕信息分为多个第一字幕段,例如,一句话划分为一个第一字幕段。
第一获取子模块被配置为执行获取正在编辑的视频的时长。
第五确定子模块被配置为执行根据第一字幕段的数量和时长,确定每个第一字幕子区域的长度。由于第一字幕子区域的长度表征其对应的视频的时长,所以根据视频的时长和第一字幕段的数量,能够确定每个第一字幕子区域中的第一字幕段对应的视频时长,根据第一字幕段对应的视频时长,即可确定该第一字幕段对应的视频帧图像的数量,从而根据对应的视频帧图像的数量确定对应的第一字幕子区域的长度。
在一些实施例中,第一字幕子区域的长度等于该第一字幕子区域对应的多个视频帧图像的显示总长度,或者第一字幕子区域的长度小于对应的多个视频帧图像的显示总长度。
第六确定子模块被配置为执行确定第一字幕区域的起始点为第一个所述第一字幕子区域 的起始位置,其他的所述第一字幕子区域的所述起始位置在前一个所述第一字幕子区域的终止位置,这里的“之后”是指在第一字幕子区域的长度方向上的之后,具体地,相邻的两个第一字幕段的分布间隔根据实际情况确定,在一些实施例中,相邻的两个第一字幕段接触,即间隔距离为0,或者相邻的两个第一字幕段不接触,间隔距离大于0。
也即是第六确定子模块被配置为执行将第一字幕区域的起始点确定为第一个字幕子区域的起始位置,将任一第一字幕子区域的终止位置确定为下一个第一字幕子区域的起始位置。
为了更准确且高效地确定第一字幕段对应的视频帧图像,在一些实施例中,确定模块包括第二获取子模块、第七确定子模块和第八确定子模块。
其中,第二获取子模块被配置为执行获取正在编辑的视频的语音内容对应的文字内容。
第七确定子模块被配置为执行确定第一文字信息与文字内容的相似度是否大于第二预定阈值。
第八确定子模块被配置为执行在第一文字信息与上述文字内容的相似度大于第二预定阈值的情况下,确定文字内容为第一字幕信息。
第四确定子模块被配置为执行根据文字内容与视频帧图像的对应关系,确定第一字幕段的位置信息,具体地,文字内容与语音内容有对应关系,语音内容与视频帧图像有对应关系,因此,文字内容和视频帧图像具有对应关系,根据该对应关系,更加准确地确定每一部分的文字内容(第一字幕段)对应的视频帧图像,从而更准确地确定第一字幕子区域的位置信息,即确定第一字幕子区域的起始位置和长度,第一字幕子区域的起始位置与对应的第一个帧图像的起始位置对应,或者在第一个帧图像的起始位置之后,第一字幕子区域的长度与对应的多个帧图像的显示总长度相同,或者小于对应的多个帧图像的显示总长度。
在实际的应用过程中,在一些实施例中,装置还包括第一调整单元,第一调整单元被配置为执行在在视频编辑界面的第一字幕区域显示基于第一文字信息生成的第一字幕信息之后,响应于作用在第一目标字幕子区域上的第三预定操作,第一目标字幕子区域的起始位置从初始位置变更至预定位置。通过调整第一目标字幕子区域的起始位置,进而调整第一目标字幕子区域的显示位置,进一步保证调整后的第一目标字幕子区域的显示位置更加准确,从而进一步保证第一目标字幕子区域中的第一字幕段对应的视频时长以及视频帧图像更加准确。
也即是第一调整单元被配置为执行响应于作用在第一目标字幕子区域的第三操作,将第一目标字幕子区域的起始位置变更至目标位置。
在一些实施例中,上述装置还包括第二调整单元,第二调整单元被配置为执行在在视频编辑界面的第一字幕区域显示基于第一文字信息生成的第一字幕信息之后,响应于作用在第 二目标字幕子区域上的第四预定操作,第二目标字幕子区域的长度从初始长度变为预定长度,且第二目标字幕子区域对应的视频帧图像的数量从第一预定数量变为第二预定数量。通过该方式,调整第二目标字幕子区域的显示位置,从而进一步保证调整后的第二目标字幕子区域的显示位置更加准确,从而进一步保证第二目标字幕子区域中的第一字幕段对应的视频时长以及视频帧图像更加准确。
也即是第二调整单元被配置为执行响应于作用在第二目标字幕子区域上的第四操作,将第二目标字幕子区域的长度变为目标长度。
在一些实施例中,在视频需要加载双字幕的情况下,例如在视频需要加载中文字幕和英文字幕的情况下,上述装置还包括第二获取单元和第二显示单元,其中,第二获取单元被配置为执行在在视频编辑界面的第一字幕区域显示基于第一文字信息生成的第一字幕信息之后,响应于复制第二文字信息的第五预定操作,获取第二文字信息,也即是第二获取单元被配置为执行响应于第五操作,获取第二文字信息,该第五操作用于复制第二文字信息;第二显示单元被配置为执行根据第二文字信息,在视频编辑界面的第二字幕区域显示第二字幕信息。其中,第二字幕信息是基于第二文字信息生成的,该第二字幕区域位于第一字幕区域的任一侧,能够根据实际情况进行设置。在一些实施例中,如图6所示,第二字幕区域610位于第一字幕区域620远离视频帧图像620的一侧。
还需要说明的是,本公开实施例中的第二字幕信息的显示过程能够参考关于第一字幕信息的显示过程的描述,此处就不再赘述了。
在一些实施例中,上述装置还包括播放单元,播放单元被配置为执行在视频的字幕加载完之后,响应于第五预定操作,播放包括第一字幕信息的视频。
图10是根据一示例性实施例示出的一种自动加载字幕的装置框图,如图10所示,该自动加载字幕的装置包括检测单元1010、第二获取单元1020和第二显示单元1030。
检测单元1010被配置为执行检测是否存在用于复制第一文字信息的第一预定操作。
第二获取单元1020被配置为执行在检测到第一预定操作的情况下,获取第一文字信息。
第二显示单元1030被配置为执行基于第一文字信息生成第一字幕信息,并在视频编辑界面的字幕区域显示第一字幕信息。
本公开实施例提供的装置中,在检测到第一预定操作的情况下,第一显示单元在视频编辑界面的字幕区域显示基于第一文字信息生成的第一字幕信息。该方案中,根据复制的文字,就能够在视频编辑界面的第一字幕区域中生成第一字幕信息,提高了编辑字幕的效率。
上述的第一获取单元和第一显示单元的实施过程参考上述方案中的描述,此处就不再赘述了。
图11是根据一示例性实施例示出的一种自动加载字幕的方法的流程图,如图11所示,该自动加载字幕的方法用于电子设备中,包括第三获取单元1110和第三显示单元1120。
其中,第三获取单元1110被配置为执行在对目标视频进行编辑时,从用于存储文本信息的内存区域中获取第一文本信息,其中,内存区域用于存储通过对目标文本进行复制操作(第一预定操作)得到的文本信息。
第三显示单元1120被配置为执行基于第一文本信息生成第一字幕信息,并在视频编辑界面的第一字幕区域展示第一字幕信息。
上述的实施例中,根据复制得到的第一文本信息来生成第一字幕信息,并且,将对应的第一字幕信息显示在第一字幕区域中。该方案中,根据复制的文字,在视频编辑界面的第一字幕区域中生成第一字幕信息,提高了编辑字幕的效率。
关于上述实施例中的装置,其中各个模块执行操作的实施方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
图12是根据一示例性实施例示出的一种用于执行字幕自动加载的电子设备01的框图。
在示例性实施例中,还提供了一种包括可执行指令的存储介质,例如用于存储可执行指令的存储器1210,上述指令可由电子设备01的处理器1220执行以完成上述方法。存储介质可以是非临时性计算机可读存储介质,例如,上述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
本公开的实施例还提供了一种电子设备,包括:处理器;用于存储处理器可执行指令的存储器;其中,处理器被配置为执行指令,以实现如下操作:响应于第一操作,获取第一文字信息,第一操作用于复制第一文字信息;在视频编辑界面的第一字幕区域显示第一字幕信息,第一字幕信息是基于第一文字信息生成的。
在一些实施例中,处理器被配置为执行指令,以实现上述方法实施例中的其他实施例提供的自动加载字幕的方法。
本公开实施例还提出了一种非易失性存储介质,当非易失性存储介质中的指令由电子设的处理器执行时,使得电子设备能够执行如下操作:响应于第一操作,获取第一文字信息,第一操作用于复制第一文字信息;在视频编辑界面的第一字幕区域显示第一字幕信息,第一字幕信息是基于第一文字信息生成的。
在一些实施例中,当非易失性存储介质中的指令由电子设的处理器执行时,使得电子设备能够执行上述方法实施例中的其他实施例提供的自动加载字幕的方法。
本公开实施例提供了一种自动加载字幕的方法,包括:响应于用于复制第一文字信息的第一预定操作,获取第一文字信息;在视频编辑界面的第一字幕区域显示基于第一文字信息 生成的第一字幕信息。
在一些实施例中,根据第一文字信息确定第一字幕信息步骤包括:确定第一文字信息按照预定方式显示的预定时长是否小于正在编辑的视频的时长;在预定时长小于视频的时长的情况下,将第一文字信息与字幕数据库中的预定字幕段进行比较,其中,字幕数据库包括多个预定字幕段;在第一文字信息与目标部分的相似度大于第一预定阈值的情况下,根据正在编辑的视频的长度,确定目标字幕段的包括目标部分的至少部分为第一字幕信息,其中,目标部分为目标字幕段中与第一文字信息的预定时长相同的部分,目标字幕段为多个预定字幕段中包括目标部分的一个。
在一些实施例中,在第一字幕区域显示第一字幕信息步骤包括:确定第一字幕子区域的位置信息,位置信息包括长度和起始位置,第一字幕子区域的长度为第一字幕子区域的显示长度,第一字幕子区域的长度表征第一字幕子区域中的第一字幕段对应的视频时长,第一字幕子区域的起始位置表征第一字幕段对应的第一个视频帧图像,第一字幕段为根据第一字幕信息划分得到的;根据位置信息,在第一字幕区域显示多个第一字幕子区域,其中,一个第一字幕子区域中具有一个第一字幕段。
在一些实施例中,视频编辑界面还包括图像区域,图像区域显示有多个视频帧图像,多个视频帧图像沿着预定方向依次排列,多个第一字幕子区域沿着预定方向依次排列,一个第一字幕子区域的位于对应的多个的视频帧图像的一侧,预定方向为第一字幕子区域的长度方向。
在一些实施例中,确定第一字幕子区域的位置信息步骤还包括:将第一字幕信息分为多个第一字幕段;获取正在编辑的视频的时长;根据第一字幕段的数量和时长,确定每个第一字幕子区域的长度;确定第一字幕区域的起始点为第一个第一字幕子区域的起始位置,其他的第一字幕子区域的起始位置在前一个第一字幕子区域的终止位置。
在一些实施例中,根据第一文字信息确定第一字幕信息步骤包括:获取正在编辑的视频的语音内容对应的文字内容;确定第一文字信息与文字内容的相似度是否大于第二预定阈值;在第一文字信息与文字内容的相似度大于第二预定阈值的情况下,确定文字内容为第一字幕信息;确定第一字幕子区域的位置信息步骤包括:根据文字内容与视频帧图像的对应关系,确定第一字幕段的位置信息。
本公开实施例提供了一种自动加载字幕的方法,包括:检测是否存在用于复制第一文字信息的第一预定操作;在检测到第一预定操作的情况下,获取第一文字信息;基于第一文字信息生成第一字幕信息,并在视频编辑界面的字幕区域显示第一字幕信息。
本公开实施例提供了一种自动加载字幕的方法,包括:在对目标视频进行编辑时,从用 于存储文本信息的内存区域中获取第一文本信息,其中,内存区域用于存储通过对目标文本进行复制操作得到的文本信息;基于第一文本信息生成第一字幕信息,并在视频编辑界面的第一字幕区域展示第一字幕信息。
本公开所有实施例均可以单独被执行,也可以与其他实施例相结合被执行,均视为本公开要求的保护范围。

Claims (32)

  1. 一种自动加载字幕的方法,包括:
    响应于第一操作,获取第一文字信息,所述第一操作用于复制所述第一文字信息;
    在视频编辑界面的第一字幕区域显示第一字幕信息,所述第一字幕信息是基于所述第一文字信息生成的。
  2. 根据权利要求1所述的方法,其中,所述响应于第一操作,获取第一文字信息,包括:
    响应于所述第一操作,在所述视频编辑界面上显示提示信息,所述提示信息用于提醒是否基于所述第一文字信息生成所述第一字幕信息;
    响应于作用在所述提示信息上的第二操作,获取所述第一文字信息。
  3. 根据权利要求1所述的方法,其中,所述方法还包括:
    基于所述第一文字信息,生成所述第一字幕信息。
  4. 根据权利要求3所述的方法,其中,所述基于所述第一文字信息,生成所述第一字幕信息,包括:
    将所述第一文字信息确定为所述第一字幕信息。
  5. 根据权利要求3所述的方法,其中,所述基于所述第一文字信息,生成所述第一字幕信息,包括:
    对比所述第一文字信息与字幕数据库中的多个字幕段;
    在所述第一文字信息与参考文字信息的相似度大于第一阈值的情况下,从多个所述字幕段中选取目标字幕段,其中所述目标字幕段为多个所述字幕段中包括所述参考文字信息的字幕段,所述第一文字信息对应的第一时长等于所述参考文字信息对应的第二时长,所述第一时长为所述第一文字信息按照目标方式显示的时长,所述第二时长为所述参考文字信息按照所述目标方式显示的时长;
    基于所述目标字幕段和正在编辑的视频,确定所述第一字幕信息。
  6. 根据权利要求5所述的方法,其中,所述基于所述目标字幕段和正在编辑的视频,确定所述第一字幕信息,包括:
    在所述视频的时长小于所述目标字幕段按照所述目标方式显示的时长的情况下,从所述目标字幕段中选取目标文字信息,将所述目标文字信息确定为所述第一字幕信息,所述目标文字信息包括所述参考文字信息,且所述目标文字信息按照所述目标方式显示的时长等于所述视频的时长;
    在所述视频的时长不小于所述目标字幕段按照所述目标方式显示的时长的情况下,将所述目标字幕段确定为所述第一字幕信息。
  7. 根据权利要求5所述的方法,其中,所述对比所述第一文字信息与字幕数据库中的多个字幕段,包括:
    在所述第一时长小于所述视频的时长的情况下,对比所述第一文字信息与多个所述字幕段。
  8. 根据权利要求3所述的方法,其中,所述在视频编辑界面的第一字幕区域显示所述第一字幕信息,包括:
    分别确定所述第一字幕区域中每个第一字幕子区域的位置信息,所述位置信息包括长度和起始位置,所述第一字幕子区域的长度表征所述第一字幕子区域中第一字幕段对应的视频时长,所述第一字幕子区域的起始位置表征所述第一字幕段对应的第一个视频帧图像,所述第一字幕段为基于所述第一字幕信息划分得到的;
    基于每个所述第一字幕子区域的位置信息,在每个所述第一字幕子区域显示对应的所述第一字幕段。
  9. 根据权利要求8所述的方法,其中,所述方法还包括:
    在所述视频编辑界面的图像区域中,沿着目标方向显示多个视频帧图像,所述目标方向为所述第一字幕子区域的长度方向;
    在所述第一字幕区域中,沿着所述目标方向显示多个所述第一字幕子区域,所述第一字幕子区域位于对应的多个所述视频帧图像的一侧。
  10. 根据权利要求8所述的方法,其中,所述分别确定所述第一字幕区域中每个第一字幕子区域的位置信息,包括:
    将所述第一字幕信息分为多个所述第一字幕段;
    获取正在编辑的视频的时长;
    基于所述第一字幕段的数量和所述时长,确定每个所述第一字幕子区域的长度;
    将所述第一字幕区域的起始点确定为第一个所述第一字幕子区域的所述起始位置,将任一所述第一字幕子区域的终止位置确定为下一个所述第一字幕子区域的所述起始位置。
  11. 根据权利要求8所述的方法,其中,所述基于所述第一文字信息,生成所述第一字幕信息,包括:
    获取正在编辑的所述视频的语音内容对应的文字内容;
    确定所述第一文字信息与所述文字内容的相似度;
    在所述第一文字信息与所述文字内容的相似度大于第二阈值的情况下,将所述文字内容确定为所述第一字幕信息;
    所述分别确定所述第一字幕区域中每个第一字幕子区域的位置信息,包括:
    基于所述文字内容与每个所述视频帧图像的对应关系,确定每个所述第一字幕子区域的所述位置信息。
  12. 根据权利要求1至11中任一项所述的方法,其中,所述方法还包括:
    响应于作用在第一目标字幕子区域的第三操作,将所述第一目标字幕子区域的起始位置变更至目标位置。
  13. 根据权利要求1至11中任一项所述的方法,其中,所述方法还包括:
    响应于作用在第二目标字幕子区域上的第四操作,将所述第二目标字幕子区域的长度变为目标长度。
  14. 根据权利要求1至11中任一项所述的方法,其中,所述方法还包括:
    响应于第五操作,获取第二文字信息,所述第五操作用于复制所述第二文字信息;
    基于所述第二文字信息,在所述视频编辑界面的第二字幕区域显示第二字幕信息,所述第二字幕区域位于所述第一字幕区域的一侧,所述第二字幕信息是基于所述第二文字信息生成的。
  15. 根据权利要求1-11任一项所述的方法,其中,所述响应于第一操作,获取第一文字信息,包括:
    响应于所述第一操作,将复制的所述第一文字信息存储在内存区域中;
    在对目标视频进行编辑时,从所述内存区域中获取所述第一文字信息。
  16. 一种自动加载字幕的装置,包括:
    第一获取单元,被配置为执行响应于第一操作,获取第一文字信息,所述第一操作用于复制所述第一文字信息;
    第一显示单元,被配置为执行在视频编辑界面的第一字幕区域显示第一字幕信息,所述第一字幕信息是基于所述第一文字信息生成的。
  17. 一种电子设备,包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    其中,所述处理器被配置为执行所述指令,以实现如下操作:
    响应于第一操作,获取第一文字信息,所述第一操作用于复制所述第一文字信息;
    在视频编辑界面的第一字幕区域显示第一字幕信息,所述第一字幕信息是基于所述第一文字信息生成的。
  18. 根据权利要求17所述的电子设备,其中,所述处理器被配置为执行所述指令,以实现如下操作:
    响应于所述第一操作,在所述视频编辑界面上显示提示信息,所述提示信息用于提醒是否基于所述第一文字信息生成所述第一字幕信息;
    响应于作用在所述提示信息上的第二操作,获取所述第一文字信息。
  19. 根据权利要求17所述的电子设备,其中,所述处理器被配置为执行所述指令,以实现如下操作:
    基于所述第一文字信息,生成所述第一字幕信息。
  20. 根据权利要求19所述的电子设备,其中,所述处理器被配置为执行所述指令,以实现如下操作:
    将所述第一文字信息确定为所述第一字幕信息。
  21. 根据权利要求19所述的电子设备,其中,所述处理器被配置为执行所述指令,以实现如下操作:
    对比所述第一文字信息与字幕数据库中的多个字幕段;
    在所述第一文字信息与参考文字信息的相似度大于第一阈值的情况下,从多个所述字幕段中选取目标字幕段,其中所述目标字幕段为多个所述字幕段中包括所述参考文字信息的任一字幕段,所述第一文字信息对应的第一时长等于所述参考文字信息对应的第二时长,所述第一时长为所述第一文字信息按照目标方式显示的时长,所述第二时长为所述参考文字信息按照所述目标方式显示的时长;
    基于所述目标字幕段和正在编辑的视频,确定所述第一字幕信息。
  22. 根据权利要求21所述的电子设备,其中,所述处理器被配置为执行所述指令,以实现如下操作:
    在所述视频的时长小于所述目标字幕段按照所述目标方式显示的时长的情况下,从所述目标字幕段中选取目标文字信息,将所述目标文字信息确定为所述第一字幕信息,所述目标文字信息包括所述参考文字信息,且所述目标文字信息按照所述目标方式显示的时长等于所述视频的时长;
    在所述视频的时长不小于所述目标字幕段按照所述目标方式显示的时长的情况下,将所述目标字幕段确定为所述第一字幕信息。
  23. 根据权利要求21所述的电子设备,其中,所述处理器被配置为执行所述指令,以实现如下操作:
    在所述第一时长小于所述视频的时长的情况下,对比所述第一文字信息与多个所述字幕段。
  24. 根据权利要求21所述的电子设备,其中,所述处理器被配置为执行所述指令,以实现如下操作:
    分别确定所述第一字幕区域中每个第一字幕子区域的位置信息,所述位置信息包括长度和起始位置,所述第一字幕子区域的长度表征所述第一字幕子区域中第一字幕段对应的视频时长,所述第一字幕子区域的起始位置表征所述第一字幕段对应的第一个视频帧图像,所述第一字幕段为基于所述第一字幕信息划分得到的;
    基于每个所述第一字幕子区域的位置信息,在每个所述第一字幕子区域显示对应的所述第一字幕段。
  25. 根据权利要求24所述的电子设备,其中,所述处理器被配置为执行所述指令,以实现如下操作:
    在所述视频编辑界面的图像区域中,沿着目标方向显示多个视频帧图像,所述目标方向为所述第一字幕子区域的长度方向;
    在所述第一字幕区域中,沿着所述目标方向显示多个所述第一字幕子区域,所述第一字幕子区域位于对应的多个所述视频帧图像的一侧。
  26. 根据权利要求24所述的电子设备,其中,所述处理器被配置为执行所述指令,以实现如下操作:
    将所述第一字幕信息分为多个所述第一字幕段;
    获取正在编辑的视频的时长;
    基于所述第一字幕段的数量和所述时长,确定每个所述第一字幕子区域的长度;
    将所述第一字幕区域的起始点确定为第一个所述第一字幕子区域的所述起始位置,将任一所述第一字幕子区域的终止位置确定为下一个所述第一字幕子区域的所述起始位置。
  27. 根据权利要求24所述的电子设备,其中,所述处理器被配置为执行所述指令,以实现如下操作:
    获取正在编辑的所述视频的语音内容对应的文字内容;
    确定所述第一文字信息与所述文字内容的相似度;
    在所述第一文字信息与所述文字内容的相似度大于第二阈值的情况下,将所述文字内容 确定为所述第一字幕信息;
    基于所述文字内容与每个所述视频帧图像的对应关系,确定每个所述第一字幕子区域的所述位置信息。
  28. 根据权利要求17至27任一项所述的电子设备,其中,所述处理器被配置为执行所述指令,以实现如下操作:
    响应于作用在第一目标字幕子区域的第三操作,将所述第一目标字幕子区域的起始位置变更至目标位置。
  29. 根据权利要求17至27任一项所述的电子设备,其中,所述处理器被配置为执行所述指令,以实现如下操作:
    响应于作用在第二目标字幕子区域上的第四操作,将所述第二目标字幕子区域的长度变为目标长度。
  30. 根据权利要求17至27任一项所述的电子设备,其中,所述处理器被配置为执行所述指令,以实现如下操作:
    响应于第五操作,获取第二文字信息,所述第五操作用于复制所述第二文字信息;
    基于所述第二文字信息,在所述视频编辑界面的第二字幕区域显示第二字幕信息,所述第二字幕区域位于所述第一字幕区域的一侧,所述第二字幕信息是基于所述第二文字信息生成的。
  31. 根据权利要求17至27任一项所述的电子设备,其中,所述处理器被配置为执行所述指令,以实现如下操作:
    响应于所述第一操作,将复制的所述第一文字信息存储在内存区域中;
    在对目标视频进行编辑时,从所述内存区域中获取所述第一文字信息。
  32. 一种非易失性存储介质,当所述非易失性存储介质中的指令由电子设的处理器执行时,使得电子设备能够执行如下操作:
    响应于第一操作,获取第一文字信息,所述第一操作用于复制所述第一文字信息;
    在视频编辑界面的第一字幕区域显示第一字幕信息,所述第一字幕信息是基于所述第一文字信息生成的。
PCT/CN2021/107903 2020-11-27 2021-07-22 自动加载字幕的方法及电子设备 WO2022110844A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011367465.2 2020-11-27
CN202011367465.2A CN112988005B (zh) 2020-11-27 2020-11-27 自动加载字幕的方法

Publications (1)

Publication Number Publication Date
WO2022110844A1 true WO2022110844A1 (zh) 2022-06-02

Family

ID=76344834

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/107903 WO2022110844A1 (zh) 2020-11-27 2021-07-22 自动加载字幕的方法及电子设备

Country Status (2)

Country Link
CN (1) CN112988005B (zh)
WO (1) WO2022110844A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112988005B (zh) * 2020-11-27 2023-02-28 北京达佳互联信息技术有限公司 自动加载字幕的方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102752552A (zh) * 2011-09-23 2012-10-24 新奥特(北京)视频技术有限公司 一种在文稿中添加字幕的方法及系统
EP2662836A1 (en) * 2012-05-09 2013-11-13 NI Group Limited A method of publishing digital content
CN111565330A (zh) * 2020-07-13 2020-08-21 北京美摄网络科技有限公司 一种同步字幕的添加方法及装置、电子设备、存储介质
CN112988005A (zh) * 2020-11-27 2021-06-18 北京达佳互联信息技术有限公司 自动加载字幕的方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105657395A (zh) * 2015-08-17 2016-06-08 乐视致新电子科技(天津)有限公司 一种3d视频的字幕播放方法及装置
CN106303659A (zh) * 2016-08-22 2017-01-04 暴风集团股份有限公司 在播放器中加载图文字幕的方法及系统
CN106775336A (zh) * 2016-11-30 2017-05-31 努比亚技术有限公司 一种内容复制处理方法、装置及终端
CN107170452A (zh) * 2017-04-27 2017-09-15 广东小天才科技有限公司 一种电子会议的加入方法及装置
CN107992210A (zh) * 2017-10-11 2018-05-04 捷开通讯(深圳)有限公司 输入法词汇推荐方法、智能终端及具有存储功能的装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102752552A (zh) * 2011-09-23 2012-10-24 新奥特(北京)视频技术有限公司 一种在文稿中添加字幕的方法及系统
EP2662836A1 (en) * 2012-05-09 2013-11-13 NI Group Limited A method of publishing digital content
CN111565330A (zh) * 2020-07-13 2020-08-21 北京美摄网络科技有限公司 一种同步字幕的添加方法及装置、电子设备、存储介质
CN112988005A (zh) * 2020-11-27 2021-06-18 北京达佳互联信息技术有限公司 自动加载字幕的方法

Also Published As

Publication number Publication date
CN112988005A (zh) 2021-06-18
CN112988005B (zh) 2023-02-28

Similar Documents

Publication Publication Date Title
TWI698122B (zh) 一種彈幕展示方法及客戶端
US20190182564A1 (en) Synchronizing out-of-band content with a media stream
US8997134B2 (en) Controlling presentation flow based on content element feedback
US20130268826A1 (en) Synchronizing progress in audio and text versions of electronic books
US10353721B2 (en) Systems and methods for guided live help
US11924485B2 (en) Method and system of displaying a video
US20160253982A1 (en) Contextual zoom
US9323737B2 (en) Generating an interactive page template based on setting a material type and a plurality of input and output signals for a mobile device
US11243824B1 (en) Creation and management of live representations of content through intelligent copy paste actions
US10419828B2 (en) Modifying subtitles to reflect changes to audiovisual programs
US20230066504A1 (en) Automated adaptation of video feed relative to presentation content
US20160345059A1 (en) Method and device for switching channel
WO2022110844A1 (zh) 自动加载字幕的方法及电子设备
US11733823B2 (en) Synthetic media detection and management of trust notifications thereof
Chi et al. DemoWiz: re-performing software demonstrations for a live presentation
US20170004859A1 (en) User created textbook
US20230054388A1 (en) Method and apparatus for presenting audiovisual work, device, and medium
EP4099711A1 (en) Method and apparatus and storage medium for processing video and timing of subtitles
US20220192559A1 (en) Electronic device, method of determining mental state of user in consideration of external mental level according to input behavior of user, and computer program
CN115640783A (zh) 用于文档内容显示的方法、装置、设备和存储介质
CN109688455B (zh) 视频播放方法、装置及设备
CN114861110A (zh) 用于作品转发的方法、装置、设备和存储介质
CN115811632A (zh) 一种视频处理方法、装置、设备及存储介质
US11450043B2 (en) Element association and modification
KR101853322B1 (ko) 학습 콘텐츠 편집 기능을 가진 학습 애플리케이션 제공 단말 및 그 학습 콘텐츠 편집 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21896351

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 13/09/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21896351

Country of ref document: EP

Kind code of ref document: A1