WO2022042593A1 - 字幕编辑方法、装置和电子设备 - Google Patents

字幕编辑方法、装置和电子设备 Download PDF

Info

Publication number
WO2022042593A1
WO2022042593A1 PCT/CN2021/114504 CN2021114504W WO2022042593A1 WO 2022042593 A1 WO2022042593 A1 WO 2022042593A1 CN 2021114504 W CN2021114504 W CN 2021114504W WO 2022042593 A1 WO2022042593 A1 WO 2022042593A1
Authority
WO
WIPO (PCT)
Prior art keywords
subtitle
editing
video
subtitle text
area
Prior art date
Application number
PCT/CN2021/114504
Other languages
English (en)
French (fr)
Inventor
欧桐桐
黄嘉豪
张倓
席有方
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Priority to US18/023,711 priority Critical patent/US20230308730A1/en
Publication of WO2022042593A1 publication Critical patent/WO2022042593A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/34Indicating arrangements 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • H04N21/43074Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of additional data with content streams on the same device, e.g. of EPG data or interactive icon with a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8126Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts
    • H04N21/8133Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts specifically related to the content, e.g. biography of the actors in a movie, detailed information about an article seen in a video program
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/131Fragmentation of text files, e.g. creating reusable text-blocks; Linking to fragments, e.g. using XInclude; Namespaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/169Annotation, e.g. comment data or footnotes

Definitions

  • the present disclosure relates to the field of Internet technologies, and in particular, to a subtitle editing method, apparatus, and electronic device.
  • subtitles may involve adjusting the text, adjusting the timeline, etc.
  • an embodiment of the present disclosure provides a method for editing subtitles.
  • the method includes: displaying a video playing area and a subtitle editing area, wherein the video playing area is used for playing a target video, and the subtitle editing area is used for editing The candidate subtitles corresponding to the target video; the video frames currently displayed in the video playback area and the subtitles displayed in the subtitle editing area are displayed in linkage.
  • an embodiment of the present disclosure provides a method for editing subtitles.
  • the method includes: acquiring candidate subtitles, wherein the candidate subtitles include at least one subtitle text item, the subtitle text item is bound to a video time period, and the subtitle text item is The bound video time period is used for the linked display of subtitles and video; according to the unit editing operation for the subtitle text item, the subtitle text item is split or merged, and the video time period is bound to the newly generated subtitle text item.
  • an embodiment of the present disclosure provides a method for editing subtitles, the method comprising: displaying a video playback interface, wherein the video playback interface is used to play a video and display subtitles corresponding to video frames;
  • the triggering operation of the subtitle shows a video playing area and a subtitle editing area, wherein the subtitle editing area is used for editing subtitles, and the video playing area is used for playing the target video.
  • an embodiment of the present disclosure provides a subtitle editing apparatus, including: a first display unit configured to display a video playing area and a subtitle editing area, wherein the video playing area is used to play a target video, and the subtitles The editing area is used to edit the candidate subtitles corresponding to the target video; the second display unit is used to display the video frames currently displayed in the video playback area and the subtitles displayed in the subtitle editing area in linkage.
  • an embodiment of the present disclosure provides a subtitle editing device, comprising: an acquisition unit configured to acquire candidate subtitles, wherein the candidate subtitles include at least one subtitle text item, and the subtitle text item is bound to a video time period, The video time period bound to the subtitle text item is used for the linked display of the subtitle and the video; the binding unit is used to split or merge the subtitle text item according to the unit editing operation for the subtitle text item, and create a new The subtitle text item is bound to the video time period.
  • an embodiment of the present disclosure provides a subtitle editing device, including: a third display unit configured to display a video playback interface, wherein the video playback interface is used to play a video and display subtitles corresponding to video frames; a fourth presentation unit, configured to display a video playing area and a subtitle editing area in response to detecting a trigger operation for the subtitle, wherein the subtitle editing area is used for editing subtitles, and the video playing area is used for playing a target video .
  • embodiments of the present disclosure provide an electronic device, including: one or more processors; and a storage device for storing one or more programs, when the one or more programs are stored by the one or more programs
  • the processors execute such that the one or more processors implement the subtitle editing method as described in the first aspect or as in the second method.
  • an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, implements the steps of the subtitle editing method according to the first aspect or the second aspect.
  • the subtitle editing method, device, and electronic device provided by the embodiments of the present disclosure are displayed by displaying the video playback area and the subtitle editing area; and, the video frame currently displayed in the video playback area and the subtitle displayed in the subtitle editing area are displayed in linkage; thus, A new way of editing subtitles can be provided; and the user can compare the target video to be played to determine whether the candidate subtitles are wrong, thereby improving the convenience of editing subtitles by comparing the video content, thereby improving the user's ability to edit subtitles. efficiency, and can improve the accuracy of subtitles of the target video.
  • FIG. 1 is a flowchart of one embodiment of a subtitle editing method according to the present disclosure
  • FIG. 2 is a schematic diagram of an application scenario of the subtitle editing method according to the present disclosure
  • FIG. 3 is a schematic diagram of an application scenario of the subtitle editing method according to the present disclosure.
  • FIGS. 4A and 4B are schematic diagrams of another application scenario of the subtitle editing method according to the present disclosure.
  • 5A and 5B are schematic diagrams of another application scenario of the subtitle editing method of the present disclosure.
  • FIG. 6 is a flowchart of yet another embodiment of a subtitle editing method according to the present disclosure.
  • FIG. 7 is a flowchart of yet another embodiment of a subtitle editing method according to the present disclosure.
  • FIG. 8 is a schematic structural diagram of an embodiment of a subtitle editing apparatus according to the present disclosure.
  • FIG. 9 is a schematic structural diagram of an embodiment of a subtitle editing apparatus according to the present disclosure.
  • FIG. 10 is a schematic structural diagram of an embodiment of a subtitle editing apparatus according to the present disclosure.
  • 11 is an exemplary system architecture to which the subtitle editing method of one embodiment of the present disclosure may be applied;
  • FIG. 12 is a schematic diagram of a basic structure of an electronic device provided according to an embodiment of the present disclosure.
  • the term “including” and variations thereof are open-ended inclusions, ie, "including but not limited to”.
  • the term “based on” is “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • FIG. 1 shows a flow of an embodiment of a subtitle editing method according to the present disclosure.
  • the subtitle editing method is applied to a terminal device.
  • the subtitle editing method includes the following steps:
  • Step 101 displaying the video playing area and the subtitle editing area.
  • the execution body (for example, a terminal device) of the subtitle editing method can display the video playing area and the subtitle editing area.
  • the above-mentioned video playing area can be used to play the target video.
  • the above-mentioned subtitle editing area can be used for editing candidate subtitles.
  • the subtitle candidate region may display the subtitle candidates, and may display the edited and modified subtitles in response to user editing operations on the candidate subtitles.
  • the target in the target video is added for the convenience of description in the present disclosure, and is not specific to the video.
  • the target video can be any video.
  • the above target video may correspond to candidate subtitles.
  • the candidate subtitles can be subtitles of the target video.
  • the subtitle of the target video may be the text corresponding to the voice bound to the target video.
  • the subtitle editing area can be used as a display subtitle area for the user to check whether the current video played in the video playback area corresponds to the current subtitles in the subtitle editing area.
  • the subtitle editing area there may be no overlapping part between the subtitle editing area and the video playing area, but may partially overlap, or one may be set in the other (for example, the subtitle editing area is set in the video playing area).
  • the subtitle editing area is capable of displaying at least two subtitle text items.
  • the above-mentioned predefined subtitle editing initiation operation may be a predefined operation for initiating batch editing.
  • the specific operation method of the subtitle editing initiating operation can be set according to the actual application scenario, and is not limited here.
  • step 102 the video frame currently displayed in the video playing area and the subtitle displayed in the subtitle editing area are displayed in linkage.
  • the above-mentioned execution body may display the subtitles in the video frame currently displayed in the video playing area and the subtitle editing area for linkage display.
  • the above-mentioned linked presentation may include synchronous presentation of video frames and subtitles that have a corresponding relationship.
  • the subtitles displayed in the subtitle editing area may be adjusted as the video frames displayed in the video playback area change.
  • the currently displayed video frame is changed, which may be due to the normal playback of the video, or due to the skip playback of the video.
  • each subtitle in the candidate subtitles may be traversed, and each subtitle has a start time point and an end time point.
  • the interval between the start time point and the end time point is the video time period. If it is found that the video time point of the currently displayed video frame is within the video time period of a subtitle, the subtitle is displayed accordingly. If it is not found that the video time point of the currently displayed video frame is not within the video time period of any subtitle, it means that there are no editable subtitles in the current playback progress, and the subtitle editing area will not be adjusted.
  • the video frames displayed in the video playback area may be adjusted as the subtitles displayed in the subtitle editing area change.
  • the highlighted subtitle in the subtitle editing area may be changed in response to detecting the user's selection operation.
  • the video playback progress in the video playback area can be positioned to the start time of the selected subtitle. So as to realize the synchronization of subtitles and playback progress.
  • the subtitle editing method shown in this embodiment displays the video playback area and the subtitle editing area; and, the video frame currently displayed in the video playback area and the subtitle displayed in the subtitle editing area are displayed in linkage; thus, A new way of editing subtitles can be provided; and the user can compare the target video to be played to determine whether the candidate subtitles are wrong, thereby improving the convenience of editing subtitles by comparing the video content, thereby improving the user's ability to edit subtitles. efficiency, and can improve the accuracy of subtitles of the target video.
  • the method further includes: using a predefined progress indication manner, in the subtitle editing area, indicating the subtitle corresponding to the video frame currently being played in the video playing area.
  • the subtitles corresponding to the video frames currently being played may be determined in the following manner: determine whether the current playback progress (for example, the video time point played to) is within the video time period corresponding to the subtitle text item; if so, Then, the subtitle text item corresponding to the video time period in which the current playback progress is located is determined as the subtitle text item of the currently playing video frame pair; if not, it means that the current playback progress has no corresponding subtitles, and accordingly, the predefined subtitles may not be displayed. Progress indication method.
  • the progress indication manner may include highlighting color, adding underline, changing font, displaying in a preset designated area, and the like.
  • the progress indication mode may be represented as being displayed in a preset designated area. You can scroll the subtitle text item corresponding to the currently playing video to the above preset designated area.
  • the user can be reminded of the subtitle corresponding to the currently playing video frame, which is convenient for the user to check whether the subtitle matches the video frame, and improves the accuracy of the subtitle of the target video.
  • FIG. 2 and FIG. 3 illustrate exemplary application scenarios of the subtitle editing method according to the present disclosure.
  • the terminal device can play the target video on the screen; as an example, the currently playing video frame in the target video includes an anchor image.
  • the terminal device may also display subtitles 201 corresponding to the currently playing video frame.
  • the user can click on the subtitle 201 , and the operation of the user clicking on the subtitle can be understood as an operation initiated by subtitle editing.
  • the terminal device can display a video playing area 301 and a subtitle editing area 302 in response to the user clicking on the subtitle 201 .
  • the target video can be played in the video playing area 301.
  • the subtitle editing area 302 may display subtitle candidates corresponding to the target video, and may respond to subtitle editing operations for the candidate subtitles.
  • the candidate subtitles may include the candidate subtitle text "Meows are aliens who came to Earth. Because of their cute appearance, they gain the trust of humans and become one of the few friends who interact with humans on an equal footing. Meows cannot drink milk. Because meows Stars do not have the enzymes to digest milk. Cats will have diarrhea if they drink too much milk. Occasionally drinking two small sips is fine.”
  • the subtitle corresponding to the currently playing video frame is "The cat is an alien who came to the earth", and "The cat is an alien who came to the earth” can be displayed in a larger font. And can be underlined as an indication.
  • the larger font and the underlined manner in FIG. 3 can be understood as a predefined progress indication manner.
  • the method further comprises: in response to a selection operation for a subtitle in the subtitle editing area, displaying a video frame corresponding to a subtitle text item in the video playback area.
  • the above-mentioned executive body may display the video frame corresponding to the subtitle in response to the selection operation on the subtitle.
  • the implementation location of the selection operation for the subtitle text item may be the subtitle editing area. Displaying the video frame corresponding to the subtitle text item may be in the video playing area.
  • the user selects a subtitle in the subtitle editing area
  • the video playback area can display the video frame corresponding to the subtitle. Specifically, if the video frame displayed in the video playback area before the user's selection operation is different from the video frame corresponding to the subtitle targeted by the selection operation, the video playback area can quickly jump to display the video frame corresponding to the selected subtitle.
  • the selection operation can be performed in the video pause state or the video playback state.
  • the user can switch the video frames displayed in the video playback area by selecting subtitles, which enables the user to conveniently view the video frames corresponding to the selected subtitles when editing subtitles in batches, thereby improving the subtitle editing efficiency.
  • the method may further include: playing audio corresponding to the video frame currently displayed in the video playing area.
  • the audio when the video is played, the audio can be played correspondingly.
  • the user can refer to the real sound to determine whether the subtitles are correct. Therefore, it is convenient for the user to edit the subtitles, and the accuracy of the subtitles is improved.
  • the above method may further include displaying target video indication information.
  • the target video indication information may be used to indicate the target video.
  • the specific form of the target video indication information may be set and determined according to an actual application scenario, and is not limited herein.
  • the target video indication information may be the target video itself, the cover of the target video, or the name of the target video.
  • the display target video indication information may include at least one of the following, but is not limited to: the name of the target video, and the target video to be played.
  • the above-mentioned subtitle editing initiation operation may be performed in a video playback state, or may be performed in a video pause state.
  • the method further includes: playing the target video in the video playing interface, and in response to detecting the predefined subtitle editing initiation operation, displaying the subtitle editing area.
  • the terminal device can play the target video, and in the played video, candidate subtitles can be displayed. Users can initiate operations by editing subtitles while playing the video. The terminal device can then display the video playback area and the subtitle editing area.
  • the terminal responds to the user's operation and enters the subtitle editing mode (displaying the video playback area and the subtitle editing area), which enables the user to quickly use the subtitle editing mode to edit the subtitle when watching the video and expecting to edit the subtitle.
  • subtitles thereby improving the speed of the user from the time when the user desires to edit the subtitles to the start of the subtitle editing, thereby improving the overall speed of the user for editing the subtitles.
  • the above-mentioned subtitle editing initiation operation may include a triggering operation for candidate subtitles displayed on the video playback interface.
  • the presented subtitles may be candidate subtitles presented in the target video.
  • the subtitle 201 in FIG. 2 the subtitle 201 can be understood as a candidate subtitle displayed.
  • the method in response to detecting the trigger operation for the subtitle, includes: identifying a touch point position of the trigger operation, and when the touch point position is located within the display area of the subtitle, determining that the trigger operation for the subtitle is detected operate.
  • the editing control associated with the subtitle can also be set, and the user triggers the editing control, which can also be used as a trigger operation for the subtitle.
  • the user's desired operation of editing the subtitles can be effectively captured, so that the user can quickly use the subtitle editing mode to edit the subtitles when watching the video and expecting to edit the subtitles. As a result, it is possible to improve the speed at which the user desires to perform subtitle editing to start subtitle editing.
  • the above method may further include: determining the subtitles triggered in the video playback interface as subtitles to be edited in the subtitle editing area.
  • the subtitle text of the subtitle 201 may be determined as the subtitle to be edited in the subtitle editing area.
  • the above-mentioned subtitle editing initiating operation may include: a triggering operation of a preset subtitle editing initiating control.
  • the preset subtitle editing initiating control may refer to a control for initiating subtitle editing.
  • the specific display form and display position of the subtitle editing initiation control can be set according to the actual application scenario, which is not limited here.
  • a control marked with the words "edit subtitles" may be set on the video playback screen as a subtitle editing initiation control.
  • the entry for initiating subtitle editing can be effectively prompted, and the time for the user to search for the subtitle editing entry can be reduced.
  • the above method further includes: in the playback state of the target video, according to the detected subtitle browsing operation for the subtitle editing area, using a free browsing mode to display the subtitles in the subtitle editing area.
  • the above-mentioned subtitle browsing operation may be used to trigger the subtitles (or subtitle text items) displayed in the free browsing mode in the subtitle editing area.
  • the specific implementation manner of the subtitle browsing operation can be set according to the actual application scenario, and is not limited here.
  • the above-mentioned subtitle browsing operation may be a page turning operation, or may be a sliding operation in the subtitle editing area.
  • the free browsing mode may include a mode in which the user can browse subtitles in the subtitle editing area, and no subtitle text item is selected.
  • the display effect of this mode can be analogous to the display mode in which the mouse wheel is scrolled in the document, and the document is displayed according to the user operation.
  • the user in the playback state of the target video, the user can freely browse the subtitle text in the subtitle editing area, which can facilitate the user to view the undisplayed part of the candidate subtitles. Therefore, when the size of the subtitle editing area is relatively fixed, it can be updated in time according to user operations, thereby improving information display efficiency, facilitating user viewing and improving subtitle editing efficiency.
  • the subtitle editing area described above is capable of displaying two subtitle text items indicating the candidate subtitles.
  • the subtitle text item is tied to the video time period. And, within the bound video time period, the voice indicated by the subtitle text item is played synchronously with the video frame displayed in the video playback area.
  • a candidate subtitle may include one or at least two subtitle text items.
  • the subtitle text item can be understood as the measurement unit of subtitle display.
  • a subtitle text item can be understood as a subtitle
  • a subtitle may include one or more characters
  • each subtitle is usually divided according to the semantic relationship between the characters.
  • the above-mentioned subtitle editing area can display at least two subtitle text items, and the user can edit the subtitle text items in batches to improve the operation efficiency.
  • the candidate subtitles are obtained based on speech recognition of the speech corresponding to the target video.
  • the candidate subtitles may include subtitle text items, and the subtitle text items are bound with the video time period.
  • voice recognition is performed on the voice of the target video, and the text corresponding to the voice of the target video can be obtained.
  • text eg, words, words, sentences
  • each segment of speech data can be bound.
  • each sentence identified can be automatically bound to the video time period.
  • the subtitle text item obtained by recognizing the language in the target video time period can be bound to the target video time period.
  • the target video time period may be any time period of the target video.
  • the target in the target video time period is added for convenience of narration and does not constitute a restriction on the video time period.
  • subtitle text items can be played within the bound video time period.
  • the subtitle text item "Meow is an alien who came to earth” can be bound to the video time period from the moment when the video starts (00:00) to the 10th second (00:10) of the video .
  • the subtitle text item "Meow is an alien who came to Earth” can be displayed during the video time period 00:00-00:10.
  • each word in "Meow is an alien who came to the earth” can be displayed together; it can also be displayed in several parts. Earth's", finally showing "Aliens”.
  • obtaining the candidate subtitles through speech recognition and binding the video time period to the subtitle text items in the candidate subtitles can improve the speed and accuracy of binding the subtitle text items to the video time period.
  • the method further includes: in response to determining that the target video is in a paused state, enabling a response function of the subtitle editing area to a subtitle editing operation.
  • the response function of the subtitle editing area to the editing operation may include that the subtitle editing area can detect the subtitle editing operation, and update the subtitles displayed in the subtitle editing area according to the detected subtitle editing operation.
  • the video playback and subtitle editing can be performed at the same time, it may cause the user's attention to be difficult to take into account the subtitles being edited and the subtitles corresponding to the video frame being played due to the rapid progress of the video played in the video playback area. As a result, the user misses to view the subtitles corresponding to some video frames, and the video needs to be played repeatedly to check. This may result in more user actions and lower accuracy.
  • the manner of pausing the target video may include a triggering operation for the subtitle text item in the subtitle editing area.
  • the triggering operation for the subtitle text item in the subtitle editing area may include the above-mentioned selection operation for the subtitle text item in the subtitle editing area.
  • the subtitle text item is bound to the subtitle editing sub-area, and is bound to the playback time period.
  • the subtitle text in the bound subtitle text item is displayed within the bound play time period (in the video play area).
  • pausing the target video can speed up the process of the user entering the subtitle editing, thereby improving the speed of the user editing subtitles.
  • the manner of pausing the target video may include: a preset triggering operation for the video playing area.
  • the above-mentioned triggering operation for the video playing area may include the operation of implementing the position in the video playing area.
  • the preset trigger operation for the video playback area may include a click operation in the video playback area, and in response to the click operation, the video playback may be paused.
  • the method further includes: updating the candidate subtitles displayed in the subtitle editing area according to the subtitle editing operation in the subtitle editing area.
  • the subtitle editing operation may include a text editing operation for subtitle text and a unit editing operation for subtitle text items.
  • the unit editing operation of subtitle text items may include modifying the relationship between subtitle text items, such as splitting subtitle text items or merging subtitle text items.
  • updating the candidate subtitles displayed in the subtitle editing area can display the effect after editing in time when the user edits the subtitle, so as to facilitate the user to determine whether the editing is correct. Thereby, the efficiency of subtitle editing by the user can be improved.
  • subtitle editing operations may include text editing operations.
  • the method further includes: in response to detecting a text editing operation on the subtitle text, updating the subtitle text in the subtitle editing area, and maintaining the time period bound to the subtitle text item targeted by the text editing operation unchanged.
  • the subtitle text item is modified during the text editing operation, the subtitle text in the subtitle text item is changed, and the video time period bound to the subtitle text item is not changed.
  • the video time period bound to the subtitle text item can be guaranteed to be relatively stable, and the consistency between the subtitle text item and the corresponding video frame can be ensured.
  • sentence segmentation between subtitle text items is accurate, that is, words in sentence A are not confused with words in sentence B. Since the start time point and the end time point of sentence A are relatively accurate, the addition or deletion of words in sentence A is within the video time period bound to sentence A. Therefore, it can be ensured that the subtitle text item is consistent with the corresponding video frame during the process of modifying the text, and the situation that the video frame does not correspond to the subtitle can be avoided as much as possible.
  • the editing mode in the text editing operation may include, but is not limited to, adding, deleting, modifying, and the like.
  • the user can add words and the like to the candidate subtitles.
  • the user can delete words and the like in the candidate subtitles.
  • the user can modify the words and the like in the candidate subtitles.
  • the subtitle editing area when a user performs text editing operations in the subtitle editing area, can be understood as a text box. It can be understood that the operations that can be performed in the general text box can also be performed in the subtitle editing area.
  • the user can modify the text of the candidate subtitles in time to improve the accuracy of the subtitles corresponding to the target video.
  • the candidate subtitles may be obtained through speech recognition, and through text editing operations, the user can correct the results obtained by speech recognition to improve the accuracy of the subtitles corresponding to the target video.
  • the subtitle editing operations include cell editing operations.
  • the method may further include: splitting or merging the subtitle text items according to the unit editing operation for the subtitle text items, and binding a video time period to the newly generated subtitle text items.
  • At least two subtitle text items can be generated.
  • At least two subtitle text items can be combined into one subtitle text item.
  • the unit editing operation may take the subtitle text item as a unit to perform splitting or merging of the subtitle text item.
  • the executive body can automatically bind the video time period to the newly generated subtitle text item, thereby reducing the need for manual work. Adjusting the link of binding the video time period for subtitles reduces the difficulty of subtitle editing and improves the efficiency of subtitle editing.
  • the convenient splitting or merging of subtitle text items can effectively make up for the shortcomings of speech recognition in sentence segmentation and improve the overall accuracy of candidate subtitles.
  • the split operation can be used to split at least two subtitle text items.
  • the above-mentioned splitting operation may include an operation of segmenting a subtitle text item.
  • the above-mentioned execution body splits the subtitle text item to generate at least two subtitle text items, wherein the generated at least two subtitle text items are displayed in a time-sharing manner. or displayed in different display areas.
  • the time-sharing display can include different display at the same time.
  • the two subtitle text items are two different subtitles, and one can be displayed before the other is displayed, or the two subtitles are displayed in different display areas, for example Display two different subtitles up and down.
  • the splitting or merging of the subtitle text items according to the unit editing operation for the subtitle text items may include: in response to detecting the splitting operation, splitting each subtitle text item obtained by splitting The video time period of the subtitle text item before the split is divided according to the proportion of the subtitle text item before the split; each video time period obtained by the division is bound to each subtitle text item obtained by the split.
  • the video time periods bound to the subtitle text items before the split are divided according to the split subtitle text items, so as to ensure that the split subtitle text items are bound to each other as much as possible.
  • the speed of binding the video time period for the newly generated subtitle text item is increased.
  • FIG. 4A and FIG. 4B illustrate exemplary application scenarios of the split operation.
  • the user can perform a line break operation in the middle of the subtitle text item "because of its cute appearance, gain the trust of humans and become a few friends who interact with humans on an equal footing" (for example, at the comma), and the position of the line break operation can be marked with a label
  • the indication symbol 401 represents.
  • the original subtitle text item is split into two subtitle text items, and the two split subtitle text items are shown in the indication box 402 in FIG. 4B .
  • the proportion of the subtitle text items obtained by splitting in the subtitle text items before the splitting includes: the number of text words in the subtitle text items obtained by splitting and the number of text words in the subtitle text items before the splitting The ratio between total text subtitles.
  • the proportion can be counted based on text subtitles. Since there are 15 words for "gaining the trust of humans because of their cute appearance", and 14 words for "becoming a few friends who interact with human beings on an equal footing", the line feed operation position indicated by the indicator 401 in FIG. 4A can be changed to Subtitle text items before splitting are split 15:14 (representing 15 to 14). The time period bound to the subtitle text item before the split is divided according to 15:14.
  • the video time period bound to the subtitle text item before splitting is from the 1st minute and 1st minute to the 1st minute and 29th second; "trust" is bound to the first minute 1 second to the first minute 15 seconds of the video time period; the split subtitle text item "become a few friends who are equal to human beings" can be bound to the first minute and 16 seconds of the video time period Bind until the first minute and 29 seconds.
  • the proportion can be quickly determined and split to improve the splitting speed.
  • the proportion of the subtitle text items obtained by splitting in the subtitle text items before the splitting includes: the speech duration corresponding to the subtitle text items obtained by splitting, and the subtitle text items corresponding to the subtitle text items before the splitting. The ratio between the total speech duration.
  • the proportion can be counted according to the voice duration.
  • the speech duration of "gain the trust of humans because of their cute appearance” is 12 seconds
  • the speech duration of "become one of the few friends who interact with humans on an equal footing" is 8 seconds, that is, the first half of the sentence accounts for 60%
  • the second half Sentences account for 40%.
  • the line feed operation position indicated by the indicator 401 in FIG. 4A can divide the subtitle text items before splitting into 3:2 (representing 3:2).
  • the time period bound to the subtitle text item before the split is divided according to 3:2.
  • the video time period bound to the subtitle text item before splitting is from the 1st minute 1 second to the 1st minute 30 seconds; then the split subtitle text item "because of the cute appearance And gain the trust of human beings" is bound to the first minute 1 second to the first minute 18 seconds of the video time period; the split subtitle text item "become a few friends who interact with human beings on an equal footing" can be combined with the first minute and 18 seconds of the video time period. Binding is performed from minutes 19 seconds to 1 minute 30 seconds.
  • the cell editing operations described above may include merge operations.
  • the above-mentioned merging operation can be used to merge at least two subtitle text items.
  • the above step of splitting or merging the subtitle text items according to the unit editing operation for the subtitle text items may include: in response to detecting the merging operation, performing the merging operation on at least two subtitle text items targeted for the merging operation. Merging, and merging the video time periods bound to the at least two subtitle text items.
  • the above-mentioned merging operation may include a segment (eg, segment identification) operation that deletes two subtitle text items.
  • the above-mentioned execution body may merge at least two subtitle text items to generate a new subtitle text item in response to detecting a merge operation for the subtitle text items, wherein the generated new subtitle text item is in the new subtitle text item. Play during the video time period bound to the text item.
  • FIGS. 5A and 5B illustrate exemplary application scenarios of the merge operation.
  • the user can perform an operation of deleting a segment at the position indicated by the delete symbol 501 .
  • the operation of deleting segments can be understood as a merge operation.
  • FIG. 5B in response to the above-mentioned merging operation, the original two subtitle text items are merged into one subtitle text item, and the merged subtitle text item is shown in the indication box 502 in FIG. 5B .
  • the video time period bound to the above subtitle text item "Cow stars can't drink milk” before the merger is from the first minute 31 seconds to the first minute 40 seconds
  • the subtitle text item before the merger "Because the cats do not digest milk”
  • the time period to which the enzyme "bound” is from the first minute and 41 seconds to the first minute and 50 seconds.
  • At least two subtitle text items targeted by the merging operation are merged, and a video time period is automatically bound to the bound subtitle text items, so that the entire subtitle editing can be performed for the candidate subtitles. time, that is, to improve the efficiency of subtitle editing.
  • subtitle text in a single subtitle text item is stored in the same storage unit, and different subtitle text items are stored in different storage units.
  • the storage unit may be an array.
  • subtitle text within a single subtitle text item may be stored in the same array, and different subtitle text items may be stored in different arrays.
  • the subtitle text items can be distinguished, which facilitates setting corresponding properties (eg video time period, text controls, etc.) for the subtitle text items.
  • the method further includes: setting a corresponding self attribute for the storage unit in which the single subtitle text item is stored.
  • the self attribute is used to indicate the self characteristic of the subtitle text item.
  • the self-property may include, but is not limited to, at least one of the following: a bound multimedia time period (for example, a video time period and/or an audio time period may be included), a bound text control, and a bound edit trigger Controls (when the edit trigger control or area of a certain subtitle text item is triggered, the editing of this subtitle text item can be triggered) and so on.
  • a bound multimedia time period for example, a video time period and/or an audio time period may be included
  • a bound text control for example, a video time period and/or an audio time period may be included
  • a bound text control for example, a bound text control, and a bound edit trigger Controls (when the edit trigger control or area of a certain subtitle text item is triggered, the editing of this subtitle text item can be triggered) and so on.
  • the own property can be set in association with the storage unit.
  • subtitle text item A may be stored in storage unit B.
  • the video time period bound to the subtitle text item A can be bound to the storage unit B.
  • the corresponding self-property is set for the subtitle text A.
  • the integrity of the subtitle text item can be ensured because the changed subtitle text is set in an independent storage unit, thereby ensuring the stability of its own attributes. .
  • the properties of the subtitle text item can be kept unchanged, for example, the time period to which the subtitle text item is bound remains unchanged; as an example, The time period bound by the subtitle text item "Cow stars can't drink milk. Because the cats do not have the enzymes to digest milk" can be from the first minute 31 seconds to the first minute 50 seconds. If you modify the cats in this subtitle text item For the dog, that is, "the dog can't drink milk. Because the dog does not have the enzyme to digest milk", the bound time period remains unchanged, and it can still be from the first minute and 31 seconds to the first minute and 50 seconds.
  • the characters in the array can be quickly modified, the storage space occupied can be reduced as much as possible, and the subtitle editing speed can be improved.
  • At least two subtitle text items generated according to the split operation are stored in different storage units.
  • the at least two subtitle text items generated after splitting are stored in different storage units, which can effectively set their own attributes for the newly generated subtitle text items. Therefore, the subtitle separation can be quickly realized in the subtitle split scene, and the consistency between the subtitles and the video frame can be ensured.
  • the subtitle text items generated according to the merge operation are stored in the same storage unit.
  • the subtitle text item generated after the merging operation is stored in the same storage unit, and its own properties can be set for the newly generated subtitle text item. Therefore, subtitle merging can be quickly realized in a subtitle merging scene, and the consistency of subtitles and video frames can be ensured.
  • the subtitle editing area includes a text control bound to a subtitle text item.
  • the subtitle editing area may include at least two subtitle editing sub-areas.
  • the subtitle editing sub-area is used to display a subtitle, and the subtitle editing sub-area can be bound with the subtitle text item.
  • the size of the subtitle sub-area can be adjusted according to the number of words in the subtitle text item.
  • the subtitle editing sub-area can be understood as a text control, and the display position of the text control can be adjusted according to the video playback progress.
  • the subtitle text item can be responsive to move in the subtitle editing area.
  • the text control is used to display the subtitle text in the bound subtitle text item.
  • each subtitle text item when editing the subtitle text item, the user can be prompted to distinguish each subtitle text item, and the content in this subtitle text item can be quickly modified without changing the contents of other subtitle text items. Subtitle text items are disturbing. Thus, it can be ensured that the video time periods bound to each subtitle text item are not confused.
  • the text control can be a text box.
  • subtitle editing can be performed by using the existing text editing function of the text box, thereby reducing development difficulty and improving development efficiency.
  • the user's familiarity with the text box can be utilized to reduce the user's operational difficulty and improve the user's efficiency in editing subtitles.
  • the method further includes generating a new text control based on the split operation, and binding the newly generated text control with the newly generated subtitle text item.
  • a subtitle text item is split into two, the first subtitle text item obtained from the split can be bound to the original text control, and the second subtitle text item obtained from the split can be bound to the newly generated text control .
  • the method further includes: based on the merge operation, deleting the text control bound to the merged subtitle text item.
  • a new text control may be generated for the merged subtitle text item, and the merged subtitle text item is displayed in the new text control. Therefore, a method of newly generating a text control can be adopted to reduce the probability of confusion of the text control. In other words, keeping a text control and deleting a text control may cause operation errors due to many types of operations.
  • only one of the text controls bound to each subtitle text item before merging may be retained, and the merged subtitle text item may be displayed in the retained text control. In this way, the calculation amount of the newly generated text control can be saved, and the speed of displaying after merging can be improved.
  • the above method may further include: displaying the video frame corresponding to the updated subtitle text item according to the subtitle editing operation.
  • the subtitle text item may perform a text editing operation or a unit editing operation, and after the user operates the subtitle text item, the updated subtitle text item may be displayed in the subtitle editing area.
  • the above-mentioned execution body can quickly update the subtitle text item of the updated subtitle text item to the video frame corresponding to the video playing area.
  • the user can preview the display effect of the video frame after the subtitle update in time, which is convenient for the user to adjust in time according to the preview effect, and improves the overall efficiency of the user in editing the subtitles.
  • FIG. 6 shows the flow of one embodiment of the subtitle editing method according to the present disclosure.
  • the subtitle editing method includes the following steps:
  • Step 601 Obtain candidate subtitles.
  • the candidate subtitles include at least one subtitle text item, the subtitle text item is bound to a video time period, and the video time period bound to the subtitle text item is used for the linked display of the subtitle and the video.
  • Step 602 according to the unit editing operation for the subtitle text item, split or merge the subtitle text item, and bind a video time period to the newly generated subtitle text item.
  • the executive body can automatically bind the video time period to the newly generated subtitle text item, thereby reducing the need for manual work. Adjusting the environment for binding video time periods for subtitles reduces the difficulty of subtitle editing and improves the efficiency of subtitle editing.
  • the binding a video time period for the newly generated subtitle text item includes: based on the video time period bound to the subtitle text item targeted by the splitting operation or the merging operation, binding the newly generated subtitle text item for the newly generated subtitle text item. Video time period.
  • the unit editing operation includes a splitting operation; and the subtitle text item is split or merged according to the unit editing operation for the subtitle text item, and a video is bound for the newly generated subtitle text item
  • the time period includes: in response to detecting the splitting operation, dividing the video time period of the subtitle text item before splitting according to the proportion of each subtitle text item obtained by splitting in the subtitle text item before splitting; Bind each video time segment obtained by division to each subtitle text item obtained by dividing.
  • the unit editing operation includes a merging operation; and the subtitle text item is split or merged according to the unit editing operation for the subtitle text item, and the video time is bound to the newly generated subtitle text item
  • the segment includes: in response to detecting a merge operation, merging at least two subtitle text items targeted by the merge operation, and merging video time periods bound to the at least two subtitle text items.
  • the subtitle editing operation may include a text editing operation for subtitle text and a unit editing operation for subtitle text items.
  • Editing operations for subtitle text items may include modifying the relationship between subtitle text items, such as splitting subtitle text items or merging subtitle text items.
  • subtitle editing operations may include text editing operations.
  • the method also includes updating the subtitle text in the subtitle editing area in response to detecting a text editing operation on the subtitle text.
  • the editing mode in the text editing operation may include, but is not limited to, adding, deleting, modifying, and the like.
  • the user can delete words and the like in the candidate subtitles.
  • the user can modify the words and the like in the candidate subtitles.
  • the subtitle editing area when a user performs text editing operations in the subtitle editing area, can be understood as a text box. It can be understood that the operations that can be performed in the general text box can also be performed in the subtitle editing area.
  • the user can modify the text of the candidate subtitles in time to improve the accuracy of the subtitles corresponding to the target video.
  • the candidate subtitles may be obtained through speech recognition, and through text editing operations, the user can correct the results obtained by speech recognition to improve the accuracy of the subtitles corresponding to the target video.
  • the subtitle editing operations include cell editing operations.
  • At least two subtitle text items can be generated.
  • At least two subtitle text items can be combined into one subtitle text item.
  • the unit editing operation may take the subtitle text item as a unit to perform splitting or merging of the subtitle text item.
  • the executive body can automatically bind the video time period to the newly generated subtitle text item, thereby reducing the need for manual work. Adjusting the environment for binding video time periods for subtitles reduces the difficulty of subtitle editing and improves the efficiency of subtitle editing. In addition, the convenient splitting or merging of subtitle text items can effectively make up for the shortcomings of speech recognition in sentence segmentation and improve the overall accuracy of candidate subtitles.
  • the unit editing operations described above may include splitting operations.
  • the split operation can be used to split at least two subtitle text items.
  • the above-mentioned splitting operation may include an operation of segmenting a subtitle text item.
  • the execution body splits the subtitle text item to generate at least two subtitle text items, wherein the generated at least two subtitle text items are displayed in a time-sharing manner.
  • the splitting or merging of the subtitle text items according to the unit editing operation for the subtitle text items may include: in response to detecting the splitting operation, splitting each subtitle text item obtained by splitting The video time period of the subtitle text item before the split is divided according to the proportion of the subtitle text item before the split; each video time period obtained by the division is bound to each subtitle text item obtained by the split.
  • the video time periods bound to the subtitle text items before the split are divided according to the split subtitle text items, so as to ensure that the split subtitle text items are bound to each other as much as possible.
  • the speed of binding the video time period for the newly generated subtitle text item is increased.
  • the proportion of the subtitle text items obtained by splitting in the subtitle text items before the splitting includes: the number of text words in the subtitle text items obtained by splitting and the number of text words in the subtitle text items before the splitting The ratio between total text subtitles.
  • the proportion can be quickly determined and split to improve the splitting speed.
  • the proportion of the subtitle text items obtained by splitting in the subtitle text items before the splitting includes: the speech duration corresponding to the subtitle text items obtained by splitting, and the subtitle text items corresponding to the subtitle text items before the splitting. The ratio between the total speech duration.
  • the cell editing operations described above may include merge operations.
  • the above-mentioned merging operation can be used to merge at least two subtitle text items.
  • the above step of splitting or merging the subtitle text items according to the unit editing operation for the subtitle text items may include: in response to detecting the merging operation, performing the merging operation on at least two subtitle text items targeted for the merging operation. Merging, and merging the video time periods bound to the at least two subtitle text items.
  • the merging operation described above may include an operation of deleting segments of two subtitle text items.
  • the above-mentioned execution body may merge at least two subtitle text items to generate a new subtitle text item in response to detecting a merge operation for the subtitle text items, wherein the generated new subtitle text item is in the new subtitle text item. Play during the video time period bound to the text item.
  • At least two subtitle text items targeted by the merging operation are merged, and a video time period is automatically bound to the bound subtitle text items, so that the entire subtitle editing can be performed for the candidate subtitles. time, that is, to improve the efficiency of subtitle editing.
  • subtitle text in a single subtitle text item is stored in the same storage unit, and different subtitle text items are stored in different storage units.
  • the storage unit may be an array.
  • subtitle text within a single subtitle text item may be stored in the same array, and different subtitle text items may be stored in different arrays.
  • the subtitle text items can be distinguished, which facilitates setting corresponding properties (eg video time period, text controls, etc.) for the subtitle text items.
  • using the array data format to store the subtitle text items can quickly modify the characters in the array, reduce the storage space occupied as much as possible, and improve the subtitle editing speed.
  • At least two subtitle text items generated according to the split operation are stored in different storage units.
  • the subtitle text items generated according to the merge operation are stored in the same storage unit.
  • the subtitle text item generated after the merging operation is stored in the same storage unit, and its own properties can be set for the newly generated subtitle text item.
  • subtitle merging can be quickly realized in a subtitle merging scene, and the consistency between the subtitles and the video frame can be ensured.
  • the subtitle editing area includes a text control bound to a subtitle text item.
  • the subtitle editing area may include at least two subtitle editing sub-areas.
  • the subtitle editing sub-area is used to display a subtitle, and the subtitle editing sub-area can be bound with the subtitle text item.
  • the size of the subtitle sub-area can be adjusted according to the number of words in the subtitle text item.
  • the subtitle editing sub-area can be understood as a text control, and the display position of the text control can be adjusted according to the video playback progress.
  • the subtitle text item can be responsive to move in the subtitle editing area.
  • the text control is used to display the subtitle text in the bound subtitle text item.
  • subtitle text item when editing the subtitle text item, the user can be prompted to distinguish each subtitle text item, and the content in this subtitle text item can be quickly modified without changing the content of other subtitle text items. Subtitle text items are disturbing. Thus, it can be ensured that the video time periods bound to each subtitle text item are not confused.
  • the text control can be a text box.
  • subtitle editing can be performed by using the existing text editing function of the text box, thereby reducing development difficulty and improving development efficiency.
  • the user's familiarity with the text box can be utilized to reduce the user's operational difficulty and improve the user's efficiency in editing subtitles.
  • a subtitle text item is split into two, the first subtitle text item obtained from the split can be bound to the original text control, and the second subtitle text item obtained from the split can be bound to the newly generated text control .
  • the method further includes: based on the merge operation, deleting the text control bound to the merged subtitle text item.
  • a new text control may be generated for the merged subtitle text item, and the merged subtitle text item is displayed in the new text control. Therefore, a method of newly generating a text control can be adopted to reduce the probability of confusion of the text control. In other words, keeping a text control and deleting a text control may cause operation errors due to many types of operations.
  • only one of the text controls bound to each subtitle text item before merging may be retained, and the merged subtitle text item may be displayed in the retained text control. In this way, the calculation amount of the newly generated text control can be saved, and the speed of displaying after merging can be improved.
  • the above method may further include: displaying the video frame corresponding to the updated subtitle text item according to the subtitle editing operation.
  • the subtitle text item may perform a text editing operation or a unit editing operation, and after the user operates the subtitle text item, the updated subtitle text item may be displayed in the subtitle editing area.
  • the above-mentioned executive body can quickly update the subtitle text item of the updated subtitle text item to the video frame corresponding to the video playing area.
  • FIG. 7 shows the flow of one embodiment of the subtitle editing method according to the present disclosure.
  • the subtitle editing method includes the following steps:
  • Step 701 displaying a video playing interface.
  • the video playing interface is used to play the video and display the subtitles corresponding to the video frames.
  • Step 702 in response to detecting the triggering operation for the subtitles, displaying a video playing area and a subtitle editing area.
  • the subtitle editing area is used for editing subtitles
  • the video playing area is used for playing the target video.
  • displaying a video playing area and a subtitle editing area in response to detecting a triggering operation for the subtitles includes: in response to detecting a triggering operation for the subtitles, zooming out in the video playing interface size in the video playback area, play the video in the reduced size video playback area, and display the subtitle editing area.
  • playing the video in the reduced-size video playing area can reduce the interface occupied by the video playing, facilitate the display of the subtitle editing area, improve the utilization rate of the interface, and improve the subtitle editing efficiency.
  • the subtitle editing area is capable of displaying two subtitle text items indicating two.
  • the subtitle text item is bound with a video time period, wherein, in the bound video time period, the voice indicated by the subtitle text item is played synchronously with the video frame displayed in the video playing area.
  • a candidate subtitle may include one or at least two subtitle text items.
  • the subtitle text item can be understood as the measurement unit of subtitle display.
  • a subtitle text item can be understood as a subtitle.
  • the above-mentioned subtitle editing area can display at least two subtitle text items, and the user can edit the subtitle text items in batches to improve the operation efficiency.
  • the method further includes: in response to detecting a first trigger operation in the subtitle editing area, displaying the subtitles in the subtitle editing area in a free browsing mode.
  • the above-mentioned subtitle browsing operation may be used to trigger the subtitles (or subtitle text items) displayed in the subtitle editing area in the own browsing mode.
  • the specific implementation manner of the subtitle browsing operation can be set according to the actual application scenario, and is not limited here.
  • the above-mentioned subtitle browsing operation may be a page turning operation, or may be a sliding operation in the subtitle editing area.
  • the free browsing mode may include a mode in which the user can browse subtitles in the subtitle editing area, and no subtitle text item is selected. This mode can be compared to the display mode in which the mouse wheel is scrolled in the document, and the document is displayed according to the user's operation.
  • the user in the playback state of the target video, the user can freely browse the subtitle text in the subtitle editing area, which can facilitate the user to view the undisplayed part of the candidate subtitles. Therefore, when the size of the subtitle editing area is relatively fixed, it can be updated in time according to user operations, thereby improving information display efficiency, facilitating user viewing and improving subtitle editing efficiency.
  • the method further includes: in response to detecting a second trigger operation in the subtitle editing area, selecting a subtitle text item in the subtitle editing area.
  • in response to the detection of the triggering operation for the subtitle comprising: identifying a touch point position that triggers the operation, and when the touch point position is located within a display area of the subtitle, determining that the touch point position for the subtitle is detected. Trigger action for subtitles.
  • an editing control associated with the subtitle can also be set, and the user triggers the editing control, which can also be used as a trigger operation for the subtitle.
  • the present disclosure provides an embodiment of a subtitle editing apparatus
  • the apparatus embodiment corresponds to the method embodiment shown in FIG. 1
  • the apparatus may specifically Used in various electronic devices.
  • the apparatus includes units for performing corresponding steps or operations, and the above units may be implemented by means of software, hardware, or a combination thereof.
  • the subtitle editing apparatus of this embodiment includes: a first presentation unit 801 and a second presentation unit 802 .
  • the first display unit is used to display a video playing area and a subtitle editing area, wherein the video playing area is used to play the target video, and the subtitle editing area is used to edit the candidate subtitles corresponding to the target video;
  • the second The display unit is used to display the video frame currently displayed in the video playback area and the subtitle displayed in the subtitle editing area in linkage.
  • step 101 and step 102 in the corresponding embodiment of FIG. 1 respectively. , and will not be repeated here.
  • the linked display of the video frame currently displayed in the video playback area and the subtitles displayed in the subtitle editing area includes: using a predefined progress indication method, in the subtitle editing area, indicating Subtitles corresponding to the currently playing video frame in the video playing area.
  • performing the linked display of the video frame currently displayed in the video playback area and the subtitle displayed in the subtitle editing area includes: in response to detecting a selection operation for the subtitle in the subtitle editing area, in the subtitle editing area In the video playback area, the video frame corresponding to the selected subtitle is displayed.
  • the apparatus is further configured to: play audio corresponding to the video frame currently displayed in the video playing area.
  • the apparatus is further configured to: play the target video in the video playing interface, and display the subtitle editing area in response to detecting the predefined subtitle editing initiation operation.
  • the subtitle editing initiating operation includes a triggering operation for the subtitle displayed in the video playback interface.
  • the apparatus is further configured to: determine the subtitles triggered in the video playback interface as subtitles to be edited in the subtitle editing area.
  • the subtitle editing initiating operation includes a triggering operation of a preset subtitle editing initiating control.
  • the apparatus is further configured to display the subtitles in the subtitle editing area in a free browsing mode according to the detected subtitle browsing operation for the subtitle editing area in the playback state of the target video.
  • the subtitle editing area is capable of displaying at least two subtitle text items of the candidate subtitles, and the subtitle text items are bound with a video time period, wherein in the bound video time period, the subtitle text items The indicated voice is played synchronously with the video frame displayed in the video playback area.
  • the candidate subtitles are obtained based on speech recognition of the speech corresponding to the target video, and a subtitle text item obtained by recognizing the speech in the target video time period is bound to the target video time period.
  • the apparatus is further configured to: in response to determining that the target video is in a paused state, enable a response function of the subtitle editing area to an editing operation.
  • the way of pausing the target video includes: selecting an item of subtitle text in the subtitle editing area.
  • the manner of pausing the target video includes: a preset trigger operation for the video playing area.
  • the apparatus is further configured to: update the candidate subtitles displayed in the subtitle editing area according to the subtitle editing operation in the subtitle editing area.
  • the subtitle editing operation may include a text editing operation; and the apparatus is further for: in response to detecting the text editing operation on the subtitle text, updating the subtitle text in the subtitle editing area, and maintaining the text editing The time period to which the subtitle text item targeted by the operation is bound remains unchanged.
  • the subtitle editing operation includes a unit editing operation; and the apparatus is further configured to split or merge the subtitle text item according to the unit editing operation for the subtitle text item, and generate a new subtitle text for the subtitle text item. Items are bound to the video time period.
  • the unit editing operation comprises a splitting operation
  • the step of splitting or merging the subtitle text items according to the unit editing operation for the subtitle text items, and binding the video time period to the newly generated subtitle text items includes: in response to detecting the splitting operation, obtaining according to the splitting The proportion of each subtitle text item in the subtitle text item before splitting is divided, and the video time period of the subtitle text item before splitting is divided; Items are bound separately.
  • the proportion of the subtitle text items obtained by splitting in the subtitle text items before the splitting includes at least one of the following: the number of text words in the subtitle text items obtained by splitting, and the subtitle text items before the splitting.
  • the unit editing operation includes a merging operation; and the subtitle text item is split or merged according to the unit editing operation for the subtitle text item, and the video time is bound to the newly generated subtitle text item
  • the segment includes: in response to detecting a merge operation, merging at least two subtitle text items targeted by the merge operation, and merging video time periods bound to the at least two subtitle text items.
  • subtitle text in a single subtitle text item is stored in the same storage unit, and different subtitle text items are stored in different storage units.
  • a corresponding self attribute is set for the storage unit in which a single subtitle text item is stored, wherein the self attribute is used to indicate the self characteristic of the subtitle text item.
  • the at least two subtitle text items generated according to the split operation are stored in different storage units; the subtitle text items generated according to the merge operation are stored in the same storage unit.
  • the subtitle editing area includes a text control bound to the subtitle text item, wherein the text control is used to display the subtitle text in the bound subtitle text item.
  • the apparatus is further used for at least one of the following: based on the split operation, generating a new text control, and binding the newly generated text control with the newly generated subtitle text item; based on the merging operation, deleting The text control to which the merged subtitle text item is bound.
  • the apparatus is further configured to: display the video frame corresponding to the updated subtitle text item in the video playback area.
  • the present disclosure provides an embodiment of a subtitle editing apparatus.
  • the apparatus embodiment corresponds to the method embodiment shown in FIG. 1 , and the apparatus may specifically Used in various electronic devices.
  • the apparatus includes units for performing corresponding steps or operations, and the above units may be implemented by means of software, hardware, or a combination thereof.
  • the subtitle editing apparatus of this embodiment includes: an obtaining unit 901 and a binding unit 902 .
  • the obtaining unit is configured to obtain candidate subtitles, wherein the candidate subtitles include at least one subtitle text item, the subtitle text item is bound with a video time period, and the video time period bound to the subtitle text item is used for the subtitle and video.
  • Linkage display used for splitting or merging the subtitle text items according to the unit editing operation for the subtitle text items, and binding the video time period to the newly generated subtitle text items.
  • the specific processing of the acquiring unit 901 and the binding unit 902 of the subtitle editing apparatus and the technical effects brought about by the subtitle editing apparatus may refer to the relevant descriptions of step 601 and step 602 in the corresponding embodiment of FIG. Repeat.
  • the binding a video time period for the newly generated subtitle text item includes: based on the video time period bound to the subtitle text item targeted by the splitting operation or the merging operation, binding the newly generated subtitle text item for the newly generated subtitle text item. Video time period.
  • the unit editing operation includes a merging operation; and the subtitle text item is split or merged according to the unit editing operation for the subtitle text item, and the video time is bound to the newly generated subtitle text item
  • the segment includes: in response to detecting a merge operation, merging at least two subtitle text items targeted by the merge operation, and merging video time periods bound to the at least two subtitle text items.
  • the present disclosure provides an embodiment of a subtitle editing apparatus.
  • the apparatus embodiment corresponds to the method embodiment shown in FIG. 1 , and the apparatus may specifically Used in various electronic devices.
  • the apparatus includes units for performing corresponding steps or operations, and the above units may be implemented by means of software, hardware, or a combination thereof.
  • the subtitle editing apparatus of this embodiment includes: a third display unit 1001 and a fourth display unit 1002 .
  • the third display unit is used to display a video playback interface, wherein the video playback interface is used to play the video and display the subtitles corresponding to the video frames;
  • the fourth display unit is used to respond to the detection of An operation is triggered to display a video playing area and a subtitle editing area, wherein the subtitle editing area is used for editing subtitles, and the video playing area is used to play the target video.
  • step 701 and step 702 in the corresponding embodiment of FIG. 7 , respectively. It is not repeated here.
  • displaying a video playing area and a subtitle editing area in response to detecting a triggering operation for the subtitles includes: in response to detecting a triggering operation for the subtitles, zooming out in the video playing interface The video displays the dimensions in the playback area, plays the video in the reduced-size video playback area, and displays the subtitle editing area.
  • the subtitle editing area is capable of displaying two subtitle text items indicating two.
  • the apparatus is further configured to: in response to detecting a first trigger operation in the subtitle editing area, use a free browsing mode to display the subtitles in the subtitle editing area.
  • the apparatus is further configured to select a subtitle text item in the subtitle editing area in response to detecting a second trigger operation in the subtitle editing area.
  • in response to the detection of the triggering operation for the subtitle comprising: identifying a touch point position that triggers the operation, and when the touch point position is located within a display area of the subtitle, determining that the touch point position for the subtitle is detected. Trigger action for subtitles.
  • FIG. 11 illustrates an exemplary system architecture to which a subtitle editing method according to an embodiment of the present disclosure may be applied.
  • the system architecture may include terminal devices 1101 , 1102 , and 1103 , a network 1104 , and a server 1105 .
  • the network 1104 is a medium used to provide a communication link between the terminal devices 1101 , 1102 , 1103 and the server 1105 .
  • the network 1104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
  • the terminal devices 1101, 1102, 1103 can interact with the server 1105 through the network 1104 to receive or send messages and the like.
  • Various client applications may be installed on the terminal devices 1101 , 1102 and 1103 , such as web browser applications, search applications, and news information applications.
  • the client applications in the terminal devices 1101, 1102, and 1103 can receive the user's instruction, and complete corresponding functions according to the user's instruction, such as adding corresponding information to the information according to the user's instruction.
  • the terminal devices 1101, 1102, and 1103 may be hardware or software.
  • the terminal devices 1101, 1102, and 1103 can be various electronic devices that have a display screen and support web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, Moving Picture Experts Compression Standard Audio Layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, Moving Picture Experts Compression Standard Audio Layer 4) Players, Laptops and Desktops, etc.
  • the terminal devices 1101, 1102, and 1103 are software, they can be installed in the electronic devices listed above. It can be implemented as a plurality of software or software modules (eg, software or software modules for providing distributed services), or can be implemented as a single software or software module. There is no specific limitation here.
  • the server 1105 may be a server that provides various services, such as receiving information acquisition requests sent by the terminal devices 1101 , 1102 , and 1103 , and acquiring display information corresponding to the information acquisition requests in various ways according to the information acquisition requests. And the relevant data of the displayed information is sent to the terminal devices 1101 , 1102 and 1103 .
  • the subtitle editing method provided by the embodiment of the present disclosure may be executed by a terminal device, and correspondingly, the subtitle editing apparatus may be set in the terminal devices 1101 , 1102 , and 1103 .
  • the subtitle editing method provided by the embodiment of the present disclosure may also be executed by the server 1105 , and correspondingly, the subtitle editing apparatus may be provided in the server 1105 .
  • terminal devices, networks and servers in FIG. 11 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.
  • FIG. 12 it shows a schematic structural diagram of an electronic device (eg, the terminal device or the server in FIG. 11 ) suitable for implementing an embodiment of the present disclosure.
  • Terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (eg, mobile terminals such as in-vehicle navigation terminals), etc., and stationary terminals such as digital TVs, desktop computers, and the like.
  • the electronic device shown in FIG. 12 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • the electronic device may include a processing device (eg, a central processing unit, a graphics processor, etc.) 1201 that may be loaded into a random access memory according to a program stored in a read only memory (ROM) 1202 or from a storage device 1208
  • the program in the (RAM) 1203 executes various appropriate operations and processes.
  • various programs and data required for the operation of the electronic device 1200 are also stored.
  • the processing device 1201, the ROM 1202, and the RAM 1203 are connected to each other through a bus 1204.
  • An input/output (I/O) interface 1205 is also connected to bus 1204 .
  • I/O interface 1205 input devices 1206 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speakers, vibration An output device 1207 of a computer, etc.; a storage device 1208 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 1209.
  • Communication means 1209 may allow electronic devices to communicate wirelessly or by wire with other devices to exchange data. While FIG. 11 shows an electronic device having various means, it should be understood that not all of the illustrated means are required to be implemented or available. More or fewer devices may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 1209 , or from the storage device 1208 , or from the ROM 1202 .
  • the processing apparatus 1201 the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal in baseband or propagated as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, electrical wire, optical fiber cable, RF (radio frequency), etc., or any suitable combination of the foregoing.
  • the client and server can use any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol) to communicate, and can communicate with digital data in any form or medium Communication (eg, a communication network) interconnects.
  • HTTP HyperText Transfer Protocol
  • Examples of communication networks include local area networks (“LAN”), wide area networks (“WAN”), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently known or future development network of.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: displays a video playing area and a subtitle editing area, wherein the video playing area is used for The target video is played, and the subtitle editing area is used to edit the candidate subtitles corresponding to the target video; the video frame currently displayed in the video playback area and the subtitle displayed in the subtitle editing area are displayed in linkage.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, causes the electronic device to: acquire candidate subtitles, wherein the candidate subtitles include at least one subtitle text item, the subtitles The text item is bound to the video time period, and the video time period bound to the subtitle text item is used for the linked display of the subtitle and the video; according to the unit editing operation for the subtitle text item, the subtitle text item is split or merged, and The newly generated subtitle text item is bound to the video time period.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: displays a video playback interface, wherein the video playback interface is used for playing videos and displaying A subtitle corresponding to a video frame; in response to detecting a trigger operation for the subtitle, a video playback area and a subtitle editing area are displayed, wherein the subtitle editing area is used to edit subtitles, and the video playback area is used to play the target video .
  • Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner.
  • the name of the unit does not constitute a limitation of the unit itself under certain circumstances, for example, the first display unit may also be described as a "unit for displaying target video indication information".
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs Systems on Chips
  • CPLDs Complex Programmable Logical Devices
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Television Signal Processing For Recording (AREA)
  • Studio Circuits (AREA)

Abstract

本公开实施例公开了字幕编辑方法、装置和电子设备。该方法的一具体实施方式包括:展示视频播放区域和字幕编辑区域,其中,所述视频播放区域用于播放目标视频,所述字幕编辑区域用于编辑所述目标视频对应的候选字幕;将视频播放区域当前展示的视频帧与所述字幕编辑区域展示的字幕,进行联动展示。由此,提供了一种新的字幕编辑方式。

Description

字幕编辑方法、装置和电子设备
相关申请的交叉引用
本申请要求于2020年08月25日提交的,申请号为202010868161.8、发明名称为“字幕编辑方法、装置和电子设备”的中国专利申请的优先权,该申请的全文通过引用结合在本申请中。
技术领域
本公开涉及互联网技术领域,尤其涉及一种字幕编辑方法、装置和电子设备。
背景技术
随着互联网的发展,用户越来越多的使用终端设备浏览各类信息。例如,用户可以在终端上制作视频。制作视频的时候,用户可能希望为视频配上相应的字幕。字幕的制作可能涉及到调整文本、调整时间轴等。
发明内容
提供该公开内容部分以便以简要的形式介绍构思,这些构思将在后面的具体实施方式部分被详细描述。该公开内容部分并不旨在标识要求保护的技术方案的关键特征或必要特征,也不旨在用于限制所要求的保护的技术方案的范围。
第一方面,本公开实施例提供了一种字幕编辑方法,该方法包括:展示视频播放区域和字幕编辑区域,其中,所述视频播放 区域用于播放目标视频,所述字幕编辑区域用于编辑所述目标视频对应的候选字幕;将视频播放区域当前展示的视频帧与所述字幕编辑区域展示的字幕,进行联动展示。
第二方面,本公开实施例提供了一种字幕编辑方法,该方法包括:获取候选字幕,其中,所述候选字幕包括至少一个字幕文本项,字幕文本项与视频时间段绑定,字幕文本项所绑定的视频时间段用于字幕与视频的联动展示;根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段。
第三方面,本公开实施例提供了一种字幕编辑方法,该方法包括:展示视频播放界面,其中,所述视频播放界面用于播放视频和展示与视频帧对应的字幕;响应于检测到针对所述字幕的触发操作,展示视频播放区域和字幕编辑区域,其中,所述字幕编辑区域用于编辑字幕,所述视频播放区域用于播放目标视频。
第四方面,本公开实施例提供了一种字幕编辑装置,包括:第一展示单元,用于展示视频播放区域和字幕编辑区域,其中,所述视频播放区域用于播放目标视频,所述字幕编辑区域用于编辑所述目标视频对应的候选字幕;第二展示单元,用于将视频播放区域当前展示的视频帧与所述字幕编辑区域展示的字幕,进行联动展示。
第五方面,本公开实施例提供了一种字幕编辑装置,包括:获取单元,用于获取候选字幕,其中,所述候选字幕包括至少一个字幕文本项,字幕文本项与视频时间段绑定,字幕文本项所绑定的视频时间段用于字幕与视频的联动展示;绑定单元,用于根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段。
第六方面,本公开实施例提供了一种字幕编辑装置,包括:第三展示单元,用于展示视频播放界面,其中,所述视频播放界面用于播放视频和展示与视频帧对应的字幕;第四展示单元,用于响应于检测到针对所述字幕的触发操作,展示视频播放区域和 字幕编辑区域,其中,所述字幕编辑区域用于编辑字幕,所述视频播放区域用于播放目标视频。
第七方面,本公开实施例提供了一种电子设备,包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如第一方面所述或者如第二方法的字幕编辑方法。
第八方面,本公开实施例提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时实现如第一方面或者如第二方面所述的字幕编辑方法的步骤。
本公开实施例提供的字幕编辑方法、装置和电子设备,通过展示视频播放区域和字幕编辑区域;并且,视频播放区域当前展示的视频帧,与字幕编辑区域展示的字幕,联动展示;由此,可以提供一种新的编辑字幕的方式;并且用户可以对照所播放的目标视频,确定候选字幕是否有误,从而,可以提高用户对照视频内容编辑字幕的便捷程度,由此可以提高用户编辑字幕的效率,并且可以提高目标视频的字幕的准确率。
附图说明
结合附图并参考以下具体实施方式,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,原件和元素不一定按照占比绘制。
图1是根据本公开的字幕编辑方法的一个实施例的流程图;
图2是根据本公开的字幕编辑方法的一个应用场景的示意图;
图3是根据本公开的字幕编辑方法的一个应用场景的示意图;
图4A和图4B是根据本公开的字幕编辑方法的另一个应用场景的示意图;
图5A和图5B是本公开的字幕编辑方法的又一个应用场景的示意图;
图6是根据本公开的字幕编辑方法的又一个实施例的流程图;
图7是根据本公开的字幕编辑方法的又一个实施例的流程图;
图8是根据本公开的字幕编辑装置的一个实施例的结构示意图;
图9是根据本公开的字幕编辑装置的一个实施例的结构示意图;
图10是根据本公开的字幕编辑装置的一个实施例的结构示意图;
图11是本公开的一个实施例的字幕编辑方法可以应用于其中的示例性系统架构;
图12是根据本公开实施例提供的电子设备的基本结构的示意图。
具体实施方式
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块 或单元所执行的功能的顺序或者相互依存关系。
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。
请参考图1,其示出了根据本公开的字幕编辑方法的一个实施例的流程。该字幕编辑方法应用于终端设备。如图1所示该字幕编辑方法,包括以下步骤:
步骤101,展示视频播放区域和字幕编辑区域。
在本实施例中,字幕编辑方法的执行主体(例如终端设备)可以展示视频播放区域和字幕编辑区域。
在这里,上述视频播放区域可以用于播放所述目标视频。
在这里,上述字幕编辑区域可以用于编辑候选字幕。换句话说,字幕候选区域可以展示候选字幕,并且可以响应用户针对候选字幕的编辑操作,展示经编辑修改的字幕。
在本实施例中,目标视频中的目标,是为了本公开中方便叙述而添加,不是对视频的特指。实际应用场景中,目标视频可以是任意视频。
在本实施例中,上述目标视频可以与候选字幕对应。
一般情况下,候选字幕可以是目标视频的字幕。目标视频的字幕,可以是目标视频绑定的语音所对应的文字。
可以理解,如果用户没有进行编辑操作,即用户观看字幕而没有编辑字幕,那么字幕编辑区域可以作为展示字幕区域,供用户对照查看视频播放区域播放的当前视频与字幕编辑区域的当前字幕是否对应。
在本实施例中,字幕编辑区域与视频播放区域之间,可以没有重合部分,可以部分重合,也可以一个设置在另一个中(例如字幕编辑区域设置在视频播放区域中)。
在一些实施例中,字幕编辑区域能够展示至少两个字幕文本项。
在这里,上述预定义的字幕编辑发起操作,可以是预先定义的用于发起批量编辑的操作。字幕编辑发起操作的具体操作方式,可以根据实际应用场景设置,在此不做限定。
步骤102,将视频播放区域当前展示的视频帧与所述字幕编辑区域展示的字幕,进行联动展示。
在本实施例中,上述执行主体可以将所述视频播放区域当前展示的视频帧与所述字幕编辑区域展示字幕,进行联动展示。
在这里,上述联动展示可以包括同步展示具有对应关系的视频帧与字幕。
在一些应用场景中,可以随着视频播放区域展示的视频帧的改变,对字幕编辑区域展示的字幕进行调整。
可选的,当前展示的视频帧发生改变,可能是因为视频的正常播放,也可能是因为视频的跳转播放等。
可选的,在当前展示的视频帧发生改变的时候,可以遍历候选字幕中的各条字幕,各条字幕均具有起始时间点和终止时间点。起始时间点和终止时间点之间的区间为视频时间段。如果查找到当前展示的视频帧的视频时间点处于某条字幕的视频时间段内,则对应展示该条字幕。如果没有查找到当前展示的视频帧的视频时间点没有处于任一条字幕的视频时间段内,则说明当前的播放进度没有可编辑的字幕,则字幕编辑区域不进行调整。
在一些应用场景中,可以随着字幕编辑区域展示的字幕的改变,对视频播放区域中展示的视频帧进行调整。
在一些应用场景中,可以响应于检测到用户的选中操作,改变字幕编辑区域中进行突出展示的字幕。用户选中了字幕编辑区域的某条字幕后,可以将视频播放区域中的视频播放进度定位到选中的字幕的开始时间。从而实现字幕与播放进度的同步。
需要说明的是,本实施例所示的字幕编辑方法,通过展示视频播放区域和字幕编辑区域;并且,视频播放区域当前展示的视 频帧,与字幕编辑区域展示的字幕,联动展示;由此,可以提供一种新的编辑字幕的方式;并且用户可以对照所播放的目标视频,确定候选字幕是否有误,从而,可以提高用户对照视频内容编辑字幕的便捷程度,由此可以提高用户编辑字幕的效率,并且可以提高目标视频的字幕的准确率。
在一些实施例中,所述方法还包括:采用预定义的进度指示方式,在所述字幕编辑区域中,指示所述视频播放区域中当前播放的视频帧对应的字幕。
在一些应用场景中,上述当前播放的视频帧对应的字幕,可以通过如下方式确定:确定当前播放进度(例如播放至的视频时间点)是否在字幕文本项对应的视频时间段内;如果在,则将当前播放进度所在视频时间段对应的字幕文本项,确定为当前播放的视频帧对的字幕文本项;如果不在,则说明当前播放进度没有对应的字幕,相应的,可以不显示预定义的进度指示方式。
作为示例,进度指示方式可以包括高亮颜色、加下划线、变换字体、在预设的指定区域中展示等。
在一些应用场景中,可以将进度指示方式可以表示为在预设的指定区域中展示。可以将当前播放视频对应的字幕文本项,滚动到上述预设的指定区域。
需要说明的是,通过在字幕编辑区域显示进度指示方式,可以提醒用户当前播放的视频帧对应的字幕,便于用户检查字幕与视频帧是否匹配,提高目标视频的字幕的准确性。
请参考图2和图3,图2和图3示出了根据本公开的字幕编辑方法的示例性应用场景。
如图2所示,终端设备可以在屏幕上播放目标视频;作为示例,目标视频中当前播放的视频帧包括锚图像。终端设备还可以展示与当前播放的视频帧对应的字幕201。用户可以点击字幕201,用户点击字幕的操作可以理解为字幕编辑发起操作。
如图3所示,终端设备可以响应于用户点击字幕201,展示视频播放区域301和字幕编辑区域302。视频播放区域301中可以播 放目标视频。字幕编辑区域302中可以展示目标视频所对应的候选字幕,并且可以响应针对候选字幕的字幕编辑操作。
作为示例,候选字幕可以包括候选字幕文本“喵星人是来到地球的外星人。因为很萌的外表而获得人类的信任,成为少数与人类平等交往的朋友。喵星人不能喝牛奶。因为喵星人没有消化牛奶的酶。喵星人喝多了牛奶会拉肚子。偶尔喝喝两小口还是可以的。”
作为示例,请参考图3,当前播放的视频帧对应的字幕为“喵星人是来到地球的外星人”,“喵星人是来到地球的外星人”可以采用大一些的字体显示,并且可以采用加下划线的方式作为指示。图3中的大一些的字体、加下划线的方式可以理解为预定义的进度指示方式。
在一些实施例中,所述方法还包括:响应于针对所述字幕编辑区域中字幕的选中操作,在所述视频播放区域中,展示与字幕文本项对应的视频帧。
在这里,上述执行主体可以响应于对字幕的选中操作,展示与字幕对应的视频帧。
可以理解,针对字幕文本项的选中操作的实施位置可以是字幕编辑区域。展示字幕文本项对应的视频帧可以是在视频播放区域。
在一些应用场景中,用户在字幕编辑区域选中字幕,则视频播放区域可以展示字幕对应的视频帧。具体的,如果在用户的选中操作之前视频播放区域展示的视频帧,与选中操作针对的字幕对应的视频帧不同,则视频播放区域可以快速跳转展示选中的字幕对应的视频帧。
可选的,选中操作可以在视频暂停状态或者视频播放状态进行。
需要说明的是,用户可以通过选中字幕来切换视频播放区域所展示的视频帧,可以使得用户在批量编辑字幕的时候,便捷地查看选中的字幕对应的视频帧,提高字幕编辑效率。
在一些实施例中,所述方法还可以包括:播放与视频播放区域当前展示的视频帧对应的音频。
在这里,在播放视频的时候,可以对应地播放音频。由此,用户可以参考真实的声音,判断字幕是否正确。由此,可以便于用户编辑字幕,提高字幕的准确率。
在一些实施例中,在所述步骤101之前,上述方法还可以包括展示目标视频指示信息。在这里,目标视频指示信息可以用于指示目标视频。目标视频指示信息的具体形式,可以根据实际应用场景设置以及确定,在此不做限定。
在一些实施例中,目标视频指示信息可以是目标视频本身,也可以是目标视频的封面,还可以是目标视频的名称。作为示例,展示目标视频指示信息可以包括以下至少一项但不限于:目标视频的名称,播放目标视频。
在一些应用场景中,上述字幕编辑发起操作可以是在视频播放状态下进行的,也可以是在视频暂停状态下进行的。
在一些实施例中,所述方法还包括:在视频播放界面中播放目标视频,以及响应于检测到预定义的字幕编辑发起操作,展示字幕编辑区域。
在一些应用场景中,终端设备可以播放目标视频,在所播放的视频中,可以显示候选字幕。用户可以在播放视频的时候,进行字幕编辑发起操作。然后终端设备可以展示视频播放区域和字幕编辑区域。
需要说明的是,在播放目标视频的时候,终端响应用户操作进入字幕编辑模式(展示视频播放区域和字幕编辑区域),可以使得用户在观看视频并且期望编辑字幕的时候,快速利用字幕编辑模式编辑字幕,由此可以提高用户从期望进行字幕编辑到开始字幕编辑的速度,进而提高用户进行字幕编辑的整体速度。
在一些实施例中,上述字幕编辑发起操作可以包括:针对视频播放界面所展示的候选字幕的触发操作。
在这里,所展示的字幕可以是在目标视频中展示的候选字幕。 作物示例,请参考图2中的字幕201,字幕201可以理解为所展示的候选字幕。
作为示例,响应于检测到针对所述字幕的触发操作包括:识别触发操作的触点位置,当所述触点位置位于所述字幕的显示区域范围内时,确定检测到针对所述字幕的触发操作。
作为示例,也可以在设置与字幕关联的编辑控件,用户触发编辑控件,也可以作为对该字幕的触发操作。
需要说明的是,通过将所展示的候选字幕的触发操作作为字幕编辑操作,可以有效捕捉用户期望编辑字幕的操作,使得用户在观看视频并且期望编辑字幕的时候,快速利用字幕编辑模式编辑字幕,由此可以提高用户从期望进行字幕编辑到开始字幕编辑的速度。
在一些实施例中,上述方法还可以包括:将在视频播放界面中所触发的字幕,确定为字幕编辑区域中的待编辑字幕。
作为示例,请参考图2中的字幕201,用户触发字幕201,则可以将字幕201的字幕文本,确定为字幕编辑区域中的待编辑字幕。
在一些应用场景中,确定为字幕编辑区域中的待编辑字幕,可以将光标置于该字幕的展示区域。由此,可以节省用户再查找待编辑的字幕的时间,提高字幕编辑效率。
在一些实施例中,上述字幕编辑发起操作可以包括:针对预设的字幕编辑发起控件的触发操作。
在这里,预设的字幕编辑发起控件可以指用于发起字幕编辑的控件。
在这里,字幕编辑发起控件的具体展示形式和展示位置,可以根据实际应用场景设置,在此不做限定。作为示例,可以在视频播放画面设置标示“编辑字幕”字样的控件,作为字幕编辑发起控件。
需要说明的是,通过预先设置的字幕编辑发起操作,可以有效提示发起字幕编辑的入口,减少用户找寻字幕编辑入口的时间。
在一些实施例中,上述方法还包括:在目标视频的播放状态,根据检测到针对所述字幕编辑区域的字幕浏览操作,采用自由浏览模式,展示所述字幕编辑区域中的字幕。
作为示例,上述字幕浏览操作可以用于触发在字幕编辑区域采用自由浏览模式显示的字幕(或者说字幕文本项)。字幕浏览操作的具体实现方式,可以根据实际的应用场景设置,在此不做限定。
作为示例,上述字幕浏览操作可以是翻页操作,可以是在字幕编辑区域的滑动操作。
在这里,自由浏览模式可以包括用户可在字幕编辑区域进行字幕浏览,并且没有字幕文本项被选中的模式。这种模式的显示效果可以类比在文档中进行鼠标滚轮滑动,文档根据用户操作进行展示的展示模式。
需要说明的是,在目标视频的播放状态,用户可以在字幕编辑区域自由浏览字幕文本,可以方便用户查看候选字幕的未显示的部分。使得字幕编辑区域在区域大小相对固定的情况下,可以根据用户操作可以及时更新,提高信息显示效率,方便用户查看以及提高字幕编辑效率。
在一些实施例中,上述字幕编辑区域能够显示所述候选字幕的指示两个字幕文本项。
在这里,字幕文本项与视频时间段绑定。并且,在所绑定的视频时间段内,字幕文本项指示的语音与视频播放区域展示的视频帧同步播放。
换句话说,候选字幕可以包括一个或者至少两个字幕文本项。字幕文本项可以理解为字幕显示的计量单位。通俗来讲,字幕文本项可以理解为一条字幕,一条字幕可以包括一个或多个字符,各条字幕(即各个字幕文本项)通常是根据字符之间的语义关系划分得到的。
需要说明的是,上述字幕编辑区域能够显示至少两个字幕文本项,用户可以批量对字幕文本项进行编辑,提高操作效率。
在一些实施例中,候选字幕基于对目标视频对应的语音进行语音识别得到。
在这里,候选字幕可以包括字幕文本项,字幕文本项与视频时间段绑定。
在这里,对目标视频的语音进行语音识别,可以得到目标视频的语音对应的文本。并且,对目标视频进行语音识别,可以对每个语音数据片段对应的文本(例如,字、词、句)进行绑定。由此,识别得到的每个句子,可以自动绑定视频时间段。
在这里,对目标视频时间段内的语言进行识别得到的字幕文本项,可以与目标视频时间段绑定。在这里,目标视频时间段可以是目标视频的任一时间段。目标视频时间段中的目标,是为了方便叙述而添加,不构成对视频时间段的限制。
在这里,字幕文本项可以在所绑定的视频时间段内播放。
作为示例,请参考图3,字幕文本项“喵星人是来到地球的外星人”可以与视频开始的时刻(00:00)到视频第10秒(00:10)的视频时间段绑定。字幕文本项“喵星人是来到地球的外星人”,可以在视频时间段00:00-00:10内展示。可选的,“喵星人是来到地球的外星人”中的各个字,可以一起展示;也可以分为几部分展示,作为示例,可以先展示“喵星人”,再展示“是来到地球的”,最后展示“外星人”。
需要说明的是,通过语音识别得到候选字幕,以及为候选字幕中的字幕文本项绑定视频时间段,可以提高字幕文本项与视频时间段绑定的速度和准确率。
在一些实施例中,所述方法还包括:响应于确定目标视频处于暂停状态,开启所述字幕编辑区域对字幕编辑操作的响应功能。
在这里,字幕编辑区域对编辑操作的响应功能,可以包括字幕编辑区域能够检测字幕编辑操作,并且根据所检测到的字幕编辑操作更新显示在字幕编辑区域的字幕。
在一些应用场景中,目标视频处于播放状态的时候,字幕编辑区域的对编辑操作的响应功能可能是关闭的。目标视频处于暂 停状态的时候,开启字幕编辑区域对字幕编辑操作的响应功能。由此,可以提高用户操作的有序性,进而提高用户操作效率和字幕的准确性。
具体来说,如果视频播放和字幕编辑可以同时进行,可能导致由于视频播放区域所播放的视频进度过快,用户注意力难以顾及正在编辑的字幕、正在播放的视频帧对应的字幕这两者,导致用户遗漏查看某些视频帧对应的字幕,需要反复播放视频进行检查。由此可能导致用户更多的操作和更低的准确率。
因此,视频处于暂停状态的时候,开启对编辑操作的响应功能,能够提高用户操作效率和字幕的准确率。
在一些实施例中,暂停所述目标视频的方式可以包括:针对字幕编辑区域中的字幕文本项的触发操作。
在一些实施例中,针对字幕编辑区域中的字幕文本项的触发操作,可以包括上述针对字幕编辑区域中的字幕文本项的选中操作。
在这里,字幕文本项与字幕编辑子区域绑定,并且与播放时间段绑定。所绑定的字幕文本项中的字幕文本在所绑定的播放时间段内(在视频播放区域)显示。
需要说明的是,在用户针对字幕编辑区域中的字幕文本项进行触发操作(例如选中操作)的时候,暂停目标视频,可以加快用户进入字幕编辑的进程,从而提高用户编辑字幕的速度。
在一些实施例中,暂停所述目标视频的方式可以包括:针对视频播放区域的预设触发操作。
在一些实施例中,上述针对视频播放区域的触发操作,可以包括实施位置在视频播放区域的操作。
作为示例,针对视频播放区域的预设触发操作,可以包括在视频播放区域的点击操作,响应于此点击操作,视频播放可以暂停。
需要说明的是,响应于用户在视频播放区域的预设触发操作暂停播放视频,以及开启对编辑操作的响应功能,可以加快用户 进入字幕编程进程。
在一些实施例中,所述方法还包括:根据在字幕编辑区域的字幕编辑操作,更新字幕编辑区域中显示的候选字幕。
在这里,字幕编辑操作,可以包括针对字幕文本的文本编辑操作和对于字幕文本项的单元编辑操作。对于字幕文本项的单元编辑操作,可以包括修改字幕文本项之间的关系,例如拆分字幕文本项或者合并字幕文本项。
需要说明的是,根据字幕编辑操作,更新字幕编辑区域所显示的候选字幕,可以在用户对字幕进行编辑的时候,及时展示编辑之后的效果,便于用户确定是否编辑正确。由此,可以提高用户进行字幕编辑的效率。
在一些实施例中,字幕编辑操作可以包括文本编辑操作。所述方法还包括:响应于检测到针对字幕文本的文本编辑操作,更新字幕编辑区域中的字幕文本,以及保持文本编辑操作针对的字幕文本项所绑定的时间段不变。
在这里,在进行文本编辑操作修改字幕文本项,改变字幕文本项中的字幕文本,而不改变字幕文本项绑定的视频时间段。
由此,在修改字幕文本项中的字幕文本的过程中,可以保证字幕文本项的所绑定的视频时间段相对稳定,保证字幕文本项与对应的视频帧的一致性。
具体来说,一般情况下,字幕文本项之间的断句是准确的,即不会将A句的字词与B句的字词进行混淆。由于A句的起始时间点和终止时间点相对是准确的,所以A句中的字词的增加或者删除,都是在A句所绑定的视频时间段内。从而,可以保证字幕文本项在修改文本内的过程中,与对应的视频帧还是一致的,尽可能避免出现视频帧与字幕不对应的情况。
在这里,文本编辑操作中的编辑方式,可以包括但是不限于增加、删除、修改等方式。
在一些应用场景中,用户可以向候选字幕中增加字词等。
在一些应用场景中,用户可以删除候选字幕中的字词等。
在一些应用场景中,用户可以修改候选字幕中的字词等。
在一些应用场景中,用户在字幕编辑区域进行文本编辑操作的时候,字幕编辑区域可以理解为文本框。可以理解,一般文本框中可以进行的操作,字幕编辑区域中也可以进行。
需要说明的是,通过文本编辑操作,用户可以及时对候选字幕的文本进行修改,提高目标视频对应的字幕的准确度。在一些应用场景中,候选字幕可以是通过语音识别得到的,通过文本编辑操作,用户可以对语音识别得到的结果进行纠正,提高目标视频对应的字幕的准确性。
在一些实施例中,所述字幕编辑操作包括单元编辑操作。
在一些实施例中,所述方法还可以包括:根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段。
在这里,一个字幕文本项拆分后可以生成至少两个字幕文本项。
在这里,至少两个字幕文本项可以合并为一个字幕文本项。
在这里,单元编辑操作可以以字幕文本项为单元,进行字幕文本项的拆分或者合并。
需要说明的是,通过将字幕文本项与视频时间段绑定,字幕文本项的拆分和合并过程中,执行主体可以自动为新生成的字幕文本项绑定视频时间段,从而,减免了人工调整为字幕绑定视频时间段的环节,降低了字幕编辑的难度,提高了字幕编辑效率。另外,便捷的拆分或者合并字幕文本项,可以有效弥补语音识别在断句方面的短板,提高候选字幕整体的准确性。
在一些实施例中,上述单元编辑操作可以包括拆分操作。
在这里,拆分操作可以用于拆分至少两个字幕文本项。
在这里,拆分操作的具体实现方式,可以根据实际应用场景设置,在此不做限定。
作为示例,上述拆分操作可以包括将一段字幕文本项进行分段的操作。
在一些应用场景中,上述执行主体响应于检测到针对字幕文本项的拆分操作,拆分字幕文本项生成至少两个字幕文本项,其中,所生成的至少两个字幕文本项分时显示、或在不同显示区域显示。在这里,分时显示可以包括不同时显示,通俗来讲这两个字幕文本项为两条不同的字幕,可以显示完一条才会显示另一条,或者两条字幕在不同的显示区域显示,例如上下显示两条不同的字幕。
在一些实施例中,上述根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,可以包括:响应于检测到拆分操作,根据拆分得到的各个字幕文本项在拆分前的字幕文本项中的占比,对拆分前的字幕文本项的视频时间段进行划分;将划分得到的各个视频时间段,与拆分得到的各个字幕文本项分别进行绑定。
需要说明的是,响应于拆分操作,根据拆分后的字幕文本项对拆分前的字幕文本项所绑定的视频时间段进行划分,可以在尽量保证拆分后的字幕文本项所绑定的视频时间段与语音匹配的情况下,提高为新生成的字幕文本项绑定视频时间段的速度。
在一些应用场景中,请参考图4A和图4B,图4A和图4B示出了拆分操作的示例性应用场景。
在图4A中,用户可以在字幕文本项“因为很萌的外表而获得人类的信任,成为少数与人类平等交往的朋友”的中间(例如逗号处)执行换行操作,换行操作的位置可以用标号指示符号401表示。然后,请参考图4B,响应于上述换行操作,原有的字幕文本项被拆分为两个字幕文本项,图4B中的指示框402中示出了拆分得到的两个字幕文本项。
在一些实施例中,拆分得到的字幕文本项在拆分前的字幕文本项中的占比,包括:拆分得到的字幕文本项中的文本字数,与拆分前的字幕文本项中的总文本字幕,之间的比值。
在一些应用场景中,可以根据文本字幕,统计占比。由于“因为很萌的外表而获得人类的信任”有15个字,并且“成为少数与人 类平等交往的朋友”有14个字,图4A中的指示符号401所指示的换行操作位置,可以将拆分前的字幕文本项分为15:14(表示15比14)。按照15:14对拆分前的字幕文本项所绑定的时间段进行划分。
作为示例,如果拆分前的字幕文本项所绑定的视频时间段为第1分1秒到第1分29秒;那么可以将拆分后的字幕文本项“因为很萌的外表而获得人类的信任”与视频时间段第1分1秒到第1分15秒进行绑定;可以将拆分后的字幕文本项“成为少数与人类平等交往的朋友”与视频时间段第1分16秒到第1分29秒进行绑定。
需要说明的是,基于文本字数占比,统计时间段占比,可以快速确定占比并进行拆分,提高拆分速度。
在一些实施例中,拆分得到的字幕文本项在拆分前的字幕文本项中的占比,包括:拆分得到的字幕文本项对应的语音时长,与拆分前的字幕文本项对应的总语音时长,之间的比值。
在一些应用场景中,可以根据语音时长,统计占比。作为示例,“因为很萌的外表而获得人类的信任”的语音时长为12秒,并且“成为少数与人类平等交往的朋友”的语音时长为8秒,即前半句占比60%,后半句占比40%。图4A中的指示符号401所指示的换行操作位置,可以将拆分前的字幕文本项分为3:2(表示3比2)。按照3:2对拆分前的字幕文本项所绑定的时间段进行划分。
作为示例,作为示例,如果拆分前的字幕文本项所绑定的视频时间段为第1分1秒到第1分30秒;那么可以将拆分后的字幕文本项“因为很萌的外表而获得人类的信任”与视频时间段第1分1秒到第1分18秒进行绑定;可以将拆分后的字幕文本项“成为少数与人类平等交往的朋友”与视频时间段第1分19秒到第1分30秒进行绑定。
需要说明的是,基于语音时长统计占比,可以充分考虑实际场景中语音的停顿、不同字词可能具有长短音等的差别。从而,可以提高字幕文本项拆分场景中对视频时间段的拆分的准确性, 从而,提高视频与字幕的同步程度。
在一些实施例中,上述单元编辑操作可以包括合并操作。
在这里,上述合并操作可以用于合并至少两个字幕文本项。
在这里,上述合并操作的具体实现形式,可以根据实际应用场景设置,在此不做限定。
在一些实施例中,上述步骤根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,可以包括:响应于检测到合并操作,将合并操作针对的至少两个字幕文本项进行合并,以及将所述至少两个字幕文本项所绑定的视频时间段进行合并。
作为示例,上述合并操作可以包括删除两段字幕文本项的分段(比如分段标识)操作。
在一些应用场景中,上述执行主体可以响应于检测到针对字幕文本项的合并操作,合并至少两个字幕文本项生成新的字幕文本项,其中,所生成的新的字幕文本项在新的字幕文本项所绑定的视频时间段内播放。
在一些应用场景中,请参考图5A和图5B,图5A和图5B示出了合并操作的示例性应用场景。
在图5A中,用户可以在删除符号501指示的位置处,进行删除分段的操作。在这里,可以将删除分段的操作理解为合并操作。然后,请参考图5B,响应于上述合并操作,原有的两个字幕文本项被合并为一个字幕文本项,图5B中的指示框502中示出了合并的字幕文本项。
在这里,上述合并前的字幕文本项“喵星人不能喝牛奶”所绑定的视频时间段为第1分31秒到第1分40秒,合并前的字幕文本项“因为喵星人没有消化牛奶的酶”所绑定的时间段为第1分41秒到第1分50秒。合并后的字幕文本项“喵星人不能喝牛奶。因为喵星人没有消化牛奶的酶”所绑定的时间段可以为第1分31秒到第1分50秒。
需要说明的是,响应于合并操作,将合并操作所针对的至少两个字幕文本项进行合并,并且自动为绑定后的字幕文本项绑定 视频时间段,可以为候选字幕进行字幕编辑的整体时间,即提高字幕编辑效率。
在一些实施例中,单个字幕文本项中的字幕文本存储于同一存储单元,不同字幕文本项存储于不同存储单元。
在一些应用场景中,存储单元可以是物理的存储位置,也可以是预设数据结构下的单元。
作为示例,存储单元可以为数组。换句话说,可以将单个字幕文本项中的字幕文本存储于同一数组中,将不同字幕文本项存储于不同的数组中。
需要说明的是,通过采用隔离的存储单元存储不同的字幕文本项,可以对字幕文本项之间进行区分,便于为字幕文本项设置相应的自身属性(例如视频时间段、文本控件等)。
在一些实施例中,所述方法还包括:为单个字幕文本项存储于的存储单元设置相应的自身属性。在这里,所述自身属性用于指示字幕文本项的自身特征。
作为示例,自身属性可以包括但是不限于以下至少一项:所绑定的多媒体时间段(例如可以包括视频时间段和/或音频时间段)、所绑定的文本控件、所绑定的编辑触发控件(当某个字幕文本项的编辑触发控件或区域被触发时,可以触发对该个字幕文本项的编辑)等。
在这里,自身属性可以与存储单元关联设置。作为示例,字幕文本项甲可以存储于存储单元乙中。可以将字幕文本项甲所绑定的视频时间段,与存储单元乙进行绑定。由此实现为字幕文本甲设置相应的自身属性。由此,可以在进行字幕文本项的操作的时候,即使字幕文本项发生变化,也可以由于变化的字幕文本设置于独立存储单元,而保证字幕文本项的整体性,进而保证自身属性的稳定性。换句话说,可以对字幕文本项中的字幕文本进行文本内容编辑的情况下,保持该字幕文本项的自身属性保持不变,例如,该字幕文本项绑定的时间段不变;作为示例,字幕文本项“喵星人不能喝牛奶。因为喵星人没有消化牛奶的酶”所绑定的时 间段可以为第1分31秒到第1分50秒,如果把此字幕文本项中的喵星人修改为汪汪星人,即“汪汪星人不能喝牛奶。因为汪汪星人没有消化牛奶的酶”所绑定的时间段保持不变,仍然可以为第1分31秒到第1分50秒。
需要说明的是,如果仅发生文本内容编辑操作而没有发生拆分合并操作,保持字幕文本项保持的时间段不变,可以既能够提高字幕文本内容的准确性,也能够避免由于编辑字幕文本内容导致错误地修改该字幕文本所对应的视频时间段或音频时间段。具体来说,各个不同文本项之间的拆分,一般是按照停顿和语义意群划分的,这种划分相对准确,因此所绑定的时间段可以保持不变,保证以字幕文本项为单元所得到的意群整体的时间准确性。但是意群内的具体字词的识别可能有误,但这不影响意群的时间段划分,因此可以提供文本内容编辑操作以编辑文本项内部的文本内容,以提高得到的字幕文本的内容准确性。
需要说明的是,采用数组的数据格式存储字幕文本项,可以实现快速对数组中的字符进行修改,并且尽可能减少所占的存储空间,提高字幕编辑速度。
在一些应用场景中,根据拆分操作生成的至少两个字幕文本项存储于不同的存储单元。
在这里,将拆分后所生成的至少两个字幕文本项存储于不同的存储单元,可以有效为新生成的字幕文本项设置自身的属性。从而,可以在拆分字幕场景中快速实现字幕分离,并且保证字幕与视频帧的一致性。
在一些应用场景中,根据合并操作生成的字幕文本项存储于同一存储单元。
在这里,将合并操作后生成的字幕文本项存储于同一存储单元,可以为新生成的字幕文本项设置自身的属性。从而,可以在合并字幕场景中快速实现字幕合并,并且保证字幕与视频帧的一致性。
在一些实施例中,所述字幕编辑区域包括与字幕文本项绑定 的文本控件。
在一些应用场景中,字幕编辑区域可以包括至少两个字幕编辑子区域。字幕编辑子区域用于显示一条字幕,字幕编辑子区域可以与字幕文本项绑定。字幕子区域的尺寸可以根据字幕文本项中的字数进行调整。
在一些应用场景中,字幕编辑子区域可以理解为文本控件,文本控件的显示位置可以根据视频播放进度进行调整。从视觉上看,字幕文本项可以响应于在字幕编辑区域中移动。
可以理解,不同的文本控件,视觉上,可以对应不同的文字段落。换句话说,视觉上不同的段落,对应不同的文本控件。
在这里,文本控件用于展示所绑定的字幕文本项中的字幕文本。
需要说明的是,通过采用单独的文本控件展示各字幕文本项,可以在对字幕文本项进行编辑操作的时候,提示用户区分各个字幕文本项,快速修改此字幕文本项中的内容而不对其它的字幕文本项进行造成干扰。由此,可以保证各个字幕文本项所绑定的视频时间段不发生混乱。
在一些应用场景中,文本控件可以为文本框。
需要说明的是,采用文本框作为文本控件,可以利用文本框自身已有的文本编辑功能进行字幕编辑,由此,可以减少开发难度,提高开发效率。并且,可以利用用户对文本框较为熟悉的特点,降低用户的操作难度,提高用户进行字幕编辑的效率。
在一些实施例中,所述方法还包括:基于拆分操作,生成新的文本控件,以及将新生成的文本控件与新生成的字幕文本项绑定。
作为示例,将一个字幕文本项拆分两个,拆分得到的第一个字幕文本项可以与原文本控件绑定,拆分得到的第二个字幕文本项可以与新生成的文本控件绑定。
在一些实施例中,所述方法还包括:基于合并操作,删除被合并的字幕文本项所绑定的文本控件。
在一些应用场景中,可以为合并后的字幕文本项生成新的文本控件,在该新的文本控件中显示合并后的字幕文本项。由此,可以采用新生成文本控件的方式,减低文本控件出现错乱的概率。换句话说,保留一个文本控件并且删除一个文本控件,由于操作类型较多,可能会出现操作错误。
在一些应用场景中,也可以是仅保留合并前各个字幕文本项绑定的文本控件中的一个,将合并后的字幕文本项显示在该保留的文本控件中。由此,可以节省新生成文本控件的计算量,提高合并后进行显示的速度。
在一些实施例中,上述方法还可以包括:根据字幕编辑操作,展示经更新的字幕文本项对应的视频帧。
在这里,字幕文本项可以进行文本编辑操作或者单元编辑操作,在用户对字幕文本项进行操作之后,可以在字幕编辑区域展示经更新的字幕文本项。并且,上述执行主体可以快速将经更新的字幕文本项的字幕文本项,更新到视频播放区域对应的视频帧中。
由此,用户可以及时对字幕更新后的视频帧的展示效果进行预览,便于用户根据预览效果及时进行调整,提高用户编辑字幕的整体效率。
请继续参考图6,其示出了根据本公开的字幕编辑方法的一个实施例的流程。如图6所示该字幕编辑方法,包括以下步骤:
步骤601,获取候选字幕。
在这里,所述候选字幕包括至少一个字幕文本项,字幕文本项与视频时间段绑定,字幕文本项所绑定的视频时间段用于字幕与视频的联动展示。
步骤602,根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段。
需要说明的是,图6所示实施例的相关实现细节和技术效果, 可以参考本公开中其它部分的说明,在此不再赘述。
需要说明的是,通过将字幕文本项与视频时间段绑定,字幕文本项的拆分和合并过程中,执行主体可以自动为新生成的字幕文本项绑定视频时间段,从而,减免了人工调整为字幕绑定视频时间段的环境,降低了字幕编辑的难度,提高了字幕编辑效率。
在一些实施例中,所述为新生成的字幕文本项绑定视频时间段,包括:基于所述拆分操作或合并操作针对的字幕文本项绑定的视频时间段,为新生成的绑定视频时间段。
在一些实施例中,所述单元编辑操作包括拆分操作;以及所述根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段,包括:响应于检测到拆分操作,根据拆分得到的各个字幕文本项在拆分前的字幕文本项中的占比,对拆分前的字幕文本项的视频时间段进行划分;将划分得到的各个视频时间段,与拆分得到的各个字幕文本项分别进行绑定。
在一些实施例中,所述单元编辑操作包括合并操作;以及所述根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段,包括:响应于检测到合并操作,将合并操作针对的至少两个字幕文本项进行合并,以及将所述至少两个字幕文本项所绑定的视频时间段进行合并。
在这里,字幕编辑操作,可以包括针对字幕文本的文本编辑操作和对于字幕文本项的单元编辑操作。对于字幕文本项的编辑操作,可以包括修改字幕文本项之间的关系,例如拆分字幕文本项或者合并字幕文本项。
需要说明的是,根据字幕编辑操作,更新字幕编辑区域所显示的候选字幕,可以在用户对字幕进行编辑的时候,及时展示编辑之后的效果,便于用户确定是否编辑正确。由此,可以提高用户进行字幕编辑的效率。
在一些实施例中,字幕编辑操作可以包括文本编辑操作。所 述方法还包括:响应于检测到针对字幕文本的文本编辑操作,更新字幕编辑区域中的字幕文本。
在这里,文本编辑操作中的编辑方式,可以包括但是不限于增加、删除、修改等方式。
在一些应用场景中,用户可以向候选字幕中增加字词等。
在一些应用场景中,用户可以删除候选字幕中的字词等。
在一些应用场景中,用户可以修改候选字幕中的字词等。
在一些应用场景中,用户在字幕编辑区域进行文本编辑操作的时候,字幕编辑区域可以理解为文本框。可以理解,一般文本框中可以进行的操作,字幕编辑区域中也可以进行。
需要说明的是,通过文本编辑操作,用户可以及时对候选字幕的文本进行修改,提高目标视频对应的字幕的准确度。在一些应用场景中,候选字幕可以是通过语音识别得到的,通过文本编辑操作,用户可以对语音识别得到的结果进行纠正,提高目标视频对应的字幕的准确性。
在一些实施例中,所述字幕编辑操作包括单元编辑操作。
在一些实施例中,所述方法还可以包括:根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段。
在这里,一个字幕文本项拆分后可以生成至少两个字幕文本项。
在这里,至少两个字幕文本项可以合并为一个字幕文本项。
在这里,单元编辑操作可以以字幕文本项为单元,进行字幕文本项的拆分或者合并。
需要说明的是,通过将字幕文本项与视频时间段绑定,字幕文本项的拆分和合并过程中,执行主体可以自动为新生成的字幕文本项绑定视频时间段,从而,减免了人工调整为字幕绑定视频时间段的环境,降低了字幕编辑的难度,提高了字幕编辑效率。另外,便捷的拆分或者合并字幕文本项,可以有效弥补语音识别在断句方面的短板,提高候选字幕整体的准确性。
在一些实施例中,上述单元编辑操作可以包括拆分操作。
在这里,拆分操作可以用于拆分至少两个字幕文本项。
在这里,拆分操作的具体实现方式,可以根据实际应用场景设置,在此不做限定。
作为示例,上述拆分操作可以包括将一段字幕文本项进行分段的操作。
在一些应用场景中,上述执行主体响应于检测到针对字幕文本项的拆分操作,拆分字幕文本项生成至少两个字幕文本项,其中,所生成的至少两个字幕文本项分时显示。
在一些实施例中,上述根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,可以包括:响应于检测到拆分操作,根据拆分得到的各个字幕文本项在拆分前的字幕文本项中的占比,对拆分前的字幕文本项的视频时间段进行划分;将划分得到的各个视频时间段,与拆分得到的各个字幕文本项分别进行绑定。
需要说明的是,响应于拆分操作,根据拆分后的字幕文本项对拆分前的字幕文本项所绑定的视频时间段进行划分,可以在尽量保证拆分后的字幕文本项所绑定的视频时间段与语音匹配的情况下,提高为新生成的字幕文本项绑定视频时间段的速度。
在一些实施例中,拆分得到的字幕文本项在拆分前的字幕文本项中的占比,包括:拆分得到的字幕文本项中的文本字数,与拆分前的字幕文本项中的总文本字幕,之间的比值。
需要说明的是,基于文本字数占比,统计占比,可以快速确定占比并进行拆分,提高拆分速度。
在一些实施例中,拆分得到的字幕文本项在拆分前的字幕文本项中的占比,包括:拆分得到的字幕文本项对应的语音时长,与拆分前的字幕文本项对应的总语音时长,之间的比值。
需要说明的是,基于语音时长统计占比,可以充分考虑实际场景中语音的停顿、不同字词可能具有长短音等的差别。从而,可以提高字幕文本项拆分场景中对视频时间段的拆分的准确性, 从而,提高视频与字幕的同步程度。
在一些实施例中,上述单元编辑操作可以包括合并操作。
在这里,上述合并操作可以用于合并至少两个字幕文本项。
在这里,上述合并操作的具体实现形式,可以根据实际应用场景设置,在此不做限定。
在一些实施例中,上述步骤根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,可以包括:响应于检测到合并操作,将合并操作针对的至少两个字幕文本项进行合并,以及将所述至少两个字幕文本项所绑定的视频时间段进行合并。
作为示例,上述合并操作可以包括删除两段字幕文本项的分段的操作。
在一些应用场景中,上述执行主体可以响应于检测到针对字幕文本项的合并操作,合并至少两个字幕文本项生成新的字幕文本项,其中,所生成的新的字幕文本项在新的字幕文本项所绑定的视频时间段内播放。
需要说明的是,响应于合并操作,将合并操作所针对的至少两个字幕文本项进行合并,并且自动为绑定后的字幕文本项绑定视频时间段,可以为候选字幕进行字幕编辑的整体时间,即提高字幕编辑效率。
在一些实施例中,单个字幕文本项中的字幕文本存储于同一存储单元,不同字幕文本项存储于不同存储单元。
在一些应用场景中,存储单元可以是物理的存储位置,也可以是预设数据结构下的单元。
作为示例,存储单元可以为数组。换句话说,可以将单个字幕文本项中的字幕文本存储于同一数组中,将不同字幕文本项存储于不同的数组中。
需要说明的是,通过采用隔离的存储单元存储不同的字幕文本项,可以对字幕文本项之间进行区分,便于为字幕文本项设置相应的自身属性(例如视频时间段、文本控件等)。
需要说明的是,采用数组的数据格式存储字幕文本项,可以 实现快速对数组中的字符进行修改,并且尽可能减少所占的存储空间,提高字幕编辑速度。
在一些应用场景中,根据拆分操作生成的至少两个字幕文本项存储于不同的存储单元。
在这里,将拆分后所生成的至少两个字幕文本项存储于不同的存储单元,可以有效为新生成的字幕文本项设置自身的属性。从而,可以在拆分字幕场景中快速实现字幕分离,并且保证字幕与视频帧的一致性。
在一些应用场景中,根据合并操作生成的字幕文本项存储于同一存储单元。
在这里,将合并操作后生成的字幕文本项存储于同一存储单元,可以为新生成的字幕文本项设置自身的属性。从而,可以在合并字幕场景中快速实现字幕合并,并且保证字幕与视频帧的一致性。
在一些实施例中,所述字幕编辑区域包括与字幕文本项绑定的文本控件。
在一些应用场景中,字幕编辑区域可以包括至少两个字幕编辑子区域。字幕编辑子区域用于显示一条字幕,字幕编辑子区域可以与字幕文本项绑定。字幕子区域的尺寸可以根据字幕文本项中的字数进行调整。
在一些应用场景中,字幕编辑子区域可以理解为文本控件,文本控件的显示位置可以根据视频播放进度进行调整。从视觉上看,字幕文本项可以响应于在字幕编辑区域中移动。
可以理解,不同的文本控件,视觉上,可以对应不同的文字段落。换句话说,视觉上不同的段落,对应不同的文本控件。
在这里,文本控件用于展示所绑定的字幕文本项中的字幕文本。
需要说明的是,通过采用单独的文本控件展示个字幕文本项,可以在对字幕文本项进行编辑操作的时候,提示用户区分各个字幕文本项,快速修改此字幕文本项中的内容而不对其它的字幕文 本项进行造成干扰。由此,可以保证各个字幕文本项所绑定的视频时间段不发生混乱。
在一些应用场景中,文本控件可以为文本框。
需要说明的是,采用文本框作为文本控件,可以利用文本框自身已有的文本编辑功能进行字幕编辑,由此,可以减少开发难度,提高开发效率。并且,可以利用用户对文本框较为熟悉的特点,降低用户的操作难度,提高用户进行字幕编辑的效率。
在一些实施例中,所述方法还包括:基于拆分操作,生成新的文本控件,以及将新生成的文本控件与新生成的字幕文本项绑定。
作为示例,将一个字幕文本项拆分两个,拆分得到的第一个字幕文本项可以与原文本控件绑定,拆分得到的第二个字幕文本项可以与新生成的文本控件绑定。
在一些实施例中,所述方法还包括:基于合并操作,删除被合并的字幕文本项所绑定的文本控件。
在一些应用场景中,可以为合并后的字幕文本项生成新的文本控件,在该新的文本控件中显示合并后的字幕文本项。由此,可以采用新生成文本控件的方式,减低文本控件出现错乱的概率。换句话说,保留一个文本控件并且删除一个文本控件,由于操作类型较多,可能会出现操作错误。
在一些应用场景中,也可以是仅保留合并前各个字幕文本项绑定的文本控件中的一个,将合并后的字幕文本项显示在该保留的文本控件中。由此,可以节省新生成文本控件的计算量,提高合并后进行显示的速度。
在一些实施例中,上述方法还可以包括:根据字幕编辑操作,展示经更新的字幕文本项对应的视频帧。
在这里,字幕文本项可以进行文本编辑操作或者单元编辑操作,在用户对字幕文本项进行操作之后,可以在字幕编辑区域展示经更新的字幕文本项。并且,上述执行主体可以快速将经更新的字幕文本项的字幕文本项,更新到视频播放区域对应的视频帧 中。
由此,用户可以及时对字幕更新后的视频帧的展示效果进行预览,便于用户根据预览效果及时进行调整,提高用户编辑字幕的整体效率。
请继续参考图7,其示出了根据本公开的字幕编辑方法的一个实施例的流程。如图7所示该字幕编辑方法,包括以下步骤:
步骤701,展示视频播放界面。
在这里,所述视频播放界面用于播放视频和展示与视频帧对应的字幕。
步骤702,响应于检测到针对所述字幕的触发操作,展示视频播放区域和字幕编辑区域。
在这里,所述字幕编辑区域用于编辑字幕,所述视频播放区域用于播放目标视频。
需要说明的是,图7所示实施例的相关实现细节和技术效果,可以参考本公开中其它部分的说明,在此不再赘述。
在一些实施例中,所述响应于检测到针对所述字幕的触发操作,展示视频播放区域和字幕编辑区域,包括:响应于检测到针对所述字幕的触发操作,缩小所述视频播放界面中的视频播放区域中的尺寸,在缩小尺寸后的视频播放区域播放视频,以及展示字幕编辑区域。
在这里,在缩小尺寸的视频播放区域中播放视频,能够减少视频播放所占用的界面,便于对字幕编辑区域的显示,提高界面的利用率,提高字幕编辑效率。
在一些实施例中,所述字幕编辑区域能够展示指示两个字幕文本项。
在一些场景中,字幕文本项与视频时间段绑定,其中,在所绑定的视频时间段内,字幕文本项指示的语音与视频播放区域展示的视频帧同步播放。
换句话说,候选字幕可以包括一个或者至少两个字幕文本项。 字幕文本项可以理解为字幕显示的计量单位。通俗来讲,字幕文本项可以理解为一条字幕。
需要说明的是,上述字幕编辑区域能够显示至少两个字幕文本项,用户可以批量对字幕文本项进行编辑,提高操作效率。
在一些实施例中,所述方法还包括:响应于在所述字幕编辑区域检测到第一触发操作,采用自由浏览模式,展示所述字幕编辑区域中的字幕。
作为示例,上述字幕浏览操作可以用于触发在字幕编辑区域采用自有浏览模式显示的字幕(或者说字幕文本项)。字幕浏览操作的具体实现方式,可以根据实际的应用场景设置,在此不做限定。
作为示例,上述字幕浏览操作可以是翻页操作,可以是在字幕编辑区域的滑动操作。
在这里,自由浏览模式可以包括用户可在字幕编辑区域进行字幕浏览,并且没有字幕文本项被选中的模式。这种模式可以类比在文档中进行鼠标滚轮滑动,文档根据用户操作进行展示的展示模式。
需要说明的是,在目标视频的播放状态,用户可以在字幕编辑区域自由浏览字幕文本,可以方便用户查看候选字幕的未显示的部分。使得字幕编辑区域在区域大小相对固定的情况下,可以根据用户操作可以及时更新,提高信息显示效率,方便用户查看以及提高字幕编辑效率。
在一些实施例中,所述方法还包括:响应于在所述字幕编辑区域检测到第二触发操作,选中字幕编辑区域的字幕文本项。
在一些应用场景中,确定为字幕编辑区域中的待编辑字幕,可以将光标置于该字幕的展示区域。由此,可以节省用户再查找待编辑的字幕的时间,提高字幕编辑效率。
在一些实施例中,响应于检测到针对所述字幕的触发操作包括:识别触发操作的触点位置,当所述触点位置位于所述字幕的显示区域范围内时,确定检测到针对所述字幕的触发操作。
在一些应用场景中,也可以在设置与字幕关联的编辑控件,用户触发编辑控件,也可以作为对该字幕的触发操作。
进一步参考图8,作为对上述各图所示方法的实现,本公开提供了一种字幕编辑装置的一个实施例,该装置实施例与图1所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。该装置包括用于执行相应步骤或操作的单元,可以采用软件、硬件或者其结合的方式实现上述单元。
如图8所示,本实施例的字幕编辑装置包括:第一展示单元801、和第二展示单元802。其中,第一展示单元,用于展示视频播放区域和字幕编辑区域,其中,所述视频播放区域用于播放目标视频,所述字幕编辑区域用于编辑所述目标视频对应的候选字幕;第二展示单元,用于将视频播放区域当前展示的视频帧与所述字幕编辑区域展示的字幕,进行联动展示。
在本实施例中,字幕编辑装置的第一展示单元801、和第二展示单元802的具体处理及其所带来的技术效果可分别参考图1对应实施例中步骤101和步骤102的相关说明,在此不再赘述。
在一些实施例中,所述将视频播放区域当前展示的视频帧与所述字幕编辑区域展示的字幕,进行联动展示,包括:采用预定义的进度指示方式,在所述字幕编辑区域中,指示所述视频播放区域中当前播放的视频帧对应的字幕。
在一些实施例中,所述将视频播放区域当前展示的视频帧与所述字幕编辑区域展示的字幕,进行联动展示,包括:响应于检测到针对字幕编辑区域中字幕的选中操作,在所述视频播放区域中,展示与选中的字幕对应的视频帧。
在一些实施例中,所述装置还用于:播放与视频播放区域当前展示的视频帧对应的音频。
在一些实施例中,所述装置还用于:在视频播放界面中播放目标视频,以及响应于检测到预定义的字幕编辑发起操作,展示字幕编辑区域。
在一些实施例中,所述字幕编辑发起操作包括:针对所述视频播放界面中展示的字幕的触发操作。
在一些实施例中,所述装置还用于:将在视频播放界面中所触发的字幕,确定为字幕编辑区域中的待编辑字幕。
在一些实施例中,所述字幕编辑发起操作包括:针对预设的字幕编辑发起控件的触发操作。
在一些实施例中,所述装置还用于:在目标视频的播放状态,根据检测到针对所述字幕编辑区域的字幕浏览操作,采用自由浏览模式,展示所述字幕编辑区域中的字幕。
在一些实施例中,所述字幕编辑区域能够显示所述候选字幕的至少两个字幕文本项,字幕文本项与视频时间段绑定,其中,在所绑定的视频时间段内,字幕文本项指示的语音与视频播放区域展示的视频帧同步播放。
在一些实施例中,所述候选字幕基于对目标视频对应的语音进行语音识别得到,并且,对目标视频时间段内的语音进行识别得到的字幕文本项,与所述目标视频时间段绑定。
在一些实施例中,所述装置还用于:响应于确定目标视频处于暂停状态,开启所述字幕编辑区域对编辑操作的响应功能。
在一些实施例中,暂停所述目标视频的方式包括:针对所述字幕编辑区域中字幕文本项的选中操作。
在一些实施例中,暂停所述目标视频的方式包括:针对所述视频播放区域的预设触发操作。
在一些实施例中,所述装置还用于:根据在字幕编辑区域的字幕编辑操作,更新字幕编辑区域中显示的候选字幕。
在一些实施例中,字幕编辑操作可以包括文本编辑操作;以及所述装置还用于:响应于检测到针对字幕文本的文本编辑操作,更新字幕编辑区域中的字幕文本,以及保持所述文本编辑操作针对的字幕文本项所绑定的时间段不变。
在一些实施例中,所述字幕编辑操作包括单元编辑操作;以 及所述装置还用于根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段。
在一些实施例中,所述单元编辑操作包括拆分操作;以及
所述根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段,包括:响应于检测到拆分操作,根据拆分得到的各个字幕文本项在拆分前的字幕文本项中的占比,对拆分前的字幕文本项的视频时间段进行划分;将划分得到的各个视频时间段,与拆分得到的各个字幕文本项分别进行绑定。
在一些实施例中,拆分得到的字幕文本项在拆分前的字幕文本项中的占比,包括以下至少一项:拆分得到的字幕文本项中的文本字数,与拆分前的字幕文本项中的总文本字幕,之间的比值;拆分得到的字幕文本项对应的语音时长,与拆分前的字幕文本项对应的总语音时长,之间的比值。
在一些实施例中,所述单元编辑操作包括合并操作;以及所述根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段,包括:响应于检测到合并操作,将合并操作针对的至少两个字幕文本项进行合并,以及将所述至少两个字幕文本项所绑定的视频时间段进行合并。
在一些实施例中,单个字幕文本项中的字幕文本存储于同一存储单元,不同字幕文本项存储于不同存储单元。
在一些实施例中,为单个字幕文本项存储于的存储单元设置相应的自身属性,其中,所述自身属性用于指示字幕文本项的自身特征。
在一些实施例中,根据拆分操作生成的至少两个字幕文本项存储于不同的存储单元;根据合并操作生成的字幕文本项存储于同一存储单元。
在一些实施例中,所述字幕编辑区域包括与字幕文本项绑定 的文本控件,其中,文本控件用于展示所绑定的字幕文本项中的字幕文本。
在一些实施例中,所述装置还用于以下至少一项:基于拆分操作,生成新的文本控件,以及将新生成的文本控件与新生成的字幕文本项绑定;基于合并操作,删除被合并的字幕文本项所绑定的文本控件。
在一些实施例中,所述装置还用于:在视频播放区域,展示经更新的字幕文本项所对应的视频帧。
进一步参考图9,作为对上述各图所示方法的实现,本公开提供了一种字幕编辑装置的一个实施例,该装置实施例与图1所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。该装置包括用于执行相应步骤或操作的单元,可以采用软件、硬件或者其结合的方式实现上述单元。
如图9所示,本实施例的字幕编辑装置包括:获取单元901和绑定单元902。其中,获取单元,用于获取候选字幕,其中,所述候选字幕包括至少一个字幕文本项,字幕文本项与视频时间段绑定,字幕文本项所绑定的视频时间段用于字幕与视频的联动展示;绑定单元,用于根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段。
在本实施例中,字幕编辑装置的获取单元901和绑定单元902的具体处理及其所带来的技术效果可分别参考图6对应实施例中步骤601和步骤602的相关说明,在此不再赘述。
在一些实施例中,所述为新生成的字幕文本项绑定视频时间段,包括:基于所述拆分操作或合并操作针对的字幕文本项绑定的视频时间段,为新生成的绑定视频时间段。
在一些实施例中,所述单元编辑操作包括拆分操作;以及所述根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段,包括: 响应于检测到拆分操作,根据拆分得到的各个字幕文本项在拆分前的字幕文本项中的占比,对拆分前的字幕文本项的视频时间段进行划分;将划分得到的各个视频时间段,与拆分得到的各个字幕文本项分别进行绑定。
在一些实施例中,所述单元编辑操作包括合并操作;以及所述根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段,包括:响应于检测到合并操作,将合并操作针对的至少两个字幕文本项进行合并,以及将所述至少两个字幕文本项所绑定的视频时间段进行合并。
进一步参考图10,作为对上述各图所示方法的实现,本公开提供了一种字幕编辑装置的一个实施例,该装置实施例与图1所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。该装置包括用于执行相应步骤或操作的单元,可以采用软件、硬件或者其结合的方式实现上述单元。
如图10所示,本实施例的字幕编辑装置包括:第三展示单元1001和第四展示单元1002。其中,第三展示单元,用于展示视频播放界面,其中,所述视频播放界面用于播放视频和展示与视频帧对应的字幕;第四展示单元,用于响应于检测到针对所述字幕的触发操作,展示视频播放区域和字幕编辑区域,其中,所述字幕编辑区域用于编辑字幕,所述视频播放区域用于播放目标视频。
在本实施例中,字幕编辑装置的第三展示单元1001和第四展示单元100的具体处理及其所带来的技术效果可分别参考图7对应实施例中步骤701和步骤702的相关说明,在此不再赘述。
在一些实施例中,所述响应于检测到针对所述字幕的触发操作,展示视频播放区域和字幕编辑区域,包括:响应于检测到针对所述字幕的触发操作,缩小所述视频播放界面中的视频展示播放区域中的尺寸,在缩小尺寸后的视频播放区域播放视频,以及展示字幕编辑区域。
在一些实施例中,所述字幕编辑区域能够展示指示两个字幕文本项。
在一些实施例中,所述装置还用于:响应于在所述字幕编辑区域检测到第一触发操作,采用自由浏览模式,展示所述字幕编辑区域中的字幕。
在一些实施例中,所述装置还用于:响应于在所述字幕编辑区域检测到第二触发操作,选中字幕编辑区域的字幕文本项。
在一些实施例中,响应于检测到针对所述字幕的触发操作包括:识别触发操作的触点位置,当所述触点位置位于所述字幕的显示区域范围内时,确定检测到针对所述字幕的触发操作。
请参考图11,图11示出了本公开的一个实施例的字幕编辑方法可以应用于其中的示例性系统架构。
如图11所示,系统架构可以包括终端设备1101、1102、1103,网络1104,服务器1105。网络1104用以在终端设备1101、1102、1103和服务器1105之间提供通信链路的介质。网络1104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
终端设备1101、1102、1103可以通过网络1104与服务器1105交互,以接收或发送消息等。终端设备1101、1102、1103上可以安装有各种客户端应用,例如网页浏览器应用、搜索类应用、新闻资讯类应用。终端设备1101、1102、1103中的客户端应用可以接收用户的指令,并根据用户的指令完成相应的功能,例如根据用户的指令在信息中添加相应信息。
终端设备1101、1102、1103可以是硬件,也可以是软件。当终端设备1101、1102、1103为硬件时,可以是具有显示屏并且支持网页浏览的各种电子设备,包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、膝上型便携计算机和台式计算机等等。当终端设备1101、 1102、1103为软件时,可以安装在上述所列举的电子设备中。其可以实现成多个软件或软件模块(例如用来提供分布式服务的软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。
服务器1105可以是提供各种服务的服务器,例如接收终端设备1101、1102、1103发送的信息获取请求,根据信息获取请求通过各种方式获取信息获取请求对应的展示信息。并展示信息的相关数据发送给终端设备1101、1102、1103。
需要说明的是,本公开实施例所提供的字幕编辑方法可以由终端设备执行,相应地,字幕编辑装置可以设置在终端设备1101、1102、1103中。此外,本公开实施例所提供的字幕编辑方法还可以由服务器1105执行,相应地,字幕编辑装置可以设置于服务器1105中。
应该理解,图11中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。
下面参考图12,其示出了适于用来实现本公开实施例的电子设备(例如图11中的终端设备或服务器)的结构示意图。本公开实施例中的终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图12示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图12所示,电子设备可以包括处理装置(例如中央处理器、图形处理器等)1201,其可以根据存储在只读存储器(ROM)1202中的程序或者从存储装置1208加载到随机访问存储器(RAM)1203中的程序而执行各种适当的动作和处理。在RAM 1203中,还存储有电子设备1200操作所需的各种程序和数据。处理装置 1201、ROM 1202以及RAM 1203通过总线1204彼此相连。输入/输出(I/O)接口1205也连接至总线1204。
通常,以下装置可以连接至I/O接口1205:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置1206;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置1207;包括例如磁带、硬盘等的存储装置1208;以及通信装置1209。通信装置1209可以允许电子设备与其他设备进行无线或有线通信以交换数据。虽然图11示出了具有各种装置的电子设备,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置1209从网络上被下载和安装,或者从存储装置1208被安装,或者从ROM1202被安装。在该计算机程序被处理装置1201执行时,执行本公开实施例的方法中限定的上述功能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以 包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如HTTP(HyperText Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(“LAN”),广域网(“WAN”),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:展示视频播放区域和字幕编辑区域,其中,所述视频播放区域用于播放目标视频,所述字幕编辑区域用于编辑所述目标视频对应的候选字幕;将视频播放区域当前展示的视频帧与所述字幕编辑区域展示的字幕,进行联动展示。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:获取候选字幕,其中,所述候选字幕包括至少一个字幕文本项,字幕文本项与视频时间段绑定,字幕文本项所绑定的视频时间段用于字幕与视频的联动展示;根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视 频时间段。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:展示视频播放界面,其中,所述视频播放界面用于播放视频和展示与视频帧对应的字幕;响应于检测到针对所述字幕的触发操作,展示视频播放区域和字幕编辑区域,其中,所述字幕编辑区域用于编辑字幕,所述视频播放区域用于播放目标视频。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定,例如,第一展示单元还可以被描述为“展示目标视频指示信息的单元”。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。 在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。

Claims (46)

  1. 一种字幕编辑方法,其特征在于,包括:
    展示视频播放区域和字幕编辑区域,其中,所述视频播放区域用于播放目标视频,所述字幕编辑区域用于编辑所述目标视频对应的候选字幕;
    将视频播放区域当前展示的视频帧与所述字幕编辑区域展示的字幕,进行联动展示。
  2. 根据权利要求1所述的方法,其特征在于,所述将视频播放区域当前展示的视频帧与所述字幕编辑区域展示的字幕,进行联动展示,包括:
    采用预定义的进度指示方式,在所述字幕编辑区域中,指示所述视频播放区域中当前播放的视频帧对应的字幕。
  3. 根据权利要求1所述的方法,其特征在于,所述将视频播放区域当前展示的视频帧与所述字幕编辑区域展示的字幕,进行联动展示,包括:
    响应于检测到针对字幕编辑区域中字幕的选中操作,在所述视频播放区域中,展示与选中的字幕对应的视频帧。
  4. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    播放与视频播放区域当前展示的视频帧对应的音频。
  5. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    在视频播放界面中播放目标视频,以及响应于检测到预定义的字幕编辑发起操作,展示字幕编辑区域。
  6. 根据权利要求5所述的方法,其特征在于,所述字幕编辑发起操作包括:
    针对所述视频播放界面中展示的字幕的触发操作。
  7. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    将在视频播放界面中所触发的字幕,确定为字幕编辑区域中的待编辑字幕。
  8. 根据权利要求5所述的方法,其特征在于,所述字幕编辑发起操作包括:
    针对预设的字幕编辑发起控件的触发操作。
  9. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    在目标视频的播放状态,根据检测到针对所述字幕编辑区域的字幕浏览操作,采用自由浏览模式,展示所述字幕编辑区域中的字幕。
  10. 根据权利要求1所述的方法,其特征在于,所述字幕编辑区域能够显示所述候选字幕的至少两个字幕文本项,字幕文本项与视频 时间段绑定,其中,在所绑定的视频时间段内,字幕文本项指示的语音与视频播放区域展示的视频帧同步播放。
  11. 根据权利要求1所述的方法,其特征在于,所述候选字幕基于对目标视频对应的语音进行语音识别得到,并且,对目标视频时间段内的语音进行识别得到的字幕文本项,与所述目标视频时间段绑定。
  12. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    响应于确定目标视频处于暂停状态,开启所述字幕编辑区域对编辑操作的响应功能。
  13. 根据权利要求12所述的方法,其特征在于,暂停所述目标视频的方式包括:
    针对所述字幕编辑区域中字幕文本项的选中操作。
  14. 根据权利要求12所述的方法,其特征在于,暂停所述目标视频的方式包括:
    针对所述视频播放区域的预设触发操作。
  15. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    根据在字幕编辑区域的字幕编辑操作,更新字幕编辑区域中显示的候选字幕。
  16. 根据权利要求15所述的方法,其特征在于,字幕编辑操作 可以包括文本编辑操作;以及
    所述方法还包括:
    响应于检测到针对字幕文本的文本编辑操作,更新字幕编辑区域中的字幕文本,以及保持所述文本编辑操作针对的字幕文本项所绑定的时间段不变。
  17. 根据权利要求15所述的方法,其特征在于,所述字幕编辑操作包括单元编辑操作;以及
    所述方法还包括
    根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段。
  18. 根据权利要求17所述的方法,其特征在于,所述单元编辑操作包括拆分操作;以及
    所述根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段,包括:
    响应于检测到拆分操作,根据拆分得到的各个字幕文本项在拆分前的字幕文本项中的占比,对拆分前的字幕文本项的视频时间段进行划分;
    将划分得到的各个视频时间段,与拆分得到的各个字幕文本项分别进行绑定。
  19. 根据权利要求18所述的方法,其特征在于,拆分得到的字 幕文本项在拆分前的字幕文本项中的占比,包括以下至少一项:
    拆分得到的字幕文本项中的文本字数,与拆分前的字幕文本项中的总文本字幕,之间的比值;
    拆分得到的字幕文本项对应的语音时长,与拆分前的字幕文本项对应的总语音时长,之间的比值。
  20. 根据权利要求17所述的方法,其特征在于,所述单元编辑操作包括合并操作;以及
    所述根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定视频时间段,包括:
    响应于检测到合并操作,将合并操作针对的至少两个字幕文本项进行合并,以及将所述至少两个字幕文本项所绑定的视频时间段进行合并。
  21. 根据权利要求1或12所述的方法,其特征在于,单个字幕文本项中的字幕文本存储于同一存储单元,不同字幕文本项存储于不同存储单元。
  22. 根据权利要求21所述的方法,其特征在于,为单个字幕文本项存储于的存储单元设置相应的自身属性,其中,所述自身属性用于指示字幕文本项的自身特征。
  23. 根据权利要求21所述的方法,其特征在于,
    根据拆分操作生成的至少两个字幕文本项存储于不同的存储单元;
    根据合并操作生成的字幕文本项存储于同一存储单元。
  24. 根据权利要求12所述的方法,其特征在于,所述字幕编辑区域包括与字幕文本项绑定的文本控件,其中,文本控件用于展示所绑定的字幕文本项中的字幕文本。
  25. 根据权利要求24所述的方法,其特征在于,所述方法还包括以下至少一项:
    基于拆分操作,生成新的文本控件,以及将新生成的文本控件与新生成的字幕文本项绑定;
    基于合并操作,删除被合并的字幕文本项所绑定的文本控件。
  26. 根据权利要求12所述的方法,其特征在于,所述方法还包括:
    在视频播放区域,展示经更新的字幕文本项所对应的视频帧。
  27. 根据权利要求21所述的方法,其特征在于,
    所述存储单元包括数组;和/或
    存储单元的自身属性,包括以下至少一项:所绑定的多媒体时间 段、所绑定的文本控件和所绑定的编辑触发控件。
  28. 根据权利要求27所述的方法,其特征在于,
    所述多媒体时间段包括以下至少一项:视频时间段、音频时间段。
  29. 根据权利要求21所述的方法,其特征在于,所述方法还包括:
    如果对字幕文本项中的字幕文本进行文本内容编辑,则该字幕文本项的自身属性保持不变。
  30. 一种字幕编辑方法,其特征在于,包括:
    获取候选字幕,其中,所述候选字幕包括至少一个字幕文本项,字幕文本项与多媒体时间段绑定,字幕文本项所绑定的多媒体时间段用于字幕与视频的联动展示;
    根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定多媒体时间段。
  31. 根据权利要求30所述的方法,其特征在于,所述为新生成的字幕文本项绑定多媒体时间段,包括:
    基于所述拆分操作或合并操作针对的字幕文本项绑定的多媒体时间段,为新生成的绑定多媒体时间段。
  32. 根据权利要求31所述的方法,其特征在于,所述单元编辑操作包括拆分操作;以及
    所述根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定多媒体时间段,包括:
    响应于检测到拆分操作,根据拆分得到的各个字幕文本项在拆分前的字幕文本项中的占比,对拆分前的字幕文本项的多媒体时间段进行划分;
    将划分得到的各个多媒体时间段,与拆分得到的各个字幕文本项分别进行绑定。
  33. 根据权利要求31所述的方法,其特征在于,所述单元编辑操作包括合并操作;以及
    所述根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定多媒体时间段,包括:
    响应于检测到合并操作,将合并操作针对的至少两个字幕文本项进行合并,以及将所述至少两个字幕文本项所绑定的多媒体时间段进行合并。
  34. 根据权利要求30所述的方法,其特征在于,所述多媒体时间段包括以下至少一项:视频时间段、音频时间段。
  35. 根据权利要求30所述的方法,其特征在于,所述方法还包括:
    如果对字幕文本项中的字幕文本进行文本内容编辑,则该字幕文本项所绑定的多媒体时间段保持不变。
  36. 一种字幕编辑方法,其特征在于,包括:
    展示视频播放界面,其中,所述视频播放界面用于播放视频和展示与视频帧对应的字幕;
    响应于检测到针对所述字幕的触发操作,展示视频播放区域和字幕编辑区域,其中,所述字幕编辑区域用于编辑字幕,所述视频播放区域用于播放目标视频。
  37. 根据权利要求36所述的方法,其特征在于,所述响应于检测到针对所述字幕的触发操作,展示视频播放区域和字幕编辑区域,包括:
    响应于检测到针对所述字幕的触发操作,缩小所述视频播放界面中的视频展示播放区域中的尺寸,在缩小尺寸后的视频播放区域播放视频,以及展示字幕编辑区域。
  38. 根据权利要求36所述的方法,其特征在于,所述字幕编辑区域能够展示指示两个字幕文本项。
  39. 根据权利要求36所述的方法,其特征在于,所述方法还包括:
    响应于在所述字幕编辑区域检测到第一触发操作,采用自由浏览模式,展示所述字幕编辑区域中的字幕。
  40. 根据权利要求36所述的方法,其特征在于,所述方法还包括:
    响应于在所述字幕编辑区域检测到第二触发操作,选中字幕编辑区域的字幕文本项。
  41. 根据权利要求36所述的方法,其特征在于,响应于检测到针对所述字幕的触发操作包括:
    识别触发操作的触点位置,当所述触点位置位于所述字幕的显示区域范围内时,确定检测到针对所述字幕的触发操作。
  42. 一种字幕编辑装置,其特征在于,包括:
    第一展示单元,用于展示视频播放区域和字幕编辑区域,其中,所述视频播放区域用于播放目标视频,所述字幕编辑区域用于编辑所述目标视频对应的候选字幕;
    第二展示单元,用于将视频播放区域当前展示的视频帧与所述字幕编辑区域展示的字幕,进行联动展示。
  43. 一种字幕编辑装置,其特征在于,包括:
    获取单元,用于获取候选字幕,其中,所述候选字幕包括至少一个字幕文本项,字幕文本项与多媒体时间段绑定,字幕文本项所绑定的多媒体时间段用于字幕与视频的联动展示;
    绑定单元,用于根据针对字幕文本项的单元编辑操作,对字幕文本项进行拆分或者合并,以及为新生成的字幕文本项绑定多媒体时间 段。
  44. 一种字幕编辑装置,其特征在于,包括:
    第三展示单元,用于展示视频播放界面,其中,所述视频播放界面用于播放视频和展示与视频帧对应的字幕;
    第四展示单元,用于响应于检测到针对所述字幕的触发操作,展示视频播放区域和字幕编辑区域,其中,所述字幕编辑区域用于编辑字幕,所述视频播放区域用于播放目标视频。
  45. 一种电子设备,其特征在于,包括:
    一个或多个处理器;
    存储装置,用于存储一个或多个程序,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-29中任一所述的方法或者如权利要求30-35中任一所述的方法或者如权利要求36-42中任一所述的方法。
  46. 一种计算机可读介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1-29中任一所述的方法或者如权利要求30-35中任一所述的方法或者如权利要求36-42中任一所述的方法。
PCT/CN2021/114504 2020-08-25 2021-08-25 字幕编辑方法、装置和电子设备 WO2022042593A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/023,711 US20230308730A1 (en) 2020-08-25 2021-08-25 Subtitle editing method and apparatus, and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010868161.8 2020-08-25
CN202010868161.8A CN111970577B (zh) 2020-08-25 2020-08-25 字幕编辑方法、装置和电子设备

Publications (1)

Publication Number Publication Date
WO2022042593A1 true WO2022042593A1 (zh) 2022-03-03

Family

ID=73390966

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/114504 WO2022042593A1 (zh) 2020-08-25 2021-08-25 字幕编辑方法、装置和电子设备

Country Status (3)

Country Link
US (1) US20230308730A1 (zh)
CN (1) CN111970577B (zh)
WO (1) WO2022042593A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115209206A (zh) * 2022-07-11 2022-10-18 北京达佳互联信息技术有限公司 视频编辑方法、装置、设备、存储介质和计算机程序产品
CN115278356A (zh) * 2022-06-23 2022-11-01 上海高顿教育科技有限公司 一种智能化的课程视频剪辑控制方法

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111970577B (zh) * 2020-08-25 2023-07-25 北京字节跳动网络技术有限公司 字幕编辑方法、装置和电子设备
CN113886612A (zh) * 2020-11-18 2022-01-04 北京字跳网络技术有限公司 一种多媒体浏览方法、装置、设备及介质
CN115086691A (zh) * 2021-03-16 2022-09-20 北京有竹居网络技术有限公司 字幕优化方法、装置、电子设备和存储介质
CN113422996B (zh) * 2021-05-10 2023-01-20 北京达佳互联信息技术有限公司 字幕信息编辑方法、装置及存储介质
CN113438532B (zh) * 2021-05-31 2022-12-27 北京达佳互联信息技术有限公司 视频处理、视频播放方法、装置、电子设备及存储介质
CN113596557B (zh) * 2021-07-08 2023-03-21 大连三通科技发展有限公司 一种视频生成方法及装置
CN113905267B (zh) * 2021-08-27 2023-06-20 北京达佳互联信息技术有限公司 一种字幕编辑方法、装置、电子设备及存储介质
CN113891108A (zh) * 2021-10-19 2022-01-04 北京有竹居网络技术有限公司 字幕优化方法、装置、电子设备和存储介质
CN113891168B (zh) * 2021-10-19 2023-12-19 北京有竹居网络技术有限公司 字幕处理方法、装置、电子设备和存储介质
CN114268829B (zh) * 2021-12-22 2024-01-16 中电金信软件有限公司 视频处理方法、装置、电子设备及计算机可读存储介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101071618A (zh) * 2006-05-09 2007-11-14 上海乐金广电电子有限公司 影像字幕编辑方法
KR20090124240A (ko) * 2008-05-29 2009-12-03 주식회사 케이티테크 자막 편집 장치 및 그 방법
CN104104986A (zh) * 2014-07-29 2014-10-15 小米科技有限责任公司 音频与字幕的同步方法和装置
CN106792071A (zh) * 2016-12-19 2017-05-31 北京小米移动软件有限公司 字幕处理方法及装置
CN108259971A (zh) * 2018-01-31 2018-07-06 百度在线网络技术(北京)有限公司 字幕添加方法、装置、服务器及存储介质
CN108924622A (zh) * 2018-07-24 2018-11-30 腾讯科技(深圳)有限公司 一种视频处理方法及其设备、存储介质、电子设备
CN109819301A (zh) * 2019-02-20 2019-05-28 广东小天才科技有限公司 视频的播放方法及装置、终端设备、计算机可读存储介质
CN111970577A (zh) * 2020-08-25 2020-11-20 北京字节跳动网络技术有限公司 字幕编辑方法、装置和电子设备

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104378692A (zh) * 2014-11-17 2015-02-25 天脉聚源(北京)传媒科技有限公司 一种处理视频字幕的方法及装置
CN108833991A (zh) * 2018-06-29 2018-11-16 北京优酷科技有限公司 视频字幕显示方法及装置
CN111147948A (zh) * 2018-11-02 2020-05-12 北京快如科技有限公司 信息处理方法、装置及电子设备
CN109257659A (zh) * 2018-11-16 2019-01-22 北京微播视界科技有限公司 字幕添加方法、装置、电子设备及计算机可读存储介质
CN110781649B (zh) * 2019-10-30 2023-09-15 中央电视台 一种字幕编辑方法、装置及计算机存储介质、电子设备
CN111107422B (zh) * 2019-12-26 2021-08-24 腾讯科技(深圳)有限公司 图像处理方法及装置、电子设备和计算机可读存储介质
CN111565330A (zh) * 2020-07-13 2020-08-21 北京美摄网络科技有限公司 一种同步字幕的添加方法及装置、电子设备、存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101071618A (zh) * 2006-05-09 2007-11-14 上海乐金广电电子有限公司 影像字幕编辑方法
KR20090124240A (ko) * 2008-05-29 2009-12-03 주식회사 케이티테크 자막 편집 장치 및 그 방법
CN104104986A (zh) * 2014-07-29 2014-10-15 小米科技有限责任公司 音频与字幕的同步方法和装置
CN106792071A (zh) * 2016-12-19 2017-05-31 北京小米移动软件有限公司 字幕处理方法及装置
CN108259971A (zh) * 2018-01-31 2018-07-06 百度在线网络技术(北京)有限公司 字幕添加方法、装置、服务器及存储介质
CN108924622A (zh) * 2018-07-24 2018-11-30 腾讯科技(深圳)有限公司 一种视频处理方法及其设备、存储介质、电子设备
CN109819301A (zh) * 2019-02-20 2019-05-28 广东小天才科技有限公司 视频的播放方法及装置、终端设备、计算机可读存储介质
CN111970577A (zh) * 2020-08-25 2020-11-20 北京字节跳动网络技术有限公司 字幕编辑方法、装置和电子设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115278356A (zh) * 2022-06-23 2022-11-01 上海高顿教育科技有限公司 一种智能化的课程视频剪辑控制方法
CN115209206A (zh) * 2022-07-11 2022-10-18 北京达佳互联信息技术有限公司 视频编辑方法、装置、设备、存储介质和计算机程序产品

Also Published As

Publication number Publication date
CN111970577B (zh) 2023-07-25
CN111970577A (zh) 2020-11-20
US20230308730A1 (en) 2023-09-28

Similar Documents

Publication Publication Date Title
WO2022042593A1 (zh) 字幕编辑方法、装置和电子设备
KR102436734B1 (ko) 비디오 재생 노드 위치 확정 방법, 장치, 전자 장비, 컴퓨터 판독가능 저장 매체 및 컴퓨터 프로그램
WO2022068533A1 (zh) 互动信息处理方法、装置、设备及介质
WO2023051102A1 (zh) 视频推荐方法、装置、设备及介质
WO2021196903A1 (zh) 视频处理方法、装置、可读介质及电子设备
US20240121468A1 (en) Display method, apparatus, device and storage medium
WO2022143924A1 (zh) 视频生成方法、装置、电子设备和存储介质
US11914845B2 (en) Music sharing method and apparatus, electronic device, and storage medium
CN111510760A (zh) 视频信息展示方法和装置、存储介质和电子设备
US20240121479A1 (en) Multimedia processing method, apparatus, device, and medium
US20220321936A1 (en) Information push method, apparatus, electronic device and storage medium
US20220198403A1 (en) Method and device for interacting meeting minute, apparatus and medium
US20230229382A1 (en) Method and apparatus for synchronizing audio and text, readable medium, and electronic device
WO2023088442A1 (zh) 一种直播预览方法、装置、设备、程序产品及介质
WO2023011259A1 (zh) 一种信息显示方法、装置、电子设备和存储介质
WO2023036277A1 (zh) 内容显示方法、装置、设备及介质
WO2022105709A1 (zh) 多媒体的交互方法、信息交互方法、装置、设备及介质
WO2023016349A1 (zh) 一种文本输入方法、装置、电子设备和存储介质
CN108491178B (zh) 信息浏览方法、浏览器和服务器
US20240007718A1 (en) Multimedia browsing method and apparatus, device and mediuim
US20240121485A1 (en) Method, apparatus, device, medium and program product for obtaining text material
WO2024032413A1 (zh) 书籍信息显示方法、装置、设备和存储介质
WO2024093443A1 (zh) 基于语音交互的信息展示方法、装置和电子设备
CN110381356B (zh) 音视频生成方法、装置、电子设备及可读介质
WO2023134558A1 (zh) 交互方法、装置、电子设备、存储介质和程序产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21860428

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 23/06/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21860428

Country of ref document: EP

Kind code of ref document: A1