WO2022217944A1 - Procédé de liaison de sous-titre avec une source audio, et appareil - Google Patents

Procédé de liaison de sous-titre avec une source audio, et appareil Download PDF

Info

Publication number
WO2022217944A1
WO2022217944A1 PCT/CN2021/135470 CN2021135470W WO2022217944A1 WO 2022217944 A1 WO2022217944 A1 WO 2022217944A1 CN 2021135470 W CN2021135470 W CN 2021135470W WO 2022217944 A1 WO2022217944 A1 WO 2022217944A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
segment
subtitle
audio source
positional relationship
Prior art date
Application number
PCT/CN2021/135470
Other languages
English (en)
Chinese (zh)
Inventor
陈圣宾
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2022217944A1 publication Critical patent/WO2022217944A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44012Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs

Definitions

  • the embodiments of the present application relate to the field of computer technologies, and in particular, to a method, apparatus, electronic device, computer storage medium, and computer program product for binding subtitles and audio sources.
  • the video object usually contains more audio clips, such as the original sound source of the main track of the video, the imported dubbing, and the dubbing audio source, etc., and the corresponding subtitle clip can also be configured for each audio clip, so as to achieve a clearer video. expressive effect.
  • a video object has a video main track, and the video main track can reflect the playback timing of the content of the entire video object.
  • the corresponding moment of the subtitle segment is bound, and the subtitle header of the subtitle segment is bound to the moment corresponding to the subtitle header on the video main track.
  • Embodiments of the present application provide a method, apparatus, electronic device, computer storage medium, and computer program product for binding subtitles and audio sources.
  • an embodiment of the present application provides a method for binding subtitles and audio sources, and the method includes:
  • the relative positional relationship is kept unchanged.
  • maintaining the relative positional relationship unchanged in the case of performing an adjustment operation on the target audio source segment and/or the target subtitle segment includes:
  • the target subtitle segment is bound to the main track of the target video in response to a subtitle processing operation for retained subtitles.
  • the case where there is no overlap between the adjusted target audio source segment and the target subtitle segment includes: performing deletion of the target audio source segment, or performing changing the position of the subtitle header and/or performing The case of the head position of the sound source, or the case of dividing the target sound source segment to obtain multiple sound source sub-segments, and deleting the sound source sub-segment located at the head of the multiple sound source sub-segments.
  • the method further includes:
  • the relative positional relationship is updated according to the new subtitle header position and the audio source header position, and the updated relative positional relationship is kept unchanged.
  • the method further includes:
  • the portion of the target subtitle segment that exceeds the boundary is deleted.
  • the target audio source segment is displayed on an audio source track; the target subtitle segment is displayed on a subtitle track, and the audio source track, the subtitle track and the main track of the target video use the same timing sequence ; The target sound source segment is bound to the main track.
  • maintaining the relative positional relationship unchanged in the case of performing an adjustment operation on the target audio source segment and/or the target subtitle segment includes:
  • the relative positional relationship is updated according to the changed subtitle header position and audio source header position, and the The updated relative position relationship remains unchanged.
  • maintaining the relative positional relationship unchanged in the case of performing an adjustment operation on the target sound source segment includes:
  • the position of the target subtitle segment is moved as a whole following the target audio source segment, and the relative positional relationship is kept unchanged.
  • maintaining the relative positional relationship unchanged in the case of performing an adjustment operation on the target sound source segment includes:
  • maintaining the relative positional relationship unchanged in the case of performing an adjustment operation on the target sound source segment includes:
  • the target subtitle segment is subjected to a speed change operation according to the speed change value, and the relative positional relationship is kept unchanged.
  • performing a speed change operation on the target subtitle segment according to the speed change value includes:
  • the target subtitle segment is performed according to the preset variable speed value.
  • the speed change operation is performed on the part of the target subtitle segment with the first duration according to the preset speed change value.
  • maintaining the relative positional relationship unchanged in the case of performing an adjustment operation on the target sound source segment includes:
  • an embodiment of the present application provides an apparatus for binding subtitles and audio sources, and the apparatus includes:
  • an identification module configured to determine a target sound source segment in the target video, and identify a target subtitle segment by the target sound source segment;
  • a binding module configured to determine the relative positional relationship between the subtitle header position of the target subtitle segment and the audio source header position of the target audio segment
  • the maintaining module is configured to maintain the relative positional relationship unchanged when the adjustment operation on the target audio source segment and/or the target subtitle segment is performed.
  • the retention module includes:
  • a triggering submodule configured to trigger a subtitle processing operation when there is no overlap between the adjusted target audio source segment and the target subtitle segment
  • the binding submodule is configured to bind the target subtitle segment with the main track of the target video in response to the subtitle processing operation of the retained subtitle.
  • the case where there is no overlap between the adjusted target audio source segment and the target subtitle segment includes: performing deletion of the target audio source segment, or performing changing the position of the subtitle header and/or performing The case of the head position of the sound source, or the case of dividing the target sound source segment to obtain multiple sound source sub-segments, and deleting the sound source sub-segment located at the head of the multiple sound source sub-segments.
  • the apparatus further includes:
  • the detection module is configured to detect the adjusted new subtitle head position and the audio source head position when there is an overlap between the adjusted target audio source segment and the target subtitle segment;
  • the updating module is configured to update the relative positional relationship according to the new subtitle header position and the audio source header position, and keep the updated relative positional relationship unchanged.
  • the apparatus further includes:
  • the deletion module is configured to delete the part of the target subtitle segment beyond the boundary in the case of moving the target subtitle segment beyond the boundary of the main track of the target video.
  • the target audio source segment is displayed on an audio source track; the target subtitle segment is displayed on a subtitle track, and the audio source track, the subtitle track and the main track of the target video use the same timing sequence ; The target audio source segment is bound to the main track.
  • the retention module includes:
  • the update sub-module is configured to update the relative subtitle head position and the audio source head position according to the changed subtitle head position and the audio source head position when the adjustment operation of changing the subtitle head position and/or the audio source head position is performed. position relationship, and keep the updated relative position relationship unchanged.
  • the retention module includes:
  • the moving sub-module is configured to move the position of the target subtitle segment along with the target sound source segment as a whole under the condition of performing an adjustment operation for moving the position of the target audio source segment as a whole, and keep the relative position relationship unchanged. Change.
  • the retention module includes:
  • the replacement submodule is configured to determine a new subtitle position between the subtitle header position of the target subtitle fragment and the audio source header position of the replaced target audio source fragment when performing the adjustment operation of replacing the target audio source fragment. relative positional relationship, and keep the new relative positional relationship unchanged.
  • the retention module includes:
  • variable speed sub-module is configured to perform a variable speed operation on the target subtitle segment according to the variable speed value and maintain the relative position in the case of performing a variable speed operation on the target audio source segment according to a preset variable speed value The relationship remains unchanged.
  • the transmission sub-module includes:
  • the first shifting unit is configured to perform shifting of the target subtitle segment according to the preset shifting value when the first duration of the target audio segment before shifting is greater than the second duration of the target subtitle segment operate;
  • the second speed changing unit is configured to, when the first duration of the target audio source segment before shifting is smaller than the second duration of the target subtitle segment, change the part of the first duration in the target subtitle segment according to the preset The speed change operation is performed at the set speed change value.
  • the retention module includes:
  • a segmentation sub-module configured to obtain a plurality of divided audio sub-segments when performing an adjustment operation of segmenting the target audio segment
  • the cropping submodule is configured to establish a new relative positional relationship between the subtitle header position of the target subtitle segment and the audio source header position of the target audio source subsegment, and keep the new relative positional relationship unchanged.
  • the sound source sub-segment is the sound source sub-segment at the head position among all the sound source sub-segments.
  • an embodiment of the present application further provides an electronic device, including a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions to implement the subtitles Binding to the sound source.
  • an embodiment of the present application further provides a storage medium, when the instructions in the computer-readable storage medium are executed by the processor of the electronic device, the electronic device can perform the binding between the subtitles and the audio source. .
  • an embodiment of the present application further provides a computer program product, including a computer program, which realizes the binding of the subtitles and the audio source when the computer program is executed by the processor.
  • the present application includes: determining the target audio source segment in the target video, and identifying the target subtitle segment by the target audio source segment; determining the position of the subtitle header of the target subtitle segment and the position of the audio source header of the target audio source segment The relative positional relationship between them; in the case of performing the adjustment operation on the target audio source segment and/or the target subtitle segment, the relative positional relationship remains unchanged.
  • the present application can bind the relative positional relationship between the head position of the target subtitle segment and the target audio source segment, so that the editing process of the main track and the editing process of the subtitle and audio source are isolated from each other, and the editing operation of the main track is not It will affect the alignment between the subtitles and the audio source, thereby reducing the chance of misalignment between the audio source and subtitles.
  • 1 is a flowchart of steps of a method for binding subtitles and audio sources provided by an embodiment of the present application
  • FIG. 2 is a binding interface diagram of a subtitle and audio source provided by an embodiment of the present application.
  • Fig. 3 is another binding interface diagram of subtitles and audio sources provided by an embodiment of the present application.
  • Fig. 5 is another binding interface diagram of subtitles and audio sources provided by an embodiment of the present application.
  • Fig. 6 is another binding interface diagram of subtitles and audio sources provided by an embodiment of the present application.
  • Fig. 7 is another binding interface diagram of subtitles and audio sources provided by an embodiment of the present application.
  • FIG. 8 is an interface diagram of a subtitle processing operation provided by an embodiment of the present application.
  • Fig. 9 is another binding interface diagram of subtitles and audio sources provided by an embodiment of the present application.
  • FIG. 10 is another interface diagram of binding between subtitles and audio sources provided by an embodiment of the present application.
  • 11 is another interface diagram for binding subtitles and audio sources provided by an embodiment of the present application.
  • Fig. 12 is another binding interface diagram of subtitles and audio sources provided by an embodiment of the present application.
  • FIG. 13 is a block diagram of an apparatus for binding subtitles and audio sources provided by an embodiment of the present application.
  • FIG. 14 is a logical block diagram of an electronic device according to an embodiment of the present application.
  • FIG. 15 is a logical block diagram of an electronic device according to another embodiment of the present application.
  • FIG. 1 is a flowchart of steps of a method for binding subtitles and audio sources provided by an embodiment of the present application.
  • the method may be executed by a server, a processor, a vehicle-mounted device, a mobile device, a computing device, and the like. As shown in Figure 1, the method may include:
  • Step 101 Determine the target audio source segment in the target video, and identify the target subtitle segment from the target audio source segment.
  • a video can usually include one or more audio clips.
  • Audio clips refer to the sound clips that appear in the video and are a kind of timbre resource.
  • the types of audio clips can include main track original sound source, picture-in-picture original sound source, inserted music / recording / Dubbing sound source, etc., where the main track original sound source is the original sound content of the video; the picture-in-picture original sound source is the original sound content of a picture-in-picture video inserted into the video; inserting a music/recording/dubbing sound source refers to additionally inserting the video music/recording/dubbing.
  • the target audio clip can be extracted by analyzing the content of the target video, and the target audio clip can be any clip of all the audio clips included in the target video.
  • the target sound source segment can be subjected to speech recognition to obtain the corresponding text, and the recognized text content can be used as the target subtitle segment identified by the target audio source segment.
  • the target subtitle segment can be used as the target audio source segment. display subtitles.
  • the subtitle clips corresponding to the original soundtrack can be obtained through speech recognition, as well as the narration dubbing.
  • the resulting subtitle clip can be obtained through speech recognition, as well as the narration dubbing.
  • Step 102 Determine the relative positional relationship between the subtitle header position of the target subtitle segment and the audio source header position of the target audio source segment.
  • the video has a fixed playback duration, and the corresponding playback timing can be obtained from the playback duration.
  • the video is played from the starting point 0 minutes and 0 seconds to the end point of 10 minutes and 30 seconds.
  • Track the main video track is composed of multiple video frames of the video, and the main video track can display the video in the form of a frame sequence stream according to the playback timing of the video. Users can operate the main track of the video, so as to conveniently view, select and edit the content in different positions of the video.
  • the target audio source segment and the corresponding target subtitle segment need to be aligned.
  • a common situation where the audio source subtitles are not aligned is that the audio source and the subtitles in the picture are misaligned, such as lyrics and music. Misalignment .
  • the meaning of the alignment does not only include that the pronunciation moment of a word in the audio source must completely overlap with the display moment of the text corresponding to the word in the subtitle.
  • both the audio source segment and the subtitle segment are bound to the main track of the video. Therefore, with the normal editing operation of the main track by the user, the subtitle segment will inevitably be affected, resulting in the possibility of misalignment between the audio source and the subtitle. Huge improvements.
  • the target subtitle segment and the corresponding target audio source segment can be bound, so that the operation on the main track will not affect the alignment between the subtitles and the audio source. Reduced the chance of misalignment between audio source and subtitles.
  • the target subtitle segment has a subtitle header position
  • the target audio source segment has an audio source header position.
  • the position of the subtitle head can be understood as the time corresponding to the starting point of the target subtitle segment on the main track
  • the position of the audio source head can be understood as the time corresponding to the starting point of the target audio segment on the main track
  • Binding with the corresponding target audio clip can be achieved by binding the subtitle head position of the target subtitle clip with the audio head position of the target audio clip, and maintaining the relative positional relationship between the two head positions. Change.
  • FIG. 2 shows a binding interface diagram of a subtitle and an audio source provided by an embodiment of the present application
  • the two target The audio source segments A and B perform speech recognition to obtain a target subtitle segment a corresponding to the target audio source segment A and a target subtitle segment b corresponding to the target audio segment B.
  • the target audio source segments A and B can be displayed in the audio source track 10, and the target subtitle segments a and b can be displayed in the subtitle track 20.
  • the audio source track, subtitle track and the main track of the target video use the same timing sequence.
  • the target subtitle segment a can be displayed when the target video starts playing, and the binding between the target audio segment A and the target subtitle segment a can be: Determine the target audio segment The relative positional relationship between the audio source head position (00:10) of A and the subtitle head position (00:00) of the target subtitle segment a, and keep the relative positional relationship unchanged, so as to achieve the purpose of aligning the two ; Aiming at the requirement of strict phonetic alignment between the target audio source fragment B and the target subtitle fragment b, the binding of the target audio source fragment A and the target subtitle fragment a can be: determine the audio source head position (00:50) of the target audio source fragment B and the target audio source fragment B. The relative positional relationship between the subtitle header positions (00:50) of the subtitle segment b, and the relative positional relationship is maintained unchanged, so as to achieve the purpose of aligning the two.
  • Step 103 In the case of performing the adjustment operation on the target audio source segment and/or the target subtitle segment, keep the relative positional relationship unchanged.
  • the subsequent adjustment operations on the target audio segment and the target subtitle segment are as long as It does not involve the adjustment of the position of the subtitle head and the head position of the audio source, and will not affect the binding of the above-mentioned relative position relationship, and the target audio source clip and target subtitle clip will be adjusted according to actual needs, except for the head position.
  • the above relative positional relationship can also be kept unchanged, so as to achieve the purpose of aligning subtitles and audio sources.
  • the audio source segment when the audio source segment is not the original sound source of the main track, and operations such as speed change and trimming are performed on the main track, the corresponding subtitle segment of the audio source segment will also undergo changes such as speed change and trimming. If it is the original sound source of the non-main track, it will not follow the change, resulting in serious subtitle misalignment of the audio source.
  • FIG. 2 and FIG. 3 shows another interface diagram for binding subtitles and audio sources provided by an embodiment of the present application.
  • the target audio source segments A and B are both non-main track original audio sources, and the user
  • the target audio clip A since the target audio clip A is a non-main track original sound source, the target audio clip A will not follow the variable speed, and because the target subtitle clip a is not The main track is bound, but the head position is bound to the target subtitle fragment A, so the target subtitle fragment a will not follow the speed change, so that the target subtitle fragment a is bound to the head position of the target subtitle fragment A, and maintains The relative positional relationship between the head positions remains unchanged, and the purpose of alignment is achieved.
  • the target audio source segment A in response to the user performing a 2-fold speed change on the area 31 in the main track 30 of the video, the target audio source segment A will also follow the 2-fold speed change, which is implemented in this application.
  • the target subtitle segment a can be synchronously changed according to the speed change value of 2 times, so as to achieve the purpose of alignment.
  • the target audio clip B when the user has trimmed the area 32 in the main track 30 of the video, since the target audio clip B is a non-main track original sound source, the target audio clip B will not follow the speed change, and because the target audio clip B The subtitle segment b is now not bound to the main track, but is bound to the head position of the target subtitle segment B, so the target subtitle segment B will not follow the speed change, so that the target subtitle segment b and the head of the target subtitle segment B The position is bound, and the relative positional relationship between the head positions is maintained unchanged to achieve the purpose of alignment.
  • the target audio source segment B is the main track original sound source
  • the part corresponding to the region 32 of the target subtitle segment b in the related art will also be trimmed, As a result, part of the subtitle information is missing.
  • the part corresponding to the region 32 of the target audio source segment B will be cropped, the part corresponding to the region 32 of the target subtitle segment B will not be cropped, thereby avoiding the subtitle part.
  • the lack of information ensures the integrity of the subtitles.
  • a method for binding subtitles and audio sources includes: determining a target audio source segment in a target video, and identifying the target subtitle segment from the target audio source segment; determining a subtitle header of the target subtitle segment The relative positional relationship between the head position of the target audio source segment and the audio source header position of the target audio source segment; in the case of performing an adjustment operation on the target audio source segment and/or the target subtitle segment, the relative positional relationship remains unchanged.
  • the present application can bind the relative positional relationship between the head position of the target subtitle segment and the target audio source segment, so that the editing process of the main track and the editing process of the subtitle and audio source are isolated from each other, and the editing operation of the main track is not It will affect the alignment between the subtitles and the audio source, thereby reducing the chance of misalignment between the audio source and subtitles.
  • FIG. 4 is a flowchart of steps of a method for binding subtitles and audio sources provided by an embodiment of the present application, and the method may be executed by a server, a processor, a vehicle-mounted device, a mobile device, a computing device, and the like. As shown in Figure 4, the method may include:
  • Step 201 Determine the target audio source segment in the target video, and identify the target subtitle segment from the target audio source segment.
  • Step 202 Determine the relative positional relationship between the subtitle header position of the target subtitle segment and the audio source header position of the target audio source segment.
  • Step 203 In the case of performing the adjustment operation on the target audio source segment and/or the target subtitle segment, keep the relative positional relationship unchanged.
  • step 203 may include:
  • Sub-step 2031 In the case that there is no overlap between the adjusted target audio source segment and the target subtitle segment, trigger a subtitle processing operation.
  • the user can adjust the position, length, etc. of the target audio source segment and/or the target subtitle segment based on the alignment requirements of the subtitle and audio source, thereby changing the relationship between the entire target audio source segment and the entire target subtitle segment.
  • Subtitle processing operation It is used to delete the target subtitle segment at this time, so as to avoid the influence of misplaced subtitles, or to retain the target subtitle segment so that the target subtitle segment can be reused by subsequent editing operations.
  • the case where there is no overlap between the adjusted target audio source segment and the target subtitle segment includes: performing deletion of the target audio source segment, or performing changing the position of the subtitle header and/or performing The case of the head position of the sound source, or the case of dividing the target sound source segment to obtain multiple sound source sub-segments, and deleting the sound source sub-segment located at the head of the multiple sound source sub-segments.
  • FIG. 5 shows another interface diagram for binding subtitles and audio sources provided by the embodiment of the present application
  • the deletion of the target audio source is performed.
  • segment A there is no overlap between the target subtitle segment a and the target audio source segment A after deletion.
  • the subtitle processing operation for the target subtitle segment a can be triggered, thereby realizing the deletion of the entire audio source segment.
  • FIG. 6 shows another interface diagram for binding subtitles and audio sources provided by the embodiment of the present application.
  • the head position of the target audio clip B is adjusted to the position corresponding to the time of 01:20
  • the target The head position of subtitle segment b is adjusted to the position corresponding to the time of 00:40. Therefore, there is no overlap between the target subtitle segment b and the target audio source segment B after adjustment.
  • the subtitle processing operation for the target subtitle segment b can be triggered. In this way, the process of processing the dislocated subtitle segment in the case that the subtitle and the audio source are completely dislocated after adjusting the head position of the audio source and the subtitle is realized.
  • FIG. 1 The interface diagram of the binding interface between subtitles and audio sources. Based on the state shown in Figure 2, the operation of first dividing the target audio source segment A into three audio source sub-segments, and then deleting the audio source sub-segment at the head position is performed. After the adjustment, it is considered that the target subtitle segment a lacks the audio source head position with which the binding relationship is established, so that there is no overlap between the adjusted target audio source segment A and the target subtitle segment a. At this time, the target subtitle segment can be triggered. Subtitle processing operation of segment a, so as to realize the process of processing the misplaced subtitle segment when the subtitle and audio source are completely misaligned.
  • Sub-step 2032 Bind the target subtitle segment to the main track of the target video in response to the subtitle processing operation of the retained subtitle.
  • the target subtitle segment in the case of responding to the subtitle processing operation of the reserved subtitle, can be bound to the main track of the target video for subsequent processing of the reserved target subtitle segment.
  • the subtitle header position of the target subtitle segment can be bound with the corresponding moment of the subtitle header position on the main track of the target video, thereby achieving the purpose of temporarily retaining the target subtitle segment.
  • the position of the subtitle header (00:40) of the adjusted target subtitle segment b can be bound to the position corresponding to the time 00:40 on the main track.
  • subsequent processing operations performed on the target subtitle segment may include the following scenarios: For example, in a scenario, for a male voice source clip, the user wants to If the audio clip is replaced with the machine sound source clip, the user can delete the entire male voice source clip, and bind the subtitle clip corresponding to the male voice source clip to the main track, and then wait for the machine sound source clip to be generated, and then insert the machine sound source clip into The position of the original male voice source clip makes the subtitle clip re-establish the alignment relationship with the machine voice source clip.
  • the audio source segment can also be deleted, and only the corresponding subtitle segment is kept bound to the main track, so that only the content of the segment is described in the form of text during playback.
  • the target subtitle segment may be deleted. That is, when the user thinks that the misplaced target subtitle segment is of no use value, the subtitle processing operation of deleting subtitles can also be performed, so as to delete the target subtitle segment and avoid the interference caused by the misplaced target subtitle segment.
  • the triggered subtitle processing operation may be provided in the form of an interface.
  • FIG. 8 shows an interface diagram of a subtitle processing operation provided by the embodiment of the present application, including a subtitle processing operation for realizing subtitles.
  • Controls for handling operations including reminder text: "This audio recognizes subtitles, delete them together?", "Remove recognized subtitles and audio” button, and "Remove audio only” button.
  • the audio and the corresponding subtitles will be deleted together.
  • the user triggers the "Delete Audio Only” button only the audio will be deleted, and the corresponding subtitles of the audio will be deleted. Bind the main track.
  • the subsequent subtitle processing algorithm is divided into a subtitle processing part for speech recognition and a processing part for manually adding subtitles. Therefore, in order to avoid conflicts between the two parts of the algorithm, in the embodiment of the present application, the audio source segment is composed of
  • the attribute of the recognized subtitle segment can be set as the speech recognition subtitle attribute by default, so that the processing part of the speech recognition subtitle in the algorithm can only process the subtitle segment with the speech recognition subtitle attribute.
  • the attribute of the subtitle clip can be changed to add subtitles manually, that is, the subtitle clip can be regarded as a subtitle clip.
  • the subtitles manually added by the user are processed, so that the processing part of the manually added subtitles in the algorithm can only process the subtitle segments with the attribute of manually added subtitles.
  • the user can also set the adjustment status of the subtitle clips and audio source clips at the current moment to the old draft.
  • all target subtitle clips can be bound to the main track, and an old draft file can be created. If the clip is bound to the main track, the old draft file only contains the subtitle information and the main track information, and does not contain the audio source information with a large file size, thus saving storage resources.
  • step 203 may further include:
  • Sub-step 2033 In the case that there is an overlap between the adjusted target audio source segment and the target subtitle segment, detect the adjusted new subtitle header position and audio source header position.
  • the user can adjust the position, length, etc. of the target audio source segment and/or the target subtitle segment based on the alignment requirements of the subtitle and audio source, thereby changing the relationship between the entire target audio source segment and the entire target subtitle segment.
  • the adjustment operation if there is an overlap between the target audio source segment and the target subtitle segment, it can be considered that the target audio source segment and the target subtitle segment are in the user-adjusted alignment state, and the adjustment can be detected at this time. After the new subtitle head position and audio source head position.
  • Sub-step 2034 Update the relative positional relationship according to the new subtitle header position and audio source header position, and keep the updated relative positional relationship unchanged.
  • the original relative positional relationship can be updated, and the updated relative positional relationship can be kept unchanged, so as to satisfy the user's adjustment of the alignment state of the subtitles and the audio source. need.
  • the head positions of the target audio source segment A and the target subtitle segment a overlap with the time 00:00, and after the adjustment operation, the audio head position of the target audio segment A is at the time 00:10 , when the subtitle head position of the target subtitle segment a is at 00:00 time, according to the adjusted result, determine the new audio source header position (00:10) of the target audio source segment A and the new audio source head position (00:10) of the target subtitle segment a The relative positional relationship between the subtitle header positions (00:00), and the relative positional relationship is maintained unchanged.
  • the target audio source segment is displayed on an audio source track; the target subtitle segment is displayed on a subtitle track, and the audio source track, the subtitle track and the main track of the target video use the same timing sequence ; The target sound source segment is bound to the main track.
  • the editing interface may include three operable adjustment tracks: a main track 30 , an audio source track 10 and a subtitle track 20 of the target video.
  • the main track 30 can display the target video in the form of a frame sequence stream according to the playback sequence of the target video, and the user can operate the main track 30 to conveniently view, select and edit the content of the target video at different positions
  • the audio source track 10 is used to carry and display audio clips, and the user can adjust the audio clips on the audio track 10
  • the subtitle track 20 is used to carry and display subtitle clips, and the user can adjust the subtitle clips on the subtitle track 20.
  • the target sound source segment can be bound with the main track 30.
  • the position of the sound source head of the target sound source segment can be compared with the corresponding moment of the sound source head position on the main track 30 of the target video. Binding, so as to achieve the purpose of binding the target audio clip and the main track.
  • the adjustment operation on the main track will also affect the length and position of the target sound source clip.
  • the target sound source clip is the original sound source of the main track
  • the target sound source clip will also change accordingly.
  • the target sound source clip is not the original sound source of the main track
  • trimming, shifting, deleting and other adjustment operations are performed in the area corresponding to the target sound source clip on the main track, as long as the adjustment operation does not change the head position of the target sound source clip on the main track The position of the corresponding time, the target sound source clip will not change.
  • the target sound source clip will not change, while the corresponding target sound source clip on the main track The head of the region is cropped, and the target audio clip is deleted.
  • step 203 may include:
  • Sub-step 2035 in the case of performing the adjustment operation of changing the position of the subtitle header and/or the position of the audio source header, update the relative positional relationship according to the changed subtitle header position and the audio source header position, And keep the updated relative position relationship unchanged.
  • the user may adjust the head position of the target audio source segment and/or the target subtitle segment based on the alignment requirements of the subtitle and the audio source, thereby changing the alignment between the entire target audio source segment and the entire target subtitle segment.
  • the relative positional relationship after the adjustment operation is completed, the position of the subtitle head and the audio source head position can be changed, the relative positional relationship can be updated, and the updated relative positional relationship can be kept unchanged, so as to satisfy the user's requirements for subtitles and audio source.
  • the adjustment requirements of the alignment state is performed by the adjustment operation is completed, the position of the subtitle head and the audio source head position can be changed, the relative positional relationship can be updated, and the updated relative positional relationship can be kept unchanged, so as to satisfy the user's requirements for subtitles and audio source.
  • the head positions of the target audio source segment A and the target subtitle segment a overlap with the time 00:00, and after the adjustment operation, the audio head position of the target audio segment A is at the time 00:10 , when the subtitle head position of the target subtitle segment a is at 00:00 time, according to the adjusted result, determine the new audio source header position (00:10) of the target audio source segment A and the new audio source head position (00:10) of the target subtitle segment a The relative positional relationship between the subtitle header positions (00:00), and the relative positional relationship is maintained unchanged.
  • step 203 may include:
  • Sub-step 2036 in the case of performing the adjustment operation of moving the position of the target audio source segment as a whole, move the position of the target subtitle segment along with the target audio source segment as a whole, and keep the relative positional relationship unchanged.
  • FIG. 9 shows another interface diagram for binding subtitles and audio sources provided by the embodiment of the present application.
  • the adjustment operation is to convert the target audio source segments A and B in FIG. 2 .
  • Figure 9 shows the state after the exchange.
  • the positions of the target subtitle clips a and b move as a whole with the corresponding target audio clips, and keep the relative position relationship unchanged, which achieves an effect.
  • the subtitle segment corresponding to the audio source segment will also move along with it, which saves the user time for adjusting the alignment of the audio source segment and the subtitle segment.
  • step 203 may include:
  • Sub-step 2037 In the case of performing the adjustment operation of replacing the target audio source segment, determine a new relative positional relationship between the subtitle header position of the target subtitle segment and the audio source header position of the replaced target audio source segment. , and keep the new relative position relationship unchanged.
  • the entire segment of the target audio source segment may be replaced, and the new relative positional relationship between the audio source header positions of the target audio source segment after the subtitle header position of the target subtitle segment is replaced, thereby ensuring that Align between the replaced subtitles and the audio source.
  • the replaced target sound source segment may be consistent with the position and duration of the pre-replacement target sound source segment.
  • the replaced target sound source segment may also be inconsistent with the position and/or duration of the pre-replacement target sound source segment.
  • this embodiment of the present application does not limit this.
  • step 203 may include:
  • Sub-step 2038 In the case of performing a speed change operation on the target audio source segment according to a preset speed change value, perform a speed change operation on the target subtitle segment according to the speed change value, and keep the relative positional relationship unchanged .
  • Video speed change is a common function in video/audio editing scenarios.
  • the user can fast or slow the video/audio according to the ratio corresponding to the speed change value.
  • the video/audio may be fast-forwarded at 2 times the speed, so that the duration of the video/audio is shortened by half.
  • the target subtitle clip in the case of a variable speed operation on the target audio clip according to a preset variable speed value, can be subjected to a variable speed operation according to the variable speed value, and the head position of the target audio clip and the target audio clip can be maintained.
  • the relative positional relationship between the head positions of the subtitle segments is unchanged.
  • the target subtitle clip when the target audio clip is played at a speed change value of 2 times, the target subtitle clip can also be played at a speed change value of 2 times, so as to meet the purpose of aligning the subtitles with the audio source after the speed change adjustment operation.
  • the target sound source clip is bound to the main track of the target video.
  • a variable speed operation can be performed on the area corresponding to the target sound source clip on the main track, so that the target sound source clip can achieve variable speed.
  • the sound source clip has a corresponding variable speed effect, so you can directly perform a variable speed operation on the target sound source clip on the sound source track, so that the target sound source clip can achieve variable speed.
  • sub-step 2038 may include:
  • Sub-step A1 In the case that the first duration of the target audio source segment before shifting is greater than the second duration of the target subtitle segment, perform a shifting operation on the target subtitle segment according to the preset shifting value.
  • the speed change logic may be optimized by further comparing the size of the first duration of the target audio source segment before the speed change with the second duration of the target subtitle segment.
  • the target audio clip When the first duration of the target audio clip is greater than the second duration of the target subtitle clip, whether the target audio clip and the target subtitle clip are played at double speed or slow down according to the same variable speed value, the target audio clip after the variable speed.
  • the duration of the target subtitle clip is longer than the duration of the target subtitle clip after the variable speed, and the duration of the target audio clip is longer than the duration setting of the target subtitle clip, which will provide a better playback effect and better meet the user's viewing habits. Therefore, when the first duration is longer than In the case of the second duration, the target subtitle segment may be subjected to a speed change operation according to the preset speed change value.
  • FIG. 10 shows another interface diagram for binding subtitles and audio sources provided by an embodiment of the present application, which shows the target audio source segments C and D before the speed change, and the target subtitle segment before the speed change.
  • c, d for the target audio segment C and the target subtitle segment c, since the first duration of the target audio segment C before the speed change is greater than the second duration of the target subtitle segment c, the target audio segment C is changed according to 2 times the speed change value. After that, the target subtitle segment c can also follow the 2-fold speed change.
  • Sub-step A2 In the case where the first duration of the target audio source segment before the speed change is less than the second duration of the target subtitle segment, the part of the first duration in the target subtitle segment is adjusted according to the preset speed change value. Perform a shifting operation.
  • the duration of the target subtitle segment after the speed change is greater than the duration of the target audio clip after the speed change, so that the probability of the target subtitle segment being too long is greatly increased, while the target subtitle segment is too long, it will increase The probability that the target subtitle segment overlaps with other audio source segments.
  • the overlap between the target subtitle segment and other audio source segments will reduce the playback effect and conflict with the user's viewing habits.
  • the first duration is less than the second duration
  • FIG. 10 shows the target audio source segments C and D before shifting, and the target subtitle segments c and d before shifting.
  • the first duration of the target subtitle segment d is less than the second duration of the target subtitle segment d, then after the target audio segment D is changed according to the speed change value of 2 times, the part of the target subtitle segment d corresponding to the first duration (00:50-01:30 ) can also be followed by a 2-fold speed change, while the part (01:30-01:40) of the target subtitle segment d other than the first duration keeps the original playback speed (1 times) unchanged.
  • step 203 may include:
  • Sub-step 2039 In the case of performing the adjustment operation of dividing the target sound source segment, obtain a plurality of divided sound source sub-segments.
  • Sub-step 20310 establish a new relative positional relationship between the subtitle header position of the target subtitle segment and the audio source header position of the target audio sub-segment, and keep the new relative positional relationship unchanged, the target audio sub-segment It is the audio subclip at the head position among all audio subclips.
  • the user can also divide the target audio source segment according to actual needs to obtain multiple audio source sub-segments, and after the segmentation operation, the subtitle head position of the target subtitle segment is the same as the head position after division.
  • the audio source header position of the target audio source sub-segment is bound. Since the target subtitle segment is obtained by speech recognition of the entire target audio source segment, even if the target audio source segment is divided, the integrity of the target audio source segment will not be destroyed.
  • the embodiment of the application can bind the subtitle header position of the target subtitle segment and the audio source header position of the target audio source sub-segment that is at the head position after segmentation, so as to satisfy the requirement that after the segmentation processing operation, the target subtitle segment and the segmented target subtitle segment can be bound.
  • the head binding relationship of the audio clip For example, referring to FIG. 11 , FIG. 11 shows another interface diagram for binding subtitles and audio sources provided by an embodiment of the present application. After the segmentation operation, the target audio source segment A is divided into audio source sub-segments 1 and audio source sub-segments. 2. In the case of the audio source sub-segment 3, after the segmentation process, the target subtitle segment a is bound to the audio source header position of the audio source sub-segment 1 at the head.
  • the target subtitle segment a is still the same as the audio source sub-segment 1 at the head. to bind.
  • the method may further include:
  • Step 204 In the case that the target subtitle segment is moved beyond the boundary of the main track of the target video, delete the part of the target subtitle segment that exceeds the boundary.
  • FIG. 12 shows another interface diagram for binding subtitles and audio sources provided by the embodiment of the present application.
  • the target subtitle segment a in response to the target subtitle segment a on the subtitle track 20 Perform the overall movement, and move the part of the target subtitle segment a to the boundary beyond the main track 30 (time 00:00), then the part beyond the boundary in the target subtitle segment a can be deleted. In the case of the boundary, the entire target subtitle segment a can be deleted.
  • the embodiments of the present application provide a convenient method for deleting subtitles, which improves user experience.
  • a method for binding subtitles and audio sources includes: determining a target audio source segment in a target video, and identifying the target subtitle segment from the target audio source segment; determining a subtitle header of the target subtitle segment The relative positional relationship between the head position of the target audio source segment and the audio source header position of the target audio source segment; in the case of performing an adjustment operation on the target audio source segment and/or the target subtitle segment, the relative positional relationship remains unchanged.
  • the present application can bind the relative positional relationship between the head position of the target subtitle segment and the target audio source segment, so that the editing process of the main track and the editing process of the subtitle and audio source are isolated from each other, and the editing operation of the main track is not It will affect the alignment between the subtitles and the audio source, thereby reducing the chance of misalignment between the audio source and subtitles.
  • FIG. 13 is a block diagram of an apparatus for binding subtitles and audio sources provided by an embodiment of the present application. As shown in FIG. 13 , the apparatus includes an identification module 301 , a binding module 302 , and a holding module 303 .
  • the identification module 301 is configured to determine the target audio clip in the target video, and identify the target subtitle clip from the target audio clip;
  • the binding module 302 is configured to determine the relative positional relationship between the subtitle header position of the target subtitle segment and the audio source header position of the target audio source segment;
  • the maintaining module 303 is configured to maintain the relative positional relationship unchanged when the adjustment operation on the target audio source segment and/or the target subtitle segment is performed.
  • the holding module includes:
  • a triggering submodule configured to trigger a subtitle processing operation when there is no overlap between the adjusted target audio source segment and the target subtitle segment
  • the binding submodule is configured to bind the target subtitle segment with the main track of the target video in response to the subtitle processing operation of the retained subtitle.
  • the case where there is no overlap between the adjusted target audio source segment and the target subtitle segment includes: the case where the target audio source segment is deleted, or the subtitle header position and subtitle head position and /or the position of the head of the sound source, or the case of dividing the target sound source segment to obtain multiple sound source sub-segments, and deleting the sound source sub-segment located at the head of the multiple sound source sub-segments.
  • the apparatus further includes:
  • the detection module is configured to detect the adjusted new subtitle head position and the audio source head position when there is an overlap between the adjusted target audio source segment and the target subtitle segment;
  • the updating module is configured to update the relative positional relationship according to the new subtitle header position and the audio source header position, and keep the updated relative positional relationship unchanged.
  • the apparatus further includes:
  • the deletion module is configured to delete the part of the target subtitle segment beyond the boundary in the case of moving the target subtitle segment beyond the boundary of the main track of the target video.
  • the target audio source segment is placed on the audio source track for display; the target subtitle segment is placed on the subtitle track for display, and the audio source track, the subtitle track and the main track of the target video are displayed using The same timing; the target audio clip is bound to the main track.
  • the holding module includes:
  • the update sub-module is configured to update the relative subtitle head position and the audio source head position according to the changed subtitle head position and the audio source head position when the adjustment operation of changing the subtitle head position and/or the audio source head position is performed. position relationship, and keep the updated relative position relationship unchanged.
  • the holding module includes:
  • the moving sub-module is configured to move the position of the target subtitle segment along with the target sound source segment as a whole under the condition of performing an adjustment operation for moving the position of the target audio source segment as a whole, and keep the relative position relationship unchanged. Change.
  • the holding module includes:
  • the replacement submodule is configured to determine a new subtitle position between the subtitle header position of the target subtitle fragment and the audio source header position of the replaced target audio source fragment when performing the adjustment operation of replacing the target audio source fragment. relative positional relationship, and keep the new relative positional relationship unchanged.
  • the holding module includes:
  • variable speed sub-module is configured to perform a variable speed operation on the target subtitle segment according to the variable speed value and maintain the relative position in the case of performing a variable speed operation on the target audio source segment according to a preset variable speed value The relationship remains unchanged.
  • the transmission sub-module includes:
  • the first shifting unit is configured to perform shifting of the target subtitle segment according to the preset shifting value when the first duration of the target audio segment before shifting is greater than the second duration of the target subtitle segment operate;
  • the second speed changing unit is configured to, when the first duration of the target audio source segment before shifting is smaller than the second duration of the target subtitle segment, change the part of the first duration in the target subtitle segment according to the preset The speed change operation is performed at the set speed change value.
  • the holding module includes:
  • a segmentation sub-module configured to obtain a plurality of divided audio sub-segments when performing an adjustment operation of segmenting the target audio segment
  • the cropping submodule is configured to establish a new relative positional relationship between the subtitle header position of the target subtitle segment and the audio source header position of the target audio source subsegment, and keep the new relative positional relationship unchanged.
  • the sound source sub-segment is the sound source sub-segment at the head position among all the sound source sub-segments.
  • an apparatus for binding subtitles and audio sources includes: an identification module configured to determine a target audio source segment in a target video, and identify the target subtitle segment from the target audio source segment; The determining module is configured to determine the relative positional relationship between the subtitle header position of the target subtitle segment and the audio source header position of the target audio source segment; the maintaining module is configured to perform the target audio source segment and/or the target subtitle segment. In the case of the adjustment operation, keep the relative position relationship unchanged.
  • the present application can bind the relative positional relationship between the head position of the target subtitle segment and the target audio source segment, so that the editing process of the main track and the editing process of the subtitle and audio source are isolated from each other, and the editing operation of the main track is not It will affect the alignment between the subtitles and the audio source, thereby reducing the chance of misalignment between the audio source and subtitles.
  • FIG. 14 is a block diagram of an electronic device 600 according to an exemplary embodiment.
  • electronic device 600 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, fitness device, personal digital assistant, and the like.
  • the electronic device 600 may include one or more of the following components: a processing component 602, a memory 604, a power component 606, a multimedia component 608, a sound source component 610, an input/output (I/O) interface 612, a sensor component 614 , and the communication component 616 .
  • the processing component 602 generally controls the overall operation of the electronic device 600, such as operations associated with display, phone calls, data communications, camera operations, and recording operations.
  • the processing component 602 may include one or more processors 620 to execute instructions to perform all or some of the steps of the methods described above. Additionally, processing component 602 may include one or more modules that facilitate interaction between processing component 602 and other components. For example, processing component 602 may include a multimedia module to facilitate interaction between multimedia component 608 and processing component 602.
  • Memory 604 is used to store various types of data to support operation at electronic device 600 . Examples of such data include instructions for any application or method operating on electronic device 600, contact data, phonebook data, messages, pictures, multimedia, and the like. Memory 604 may be implemented by any type of volatile or nonvolatile storage device or combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Magnetic or Optical Disk Magnetic Disk
  • Power supply assembly 606 provides power to various components of electronic device 600 .
  • Power supply components 606 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to electronic device 600 .
  • Multimedia component 608 includes a screen that provides an output interface between the electronic device 600 and the user.
  • the screen may include a liquid crystal display (LCD) and a touch panel (TP).
  • the screen may be implemented as a touch screen to receive input signals from a user.
  • the touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the demarcation of a touch or swipe action, but also detect the duration and pressure associated with the touch or swipe action.
  • the multimedia component 608 includes a front-facing camera and/or a rear-facing camera. When the electronic device 600 is in an operation mode, such as a shooting mode or a multimedia mode, the front camera and/or the rear camera may receive external multimedia data.
  • Each of the front and rear cameras can be a fixed optical lens system or have focal length and optical zoom capability.
  • the audio component 610 is used for outputting and/or inputting audio signals.
  • the sound source assembly 610 includes a microphone (MIC) for receiving external sound source signals when the electronic device 600 is in an operation mode, such as a calling mode, a recording mode, and a voice recognition mode.
  • the received audio source signal may be further stored in memory 604 or transmitted via communication component 616 .
  • the sound source assembly 610 further includes a speaker for outputting the sound source signal.
  • the I/O interface 612 provides an interface between the processing component 602 and a peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to: home button, volume buttons, start button, and lock button.
  • Sensor assembly 614 includes one or more sensors for providing status assessment of various aspects of electronic device 600 .
  • the sensor assembly 614 can detect the open/closed state of the electronic device 600, the relative positioning of the components, such as the display and the keypad of the electronic device 600, and the sensor assembly 614 can also detect the electronic device 600 or one of the electronic devices 600. Changes in the positions of components, presence or absence of user contact with the electronic device 600 , orientation or acceleration/deceleration of the electronic device 600 and changes in the temperature of the electronic device 600 .
  • Sensor assembly 614 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact.
  • Sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor assembly 614 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 616 is used to facilitate wired or wireless communication between electronic device 600 and other devices.
  • Electronic device 600 may access wireless networks based on communication standards, such as WiFi, carrier networks (eg, 2G, 3G, 4G, or 5G), or a combination thereof.
  • the communication component 616 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component 616 also includes a near field communication (NFC) module to facilitate short-range communication.
  • the NFC module may be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • electronic device 600 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable It is implemented by a programming gate array (FPGA), a controller, a microcontroller, a microprocessor or other electronic components, and is used to implement the method for binding a subtitle and an audio source provided by the embodiment of the present application.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA programming gate array
  • controller a controller
  • microcontroller a microcontroller
  • microprocessor microprocessor or other electronic components
  • non-transitory computer storage medium including instructions, such as a memory 604 including instructions, executable by the processor 620 of the electronic device 600 to perform the method described above.
  • the non-transitory storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.
  • FIG. 15 is a block diagram of an electronic device 700 according to an exemplary embodiment.
  • the electronic device 700 may be provided as a server.
  • electronic device 700 includes processing component 722, which further includes one or more processors, and a memory resource, represented by memory 732, for storing instructions executable by processing component 722, such as applications.
  • An application program stored in memory 732 may include one or more modules, each corresponding to a set of instructions.
  • the processing component 722 is configured to execute an instruction to execute a method for binding subtitles and audio sources provided by the embodiments of the present application.
  • the electronic device 700 may also include a power supply assembly 726 configured to perform power management of the electronic device 700, a wired or wireless network interface 750 configured to connect the electronic device 700 to a network, and an input output (I/O) interface 758 .
  • Electronic device 700 may operate based on an operating system stored in memory 732, such as Windows ServerTM, Mac OS XTM, UniXTM, LinuXTM, FreeBSDTM or the like.
  • Embodiments of the present application further provide a computer program product, including a computer program, which implements the method for binding subtitles and audio sources when the computer program is executed by a processor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Studio Circuits (AREA)

Abstract

La présente demande concerne un procédé de liaison d'un sous-titre avec une source audio, un appareil, un dispositif électronique, un support de stockage informatique et un produit-programme informatique, qui comprennent : la détermination d'un segment de source audio cible dans une vidéo cible, et l'identification et l'obtention d'un segment de sous-titre cible par l'intermédiaire du segment de source audio cible ; la détermination d'une relation de position relative entre une position de début de sous-titre du segment de sous-titre cible et une position de début de source audio du segment de source audio cible ; et si une opération de réglage est effectuée sur le segment de source audio cible et/ou le segment de sous-titre cible, la conservation de la relation de position relative.
PCT/CN2021/135470 2021-04-14 2021-12-03 Procédé de liaison de sous-titre avec une source audio, et appareil WO2022217944A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110402833.0A CN113259776B (zh) 2021-04-14 2021-04-14 字幕与音源的绑定方法及装置
CN202110402833.0 2021-04-14

Publications (1)

Publication Number Publication Date
WO2022217944A1 true WO2022217944A1 (fr) 2022-10-20

Family

ID=77220790

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/135470 WO2022217944A1 (fr) 2021-04-14 2021-12-03 Procédé de liaison de sous-titre avec une source audio, et appareil

Country Status (2)

Country Link
CN (1) CN113259776B (fr)
WO (1) WO2022217944A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113259776B (zh) * 2021-04-14 2022-11-22 北京达佳互联信息技术有限公司 字幕与音源的绑定方法及装置
CN116193195A (zh) * 2023-02-23 2023-05-30 北京奇艺世纪科技有限公司 视频的处理方法、装置、处理设备及存储介质

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104104986A (zh) * 2014-07-29 2014-10-15 小米科技有限责任公司 音频与字幕的同步方法和装置
WO2017191397A1 (fr) * 2016-05-03 2017-11-09 Orange Procédé et dispositif de synchronisation de sous-titres
CN108401192A (zh) * 2018-04-25 2018-08-14 腾讯科技(深圳)有限公司 视频流处理方法、装置、计算机设备及存储介质
CN109246472A (zh) * 2018-08-01 2019-01-18 平安科技(深圳)有限公司 视频播放方法、装置、终端设备及存储介质
CN109413475A (zh) * 2017-05-09 2019-03-01 北京嘀嘀无限科技发展有限公司 一种视频中字幕的调整方法、装置和服务器
US20190250803A1 (en) * 2018-02-09 2019-08-15 Nedelco, Inc. Caption rate control
CN111836062A (zh) * 2020-06-30 2020-10-27 北京小米松果电子有限公司 视频播放方法、装置及计算机可读存储介质
CN111901538A (zh) * 2020-07-23 2020-11-06 北京字节跳动网络技术有限公司 一种字幕生成方法、装置、设备及存储介质
CN113259776A (zh) * 2021-04-14 2021-08-13 北京达佳互联信息技术有限公司 字幕与音源的绑定方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101808202B (zh) * 2009-02-18 2013-09-04 联想(北京)有限公司 实现影音文件中声音与字幕同步的方法、设备和计算机
CN102630017B (zh) * 2012-04-10 2014-03-19 中兴通讯股份有限公司 一种移动多媒体广播字幕同步的方法和系统
CN112287128B (zh) * 2020-10-23 2024-01-12 北京百度网讯科技有限公司 多媒体文件编辑方法、装置、电子设备和存储介质

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104104986A (zh) * 2014-07-29 2014-10-15 小米科技有限责任公司 音频与字幕的同步方法和装置
WO2017191397A1 (fr) * 2016-05-03 2017-11-09 Orange Procédé et dispositif de synchronisation de sous-titres
CN109413475A (zh) * 2017-05-09 2019-03-01 北京嘀嘀无限科技发展有限公司 一种视频中字幕的调整方法、装置和服务器
US20190250803A1 (en) * 2018-02-09 2019-08-15 Nedelco, Inc. Caption rate control
CN108401192A (zh) * 2018-04-25 2018-08-14 腾讯科技(深圳)有限公司 视频流处理方法、装置、计算机设备及存储介质
CN109246472A (zh) * 2018-08-01 2019-01-18 平安科技(深圳)有限公司 视频播放方法、装置、终端设备及存储介质
CN111836062A (zh) * 2020-06-30 2020-10-27 北京小米松果电子有限公司 视频播放方法、装置及计算机可读存储介质
CN111901538A (zh) * 2020-07-23 2020-11-06 北京字节跳动网络技术有限公司 一种字幕生成方法、装置、设备及存储介质
CN113259776A (zh) * 2021-04-14 2021-08-13 北京达佳互联信息技术有限公司 字幕与音源的绑定方法及装置

Also Published As

Publication number Publication date
CN113259776A (zh) 2021-08-13
CN113259776B (zh) 2022-11-22

Similar Documents

Publication Publication Date Title
EP3817395A1 (fr) Procédé et appareil d'enregistrement vidéo, dispositif et support d'enregistrement lisible
US9786326B2 (en) Method and device of playing multimedia and medium
CN107396177B (zh) 视频播放方法、装置及存储介质
WO2022217944A1 (fr) Procédé de liaison de sous-titre avec une source audio, et appareil
WO2020015334A1 (fr) Procédé et appareil de traitement vidéo, dispositif terminal et support d'informations
US10212386B2 (en) Method, device, terminal device, and storage medium for video effect processing
CN108259991B (zh) 视频处理方法及装置
CN104639977B (zh) 节目播放的方法及装置
WO2022142871A1 (fr) Procédé et appareil d'enregistrement vidéo
WO2018095252A1 (fr) Procédé et dispositif d'enregistrement vidéo
WO2019206243A1 (fr) Procédé d'affichage de matériel, terminal et support de stockage informatique
US20220248083A1 (en) Method and apparatus for video playing
US20220084313A1 (en) Video processing methods and apparatuses, electronic devices, storage mediums and computer programs
US11580742B2 (en) Target character video clip playing method, system and apparatus, and storage medium
WO2017054354A1 (fr) Procédé et dispositif de traitement d'informations
CN112584208B (zh) 一种基于人工智能的视频浏览编辑方法和系统
CN108769769B (zh) 视频的播放方法、装置及计算机可读存储介质
CN111147942A (zh) 视频播放方法、装置、电子设备及存储介质
WO2022160699A1 (fr) Procédé de traitement vidéo et appareil de traitement vidéo
WO2022105341A1 (fr) Procédé et appareil de traitement de données vidéo, support de stockage informatique et dispositif électronique
CN113905192A (zh) 一种字幕编辑方法、装置、电子设备及存储介质
CN106060253B (zh) 信息呈现的方法及装置
CN113364999B (zh) 视频生成方法、装置、电子设备及存储介质
CN110809184A (zh) 视频处理方法、装置及存储介质
EP3846447A1 (fr) Procédé d'acquisition d'images, dispositif d'acquisition d'images, dispositif électronique et support d'enregistrement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21936810

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21936810

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25.03.2024)