CN115776588A - Video processing method, video processing apparatus, electronic device, video processing medium, and program product - Google Patents

Video processing method, video processing apparatus, electronic device, video processing medium, and program product Download PDF

Info

Publication number
CN115776588A
CN115776588A CN202111057789.0A CN202111057789A CN115776588A CN 115776588 A CN115776588 A CN 115776588A CN 202111057789 A CN202111057789 A CN 202111057789A CN 115776588 A CN115776588 A CN 115776588A
Authority
CN
China
Prior art keywords
video
processed
target
original
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111057789.0A
Other languages
Chinese (zh)
Inventor
王若师
包红来
林枫
陈子豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zitiao Network Technology Co Ltd
Original Assignee
Beijing Zitiao Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zitiao Network Technology Co Ltd filed Critical Beijing Zitiao Network Technology Co Ltd
Priority to CN202111057789.0A priority Critical patent/CN115776588A/en
Priority to PCT/CN2022/117971 priority patent/WO2023036275A1/en
Publication of CN115776588A publication Critical patent/CN115776588A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N17/00Diagnosis, testing or measuring for television systems or their details
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/433Content storage operation, e.g. storage operation in response to a pause request, caching operations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs

Abstract

The application relates to a video processing method, a video processing device, an electronic device, a medium and a program product, which are applied to the technical field of video processing, wherein the method comprises the following steps: acquiring an original video with synchronous sound and picture and a to-be-processed video generated by performing video processing on the basis of the original video; determining a first moment of the target video frame/target audio frequency spectrum point appearing in the video to be processed based on any one of the same target video frame/target audio frequency spectrum point in the original video and the video to be processed; determining a second moment when the audio frequency spectrum point/video frame appears in the video to be processed according to the audio frequency spectrum point/video frame corresponding to the target video frame/target audio frequency spectrum point in the original video; and judging whether the sound and the picture of the video to be processed are synchronous or not according to the difference value between the second moment and the first moment. The method and the device can improve the efficiency and the accuracy of sound and picture synchronous judgment.

Description

Video processing method, video processing device, electronic apparatus, video processing medium, and program product
Technical Field
The present application relates to the field of video processing technologies, and in particular, to a video processing method, an apparatus, an electronic device, a medium, and a program product.
Background
With the development of the internet, more and more users begin to make videos and share the made videos to a social platform. In the process of making a video, the rendering operation of the video and the like may cause the problem of audio and video asynchronism, so that the user needs to judge manually. However, the efficiency and accuracy of manual judgment are low.
Disclosure of Invention
To solve the above technical problem or at least partially solve the above technical problem, the present application provides a video processing method, an apparatus, an electronic device, a medium, and a program product.
According to a first aspect of the present application, there is provided a video processing method comprising:
acquiring an original video with synchronous sound and pictures and a to-be-processed video generated by performing video processing on the basis of the original video;
determining a first moment of the target video frame appearing in the video to be processed based on the target video frame with the same picture in the original video and the video to be processed; determining a second moment when the audio spectrum point corresponding to the target video frame appears in the video to be processed according to the audio spectrum point corresponding to the target video frame in the original video;
judging whether the video to be processed is synchronous in sound and picture according to the difference value between the second moment and the first moment; alternatively, the first and second liquid crystal display panels may be,
determining a third moment when the target audio frequency spectrum point appears in the video to be processed based on the target audio frequency spectrum point with the same spectrum point in the original video and the video to be processed; determining a fourth moment when the video frame corresponding to the target audio frequency spectrum point appears in the video to be processed according to the video frame corresponding to the target audio frequency spectrum point in the original video;
and judging whether the video to be processed is synchronous in sound and picture according to the difference value between the fourth moment and the third moment.
Optionally, the method further includes:
if the audio and video pictures of the video to be processed are determined to be asynchronous, adjusting the time of the audio frequency spectrum point corresponding to the target video frame appearing in the video to be processed from the second time to the first time, or adjusting the time of the video frame corresponding to the target audio frequency spectrum point appearing in the video to be processed from the fourth time to the third time so as to synchronize the audio and video pictures of the video to be processed.
Optionally, the method for generating a to-be-processed video based on the original video includes:
acquiring video data and audio data of the original video;
and synthesizing the video data and the audio data to generate a video to be processed.
Optionally, the method for generating a video to be processed based on the original video includes:
acquiring audio data of the original video and target video data shot synchronously with the original video;
and synthesizing the target video data and the audio data to generate a video to be processed.
Optionally, if the video to be processed is a screen recording video generated when the original video is played by a player, and the sound and the picture of the video to be processed are not synchronous, it is determined that the player plays abnormally.
Optionally, after acquiring the original video synchronized with the sound and the picture and the video to be processed, the method further includes:
responding to a playing operation, and playing the original video and the video to be processed in the same picture;
and displaying the spectrogram of the audio track of the original video and the spectrogram of the audio track of the video to be processed.
According to a second aspect of the present application, there is provided a video processing apparatus comprising:
the video acquisition module is used for acquiring an original video with synchronous sound and picture and a to-be-processed video generated by performing video processing on the basis of the original video;
the first time determining module is used for determining a first time when the target video frame appears in the video to be processed based on the target video frame with the same picture in the original video and the video to be processed;
a second time determining module, configured to determine, according to an audio spectrum point corresponding to the target video frame in the original video, a second time at which the audio spectrum point corresponding to the target video frame appears in the video to be processed; or;
a third time determining module, configured to determine, based on a target audio spectrum point where any spectrum point in the original video and the video to be processed is the same, a third time at which the target audio spectrum point appears in the video to be processed;
a fourth moment determining module, configured to determine, according to a video frame corresponding to the target audio frequency spectrum point in an original video, a fourth moment occurring in the video to be processed by the video frame corresponding to the target audio frequency spectrum point;
the sound and picture synchronization judging module is used for judging whether the video to be processed is sound and picture synchronized according to the difference value between the second moment and the first moment; or judging whether the video to be processed is synchronous in sound and picture according to the difference value between the fourth time and the third time.
Optionally, the video processing apparatus further includes:
and the sound and picture synchronous processing module is used for adjusting the time when the audio frequency spectrum point corresponding to the target video frame appears in the video to be processed from the second time to the first time or adjusting the time when the video frame corresponding to the target audio frequency spectrum point appears in the video to be processed from the fourth time to the third time so as to synchronize the sound and picture of the video to be processed if the sound and picture of the video to be processed are determined to be asynchronous.
Optionally, the video processing apparatus further includes:
the first generation module of the video to be processed is used for acquiring the video data and the audio data of the original video; and synthesizing the video data and the audio data to generate a video to be processed.
Optionally, the video processing apparatus further includes:
the second generation module of the video to be processed is used for acquiring the audio data of the original video and the target video data shot synchronously with the original video; and synthesizing the target video data and the audio data to generate a video to be processed.
Optionally, if the video to be processed is a screen recording video generated when the original video is played by a player, and the sound and the picture of the video to be processed are not synchronous, it is determined that the player plays abnormally.
Optionally, the video processing apparatus further includes:
the video playing module is used for responding to playing operation and playing the original video and the video to be processed in the same picture;
and the audio track spectrogram display module is used for displaying the audio track spectrogram of the original video and the audio track spectrogram of the video to be processed.
According to a third aspect of the present application, there is provided an electronic device comprising: a processor for executing a computer program stored in a memory, the computer program, when executed by the processor, implementing the method of the first aspect.
According to a fourth aspect of the present application, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the method of the first aspect.
According to a fifth aspect of the present application, there is provided a computer program product which, when run on a computer, causes the computer to perform the method of the first aspect.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages:
after an original video and a to-be-processed video generated by video processing on the basis of the original video are obtained, a first moment of a target video frame appearing in the to-be-processed video can be determined on the basis of the target video frame with the same picture in the original video and the to-be-processed video; determining a second moment when the audio frequency spectrum point appears in the video to be processed according to the audio frequency spectrum point corresponding to the target video frame in the original video; and judging whether the sound and the picture of the video to be processed are synchronous or not according to the difference value between the second moment and the first moment. Or, a third moment when the target audio spectrum point appears in the video to be processed can be determined based on the target audio spectrum point with the same spectrum point in the original video and the video to be processed; and determining a fourth moment when the video frame corresponding to the target audio frequency spectrum point appears in the video to be processed according to the video frame corresponding to the target audio frequency spectrum point in the original video, and judging whether the sound and the picture of the video to be processed are synchronous according to a difference value between the fourth moment and the third moment. The method and the device use the original video as a reference to judge whether the video to be processed is synchronous in sound and picture, and can improve the accuracy of sound and picture synchronous judgment. Meanwhile, manual judgment is avoided, and the efficiency of sound and picture synchronous judgment is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and, together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive labor.
Fig. 1 shows a schematic diagram of a system architecture of an exemplary application environment to which the video processing method of the embodiment of the present application can be applied;
FIG. 2 is a flow chart of a video processing method according to an embodiment of the present application;
FIG. 3A is a diagram illustrating audio-visual synchronization according to an embodiment of the present application;
FIG. 3B is a schematic diagram of audio-visual synchronization according to an embodiment of the present application;
FIG. 4 is a flowchart illustrating a video processing method according to an embodiment of the present application;
FIG. 5 is a schematic diagram of an interface of audio-visual synchronization according to an embodiment of the present application;
FIG. 6 is a schematic diagram of an embodiment of a video processing apparatus;
fig. 7 is a schematic structural diagram of an electronic device in an embodiment of the present application.
Detailed Description
In order that the above-mentioned objects, features and advantages of the present application may be more clearly understood, the solution of the present application will be further described below. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, but the present application may be practiced in other ways than those described herein; it is to be understood that the embodiments described in this specification are only some embodiments of the present application and not all embodiments.
Fig. 1 is a schematic diagram showing a system architecture of an exemplary application environment to which a video processing method according to an embodiment of the present application can be applied.
As shown in fig. 1, system architecture 100 may include one or more of terminal device 101, terminal device 102, terminal device 103, network 104, and server 105. Network 104 is the medium used to provide communication links between terminal device 101, terminal device 102, terminal device 103, and server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others. The terminal devices 101, 102, 103 may be various electronic devices having a display screen, including but not limited to desktop computers, portable computers, smart phones, tablet computers, and the like. It should be understood that the number of terminal devices, networks, and servers in fig. 1 are merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, server 105 may be a server cluster comprised of multiple servers, or the like.
The video processing method provided by the embodiment of the present application is generally executed by the server 105, and accordingly, the video processing apparatus may be disposed in the server 105. However, it is easily understood by those skilled in the art that the video processing method provided in the embodiment of the present application may also be executed by the terminal device 101, the terminal device 102, and the terminal device 103. For example, the terminal device 101, the terminal device 102, and the terminal device 103 upload an original video and a to-be-processed video generated based on the original video to the server 105, and the server 105 determines whether the to-be-processed video is synchronized with sound and pictures based on the video processing method of the embodiment of the present application, and sends the determination result (i.e., the sound and picture synchronization or the sound and picture non-synchronization) to the terminal device 101, the terminal device 102, and the terminal device 103.
Referring to fig. 2, fig. 2 is a flowchart of a video processing method in an embodiment of the present application, which may include the following steps:
step S210, obtaining an original video synchronized with sound and picture, and performing video processing on the basis of the original video to generate a to-be-processed video.
In the embodiment of the application, the original video is a video with synchronous sound and picture, can be a video which is directly shot by a user through a terminal device and is not processed by the user, and can also be a video downloaded by the user from the internet. In order to make the original video have better display effect, the original video can be rendered to generate a new video, i.e. a video to be processed. That is, the to-be-processed video may be generated after video editing of the original video. In addition, in the process of playing the original video, the video to be processed can also be synchronously generated. Alternatively, a video to be processed is generated using audio data of an original video in combination with video data (containing no audio) shot in synchronization with the original video, and so on. As can be seen, the original video may contain video data and audio data in the video to be processed.
Since the problem of audio-picture asynchronism of the video to be processed can be caused in the video processing process, the video to be processed can be the video with asynchronism of audio-picture. In order to accurately judge whether the video to be processed is synchronous with the sound and the picture, the application provides a sound and picture synchronization tool, and a user can add the original video and the video to be processed to the sound and picture synchronization tool to judge the sound and the picture synchronization. Specifically, the user may perform a video adding operation in the audio and video synchronization tool, for example, after clicking an "add" button, add the original video and the video to be processed, and the server corresponding to the audio and video synchronization tool obtains the original video and the video to be processed in response to the video adding operation.
Step S220, determining a first time when the target video frame appears in the video to be processed based on the target video frame with the same picture in the original video and the video to be processed.
After the original video and the video to be processed are obtained, the sound-picture synchronization can be directly judged, a button for judging the sound-picture synchronization can also be provided for a user, and the user clicks the button to judge the sound-picture synchronization.
Specifically, when performing sound-picture synchronization judgment, video frame alignment may be performed on an original video and a video to be processed first, a target video frame with the same picture in the original video and the video to be processed is found, and a first time when the target video frame appears in the video to be processed is determined.
Step S230, determining a second time when the audio spectrum point corresponding to the target video frame appears in the video to be processed according to the audio spectrum point corresponding to the target video frame in the original video.
Under the condition that the original video and the video to be processed are aligned in terms of video frames, because the audio frequency spectrum point corresponding to the target video frame in the original video is the correct audio frequency spectrum point, the second moment when the audio frequency spectrum point corresponding to the target video frame appears in the video to be processed can be searched based on the audio frequency spectrum point corresponding to the target video frame.
Step S240, determining whether the video to be processed is synchronized with sound and picture according to the difference between the second time and the first time.
If the second time is the same as the first time, the audio frequency spectrum point corresponding to the target video frame is also the correct audio frequency spectrum point in the video to be processed, and the audio and video images of the video to be processed are synchronous. If the second time is different from the first time, the audio frequency spectrum point corresponding to the target video frame is not the correct audio frequency spectrum point in the video to be processed, and the audio and video pictures of the video to be processed are not synchronous.
Step S221, determining a third time when the target audio spectrum point appears in the video to be processed based on the target audio spectrum point where any one of the spectrum points in the original video and the video to be processed is the same.
In the embodiment of the application, the original video and the video to be processed may also be subjected to audio alignment first, a target audio frequency spectrum point with the same spectrum point in the original video and the video to be processed is found, and a third moment when the target video frame appears in the video to be processed is determined.
Step S231, determining a fourth time when the video frame corresponding to the target audio frequency spectrum point appears in the video to be processed according to the video frame corresponding to the target audio frequency spectrum point in the original video.
Similarly, under the condition that the original video and the video to be processed are subjected to audio alignment in advance, since the video frame corresponding to the target audio frequency spectrum point in the original video is a correct video frame, the fourth moment when the video frame corresponding to the target audio frequency spectrum point appears in the video to be processed can be searched based on the video frame corresponding to the target audio frequency spectrum point.
And step S241, judging whether the video to be processed is synchronous according to the difference value between the fourth time and the third time.
If the fourth time is the same as the third time, the video frame corresponding to the target audio frequency spectrum point in the video to be processed is also a correct video frame, and the audio and video of the video to be processed are synchronous. If the fourth time is different from the third time, the video frame corresponding to the target audio frequency spectrum point in the video to be processed is not a correct video frame, and the video and audio picture to be processed are not synchronous.
It should be noted that, if the second time is different from the first time, but an absolute value of a difference between the second time and the first time is smaller than a preset time difference, and the user hardly senses the problem of audio-video asynchronization under the preset time difference, at this time, it may also be considered that the video to be processed is audio-video synchronized. Namely, under the condition that the absolute value of the difference value between the second moment and the first moment is smaller than the preset time difference, the sound and the picture of the video frame to be processed are considered to be synchronous; and under the condition that the absolute value of the difference value between the second moment and the first moment is not less than the preset time difference, the sound and the picture of the video frame to be processed are considered to be asynchronous.
It is understood that if the difference between the second time and the first time is greater than 0, it indicates that the audio time of the video to be processed lags the original video, and if the difference between the second time and the first time is less than 0, it indicates that the audio time of the video to be processed lags the original video.
Similarly, in the case that the absolute value of the difference between the fourth time and the third time is smaller than the preset time difference, the video frame to be processed is considered to be synchronized with sound and picture; and under the condition that the absolute value of the difference value between the fourth moment and the third moment is not less than the preset time difference, the sound and the picture of the video frame to be processed are considered to be asynchronous.
Referring to fig. 3A, fig. 3A is a schematic diagram of audio-video synchronization in an embodiment of the present application, which may first determine, through a video frame alignment algorithm, a video frame (i.e., a target video frame) that is the same as any one of an original video and a video to be processed, that is, a video frame corresponding to an intersection position of a video track and a dotted line. It can be seen that the time of the target video frame when the original video appears is T0, and the time of the video to be processed appears is the first time T1. The correct audio frequency spectrum point corresponding to the target video frame is the audio frequency spectrum point corresponding to the time T0 in the original video, and the audio frequency spectrum point is the second time T2 at the time when the video to be processed appears, so that the audio frequency spectrum point at the first time T1 in the video to be processed is wrong, and the video to be processed has the problem of audio-video asynchronism.
Referring to fig. 3B, fig. 3B is a further schematic diagram of audio-video synchronization in the embodiment of the present application, and whether the video to be processed is audio-video synchronized can be determined in the same manner as in fig. 3A. It can be seen that, in fig. 3A, the difference between the second time T2 and the first time T1 is greater than 0, i.e. the audio time of the video to be processed lags behind the original video; in fig. 3B, the difference between the second time T2 and the first time T1 is less than 0, i.e. the audio time of the video to be processed is ahead of the original video.
According to the video processing method, after the original video and the to-be-processed video generated by video processing on the basis of the original video are obtained, the first moment when the target video frame appears in the to-be-processed video can be determined on the basis of the target video frame with the same picture in the original video and the to-be-processed video; determining a second moment when the audio frequency spectrum point appears in the video to be processed according to the audio frequency spectrum point corresponding to the target video frame in the original video; and judging whether the sound and the picture of the video to be processed are synchronous or not according to the difference value between the second moment and the first moment. Or, a third moment when the target audio spectrum point appears in the video to be processed can be determined based on the target audio spectrum point with the same spectrum point in the original video and the video to be processed; and determining a fourth moment when the video frame corresponding to the target audio frequency spectrum point appears in the video to be processed according to the video frame corresponding to the target audio frequency spectrum point in the original video, and judging whether the sound and the picture of the video to be processed are synchronous according to a difference value between the fourth moment and the third moment. The method and the device use the original video as a reference to judge whether the video to be processed is synchronous with the sound and the picture, and can improve the accuracy of sound and picture synchronous judgment. Meanwhile, manual judgment is avoided, and the efficiency of sound and picture synchronous judgment is improved.
Referring to fig. 4, fig. 4 is a flowchart of a video processing method in an embodiment of the present application, which may include the following steps:
step S410, acquiring an original video synchronized with sound and picture, and performing video processing on the basis of the original video to generate a to-be-processed video.
The video processing method of the embodiment of the application can be suitable for various different scenes, including: the audio-video playback method includes the steps of sound-picture asynchronism caused by synthesizing audio and video of an original video, sound-picture asynchronism caused by recording video data (without audio) and adding audio data of an original audio file, sound-picture asynchronism caused by playing of the original video by a player, and the like.
Accordingly, the video to be processed may be generated by: and acquiring video data and audio data of the original video, and synthesizing the video data and the audio data to generate the video to be processed. Of course, the video data may also be rendered, etc., prior to compositing.
Optionally, the video to be processed may also be generated by: firstly, a target object can be shot by two different terminal devices (a first terminal device and a second terminal device) respectively, and if the first terminal device shoots an original video with synchronous sound and picture, the second terminal device shoots a target video data synchronously without audio data. The second terminal device may obtain the audio data of the original video from the first terminal device, and combine the local target video data to synthesize the target video data and the audio data, thereby generating a video to be processed. At this time, the audio frequency spectrum of the video to be processed completely coincides with the audio frequency spectrum of the original video.
The method and the device are also suitable for scenes of multi-segment video splicing, namely, the video to be processed can be spliced by a plurality of video segments. Similar to the aforementioned single video clip, the audio in multiple video segments can also be extracted from the original video, unlike the single video clip: the need for frequent start and stop while recording multiple video clips leads to increasingly wide sound and picture gaps.
Optionally, the video to be processed may also be generated by: and playing the original video through the player, and performing screen recording operation when the original video is played to generate the video to be processed. And if the video to be processed is the screen recording video generated when the original video is played by the player and the sound and picture of the video to be processed are not synchronous, determining that the player plays abnormally. Therefore, the method and the device can also check whether the player has the problem of sound-picture asynchronism when playing the video.
Step S420, in response to the playing operation, playing the original video and the video to be processed with the same frame, and displaying the audio track spectrogram of the original video and the audio track spectrogram of the video to be processed.
According to the method and the device, after the original video and the video to be processed are obtained, the original video and the video to be processed can be played. At the same time, the track frequency spectrums of the two videos can be displayed. Referring to fig. 5, fig. 5 is an interface schematic diagram of sound-picture synchronization in an embodiment of the present application, where pictures of two videos are displayed on the top, a user can click a play button, the two videos are played in parallel from left to right, and audio track spectrograms corresponding to the two videos are arranged up and down. When two videos are played, two video pictures can be played synchronously, and whether the audio spectrum points are synchronous or not is further judged on the basis of video picture synchronization. Based on the interface, a user can also intuitively judge whether the sound and the picture of the video to be processed are synchronous, so that the flexibility of sound and picture synchronous judgment is improved.
According to the method and the device, after the original video and the video to be processed are obtained, the sound and picture synchronization can be directly judged, and under the condition that a user visually judges that the sound and picture are not synchronized, the user can manually click a 'calculation result' button to further judge whether the sound and picture are not synchronized or not in the video to be processed. The application can determine whether the video to be processed is out of synchronization of sound and pictures by executing step S430 or executing step S431.
Step S430, based on the target video frame with the same picture in the original video and the video to be processed, determining the first moment of the target video frame appearing in the video to be processed.
Step S440, determining a second time when the audio spectrum point corresponding to the target video frame appears in the video to be processed according to the audio spectrum point corresponding to the target video frame in the original video.
And step S450, judging whether the video to be processed is synchronous in sound and picture according to the difference value between the second moment and the first moment.
Step S460, if the audio and video to be processed are determined to be asynchronous, the time when the audio frequency spectrum point corresponding to the target video frame appears in the video to be processed is adjusted from the second time to the first time, so that the audio and video to be processed are synchronous.
Step S431, determining a third time when the target audio spectrum point appears in the video to be processed, based on the target audio spectrum point where any one of the spectrum points in the original video and the video to be processed is the same.
Step S441, determining a fourth moment when the video frame corresponding to the target audio frequency spectrum point appears in the video to be processed according to the video frame corresponding to the target audio frequency spectrum point in the original video.
And step S451, judging whether the video to be processed is synchronous with sound and pictures according to the difference value between the fourth time and the third time.
The steps S430 to S451 are the same as the steps S220 to S241 in the embodiment of fig. 2, and refer to the description in the embodiment of fig. 2 for details, which are not described herein again.
It should be noted that, after the sound and picture synchronization or the sound and picture asynchronization is judged, the judgment result can be displayed. Under the condition that the video pictures to be processed are determined to be not synchronous, the judgment result can be displayed in the following mode: the difference between the second time and the first time is displayed in the interface. Wherein the difference value is used for indicating the degree of asynchronism of the sound pictures. The larger the absolute value of the difference value is, the larger the degree of asynchronization of the video and the audio pictures to be processed is.
Step S461, the time of the video frame corresponding to the target audio frequency spectrum point appearing in the video to be processed is adjusted from the fourth time to the third time, so as to synchronize the audio and video of the video to be processed.
Under the condition that the audio and video pictures of the video to be processed are not synchronous, the user can also adjust the audio track or the video frame of the video to be processed so as to synchronize the audio and video pictures of the video to be processed.
According to the video processing method, the sound and picture synchronization tool is provided for the user, so that the user can judge the sound and picture synchronization of the video to be processed in a man-machine interaction mode, and the efficiency and the accuracy of the sound and picture synchronization judgment are improved. And moreover, under the condition of audio-video asynchronization, the audio-video synchronization adjustment can be accurately carried out on the video to be processed. The application can be suitable for various different scenes, and the practicability of the sound and picture synchronization tool is improved.
Corresponding to the above method embodiment, the present application embodiment further provides a video processing apparatus, and referring to fig. 6, the video processing apparatus 600 includes:
the video acquiring module 610 is configured to acquire an original video synchronized with sound and pictures and a to-be-processed video generated by performing video processing on the basis of the original video;
a first time determining module 620, configured to determine a first time when a target video frame appears in a video to be processed, based on a target video frame in which any one of an original video and the video to be processed is the same;
a second time determining module 630, configured to determine, according to the audio spectrum point corresponding to the target video frame in the original video, a second time at which the audio spectrum point corresponding to the target video frame appears in the video to be processed; or;
a third time determining module 621, configured to determine, based on a target audio spectrum point where any spectrum point in the original video and the video to be processed is the same, a third time when the target audio spectrum point appears in the video to be processed;
a fourth time determining module 631, configured to determine, according to a video frame corresponding to a target audio frequency spectrum point in an original video, a fourth time at which a video frame corresponding to the target audio frequency spectrum point appears in a video to be processed;
the sound-picture synchronization judging module 640 is configured to judge whether the video to be processed is sound-picture synchronized according to a difference between the second time and the first time; or judging whether the video to be processed is synchronous according to the difference value between the fourth time and the third time.
Optionally, the video processing apparatus 600 further includes:
and the sound and picture synchronous processing module is used for adjusting the time when the audio frequency spectrum point corresponding to the target video frame appears in the video to be processed from the second time to the first time or adjusting the time when the video frame corresponding to the target audio frequency spectrum point appears in the video to be processed from the fourth time to the third time so as to synchronize the sound and picture of the video to be processed if the sound and picture of the video to be processed are determined to be asynchronous.
Optionally, the video processing apparatus 600 further includes:
the first generation module of the video to be processed is used for acquiring video data and audio data of an original video; and synthesizing the video data and the audio data to generate a video to be processed.
Optionally, the video processing apparatus 600 further includes:
the second generation module of the video to be processed is used for acquiring the audio data of the original video and the target video data shot synchronously with the original video; and synthesizing the target video data and the audio data to generate a video to be processed.
Optionally, if the video to be processed is a screen recording video generated when the player plays the original video, and the sound and the picture of the video to be processed are not synchronous, it is determined that the player plays abnormally.
Optionally, the video processing apparatus 600 further includes:
the video playing module is used for responding to the playing operation and playing the original video and the video to be processed in the same picture;
and the audio track spectrogram display module is used for displaying the audio track spectrogram of the original video and the audio track spectrogram of the video to be processed.
The details of each module or unit in the above device have been described in detail in the corresponding method, and therefore are not described herein again.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
In an exemplary embodiment of the present application, there is also provided an electronic device including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to perform the video processing method described above in the present exemplary embodiment.
Fig. 7 is a schematic structural diagram of an electronic device in an embodiment of the present application. It should be noted that the electronic device 700 shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 7, the electronic apparatus 700 includes a Central Processing Unit (CPU) 701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for system operation are also stored. The central processing unit 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including components such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a Local Area Network (LAN) card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that the computer program read out therefrom is mounted in the storage section 708 as necessary.
In particular, according to embodiments of the present application, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. When the computer program is executed by the central processing unit 701, various functions defined in the apparatus of the present application are executed.
In an embodiment of the present application, a computer-readable storage medium is further provided, on which a computer program is stored, and when the computer program is executed by a processor, the computer program implements the video processing method.
It should be noted that the computer readable storage medium shown in the present application can be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory, a read-only memory, an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, radio frequency, etc., or any suitable combination of the foregoing.
In the embodiment of the present application, a computer program product is further provided, which, when running on a computer, causes the computer to execute the video processing method described above.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
The above description is merely exemplary of the present application and is presented to enable those skilled in the art to understand and practice the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of video processing, the method comprising:
acquiring an original video with synchronous sound and pictures and a to-be-processed video generated by performing video processing on the basis of the original video;
determining a first moment of the target video frame appearing in the video to be processed based on the target video frame with the same picture in the original video and the video to be processed; determining a second moment when the audio spectrum point corresponding to the target video frame appears in the video to be processed according to the audio spectrum point corresponding to the target video frame in the original video;
judging whether the video to be processed is synchronous in sound and picture according to the difference value between the second moment and the first moment; alternatively, the first and second liquid crystal display panels may be,
determining a third moment when the target audio frequency spectrum point appears in the video to be processed based on a target audio frequency spectrum point with the same spectrum point in the original video and the video to be processed; determining a fourth moment when the video frame corresponding to the target audio frequency spectrum point appears in the video to be processed according to the video frame corresponding to the target audio frequency spectrum point in the original video;
and judging whether the video to be processed is synchronous in sound and picture according to the difference value between the fourth moment and the third moment.
2. The method of claim 1, further comprising:
if the audio and video pictures of the video to be processed are determined to be asynchronous, adjusting the time of the audio frequency spectrum point corresponding to the target video frame appearing in the video to be processed from the second time to the first time, or adjusting the time of the video frame corresponding to the target audio frequency spectrum point appearing in the video to be processed from the fourth time to the third time so as to synchronize the audio and video pictures of the video to be processed.
3. The method according to claim 1 or 2, wherein the manner of generating the video to be processed based on the original video comprises:
acquiring video data and audio data of the original video;
and synthesizing the video data and the audio data to generate a video to be processed.
4. The method according to claim 1 or 2, wherein the manner of generating the video to be processed based on the original video comprises:
acquiring audio data of the original video and target video data shot synchronously with the original video;
and synthesizing the target video data and the audio data to generate a video to be processed.
5. The method according to claim 1 or 2, wherein if the to-be-processed video is a screen recording video generated when the original video is played by a player, and the to-be-processed video sound and picture are not synchronized, it is determined that the player plays abnormally.
6. The method of claim 1, wherein after obtaining the original video synchronized with the sound and picture and the video to be processed, the method further comprises:
responding to a playing operation, and playing the original video and the video to be processed in the same picture;
and displaying the spectrogram of the audio track of the original video and the spectrogram of the audio track of the video to be processed.
7. A video processing apparatus, characterized in that the apparatus comprises:
the video acquisition module is used for acquiring an original video with synchronous sound and picture and a to-be-processed video generated by performing video processing on the basis of the original video;
the first moment determining module is used for determining a first moment of the target video frame appearing in the video to be processed based on the target video frame with the same picture in the original video and the video to be processed;
a second time determining module, configured to determine, according to an audio spectrum point corresponding to the target video frame in the original video, a second time at which the audio spectrum point corresponding to the target video frame appears in the video to be processed; or;
a third time determining module, configured to determine, based on a target audio spectrum point where any spectrum point in the original video and the video to be processed is the same, a third time when the target audio spectrum point appears in the video to be processed;
a fourth time determining module, configured to determine, according to a video frame corresponding to the target audio frequency spectrum point in an original video, a fourth time at which the video frame corresponding to the target audio frequency spectrum point appears in the video to be processed;
the sound and picture synchronization judging module is used for judging whether the video to be processed is sound and picture synchronous or not according to the difference value between the second moment and the first moment; or judging whether the video to be processed is synchronous with sound and pictures according to the difference value between the fourth time and the third time.
8. An electronic device, comprising: a processor for executing a computer program stored in a memory, the computer program, when executed by the processor, implementing the steps of the method of any of claims 1-6.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
10. A computer program product, characterized in that it causes a computer to carry out the steps of the method according to any one of claims 1 to 6, when said computer program product is run on the computer.
CN202111057789.0A 2021-09-09 2021-09-09 Video processing method, video processing apparatus, electronic device, video processing medium, and program product Pending CN115776588A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111057789.0A CN115776588A (en) 2021-09-09 2021-09-09 Video processing method, video processing apparatus, electronic device, video processing medium, and program product
PCT/CN2022/117971 WO2023036275A1 (en) 2021-09-09 2022-09-09 Video processing method and apparatus, electronic device, medium, and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111057789.0A CN115776588A (en) 2021-09-09 2021-09-09 Video processing method, video processing apparatus, electronic device, video processing medium, and program product

Publications (1)

Publication Number Publication Date
CN115776588A true CN115776588A (en) 2023-03-10

Family

ID=85387858

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111057789.0A Pending CN115776588A (en) 2021-09-09 2021-09-09 Video processing method, video processing apparatus, electronic device, video processing medium, and program product

Country Status (2)

Country Link
CN (1) CN115776588A (en)
WO (1) WO2023036275A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3698376B2 (en) * 1996-08-19 2005-09-21 松下電器産業株式会社 Synchronous playback device
CN101616331B (en) * 2009-07-27 2011-07-20 北京汉邦高科数字技术有限公司 Method for testing video frequency frame rate and audio-video synchronous performance
US9392144B2 (en) * 2014-06-23 2016-07-12 Adobe Systems Incorporated Video synchronization based on an audio cue
CN109842795B (en) * 2019-02-28 2020-08-11 苏州科达科技股份有限公司 Audio and video synchronization performance testing method and device, electronic equipment and storage medium
CN112929654B (en) * 2021-03-16 2022-03-29 腾讯音乐娱乐科技(深圳)有限公司 Method, device and equipment for detecting sound and picture synchronization and storage medium

Also Published As

Publication number Publication date
WO2023036275A1 (en) 2023-03-16

Similar Documents

Publication Publication Date Title
US11546667B2 (en) Synchronizing video content with extrinsic data
CN106998486B (en) Video playing method and device
CN110677711B (en) Video dubbing method and device, electronic equipment and computer readable medium
US20220279239A1 (en) Method and apparatus for generating video, electronic device, and computer readable medium
US11463776B2 (en) Video playback processing method, terminal device, server, and storage medium
US20210350545A1 (en) Image processing method and apparatus, and hardware apparatus
CN110418183B (en) Audio and video synchronization method and device, electronic equipment and readable medium
CN104918136A (en) Video positioning method and device
US20220188357A1 (en) Video generating method and device
WO2021057740A1 (en) Video generation method and apparatus, electronic device, and computer readable medium
CN108259967A (en) A kind of synchronizing information acquisition methods of smart television
CN111274415A (en) Method, apparatus and computer storage medium for determining alternate video material
CN112929728A (en) Video rendering method, device and system, electronic equipment and storage medium
CN108156498A (en) A kind of method and device of audio-visual synchronization
WO2023036275A1 (en) Video processing method and apparatus, electronic device, medium, and program product
CN112533058A (en) Video processing method, device, equipment and computer readable storage medium
JP2023539273A (en) Methods, devices, electronic devices and media for determining target addition methods
CN113411636A (en) Live wheat-connecting method and device, electronic equipment and computer-readable storage medium
CN113873318A (en) Video playing method, device, equipment and storage medium
CN114979764B (en) Video generation method, device, computer equipment and storage medium
WO2023217155A1 (en) Video generation method, apparatus, and device, storage medium, and program product
CN109815408B (en) Method and device for pushing information
CN112118473B (en) Video bullet screen display method and device, computer equipment and readable storage medium
CN110909206B (en) Method and device for outputting information
KR20170112244A (en) System for providing real toon

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination