CN103458271A - Audio-video file splicing method and audio-video file splicing device - Google Patents

Audio-video file splicing method and audio-video file splicing device Download PDF

Info

Publication number
CN103458271A
CN103458271A CN2012101731344A CN201210173134A CN103458271A CN 103458271 A CN103458271 A CN 103458271A CN 2012101731344 A CN2012101731344 A CN 2012101731344A CN 201210173134 A CN201210173134 A CN 201210173134A CN 103458271 A CN103458271 A CN 103458271A
Authority
CN
China
Prior art keywords
video
little
audio
file
splicing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012101731344A
Other languages
Chinese (zh)
Inventor
王智
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sumavision Technologies Co Ltd
Original Assignee
Sumavision Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sumavision Technologies Co Ltd filed Critical Sumavision Technologies Co Ltd
Priority to CN2012101731344A priority Critical patent/CN103458271A/en
Publication of CN103458271A publication Critical patent/CN103458271A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Television Signal Processing For Recording (AREA)

Abstract

The invention discloses an audio-video file splicing method and an audio-video file splicing device. The audio-video file splicing method includes acquiring a video out-point and an audio out-point of a first TS (transport stream) audio-video file instructed by a splicing command, acquiring a video in-point and an audio in-point of a second TS audio-video file instructed by the splicing command, judging whether the video out-point satisfies a first preset condition or not and whether the video in-point satisfies a second preset condition or not, and if yes, taking the video out-point as a video file splicing out-point and taking the video in-point as a video file splicing in-point to perform video splicing. By the audio-video file splicing method and the audio-video file splicing device, the problem of lowered image quality due to fact that original programs are decoded and recoded in a digital program insertion technique in related techniques is solved, effect of improving quality of spliced images is achieved, and further, cost is saved.

Description

Audio-video document joining method and device
Technical field
The present invention relates to the audio frequency and video process field, in particular to a kind of audio-video document joining method and device.
Background technology
Digital program insert be one by MPEG(Moving Picture Experts Group, dynamic image expert group) program splices the digital jointing technology into other mpeg programs, very extensive in the application of audio frequency and video process field.
In existing digital program insertion technology, generally adopt the implementation of " decoder+video and audio inserter+encoder ", specifically as shown in Figure 1, when carrying out the digital program insertion, use SPTS(single program transport stream, single program stream) as input source, pass through successively decoder, the video and audio inserter of multiple devices cascade, encoder realizes that digital program inserts, this mode need to be carried out all decodings and all encode program stream, process is more complicated, and especially the coding side complexity is very high, and resource has high input, cost performance is low.In addition, in the process of coding again that former program is decoded, subjective picture quality is exerted an influence, cause the decline of picture quality.
For the problems referred to above in correlation technique, effective solution is not yet proposed at present.
Summary of the invention
The invention provides a kind of audio-video document joining method and device, to solve in correlation technique in digital program insertion technology the decoded problem of the image quality decrease of encoding again and causing of former program.
According to an aspect of the present invention, provide a kind of audio-video document joining method, the method comprises: the video that obtains the indicated TS fluid sound video file of splicing instruction goes out a little and audio frequency goes out a little; Obtain in indicated the 2nd TS fluid sound video file of splicing instruction that video enters a little and audio frequency enters a little; Judge whether that video goes out a first predetermined condition and video enters a second predetermined condition, if, the video of usining goes out a little and splices a little as video file, the video of usining enters a little to spell access point as video file and carry out video-splicing, wherein, the first predetermined condition comprises: the decode procedure of the image before video goes out a little do not rely on video go out a little after the image that goes out a little before of image information and video all can show; The second predetermined condition comprises: the decode procedure of the image after video enters a little do not rely on video enter a little before the image that enters a little afterwards of image information and video all can show.
Preferably, after obtaining that audio frequency enters a little and audio frequency go out a little, the method also comprises: judge whether that audio frequency goes out a little to meet the 3rd predetermined condition and audio frequency and enters a little to meet the 4th predetermined condition, if, the audio frequency of usining goes out a little and splices a little as audio file, the audio frequency of usining enters a little to spell access point as audio file and carry out audio splicing, and wherein, the 3rd predetermined condition comprises: the PTS of the frame of the audio file in a TS fluid sound video file is more than or equal to video and goes out PTS a little; The 4th predetermined condition comprises: the PTS of the frame of the audio file in the 2nd TS fluid sound video file is more than or equal to video and enters PTS a little.
Preferably, the video of usining goes out a little as video file to splice a little, and the video of usining enters a little spells access point as video file and carry out video-splicing and comprise: whether the frame number of the video file before the frame number that judges spliced video file and splicing mates; If the determination result is NO, the output video infilled frame, so that the frame number of spliced video file is complementary with the frame number that splices front video file, wherein the video infilled frame comprises blank screen frame and/or quiet frame.
Preferably, the audio frequency of usining goes out a little as audio file to splice a little, and the audio frequency of usining enters a little spells access point as audio file and carry out audio splicing and comprise: whether the frame number of the audio file before the frame number that judges spliced audio file and splicing mates; If the determination result is NO, the output audio infilled frame, so that the frame number of spliced audio file is complementary with the frame number that splices front audio file.
Preferably, using audio frequency go out a little as audio file splice point, the audio frequency of usining enters a little to spell access point as audio file and carry out audio splicing and also comprise: video file PTS is synchronizeed and is adjusted with audio file PTS.
According to a further aspect in the invention, provide a kind of audio-video document splicing apparatus, this device comprises: the first acquiring unit, and for obtaining, the video that splices the indicated TS fluid sound video file of instruction goes out a little and audio frequency goes out a little; Second acquisition unit, for obtaining, the 2nd indicated TS fluid sound video file video of splicing instruction enters a little and audio frequency enters a little; The first judging unit, for judging whether that video goes out a first predetermined condition and video enters a second predetermined condition, wherein, the first predetermined condition comprises: the decode procedure of the image before video goes out a little do not rely on video go out a little after image information and the video image before going out a little all can show, the second predetermined condition comprises: the decode procedure that video enters a little image does not afterwards rely on video and enters some the image that image information and video before enter a little afterwards and all can show; The first concatenation unit, in the first judgment unit judges when being, the video of usining goes out a little as video file to splice a little, the video of usining enters a little as video file, to spell access point and carry out video-splicing.
Preferably, this device also comprises: the second judging unit, for judging whether that audio frequency goes out a little satisfied and audio frequency and enters a little to meet the 4th predetermined condition, wherein, the 3rd predetermined condition comprises: the PTS of the frame of the audio file in a TS fluid sound video file is more than or equal to video and goes out PTS a little; The 4th predetermined condition comprises: the PTS of the frame of the audio file in the 2nd TS fluid sound video file is more than or equal to video and enters PTS a little; The second concatenation unit, in the second judgment unit judges result when being, the audio frequency of usining go out a little as audio file splice point, the audio frequency of usining enters a little as audio file, to spell access point and carry out audio splicing.
Preferably, the first concatenation unit comprises: the first judge module, and for the frame number and the splicing that judge spliced video file, whether the frame number of front video file mates; The first packing module, when at the first judge module, the determination result is NO, the output video infilled frame, so that the frame number of the video file before the frame number of spliced video file and splicing is complementary, wherein the video infilled frame comprises blank screen frame and/or quiet frame.
Preferably, the second concatenation unit comprises: the second judge module, and for the frame number and the splicing that judge spliced audio file, whether the frame number of front audio file mates; The second packing module, when at the second judge module, the determination result is NO, the output audio infilled frame, so that the frame number of the audio file before the frame number of spliced audio file and splicing is complementary.
Preferably, the second concatenation unit also comprises: lock unit, and for video file PTS is synchronizeed and is adjusted with audio file PTS.
In the present invention, adopt self-defining spelling access point and splice a little, in carrying out the video-splicing process, after getting the splicing instruction, judge that the indicated video of this splicing instruction enters a little and video goes out a little whether to meet self-defining spelling access point and splice a little required condition, meet self-defining spelling access point and while splicing condition a little judging, without former program is decoded and encoded, directly TS fluid sound video file is spliced, solved in the correlation technique in digital program insertion technology the decoded problem of the image quality decrease of encoding again and causing of former program, and then reached the effect that increases picture quality after splicing, further, also provide cost savings.
The accompanying drawing explanation
Accompanying drawing described herein is used to provide a further understanding of the present invention, forms the application's a part, and schematic description and description of the present invention the present invention does not form inappropriate limitation of the present invention for explaining.In the accompanying drawings:
Fig. 1 is a kind of preferred schematic diagram according to the audio-video document splicing of correlation technique;
Fig. 2 is a kind of preferred flow chart according to the audio-video document joining method of the embodiment of the present invention;
Fig. 3 is according to spelling access point in the audio-video document splicing of the embodiment of the present invention and splicing schematic diagram a little;
Fig. 4 is a kind of preferred structure chart according to the audio-video document splicing apparatus of the embodiment of the present invention;
Fig. 5 is the preferred structure chart of another kind according to the audio-video document splicing apparatus of the embodiment of the present invention;
Fig. 6 is another the preferred structure chart according to the audio-video document splicing apparatus of the embodiment of the present invention;
Fig. 7 is another the preferred structure chart according to the audio-video document splicing apparatus of the embodiment of the present invention;
Fig. 8 is a kind of preferred structural representation according to the audio-video document splicing system of the embodiment of the present invention;
Fig. 9 opens the schematic diagram of transmission sequence of the frame of gop structure according to the embodiment of the present invention a kind of;
Figure 10 closes the schematic diagram of transmission sequence of the frame of gop structure according to the embodiment of the present invention a kind of;
Figure 11 is a kind of preferred flow chart seamless spliced according to the video of the audio-video document splicing system of the embodiment of the present invention; And
Figure 12 is a kind of preferred flow chart seamless spliced according to the audio frequency of the audio-video document splicing system of the embodiment of the present invention.
Embodiment
Hereinafter with reference to accompanying drawing, also describe the present invention in detail in conjunction with the embodiments.It should be noted that, in the situation that do not conflict, embodiment and the feature in embodiment in the application can combine mutually.
Embodiment 1
The invention provides a kind of audio-video document joining method, preferred, as shown in Figure 2, the method comprises the steps:
S202, obtain a splicing instruction indicated TS(Transport Stream, transmits stream) video of fluid sound video file goes out a little and audio frequency goes out a little;
Preferably, the TS flow data that the one TS fluid sound video file is the keynote video file, also can be described as TS1, preferably, in the output procedure of audio frequency and video, the TS1 deblocking installs to ES(Elementary Stream, Basic Flow) layer, and the input ES data of TS1 are kept to the abbreviation of main video PIP(pipeline PIPE) and main audio PIP in.
S204, obtain in indicated the 2nd TS fluid sound video file of splicing instruction that video enters a little and audio frequency enters a little;
Preferably, the 2nd TS fluid sound video file, for the TS flow data of splicing video file, also can be described as TS2, and preferred, in the output procedure of audio frequency and video, the TS2 deblocking installs to the ES layer, and the input ES data of TS2 are kept in splicing video PIP and splicing audio frequency PIP.
Preferably, in the splicing of audio frequency and video, TS2 is inserted in TS1, has critical point, as shown in Figure 3, this critical point is called the spelling access point of TS2, simultaneously, is also that the splicing of TS1 goes out a little.
S206, judge whether that video goes out a first predetermined condition and video enters a second predetermined condition, if, the video of usining goes out a little and splices a little as video file, the video of usining enters a little to spell access point as video file and carry out video-splicing, wherein, the first predetermined condition comprises: the decode procedure of the image before video goes out a little do not rely on video go out a little after the image that goes out a little before of image information and video all can show; The second predetermined condition comprises: the decode procedure of the image after video enters a little do not rely on video enter a little before the image that enters a little afterwards of image information and video all can show.
Specifically, the decode procedure of all images before splicing instruction indicated video goes out a little does not need the image information of this video after going out a little, and the image that video goes out a little before all can show, judges video and goes out a first predetermined condition; The decode procedure of all images after splicing instruction indicated video enters a little does not need this video to enter some image information before, and the image that video enters a little afterwards all can show, judges video and enters a second predetermined condition.In the situation that video goes out a first predetermined condition and video enters a second predetermined condition, the video of usining go out a little as video file splice point, the video of usining enters a little as video file, to spell access point and carry out video-splicing.
Above-mentioned preferred embodiment in, adopt self-defining spelling access point and splice a little, in carrying out the video-splicing process, after getting the splicing instruction, judge that the indicated video of this splicing instruction enters a little and video goes out a little whether to meet self-defining spelling access point and splice a little required condition, meet self-defining spelling access point and while splicing condition a little judging, without former program is decoded and encoded, directly TS fluid sound video file is spliced, solved in the correlation technique in digital program insertion technology the decoded problem of the image quality decrease of encoding again and causing of former program, and then reached the effect that increases picture quality after splicing, further, also provide cost savings.
The present invention also improves said method, specifically, after obtaining that audio frequency enters a little and audio frequency go out a little, the method also comprises: judge whether that audio frequency goes out a little to meet the 3rd predetermined condition and audio frequency and enters a little to meet the 4th predetermined condition, if, the audio frequency of usining goes out a little and splices a little as audio file, the audio frequency of usining enters a little to spell access point as audio file and carry out audio splicing, wherein, the 3rd predetermined condition comprises: the PTS(Presentation Time Stamp of the frame of the audio file in a TS fluid sound video file, the displaying time stamp) being more than or equal to video goes out PTS a little, the 4th predetermined condition comprises: the PTS of the frame of the audio file in the 2nd TS fluid sound video file is more than or equal to video and enters PTS a little.Above-mentioned preferred embodiment in, realized the splicing of audio-video document sound intermediate frequency and video.
Preferably, if the frame of the audio file in a TS fluid sound video file does not meet and splices a condition, the audio file PTS lost in a TS fluid sound video file is less than the frame that video goes out PTS a little, and the splicing condition is met; Preferably, if the frame PTS of the audio file in a TS fluid sound video file goes out PTS a little much larger than video, increase the buffering of main audio, slow down the processing speed of audio frame, to be complementary with video file.
The present invention has also carried out further optimization to said method, specifically, using video go out a little as video file splice point, the video of usining enters a little and to spell access point as video file and carry out video-splicing and comprise: whether the frame number of the video file before the frame number that judges spliced video file and splicing mates; If the determination result is NO, the output video infilled frame, so that the frame number of spliced video file is complementary with the frame number that splices front video file, wherein the video infilled frame comprises blank screen frame and/or quiet frame.
Further, using audio frequency go out a little as audio file splice point, the audio frequency of usining enters a little and to spell access point as audio file and carry out audio splicing and comprise: whether the frame number of the audio file before the frame number that judges spliced audio file and splicing mates; If the determination result is NO, the output audio infilled frame, so that the frame number of spliced audio file is complementary with the frame number that splices front audio file.
Specifically, in the audio frequency and video splicing, if before and after splicing, the audio frequency and video frame number does not mate, after splicing finishes, the main program time can shift to an earlier date or lag behind.The frame number that frame number coupling refers to frame number that main video abandons in the process be inserted into and the insertion of splicing video is identical.Preferably, if the frame number coupling is false, splices video and be put into the buffering area wait, the output of video infilled frame, splice video until main video meets after frame mates to export.
In addition, because the splicing at video goes out a little and enters, a little all likely produce the frame fragment, these fragments may affect sound effect.Preferably, can adopt according to time relationship the method for audio frame polishing is processed, concrete steps are as follows: according to video go out a little and enter a little between time difference, calculate the time difference of current audio frequency between going out a little and enter a little, and according to the corresponding audio frame of this time difference polishing.For the audio frame fragment of stitching position, can carry out this audio frame of polishing by the mode that repeats upper full audio frame data.
The present invention has also carried out further optimization to above-mentioned method, preferably, using audio frequency go out a little as audio file splice point, the audio frequency of usining enters a little to spell as audio file the process that access point carries out audio splicing and also comprise: video file PTS is arranged and synchronizes with audio file PTS.Specifically, synchronous by Voice & Video, namely the current sound playing should be currently to broadcast the sound same period of image.Voice & Video synchronously can be by synchronously realizing between video PTS and audio frequency PTS.This is adjusted as long as the PTS of assurance Voice & Video is synchronous.For the audio frame of stitching position, because likely repeat the previous frame data, may not be inconsistent with video, still the time of frame data only has a few tens of milliseconds, can not consider.
Embodiment 2
On the basis of embodiment 1, the present embodiment provides a kind of audio-video document splicing apparatus, and particularly, as shown in Figure 4, this device comprises:
The first acquiring unit 402, for obtaining, the video that splices the indicated TS fluid sound video file of instruction goes out a little and audio frequency goes out a little; Preferably, the TS flow data that a TS fluid sound video file is the keynote video file, also can be described as TS1, and preferred, in the output procedure of audio frequency and video, the TS1 deblocking installs to the ES layer, and the input ES data of TS1 are kept in main video PIP and main audio PIP.
Second acquisition unit 404, for obtaining, the 2nd indicated TS fluid sound video file video of splicing instruction enters a little and audio frequency enters a little; Preferably, the 2nd TS fluid sound video file, for the TS flow data of splicing video file, also can be described as TS2, and preferred, in the output procedure of audio frequency and video, the TS2 deblocking installs to the ES layer, and the input ES data of TS2 are kept in splicing video PIP and splicing audio frequency PIP.Preferably, in the splicing of audio frequency and video, TS2 is inserted in TS1, has critical point, as shown in Figure 2, this critical point is called the spelling access point of TS2, simultaneously, is also that the splicing of TS1 goes out a little.
The first judging unit 406, for judging whether that video goes out a first predetermined condition and video enters a second predetermined condition, wherein, the first predetermined condition comprises: the decode procedure of the image before video goes out a little do not rely on video go out a little after image information and the video image before going out a little all can show, the second predetermined condition comprises: the decode procedure that video enters a little image does not afterwards rely on video and enters some the image that image information and video before enter a little afterwards and all can show;
The first concatenation unit 408, for when the first judging unit 406 is judged as YES, the video of usining goes out a little as video file to splice a little, and the video of usining enters a little as video file, to spell access point and carry out video-splicing.
Specifically, the decode procedure of all images before splicing instruction indicated video goes out a little does not need the image information of this video after going out a little, and the image that video goes out a little before all can show, judges video and goes out a first predetermined condition; The decode procedure of all images after splicing instruction indicated video enters a little does not need this video to enter some image information before, and the image that video enters a little afterwards all can show, judges video and enters a second predetermined condition.In the situation that video goes out a first predetermined condition and video enters a second predetermined condition, the video of usining go out a little as video file splice point, the video of usining enters a little as video file, to spell access point and carry out video-splicing.
Above-mentioned preferred embodiment in, adopt self-defining spelling access point and splice a little, in carrying out the video-splicing process, after getting the splicing instruction, judge that the indicated video of this splicing instruction enters a little and video goes out a little whether to meet self-defining spelling access point and splice a little required condition, meet self-defining spelling access point and while splicing condition a little judging, without former program is decoded and encoded, directly TS fluid sound video file is spliced, solved in the correlation technique in digital program insertion technology the decoded problem of the image quality decrease of encoding again and causing of former program, and then reached the effect that increases picture quality after splicing, further, also provide cost savings.
The present embodiment also improves said apparatus, particularly, as shown in Figure 5, this device also comprises: the second judging unit 502, for judging whether that audio frequency goes out a little satisfied and audio frequency and enters a little to meet the 4th predetermined condition, wherein, the 3rd predetermined condition comprises: the PTS(Presentation Time Stamp of the frame of the audio file in a TS fluid sound video file, displaying time stamp) be more than or equal to video and go out PTS a little; The 4th predetermined condition comprises: the PTS of the frame of the audio file in the 2nd TS fluid sound video file is more than or equal to video and enters PTS a little; The second concatenation unit 504, in the second judgment unit judges result when being, the audio frequency of usining go out a little as audio file splice point, the audio frequency of usining enters a little as audio file, to spell access point and carry out audio splicing.
Preferably, if the frame of the audio file in a TS fluid sound video file does not meet and splices a condition, the audio file PTS lost in a TS fluid sound video file is less than the frame that video goes out PTS a little, and the splicing condition is met; Preferably, if the frame PTS of the audio file in a TS fluid sound video file goes out PTS a little much larger than video, increase the buffering of main audio, slow down the processing speed of audio frame, to be complementary with video file.
The present invention has also carried out further optimization to said method, and specifically, as shown in Figure 6, the first concatenation unit 408 comprises: the first judge module 602, and for the frame number and the splicing that judge spliced video file, whether the frame number of front video file mates; The first packing module 604, when at the first judge module 602, the determination result is NO, the output video infilled frame, so that the frame number of the video file before the frame number of spliced video file and splicing is complementary, wherein the video infilled frame comprises blank screen frame and/or quiet frame.
The present invention also is optimized above-mentioned the second concatenation unit 504, specifically, as shown in Figure 7, the second concatenation unit 504 comprises: the second judge module 702, and for the frame number and the splicing that judge spliced audio file, whether the frame number of front audio file mates; The second packing module 704, when at the second judge module, the determination result is NO, the output audio infilled frame, so that the frame number of the audio file before the frame number of spliced audio file and splicing is complementary.
Specifically, in the audio frequency and video splicing, if before and after splicing, the audio frequency and video frame number does not mate, after splicing finishes, the main program time can shift to an earlier date or lag behind.The frame number that frame number coupling refers to frame number that main video abandons in the process be inserted into and the insertion of splicing video is identical.Preferably, if the frame number coupling is false, splices video and be put into the buffering area wait, the output of video infilled frame, splice video until main video meets after frame mates to export.
In addition, because the splicing at video goes out a little and enters, a little all likely produce the frame fragment, these fragments may affect sound effect.Preferably, can adopt according to time relationship the method for audio frame polishing is processed, concrete steps are as follows: according to video go out a little and enter a little between time difference, calculate the time difference of current audio frequency between going out a little and enter a little, and according to the corresponding audio frame of this time difference polishing.For the audio frame fragment of stitching position, can carry out this audio frame of polishing by the mode that repeats upper full audio frame data.
Further, the second concatenation unit 504 also comprises: lock unit, and for video file PTS is synchronizeed and is adjusted with audio file PTS.Specifically, synchronous by Voice & Video, namely the current sound playing should be currently to broadcast the sound same period of image.Voice & Video synchronously can be by synchronously realizing between video PTS and audio frequency PTS.This is adjusted as long as the PTS of assurance Voice & Video is synchronous.For the audio frame of stitching position, because likely repeat the previous frame data, may not be inconsistent with video, still the time of frame data only has a few tens of milliseconds, can not consider.
Embodiment 3
On the basis of above-described embodiment 1 and embodiment 2, the present embodiment also provides a kind of audio-video document splicing system, specifically, as shown in Figure 8, in this system, audio-video source file to be spliced is processed accordingly through video and audio material working apparatus to be spliced, as, revise the content of source file and form to guarantee and the main program format match, and be output as the target video file; The target video file of output is packaged into the target video file through video and audio player to be spliced the TS stream (TS1) that can play, carry target video and target audio in TS stream, and by different PID(Packet Identifier, Packet Identifier) distinguish.Control the parameters of splicing by splicing controller, such as time parameter (zero-time, dwell time, duration), program parameter etc., seamless spliced device flows to (TS2) row splicing by the TS stream be packaged into other TS according to the control parameter of splicing controller, and exports spliced TS stream.
In carrying out seamless spliced process, as TS2 is inserted in TS1, there is critical point, this critical point is called the spelling access point of TS2, is also that the splicing of TS1 goes out a little.
Splice a little and need meet the following conditions simultaneously:
(1) decoding of all images before splicing does not a little need image information after this;
(2) there do not is the image that can't show.
Spelling access point need meet the following conditions simultaneously:
(1) decoding of all images after spelling access point does not need image information before this;
(2) there do not is the image that can't show.
In the seamless spliced process of video layer, due to GOP(Group of Picture, the picture group) there is B frame (Bidirectionally-predictive coded pictures, bi-directional predicted frames) time, non-B frame will postpone to show, particularly, as shown in Figure 9 and Figure 10, Fig. 9 illustrates a kind of transmission sequence of opening the frame of gop structure, Figure 10 illustrates a kind of transmission sequence that closes the frame of gop structure, can find out in the drawings, in the situation that there is the B frame, non-B frame (I frame, the P frame) to postpone to show, thus, can draw the following conclusions: (the Predictive coded pictures of the P frame in the GOP in transmission sequence, forward predicted frame) or I frame (Intra coded pictures, intra-coded frame) frame before is one and splices a little, spell the starting position of access point in the GOP group.Preferably, some parameters have been provided in the sequence head of video sequence, as parameters such as high, wide, frame rate, code check and video formats, these parameters are very large for the impact of decoding, therefore if realize seamless link, must carry out some in sequence level and process, the video sequences different for parameter are not considered Bonding Problem, or are spliced at sequence head.In addition, can be in video and audio material working apparatus to be intercutted modifying target video sequence and main video matching.
Carry out in seamless spliced process in audio layer, because audio frame adopts intraframe coding method, there is no associated between frame and frame, process relatively simple.Splicing at video goes out a little and enters a little all likely to produce the frame fragment, and these fragments may affect sound effect.Can adopt according to time relationship the method for audio frame polishing is processed, specific as follows:
(1) according to video go out a little and enter a little between time difference, calculate the time difference of current audio frequency between going out a little and enter a little, and according to the corresponding audio frame of this time difference polishing;
(2) for the audio frame fragment of stitching position, can carry out this audio frame of polishing by the mode that repeats upper full audio frame data.
In addition, synchronous about Voice & Video,, the current sound playing should be currently to broadcast the sound same period of image, adopt following scheme: Voice & Video synchronously can pass through video PTS (Presentation Time Stamp, displaying time stamp) and synchronously realizing between audio frequency PTS, specifically, as long as guarantee synchronous adjustment of PTS of PTS and the video of audio frequency.For the audio frame of stitching position, because likely repeat the previous frame data, may not be inconsistent with video, still the time of frame data only has a few tens of milliseconds, does not consider.
At the TS level, adjust and to splice a little and to spell the time difference between access point, allow the image before splicing a little that time enough demonstration image is arranged, and to allow into image can show image at reasonable time.Video standard specifies all will show at reasonable time at the decoding display end through the image of coding, its time standard relied on is PCR (Program Clock Reference, program clock reference), and there is DTS (Decoding Time Stamp in image, decoded time stamp), PTS, the decode time of separate provision image and displaying time.Spell the access point place and need to revise the continuous broadcasting that DTS, PTS keep TS stream.
Below introduce seamless spliced internal process:
TS1 and TS2 deblocking are installed to ES (Elementary Stream, Basic Flow) layer.The input ES data of TS1 are kept at the abbreviation of main video PIP(pipeline PIPE) and main audio PIP in.The input ES data of TS2 are kept in splicing video PIP and splicing audio frequency PIP.
Figure 11 shows the seamless spliced a kind of flow chart of video, specific as follows:
Main video and splicing video are after pipeline is got the ES data, if do not receive the external splice order, main video data is normally exported, and the splicing video data abandons.
After receiving the external splice order, main video judges whether to meet and splices a condition.If do not meet and splice a condition, main video data is normally exported.Splice a condition if meet, main video carries out the frame number matching judgment.
After receiving the external splice order, the splicing video judges whether to meet the access point condition of spelling.If do not meet and splice into a condition, the splicing video data abandons.If meet the access point condition of spelling, the splicing video carries out the frame number matching judgment.
The video transition frame is filled and is referred to blank screen frame or the quiet frame inserted in order to guarantee the frame number coupling in splicing.If frame number does not mate, after splicing finishes, the main program time can shift to an earlier date or lag behind.The frame number that frame number coupling refers to frame number that main video abandons in the process be inserted into and the insertion of splicing video is identical.If the frame number coupling is false, to splice video and be put into the buffering area wait, the output of video infilled frame, splice video until main video meets after frame mates to export.
In addition, because the B frame adopts bi-directional predicted mode, the B frame in first GOP of splicing video between I frame and first P frame will remove, otherwise the prediction of the forward reference frame of B frame mistake in using, in splice point, place there will be mosaic.
Figure 12 shows the seamless spliced a kind of flow chart of audio frequency, specific as follows:
Main audio and splicing audio frequency are after pipeline is got the ES data, if do not receive that video has spliced order, main audio data is normally exported, and the splicing voice data abandons.
After receiving that video has spliced order, main audio judges whether to meet and splices a condition, and the splicing audio frequency judges whether to meet the access point condition of spelling.
Wherein, the condition that main audio splices an establishment is: the PTS >=Video_pts_outpoint of main audio frame, and wherein, Video_pts_outpoint is that video PTS goes out a little;
Preferably, to splice be some the main audio frame met the following conditions to best main audio:
The PTS+ audio frequency single frames PTS interval of the PTS >=Video_pts_outpoint of main audio frame >=main audio frame;
Preferably, if the main audio frame does not meet and splices a condition, lose the main audio frame of PTS<Video_pts_outpoint, the splicing condition is met; If main audio frame PTS is much larger than Video_pts_outpoint, need to increase the buffering of main audio, slow down the processing speed of audio frame, with and video matching
The condition that the splicing audio splicing enters an establishment is: the PTS >=Video_pts_inpoint of splicing audio frame, and wherein, Video_pts_inpoint is that video PTS enters a little;
Preferably, to enter be a little the splicing audio frame met the following conditions to best splicing audio splicing: the PTS+ audio frequency single frames PTS interval of splicing the PTS >=Video_pts_inpoint of audio frame >=splicing audio frame;
If the splicing audio frame does not meet and splices a condition, lose the main audio frame of PTS≤Video_pts_outpoint, the splicing condition is met; If audio frame PTS is much larger than Video_pts_outpoint in splicing, need to increase the buffering of splicing audio frequency, slow down the processing speed of audio frame, with and video matching.
When video is filled, audio frequency also will be filled, and the starting and ending that audio frequency is filled and video are filled according to the PTS time synchronized.
For the TS layer, process:
The TS layer of main video carries a PCR, is designated as PCR1.The TS layer of splicing video carries a PCR, is designated as PCR2.In seamless spliced process of the present invention and outside seamless spliced process, all use PCR1 as Program Clock Reference.The value of DTS and PTS is all to produce with reference to PCR, so will revise DTS and PTS adapts to PCR1 in the process of splicing video playback.Simultaneously, because the B frame between I frame and first P frame in first GOP of splicing video will remove, the decode time DTS of splicing video needs to revise.DTS and PTS carry in the PES grouping, to PES(program elementary stream, program flow) while dividing into groups to adjust, revise this two times.
Can find out from the above description, the present invention adopts self-defining spelling access point and splices a little, in carrying out the video-splicing process, after getting the splicing instruction, judge that the indicated video of this splicing instruction enters a little and video goes out a little whether to meet self-defining spelling access point and splice a little required condition, meet self-defining spelling access point and while splicing condition a little judging, without former program is decoded and encoded, directly TS fluid sound video file is spliced, solved in the correlation technique in digital program insertion technology the decoded problem of the image quality decrease of encoding again and causing of former program, and then reached the effect that increases picture quality after splicing, further, also provide cost savings.
Obviously, those skilled in the art should be understood that, above-mentioned each module of the present invention or each step can realize with general calculation element, they can concentrate on single calculation element, perhaps be distributed on the network that a plurality of calculation elements form, alternatively, they can be realized with the executable program code of calculation element, thereby, they can be stored in storage device and be carried out by calculation element, and in some cases, can carry out step shown or that describe with the order be different from herein, perhaps they are made into respectively to each integrated circuit modules, perhaps a plurality of modules in them or step being made into to the single integrated circuit module realizes.Like this, the present invention is not restricted to any specific hardware and software combination.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (10)

1. an audio-video document joining method, is characterized in that, comprising:
The video that obtains the indicated TS fluid sound video file of splicing instruction goes out a little and audio frequency goes out a little;
Obtain in indicated the 2nd TS fluid sound video file of splicing instruction that video enters a little and audio frequency enters a little;
Judge whether that described video goes out a first predetermined condition and described video enters a second predetermined condition, if, the described video of usining goes out a little and splices a little as video file, the described video of usining enters a little to spell access point as video file and carry out video-splicing, wherein, described the first predetermined condition comprises: the decode procedure of the image before described video goes out a little do not rely on described video go out a little after the image that goes out a little before of image information and described video all can show; Described the second predetermined condition comprises: the decode procedure of the image after described video enters a little do not rely on described video enter a little before the image that enters a little afterwards of image information and described video all can show.
2. method according to claim 1, is characterized in that, after obtaining that described audio frequency enters a little and described audio frequency goes out a little, the method also comprises:
Judge whether that described audio frequency goes out a little to meet the 3rd predetermined condition and described audio frequency and enters a little to meet the 4th predetermined condition, if, the described audio frequency of usining goes out a little and splices a little as audio file, the described audio frequency of usining enters a little to spell access point as audio file and carry out audio splicing, wherein, described the 3rd predetermined condition comprises: the PTS of the frame of the audio file in a described TS fluid sound video file is more than or equal to described video and goes out PTS a little; Described the 4th predetermined condition comprises: the PTS of the frame of the audio file in described the 2nd TS fluid sound video file is more than or equal to described video and enters PTS a little.
3. method according to claim 1 and 2, is characterized in that, the described video of usining goes out a little as video file to splice a little, and the described video of usining enters a little to spell as video file the step that access point carries out video-splicing and comprise:
Whether the frame number that judges spliced video file mates with the frame number that splices front video file;
If the determination result is NO, the output video infilled frame, so that the frame number of spliced video file is complementary with the frame number that splices front video file, wherein said video infilled frame comprises blank screen frame and/or quiet frame.
4. method according to claim 2, is characterized in that, the described audio frequency of usining goes out a little as audio file to splice a little, and the described audio frequency of usining enters a little to spell access point as audio file and carry out audio splicing and comprise:
Whether the frame number that judges spliced audio file mates with the frame number that splices front audio file;
If the determination result is NO, the output audio infilled frame, so that the frame number of spliced audio file is complementary with the frame number that splices front audio file.
5. method according to claim 2, is characterized in that, the described audio frequency of usining go out a little as audio file splice point, the described audio frequency of usining enters a little to spell access point as audio file and carry out audio splicing and also comprise:
Video file PTS is synchronizeed and adjusted with audio file PTS.
6. an audio-video document splicing apparatus, is characterized in that, comprising:
The first acquiring unit, for obtaining, the video that splices the indicated TS fluid sound video file of instruction goes out a little and audio frequency goes out a little;
Second acquisition unit, for obtaining, the 2nd indicated TS fluid sound video file video of splicing instruction enters a little and audio frequency enters a little;
The first judging unit, for judging whether that described video goes out a first predetermined condition and described video enters a second predetermined condition, wherein, described the first predetermined condition comprises: the decode procedure of the image before described video goes out a little do not rely on described video go out a little after image information and the described video image before going out a little all can show, described the second predetermined condition comprises: the decode procedure that described video enters a little image does not afterwards rely on described video and enters some the image that image information and described video before enter a little afterwards and all can show;
The first concatenation unit, in described the first judgment unit judges when being, the described video of usining goes out a little as video file to splice a little, the described video of usining enters a little as video file, to spell access point and carry out video-splicing.
7. device according to claim 6, is characterized in that, also comprises:
The second judging unit, for judging whether that described audio frequency goes out a little satisfied and described audio frequency and enters a little to meet the 4th predetermined condition, wherein, described the 3rd predetermined condition comprises: the PTS of the frame of the audio file in a TS fluid sound video file is more than or equal to described video and goes out PTS a little; Described the 4th predetermined condition comprises: the PTS of the frame of the audio file in described the 2nd TS fluid sound video file is more than or equal to described video and enters PTS a little;
The second concatenation unit, in described the second judgment unit judges result when being, the described audio frequency of usining go out a little as audio file splice point, the described audio frequency of usining enters a little as audio file, to spell access point and carry out audio splicing.
8. device according to claim 6, is characterized in that, described the first concatenation unit comprises:
The first judge module, for the frame number and the splicing that judge spliced video file, whether the frame number of front video file mates;
The first packing module, for at described the first judge module when the determination result is NO, the output video infilled frame, so that the frame number of spliced video file is complementary with the frame number that splices front video file, wherein said video infilled frame comprises blank screen frame and/or quiet frame.
9. device according to claim 7, is characterized in that, described the second concatenation unit comprises:
The second judge module, for the frame number and the splicing that judge spliced audio file, whether the frame number of front audio file mates;
The second packing module, when at described the second judge module, the determination result is NO, the output audio infilled frame, so that the frame number of the audio file before the frame number of spliced audio file and splicing is complementary.
10. device according to claim 7, is characterized in that, described the second concatenation unit also comprises:
Lock unit, for synchronizeing video file PTS to adjust with audio file PTS.
CN2012101731344A 2012-05-29 2012-05-29 Audio-video file splicing method and audio-video file splicing device Pending CN103458271A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012101731344A CN103458271A (en) 2012-05-29 2012-05-29 Audio-video file splicing method and audio-video file splicing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012101731344A CN103458271A (en) 2012-05-29 2012-05-29 Audio-video file splicing method and audio-video file splicing device

Publications (1)

Publication Number Publication Date
CN103458271A true CN103458271A (en) 2013-12-18

Family

ID=49740160

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012101731344A Pending CN103458271A (en) 2012-05-29 2012-05-29 Audio-video file splicing method and audio-video file splicing device

Country Status (1)

Country Link
CN (1) CN103458271A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778217A (en) * 2015-03-20 2015-07-15 广东欧珀移动通信有限公司 Music splicing algorithm and device
CN104980794A (en) * 2015-06-30 2015-10-14 北京金山安全软件有限公司 Video splicing method and device
CN105578260A (en) * 2015-12-18 2016-05-11 无锡天脉聚源传媒科技有限公司 Video editing method and device
CN105592321A (en) * 2015-12-18 2016-05-18 无锡天脉聚源传媒科技有限公司 Method and device for clipping video
CN107046624A (en) * 2016-02-05 2017-08-15 亚洲光学股份有限公司 Image stitching method and image processing device
CN108282670A (en) * 2017-01-05 2018-07-13 纳宝株式会社 Code converter for real-time imaging synthesis
CN109068163A (en) * 2018-08-28 2018-12-21 哈尔滨市舍科技有限公司 A kind of audio-video synthesis system and its synthetic method
WO2019169682A1 (en) * 2018-03-05 2019-09-12 网宿科技股份有限公司 Audio-video synthesis method and system
CN111263220A (en) * 2020-01-15 2020-06-09 北京字节跳动网络技术有限公司 Video processing method and device, electronic equipment and computer readable storage medium
CN111741376A (en) * 2020-07-31 2020-10-02 南斗六星系统集成有限公司 Method for synchronizing audio and video lip sounds of multimedia file splicing
CN113141536A (en) * 2020-01-17 2021-07-20 北京达佳互联信息技术有限公司 Video cover adding method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6181383B1 (en) * 1996-05-29 2001-01-30 Sarnoff Corporation Method and apparatus for preserving synchronization of audio and video presentation when splicing transport streams
CN101409831A (en) * 2008-07-10 2009-04-15 浙江师范大学 Method for processing multimedia video object
WO2011012909A2 (en) * 2009-07-31 2011-02-03 British Sky Broadcasting Limited Media insertion system
CN102349307A (en) * 2009-05-13 2012-02-08 Nds有限公司 Splicing system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6181383B1 (en) * 1996-05-29 2001-01-30 Sarnoff Corporation Method and apparatus for preserving synchronization of audio and video presentation when splicing transport streams
CN101409831A (en) * 2008-07-10 2009-04-15 浙江师范大学 Method for processing multimedia video object
CN102349307A (en) * 2009-05-13 2012-02-08 Nds有限公司 Splicing system
WO2011012909A2 (en) * 2009-07-31 2011-02-03 British Sky Broadcasting Limited Media insertion system

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778217A (en) * 2015-03-20 2015-07-15 广东欧珀移动通信有限公司 Music splicing algorithm and device
CN104980794A (en) * 2015-06-30 2015-10-14 北京金山安全软件有限公司 Video splicing method and device
CN105578260A (en) * 2015-12-18 2016-05-11 无锡天脉聚源传媒科技有限公司 Video editing method and device
CN105592321A (en) * 2015-12-18 2016-05-18 无锡天脉聚源传媒科技有限公司 Method and device for clipping video
CN107046624A (en) * 2016-02-05 2017-08-15 亚洲光学股份有限公司 Image stitching method and image processing device
CN108282670A (en) * 2017-01-05 2018-07-13 纳宝株式会社 Code converter for real-time imaging synthesis
WO2019169682A1 (en) * 2018-03-05 2019-09-12 网宿科技股份有限公司 Audio-video synthesis method and system
CN109068163A (en) * 2018-08-28 2018-12-21 哈尔滨市舍科技有限公司 A kind of audio-video synthesis system and its synthetic method
CN109068163B (en) * 2018-08-28 2021-01-29 青岛一舍科技有限公司 Audio and video synthesis system and synthesis method thereof
CN111263220A (en) * 2020-01-15 2020-06-09 北京字节跳动网络技术有限公司 Video processing method and device, electronic equipment and computer readable storage medium
CN113141536A (en) * 2020-01-17 2021-07-20 北京达佳互联信息技术有限公司 Video cover adding method and device, electronic equipment and storage medium
CN111741376A (en) * 2020-07-31 2020-10-02 南斗六星系统集成有限公司 Method for synchronizing audio and video lip sounds of multimedia file splicing
CN111741376B (en) * 2020-07-31 2020-12-01 南斗六星系统集成有限公司 Method for synchronizing audio and video lip sounds of multimedia file splicing

Similar Documents

Publication Publication Date Title
CN103458271A (en) Audio-video file splicing method and audio-video file splicing device
CN1235406C (en) System and data format for providing seamless stream switching in digital video decoder
EP1463329B1 (en) Image reproduction apparatus
EP2757795B1 (en) Video multiplexing apparatus, video multiplexing method, multiplexed video decoding apparatus, and multiplexed video decoding method
CN101984672B (en) Method and device for multi-thread video and audio synchronous control
JP6313704B2 (en) Reception device and synchronization processing method thereof
JP2000188759A (en) High frame precision seamless splicing method for information stream
KR20010095264A (en) Data multiplexer, data multiplexing method, and recording medium
US20040264577A1 (en) Apparatus and method for controlling the synchronization of a video transport stream
CN101710997A (en) MPEG-2 (Moving Picture Experts Group-2) system based method and system for realizing video and audio synchronization
CN103237255A (en) Multi-thread audio and video synchronization control method and system
US6842485B2 (en) Method and apparatus for reproducing compressively coded data
CN106470291A (en) Recover in the interruption in time synchronized from audio/video decoder
US20060203853A1 (en) Apparatus and methods for video synchronization by parsing time stamps from buffered packets
US20080037956A1 (en) Systems and Methods of Generating Encapsulated MPEG Program Streams
US6754273B1 (en) Method for compressing an audio-visual signal
JP2001204032A (en) Mpeg decoder
JP4613860B2 (en) MPEG encoded stream decoding apparatus
US20190052889A1 (en) Transmission device, transmission method, reception device, and reception method
JP2823806B2 (en) Image decoding device
JP2001346166A (en) Compression coded data reproduction method and device
KR20010090514A (en) Demultiplexing a statistically multiplexed mpeg transport stream into cbr single program transport streams
JP4967402B2 (en) Multiplexed stream conversion apparatus and method
KR101226329B1 (en) Method for channel change in Digital Broadcastings
JP3671969B2 (en) Data multiplexing method and multiple data decoding method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20131218

RJ01 Rejection of invention patent application after publication