CN103458271A

CN103458271A - Audio-video file splicing method and audio-video file splicing device

Info

Publication number: CN103458271A
Application number: CN2012101731344A
Authority: CN
Inventors: 王智
Original assignee: Sumavision Technologies Co Ltd
Current assignee: Sumavision Technologies Co Ltd
Priority date: 2012-05-29
Filing date: 2012-05-29
Publication date: 2013-12-18

Abstract

The invention discloses an audio-video file splicing method and an audio-video file splicing device. The audio-video file splicing method includes acquiring a video out-point and an audio out-point of a first TS (transport stream) audio-video file instructed by a splicing command, acquiring a video in-point and an audio in-point of a second TS audio-video file instructed by the splicing command, judging whether the video out-point satisfies a first preset condition or not and whether the video in-point satisfies a second preset condition or not, and if yes, taking the video out-point as a video file splicing out-point and taking the video in-point as a video file splicing in-point to perform video splicing. By the audio-video file splicing method and the audio-video file splicing device, the problem of lowered image quality due to fact that original programs are decoded and recoded in a digital program insertion technique in related techniques is solved, effect of improving quality of spliced images is achieved, and further, cost is saved.

Description

Audio-video document joining method and device

Technical field

The present invention relates to the audio frequency and video process field, in particular to a kind of audio-video document joining method and device.

Background technology

Digital program insert be one by MPEG(Moving Picture Experts Group, dynamic image expert group) program splices the digital jointing technology into other mpeg programs, very extensive in the application of audio frequency and video process field.

In existing digital program insertion technology, generally adopt the implementation of " decoder+video and audio inserter+encoder ", specifically as shown in Figure 1, when carrying out the digital program insertion, use SPTS(single program transport stream, single program stream) as input source, pass through successively decoder, the video and audio inserter of multiple devices cascade, encoder realizes that digital program inserts, this mode need to be carried out all decodings and all encode program stream, process is more complicated, and especially the coding side complexity is very high, and resource has high input, cost performance is low.In addition, in the process of coding again that former program is decoded, subjective picture quality is exerted an influence, cause the decline of picture quality.

For the problems referred to above in correlation technique, effective solution is not yet proposed at present.

Summary of the invention

The invention provides a kind of audio-video document joining method and device, to solve in correlation technique in digital program insertion technology the decoded problem of the image quality decrease of encoding again and causing of former program.

According to an aspect of the present invention, provide a kind of audio-video document joining method, the method comprises: the video that obtains the indicated TS fluid sound video file of splicing instruction goes out a little and audio frequency goes out a little; Obtain in indicated the 2nd TS fluid sound video file of splicing instruction that video enters a little and audio frequency enters a little; Judge whether that video goes out a first predetermined condition and video enters a second predetermined condition, if, the video of usining goes out a little and splices a little as video file, the video of usining enters a little to spell access point as video file and carry out video-splicing, wherein, the first predetermined condition comprises: the decode procedure of the image before video goes out a little do not rely on video go out a little after the image that goes out a little before of image information and video all can show; The second predetermined condition comprises: the decode procedure of the image after video enters a little do not rely on video enter a little before the image that enters a little afterwards of image information and video all can show.

Preferably, after obtaining that audio frequency enters a little and audio frequency go out a little, the method also comprises: judge whether that audio frequency goes out a little to meet the 3rd predetermined condition and audio frequency and enters a little to meet the 4th predetermined condition, if, the audio frequency of usining goes out a little and splices a little as audio file, the audio frequency of usining enters a little to spell access point as audio file and carry out audio splicing, and wherein, the 3rd predetermined condition comprises: the PTS of the frame of the audio file in a TS fluid sound video file is more than or equal to video and goes out PTS a little; The 4th predetermined condition comprises: the PTS of the frame of the audio file in the 2nd TS fluid sound video file is more than or equal to video and enters PTS a little.

Preferably, the video of usining goes out a little as video file to splice a little, and the video of usining enters a little spells access point as video file and carry out video-splicing and comprise: whether the frame number of the video file before the frame number that judges spliced video file and splicing mates; If the determination result is NO, the output video infilled frame, so that the frame number of spliced video file is complementary with the frame number that splices front video file, wherein the video infilled frame comprises blank screen frame and/or quiet frame.

Preferably, the audio frequency of usining goes out a little as audio file to splice a little, and the audio frequency of usining enters a little spells access point as audio file and carry out audio splicing and comprise: whether the frame number of the audio file before the frame number that judges spliced audio file and splicing mates; If the determination result is NO, the output audio infilled frame, so that the frame number of spliced audio file is complementary with the frame number that splices front audio file.

Preferably, using audio frequency go out a little as audio file splice point, the audio frequency of usining enters a little to spell access point as audio file and carry out audio splicing and also comprise: video file PTS is synchronizeed and is adjusted with audio file PTS.

According to a further aspect in the invention, provide a kind of audio-video document splicing apparatus, this device comprises: the first acquiring unit, and for obtaining, the video that splices the indicated TS fluid sound video file of instruction goes out a little and audio frequency goes out a little; Second acquisition unit, for obtaining, the 2nd indicated TS fluid sound video file video of splicing instruction enters a little and audio frequency enters a little; The first judging unit, for judging whether that video goes out a first predetermined condition and video enters a second predetermined condition, wherein, the first predetermined condition comprises: the decode procedure of the image before video goes out a little do not rely on video go out a little after image information and the video image before going out a little all can show, the second predetermined condition comprises: the decode procedure that video enters a little image does not afterwards rely on video and enters some the image that image information and video before enter a little afterwards and all can show; The first concatenation unit, in the first judgment unit judges when being, the video of usining goes out a little as video file to splice a little, the video of usining enters a little as video file, to spell access point and carry out video-splicing.

Preferably, this device also comprises: the second judging unit, for judging whether that audio frequency goes out a little satisfied and audio frequency and enters a little to meet the 4th predetermined condition, wherein, the 3rd predetermined condition comprises: the PTS of the frame of the audio file in a TS fluid sound video file is more than or equal to video and goes out PTS a little; The 4th predetermined condition comprises: the PTS of the frame of the audio file in the 2nd TS fluid sound video file is more than or equal to video and enters PTS a little; The second concatenation unit, in the second judgment unit judges result when being, the audio frequency of usining go out a little as audio file splice point, the audio frequency of usining enters a little as audio file, to spell access point and carry out audio splicing.

Preferably, the first concatenation unit comprises: the first judge module, and for the frame number and the splicing that judge spliced video file, whether the frame number of front video file mates; The first packing module, when at the first judge module, the determination result is NO, the output video infilled frame, so that the frame number of the video file before the frame number of spliced video file and splicing is complementary, wherein the video infilled frame comprises blank screen frame and/or quiet frame.

Preferably, the second concatenation unit comprises: the second judge module, and for the frame number and the splicing that judge spliced audio file, whether the frame number of front audio file mates; The second packing module, when at the second judge module, the determination result is NO, the output audio infilled frame, so that the frame number of the audio file before the frame number of spliced audio file and splicing is complementary.

Preferably, the second concatenation unit also comprises: lock unit, and for video file PTS is synchronizeed and is adjusted with audio file PTS.

In the present invention, adopt self-defining spelling access point and splice a little, in carrying out the video-splicing process, after getting the splicing instruction, judge that the indicated video of this splicing instruction enters a little and video goes out a little whether to meet self-defining spelling access point and splice a little required condition, meet self-defining spelling access point and while splicing condition a little judging, without former program is decoded and encoded, directly TS fluid sound video file is spliced, solved in the correlation technique in digital program insertion technology the decoded problem of the image quality decrease of encoding again and causing of former program, and then reached the effect that increases picture quality after splicing, further, also provide cost savings.

The accompanying drawing explanation

Accompanying drawing described herein is used to provide a further understanding of the present invention, forms the application's a part, and schematic description and description of the present invention the present invention does not form inappropriate limitation of the present invention for explaining.In the accompanying drawings:

Fig. 1 is a kind of preferred schematic diagram according to the audio-video document splicing of correlation technique;

Fig. 2 is a kind of preferred flow chart according to the audio-video document joining method of the embodiment of the present invention;

Fig. 3 is according to spelling access point in the audio-video document splicing of the embodiment of the present invention and splicing schematic diagram a little;

Fig. 4 is a kind of preferred structure chart according to the audio-video document splicing apparatus of the embodiment of the present invention;

Fig. 5 is the preferred structure chart of another kind according to the audio-video document splicing apparatus of the embodiment of the present invention;

Fig. 6 is another the preferred structure chart according to the audio-video document splicing apparatus of the embodiment of the present invention;

Fig. 7 is another the preferred structure chart according to the audio-video document splicing apparatus of the embodiment of the present invention;

Fig. 8 is a kind of preferred structural representation according to the audio-video document splicing system of the embodiment of the present invention;

Fig. 9 opens the schematic diagram of transmission sequence of the frame of gop structure according to the embodiment of the present invention a kind of;

Figure 10 closes the schematic diagram of transmission sequence of the frame of gop structure according to the embodiment of the present invention a kind of;

Figure 11 is a kind of preferred flow chart seamless spliced according to the video of the audio-video document splicing system of the embodiment of the present invention; And

Figure 12 is a kind of preferred flow chart seamless spliced according to the audio frequency of the audio-video document splicing system of the embodiment of the present invention.

Embodiment

Hereinafter with reference to accompanying drawing, also describe the present invention in detail in conjunction with the embodiments.It should be noted that, in the situation that do not conflict, embodiment and the feature in embodiment in the application can combine mutually.

Embodiment 1

The invention provides a kind of audio-video document joining method, preferred, as shown in Figure 2, the method comprises the steps:

S202, obtain a splicing instruction indicated TS(Transport Stream, transmits stream) video of fluid sound video file goes out a little and audio frequency goes out a little;

Preferably, the TS flow data that the one TS fluid sound video file is the keynote video file, also can be described as TS1, preferably, in the output procedure of audio frequency and video, the TS1 deblocking installs to ES(Elementary Stream, Basic Flow) layer, and the input ES data of TS1 are kept to the abbreviation of main video PIP(pipeline PIPE) and main audio PIP in.

S204, obtain in indicated the 2nd TS fluid sound video file of splicing instruction that video enters a little and audio frequency enters a little;

Preferably, the 2nd TS fluid sound video file, for the TS flow data of splicing video file, also can be described as TS2, and preferred, in the output procedure of audio frequency and video, the TS2 deblocking installs to the ES layer, and the input ES data of TS2 are kept in splicing video PIP and splicing audio frequency PIP.

Preferably, in the splicing of audio frequency and video, TS2 is inserted in TS1, has critical point, as shown in Figure 3, this critical point is called the spelling access point of TS2, simultaneously, is also that the splicing of TS1 goes out a little.

S206, judge whether that video goes out a first predetermined condition and video enters a second predetermined condition, if, the video of usining goes out a little and splices a little as video file, the video of usining enters a little to spell access point as video file and carry out video-splicing, wherein, the first predetermined condition comprises: the decode procedure of the image before video goes out a little do not rely on video go out a little after the image that goes out a little before of image information and video all can show; The second predetermined condition comprises: the decode procedure of the image after video enters a little do not rely on video enter a little before the image that enters a little afterwards of image information and video all can show.

Specifically, the decode procedure of all images before splicing instruction indicated video goes out a little does not need the image information of this video after going out a little, and the image that video goes out a little before all can show, judges video and goes out a first predetermined condition; The decode procedure of all images after splicing instruction indicated video enters a little does not need this video to enter some image information before, and the image that video enters a little afterwards all can show, judges video and enters a second predetermined condition.In the situation that video goes out a first predetermined condition and video enters a second predetermined condition, the video of usining go out a little as video file splice point, the video of usining enters a little as video file, to spell access point and carry out video-splicing.

Above-mentioned preferred embodiment in, adopt self-defining spelling access point and splice a little, in carrying out the video-splicing process, after getting the splicing instruction, judge that the indicated video of this splicing instruction enters a little and video goes out a little whether to meet self-defining spelling access point and splice a little required condition, meet self-defining spelling access point and while splicing condition a little judging, without former program is decoded and encoded, directly TS fluid sound video file is spliced, solved in the correlation technique in digital program insertion technology the decoded problem of the image quality decrease of encoding again and causing of former program, and then reached the effect that increases picture quality after splicing, further, also provide cost savings.

The present invention also improves said method, specifically, after obtaining that audio frequency enters a little and audio frequency go out a little, the method also comprises: judge whether that audio frequency goes out a little to meet the 3rd predetermined condition and audio frequency and enters a little to meet the 4th predetermined condition, if, the audio frequency of usining goes out a little and splices a little as audio file, the audio frequency of usining enters a little to spell access point as audio file and carry out audio splicing, wherein, the 3rd predetermined condition comprises: the PTS(Presentation Time Stamp of the frame of the audio file in a TS fluid sound video file, the displaying time stamp) being more than or equal to video goes out PTS a little, the 4th predetermined condition comprises: the PTS of the frame of the audio file in the 2nd TS fluid sound video file is more than or equal to video and enters PTS a little.Above-mentioned preferred embodiment in, realized the splicing of audio-video document sound intermediate frequency and video.

Preferably, if the frame of the audio file in a TS fluid sound video file does not meet and splices a condition, the audio file PTS lost in a TS fluid sound video file is less than the frame that video goes out PTS a little, and the splicing condition is met; Preferably, if the frame PTS of the audio file in a TS fluid sound video file goes out PTS a little much larger than video, increase the buffering of main audio, slow down the processing speed of audio frame, to be complementary with video file.

The present invention has also carried out further optimization to said method, specifically, using video go out a little as video file splice point, the video of usining enters a little and to spell access point as video file and carry out video-splicing and comprise: whether the frame number of the video file before the frame number that judges spliced video file and splicing mates; If the determination result is NO, the output video infilled frame, so that the frame number of spliced video file is complementary with the frame number that splices front video file, wherein the video infilled frame comprises blank screen frame and/or quiet frame.

Further, using audio frequency go out a little as audio file splice point, the audio frequency of usining enters a little and to spell access point as audio file and carry out audio splicing and comprise: whether the frame number of the audio file before the frame number that judges spliced audio file and splicing mates; If the determination result is NO, the output audio infilled frame, so that the frame number of spliced audio file is complementary with the frame number that splices front audio file.

Specifically, in the audio frequency and video splicing, if before and after splicing, the audio frequency and video frame number does not mate, after splicing finishes, the main program time can shift to an earlier date or lag behind.The frame number that frame number coupling refers to frame number that main video abandons in the process be inserted into and the insertion of splicing video is identical.Preferably, if the frame number coupling is false, splices video and be put into the buffering area wait, the output of video infilled frame, splice video until main video meets after frame mates to export.

In addition, because the splicing at video goes out a little and enters, a little all likely produce the frame fragment, these fragments may affect sound effect.Preferably, can adopt according to time relationship the method for audio frame polishing is processed, concrete steps are as follows: according to video go out a little and enter a little between time difference, calculate the time difference of current audio frequency between going out a little and enter a little, and according to the corresponding audio frame of this time difference polishing.For the audio frame fragment of stitching position, can carry out this audio frame of polishing by the mode that repeats upper full audio frame data.

The present invention has also carried out further optimization to above-mentioned method, preferably, using audio frequency go out a little as audio file splice point, the audio frequency of usining enters a little to spell as audio file the process that access point carries out audio splicing and also comprise: video file PTS is arranged and synchronizes with audio file PTS.Specifically, synchronous by Voice & Video, namely the current sound playing should be currently to broadcast the sound same period of image.Voice & Video synchronously can be by synchronously realizing between video PTS and audio frequency PTS.This is adjusted as long as the PTS of assurance Voice & Video is synchronous.For the audio frame of stitching position, because likely repeat the previous frame data, may not be inconsistent with video, still the time of frame data only has a few tens of milliseconds, can not consider.

Embodiment 2

On the basis of embodiment 1, the present embodiment provides a kind of audio-video document splicing apparatus, and particularly, as shown in Figure 4, this device comprises:

The first acquiring unit 402, for obtaining, the video that splices the indicated TS fluid sound video file of instruction goes out a little and audio frequency goes out a little; Preferably, the TS flow data that a TS fluid sound video file is the keynote video file, also can be described as TS1, and preferred, in the output procedure of audio frequency and video, the TS1 deblocking installs to the ES layer, and the input ES data of TS1 are kept in main video PIP and main audio PIP.

Second acquisition unit 404, for obtaining, the 2nd indicated TS fluid sound video file video of splicing instruction enters a little and audio frequency enters a little; Preferably, the 2nd TS fluid sound video file, for the TS flow data of splicing video file, also can be described as TS2, and preferred, in the output procedure of audio frequency and video, the TS2 deblocking installs to the ES layer, and the input ES data of TS2 are kept in splicing video PIP and splicing audio frequency PIP.Preferably, in the splicing of audio frequency and video, TS2 is inserted in TS1, has critical point, as shown in Figure 2, this critical point is called the spelling access point of TS2, simultaneously, is also that the splicing of TS1 goes out a little.

The first judging unit 406, for judging whether that video goes out a first predetermined condition and video enters a second predetermined condition, wherein, the first predetermined condition comprises: the decode procedure of the image before video goes out a little do not rely on video go out a little after image information and the video image before going out a little all can show, the second predetermined condition comprises: the decode procedure that video enters a little image does not afterwards rely on video and enters some the image that image information and video before enter a little afterwards and all can show;

The first concatenation unit 408, for when the first judging unit 406 is judged as YES, the video of usining goes out a little as video file to splice a little, and the video of usining enters a little as video file, to spell access point and carry out video-splicing.

The present embodiment also improves said apparatus, particularly, as shown in Figure 5, this device also comprises: the second judging unit 502, for judging whether that audio frequency goes out a little satisfied and audio frequency and enters a little to meet the 4th predetermined condition, wherein, the 3rd predetermined condition comprises: the PTS(Presentation Time Stamp of the frame of the audio file in a TS fluid sound video file, displaying time stamp) be more than or equal to video and go out PTS a little; The 4th predetermined condition comprises: the PTS of the frame of the audio file in the 2nd TS fluid sound video file is more than or equal to video and enters PTS a little; The second concatenation unit 504, in the second judgment unit judges result when being, the audio frequency of usining go out a little as audio file splice point, the audio frequency of usining enters a little as audio file, to spell access point and carry out audio splicing.

The present invention has also carried out further optimization to said method, and specifically, as shown in Figure 6, the first concatenation unit 408 comprises: the first judge module 602, and for the frame number and the splicing that judge spliced video file, whether the frame number of front video file mates; The first packing module 604, when at the first judge module 602, the determination result is NO, the output video infilled frame, so that the frame number of the video file before the frame number of spliced video file and splicing is complementary, wherein the video infilled frame comprises blank screen frame and/or quiet frame.

The present invention also is optimized above-mentioned the second concatenation unit 504, specifically, as shown in Figure 7, the second concatenation unit 504 comprises: the second judge module 702, and for the frame number and the splicing that judge spliced audio file, whether the frame number of front audio file mates; The second packing module 704, when at the second judge module, the determination result is NO, the output audio infilled frame, so that the frame number of the audio file before the frame number of spliced audio file and splicing is complementary.

Further, the second concatenation unit 504 also comprises: lock unit, and for video file PTS is synchronizeed and is adjusted with audio file PTS.Specifically, synchronous by Voice & Video, namely the current sound playing should be currently to broadcast the sound same period of image.Voice & Video synchronously can be by synchronously realizing between video PTS and audio frequency PTS.This is adjusted as long as the PTS of assurance Voice & Video is synchronous.For the audio frame of stitching position, because likely repeat the previous frame data, may not be inconsistent with video, still the time of frame data only has a few tens of milliseconds, can not consider.

Embodiment 3

On the basis of above-described embodiment 1 and embodiment 2, the present embodiment also provides a kind of audio-video document splicing system, specifically, as shown in Figure 8, in this system, audio-video source file to be spliced is processed accordingly through video and audio material working apparatus to be spliced, as, revise the content of source file and form to guarantee and the main program format match, and be output as the target video file; The target video file of output is packaged into the target video file through video and audio player to be spliced the TS stream (TS1) that can play, carry target video and target audio in TS stream, and by different PID(Packet Identifier, Packet Identifier) distinguish.Control the parameters of splicing by splicing controller, such as time parameter (zero-time, dwell time, duration), program parameter etc., seamless spliced device flows to (TS2) row splicing by the TS stream be packaged into other TS according to the control parameter of splicing controller, and exports spliced TS stream.

In carrying out seamless spliced process, as TS2 is inserted in TS1, there is critical point, this critical point is called the spelling access point of TS2, is also that the splicing of TS1 goes out a little.

Splice a little and need meet the following conditions simultaneously:

(1) decoding of all images before splicing does not a little need image information after this;

(2) there do not is the image that can't show.

Spelling access point need meet the following conditions simultaneously:

(1) decoding of all images after spelling access point does not need image information before this;

(2) there do not is the image that can't show.

In the seamless spliced process of video layer, due to GOP(Group of Picture, the picture group) there is B frame (Bidirectionally-predictive coded pictures, bi-directional predicted frames) time, non-B frame will postpone to show, particularly, as shown in Figure 9 and Figure 10, Fig. 9 illustrates a kind of transmission sequence of opening the frame of gop structure, Figure 10 illustrates a kind of transmission sequence that closes the frame of gop structure, can find out in the drawings, in the situation that there is the B frame, non-B frame (I frame, the P frame) to postpone to show, thus, can draw the following conclusions: (the Predictive coded pictures of the P frame in the GOP in transmission sequence, forward predicted frame) or I frame (Intra coded pictures, intra-coded frame) frame before is one and splices a little, spell the starting position of access point in the GOP group.Preferably, some parameters have been provided in the sequence head of video sequence, as parameters such as high, wide, frame rate, code check and video formats, these parameters are very large for the impact of decoding, therefore if realize seamless link, must carry out some in sequence level and process, the video sequences different for parameter are not considered Bonding Problem, or are spliced at sequence head.In addition, can be in video and audio material working apparatus to be intercutted modifying target video sequence and main video matching.

Carry out in seamless spliced process in audio layer, because audio frame adopts intraframe coding method, there is no associated between frame and frame, process relatively simple.Splicing at video goes out a little and enters a little all likely to produce the frame fragment, and these fragments may affect sound effect.Can adopt according to time relationship the method for audio frame polishing is processed, specific as follows:

(1) according to video go out a little and enter a little between time difference, calculate the time difference of current audio frequency between going out a little and enter a little, and according to the corresponding audio frame of this time difference polishing;

(2) for the audio frame fragment of stitching position, can carry out this audio frame of polishing by the mode that repeats upper full audio frame data.

In addition, synchronous about Voice & Video,, the current sound playing should be currently to broadcast the sound same period of image, adopt following scheme: Voice & Video synchronously can pass through video PTS (Presentation Time Stamp, displaying time stamp) and synchronously realizing between audio frequency PTS, specifically, as long as guarantee synchronous adjustment of PTS of PTS and the video of audio frequency.For the audio frame of stitching position, because likely repeat the previous frame data, may not be inconsistent with video, still the time of frame data only has a few tens of milliseconds, does not consider.

At the TS level, adjust and to splice a little and to spell the time difference between access point, allow the image before splicing a little that time enough demonstration image is arranged, and to allow into image can show image at reasonable time.Video standard specifies all will show at reasonable time at the decoding display end through the image of coding, its time standard relied on is PCR (Program Clock Reference, program clock reference), and there is DTS (Decoding Time Stamp in image, decoded time stamp), PTS, the decode time of separate provision image and displaying time.Spell the access point place and need to revise the continuous broadcasting that DTS, PTS keep TS stream.

Below introduce seamless spliced internal process:

TS1 and TS2 deblocking are installed to ES (Elementary Stream, Basic Flow) layer.The input ES data of TS1 are kept at the abbreviation of main video PIP(pipeline PIPE) and main audio PIP in.The input ES data of TS2 are kept in splicing video PIP and splicing audio frequency PIP.

Figure 11 shows the seamless spliced a kind of flow chart of video, specific as follows:

Main video and splicing video are after pipeline is got the ES data, if do not receive the external splice order, main video data is normally exported, and the splicing video data abandons.

After receiving the external splice order, main video judges whether to meet and splices a condition.If do not meet and splice a condition, main video data is normally exported.Splice a condition if meet, main video carries out the frame number matching judgment.

After receiving the external splice order, the splicing video judges whether to meet the access point condition of spelling.If do not meet and splice into a condition, the splicing video data abandons.If meet the access point condition of spelling, the splicing video carries out the frame number matching judgment.

The video transition frame is filled and is referred to blank screen frame or the quiet frame inserted in order to guarantee the frame number coupling in splicing.If frame number does not mate, after splicing finishes, the main program time can shift to an earlier date or lag behind.The frame number that frame number coupling refers to frame number that main video abandons in the process be inserted into and the insertion of splicing video is identical.If the frame number coupling is false, to splice video and be put into the buffering area wait, the output of video infilled frame, splice video until main video meets after frame mates to export.

In addition, because the B frame adopts bi-directional predicted mode, the B frame in first GOP of splicing video between I frame and first P frame will remove, otherwise the prediction of the forward reference frame of B frame mistake in using, in splice point, place there will be mosaic.

Figure 12 shows the seamless spliced a kind of flow chart of audio frequency, specific as follows:

Main audio and splicing audio frequency are after pipeline is got the ES data, if do not receive that video has spliced order, main audio data is normally exported, and the splicing voice data abandons.

After receiving that video has spliced order, main audio judges whether to meet and splices a condition, and the splicing audio frequency judges whether to meet the access point condition of spelling.

Wherein, the condition that main audio splices an establishment is: the PTS >=Video_pts_outpoint of main audio frame, and wherein, Video_pts_outpoint is that video PTS goes out a little;

Preferably, to splice be some the main audio frame met the following conditions to best main audio:

The PTS+ audio frequency single frames PTS interval of the PTS >=Video_pts_outpoint of main audio frame >=main audio frame;

Preferably, if the main audio frame does not meet and splices a condition, lose the main audio frame of PTS<Video_pts_outpoint, the splicing condition is met; If main audio frame PTS is much larger than Video_pts_outpoint, need to increase the buffering of main audio, slow down the processing speed of audio frame, with and video matching

The condition that the splicing audio splicing enters an establishment is: the PTS >=Video_pts_inpoint of splicing audio frame, and wherein, Video_pts_inpoint is that video PTS enters a little;

Preferably, to enter be a little the splicing audio frame met the following conditions to best splicing audio splicing: the PTS+ audio frequency single frames PTS interval of splicing the PTS >=Video_pts_inpoint of audio frame >=splicing audio frame;

If the splicing audio frame does not meet and splices a condition, lose the main audio frame of PTS≤Video_pts_outpoint, the splicing condition is met; If audio frame PTS is much larger than Video_pts_outpoint in splicing, need to increase the buffering of splicing audio frequency, slow down the processing speed of audio frame, with and video matching.

When video is filled, audio frequency also will be filled, and the starting and ending that audio frequency is filled and video are filled according to the PTS time synchronized.

For the TS layer, process:

The TS layer of main video carries a PCR, is designated as PCR1.The TS layer of splicing video carries a PCR, is designated as PCR2.In seamless spliced process of the present invention and outside seamless spliced process, all use PCR1 as Program Clock Reference.The value of DTS and PTS is all to produce with reference to PCR, so will revise DTS and PTS adapts to PCR1 in the process of splicing video playback.Simultaneously, because the B frame between I frame and first P frame in first GOP of splicing video will remove, the decode time DTS of splicing video needs to revise.DTS and PTS carry in the PES grouping, to PES(program elementary stream, program flow) while dividing into groups to adjust, revise this two times.

Can find out from the above description, the present invention adopts self-defining spelling access point and splices a little, in carrying out the video-splicing process, after getting the splicing instruction, judge that the indicated video of this splicing instruction enters a little and video goes out a little whether to meet self-defining spelling access point and splice a little required condition, meet self-defining spelling access point and while splicing condition a little judging, without former program is decoded and encoded, directly TS fluid sound video file is spliced, solved in the correlation technique in digital program insertion technology the decoded problem of the image quality decrease of encoding again and causing of former program, and then reached the effect that increases picture quality after splicing, further, also provide cost savings.

Obviously, those skilled in the art should be understood that, above-mentioned each module of the present invention or each step can realize with general calculation element, they can concentrate on single calculation element, perhaps be distributed on the network that a plurality of calculation elements form, alternatively, they can be realized with the executable program code of calculation element, thereby, they can be stored in storage device and be carried out by calculation element, and in some cases, can carry out step shown or that describe with the order be different from herein, perhaps they are made into respectively to each integrated circuit modules, perhaps a plurality of modules in them or step being made into to the single integrated circuit module realizes.Like this, the present invention is not restricted to any specific hardware and software combination.

The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims

1. an audio-video document joining method, is characterized in that, comprising:

The video that obtains the indicated TS fluid sound video file of splicing instruction goes out a little and audio frequency goes out a little;

Obtain in indicated the 2nd TS fluid sound video file of splicing instruction that video enters a little and audio frequency enters a little;

Judge whether that described video goes out a first predetermined condition and described video enters a second predetermined condition, if, the described video of usining goes out a little and splices a little as video file, the described video of usining enters a little to spell access point as video file and carry out video-splicing, wherein, described the first predetermined condition comprises: the decode procedure of the image before described video goes out a little do not rely on described video go out a little after the image that goes out a little before of image information and described video all can show; Described the second predetermined condition comprises: the decode procedure of the image after described video enters a little do not rely on described video enter a little before the image that enters a little afterwards of image information and described video all can show.

2. method according to claim 1, is characterized in that, after obtaining that described audio frequency enters a little and described audio frequency goes out a little, the method also comprises:

Judge whether that described audio frequency goes out a little to meet the 3rd predetermined condition and described audio frequency and enters a little to meet the 4th predetermined condition, if, the described audio frequency of usining goes out a little and splices a little as audio file, the described audio frequency of usining enters a little to spell access point as audio file and carry out audio splicing, wherein, described the 3rd predetermined condition comprises: the PTS of the frame of the audio file in a described TS fluid sound video file is more than or equal to described video and goes out PTS a little; Described the 4th predetermined condition comprises: the PTS of the frame of the audio file in described the 2nd TS fluid sound video file is more than or equal to described video and enters PTS a little.

3. method according to claim 1 and 2, is characterized in that, the described video of usining goes out a little as video file to splice a little, and the described video of usining enters a little to spell as video file the step that access point carries out video-splicing and comprise:

Whether the frame number that judges spliced video file mates with the frame number that splices front video file;

If the determination result is NO, the output video infilled frame, so that the frame number of spliced video file is complementary with the frame number that splices front video file, wherein said video infilled frame comprises blank screen frame and/or quiet frame.

4. method according to claim 2, is characterized in that, the described audio frequency of usining goes out a little as audio file to splice a little, and the described audio frequency of usining enters a little to spell access point as audio file and carry out audio splicing and comprise:

Whether the frame number that judges spliced audio file mates with the frame number that splices front audio file;

If the determination result is NO, the output audio infilled frame, so that the frame number of spliced audio file is complementary with the frame number that splices front audio file.

5. method according to claim 2, is characterized in that, the described audio frequency of usining go out a little as audio file splice point, the described audio frequency of usining enters a little to spell access point as audio file and carry out audio splicing and also comprise:

Video file PTS is synchronizeed and adjusted with audio file PTS.

6. an audio-video document splicing apparatus, is characterized in that, comprising:

The first acquiring unit, for obtaining, the video that splices the indicated TS fluid sound video file of instruction goes out a little and audio frequency goes out a little;

Second acquisition unit, for obtaining, the 2nd indicated TS fluid sound video file video of splicing instruction enters a little and audio frequency enters a little;

The first judging unit, for judging whether that described video goes out a first predetermined condition and described video enters a second predetermined condition, wherein, described the first predetermined condition comprises: the decode procedure of the image before described video goes out a little do not rely on described video go out a little after image information and the described video image before going out a little all can show, described the second predetermined condition comprises: the decode procedure that described video enters a little image does not afterwards rely on described video and enters some the image that image information and described video before enter a little afterwards and all can show;

The first concatenation unit, in described the first judgment unit judges when being, the described video of usining goes out a little as video file to splice a little, the described video of usining enters a little as video file, to spell access point and carry out video-splicing.

7. device according to claim 6, is characterized in that, also comprises:

The second judging unit, for judging whether that described audio frequency goes out a little satisfied and described audio frequency and enters a little to meet the 4th predetermined condition, wherein, described the 3rd predetermined condition comprises: the PTS of the frame of the audio file in a TS fluid sound video file is more than or equal to described video and goes out PTS a little; Described the 4th predetermined condition comprises: the PTS of the frame of the audio file in described the 2nd TS fluid sound video file is more than or equal to described video and enters PTS a little;

The second concatenation unit, in described the second judgment unit judges result when being, the described audio frequency of usining go out a little as audio file splice point, the described audio frequency of usining enters a little as audio file, to spell access point and carry out audio splicing.

8. device according to claim 6, is characterized in that, described the first concatenation unit comprises:

The first judge module, for the frame number and the splicing that judge spliced video file, whether the frame number of front video file mates;

The first packing module, for at described the first judge module when the determination result is NO, the output video infilled frame, so that the frame number of spliced video file is complementary with the frame number that splices front video file, wherein said video infilled frame comprises blank screen frame and/or quiet frame.

9. device according to claim 7, is characterized in that, described the second concatenation unit comprises:

The second judge module, for the frame number and the splicing that judge spliced audio file, whether the frame number of front audio file mates;

The second packing module, when at described the second judge module, the determination result is NO, the output audio infilled frame, so that the frame number of the audio file before the frame number of spliced audio file and splicing is complementary.

10. device according to claim 7, is characterized in that, described the second concatenation unit also comprises:

Lock unit, for synchronizeing video file PTS to adjust with audio file PTS.