Background technology
Digit broadcasting system, because overall process is all handled by digital signal, therefore, image quality is very distinct; In addition, can realize high tone quality by the audio signal of segmentation from the bass field to the high pitch field.And, nearest digital broadcast television, by surface wave, satellite, multiple medium such as cable TV promptly are applied and promote.
In order to receive this digital broadcasting, digital television receiver must carry out the test of function or performance by receiving the signal that broadcasts from broadcast system.Utilize the test of actual broadcast system, though can carry out the test of operating aspect, yet broadcast system do not provide and be used for carrying out the specific function test, perhaps for setting the required distinctive signal of performance of receiver.Therefore, the exploitation of digital television receiver and manufacturing company, test and the performance evaluation in order to carry out this function all oneself is constructed a virtual broadcast system, and is utilized this system to carry out the test of digital television receiver usually.
Fig. 1 shows the system diagram that utilizes the virtual broadcast system to carry out the digital television receiver test.In transport stream maker 101, generation is used for estimating the function of digital television receiver or the regulation audio/video test of performance is flowed; In channel encoder 102, test is flow to line frequency road coding; In channel up converter 103, the test stream modulation of coding channel and up conversion are arrived after the radio frequency level, export again; By digital television receiver, that is: set-top box 104, receive above-mentioned radiofrequency signal, utilize oscilloscope 105 measurement mechanisms such as grade then, perhaps HDTV/PC (high definition digital television receiver/personal computer) monitor 106 carries out function or performance evaluation to the test stream that receives.
This when the production figures television receiver test stream of necessity, its length should be followed in actual broadcast the same, answers Hold to be only.In addition, in order to estimate the performance of digital television receiver, need the test stream of many types; At this moment, a lot of channel and lot of data will be produced.Owing to can store the finite capacity of the computer storage media may of digital television receiver test stream, the length of transport stream must be minimized, and can remove to play limited test stream repeatedly.When the test of this finite length of repeat playing stream, must manage to make video/audio to flow the reproduction time of (ES) substantially, with the reproduction time of transport stream (TS) is correct, consistent can.That is to say, when repeat playing test stream, if transport stream and basic stream are inconsistent, can cause video and audio frequency to become discontinuous so, might cause for this reason in the digital television receiver production scene, tested product is mistaken as defective; Therefore need one and can make the basic stream means consistent with transport stream.
In addition, digital television receiver receives the Voice ﹠ Video resource (Resource) that is made of digital video frequency flow, and decodes, and by to after the Voice ﹠ Video signal processing, exports (audio frequency is exported, and video shows) more then; At this moment, if life period is poor between generation audio signal and the vision signal, image and sound can not be complementary so, can feel nature when audiovisual.
This Voice ﹠ Video signal that makes becomes consistent, is called lip synchronization (Lip-sync).In digital television receiver, what have is built-in with the television set (hardware and/or software) that makes lip synchronization.A method of having known exactly in needing the digital television receiver of lip synchronization, is utilized the method for the temporal information of Voice ﹠ Video signal.This method is, the temporal information of recorde audio signal and vision signal is used both time difference then, if audio frame is preceding more or lag behind frame of video, and its time difference surpass the setting-up time of regulation, so just repeat or delete audio frame, with the realization lip synchronization at that time.
But, concerning the digital television receiver of having produced, whether whether no matter top said lip synchronization controlled function to be arranged, all must utilize above-mentioned various testing apparatus to carry out actual test, whether the unanimity of the audio and video stream of affirmation digital television receiver.In the digital television receiver system, this lip synchronization problem is a material particular that obtain considering carefully.Lip synchronization is will be in digital television receiver, just can obtain checking by the time difference of audio signal and vision signal is measured.
For example, the time difference of audio signal and vision signal is in 25 milliseconds to 40 milliseconds scope the time, that is to say in preceding more 25 milliseconds of the audio frequency, when perhaps lagging behind in 40 milliseconds than video, the broadcast that can regard Voice ﹠ Video as is mated mutually, can audiovisual to the image and the sound of nature.
At present,, measure the time difference of Voice ﹠ Video signal, utilize a kind of use video sequential of special formation and the method for audio frequency sequential, to carry out the test of lip synchronization in digital television receiver.That is to say that vision signal is by black part (black part), white portion (white part), short-term portion (short line part) constitutes; Insert in short-term portion and be used for the synchronous signal of vision signal, the audio-frequency test signal then utilizes a kind of on sync bit, and the signal of the single-tone form that reduces with exponential function is to carry out sound image synchronous detecting.
But the existing problem of this method is, for sound image synchronous detecting, and the video sequential of a special special use of the pattern of wants and audio frequency sequential.And the digital television receiver of production for carrying out sound image synchronous detecting, needs attachment device and technology separately; This will directly relate to the reduction of the inconvenience and the production efficiency of test job.
Summary of the invention
The vision signal that the object of the present invention is to provide a kind of utilization to have the audio signal of time tag and have a time tag is tested the lip synchronization method and the device thereof of digital television receiver.
Especially, the object of the present invention is to provide a kind of digital television receiver sound image synchronous detecting method and device thereof, in vision signal and audio signal, insert the time tag signal, and utilize measurement mechanism such as oscilloscope to measure the Voice ﹠ Video signal waveform that has time tag, therefore can test the lip synchronization of digital television receiver very easily.
Another object of the present invention is to provide a kind of digital television receiver sound image synchronous detecting method and device thereof, in vision signal and audio signal, insert the time tag signal, and utilize measurement mechanisms such as oscilloscope to measure both time differences, with the field/frame number that obtains from time tag, calculate the audio/video time difference, thereby very easily the lip synchronization of digital television receiver is tested.
A further object of the invention is to provide a kind of method of testing and device thereof of digital television receiver lip synchronization, gets the Voice ﹠ Video resource ready, and inserts flag of frame signal and TATS respectively in the Voice ﹠ Video signal, again it is compressed into digital bit stream; In digital television receiver, digital audio and video stream is decoded, with measurement mechanisms such as oscilloscopes its waveform is measured again, thereby can be tested the lip synchronization of digital television receiver very easily.
To achieve the above object, utilization of the present invention has the Voice ﹠ Video signal of time tag, and digital television receiver is carried out the method for sound image synchronous detecting, has the following stage, that is: generate to be inserted with the digital audio of time tag and the stage of video flowing; The stage that the waveform of the Voice ﹠ Video signal that has time tag is compared; Utilization has the time tag of the Voice ﹠ Video signal of time tag, the stage of the time difference of measurements and calculations Voice ﹠ Video.
And, in the sound image synchronous detecting method of digital television receiver of the present invention, n number the audio frame time t that records according to the audio frame waveform that has time tag
a, utilize following formula to calculate, that is:
t
a=audio frequency 1 frame time * n[second]
And, in digital television receiver sound image synchronous detecting method of the present invention, m number the video field time t that records according to the frame of video waveform that has time tag
v, utilize following formula to calculate, that is:
t
v=m/ field frequency (field/second) [second]
And in digital television receiver sound image synchronous detecting method of the present invention, the time tag of audio frame and video field is at official hour, carries out loopback (loop-back).
And, in digital television receiver sound image synchronous detecting method of the present invention, can measure Voice ﹠ Video time difference dt, be made as t when n number audio frame time
a, the time of corresponding m video field is made as t
v, the Voice ﹠ Video time difference of utilizing measurement mechanism to record is made as t
Dav, the DTS initial value is made as t
DTSoffset, utilize following formula to calculate so, that is:
d
t=t
a-t
v-t
dav-t
DTSoffset
And in digital television receiver sound image synchronous detecting method of the present invention, the audio frame sign is made of the waveform that inserts the regulation number at each official hour corresponding to each audio frequency.
The line that and in digital television receiver sound image synchronous detecting method of the present invention, the time tag TATS of vision signal utilizes each frame of video, relevant with the TA field (initial 8) inserts that the field mark form of 4 grades of level constitutes.
To achieve the above object, digital television receiver sound image synchronous detecting device of the present invention, has the following structures feature, that is: be provided with audio frequency Time Calculation portion, be used for detecting the time tag signal that is included in the audio signal, identify the sequence number of audio frame, and calculate relevant audio frame time t by the sequence number of audio frame
aBe provided with the video time calculating part, be used for detecting the time tag signal that is included in the vision signal, the sequence number of identification associated video field, and calculate the time t of associated video field by the sequence number of video field
vAlso be provided with measurement section, be used for measuring the time difference of Voice ﹠ Video signal; Also be provided with the lip synchronization judegment part, utilize the time difference of the Voice ﹠ Video signal that records and related audio frame time and the video field time that calculates, calculate lip synchronization time d
t
Audio frequency Time Calculation portion in the digital television receiver sound image synchronous detecting device of the present invention has the following structures feature, is provided with audio frequency time tag test section that is:, is used for detecting the time tag signal that is included in the audio signal; Be provided with audio frame sequence number judegment part, be used for the audio frequency time tag signal that detects is decoded, and determine the audio frame sequence number; Be provided with operational part, the audio frame sequence number that utilization obtains and the time of a frame of audio frequency, calculate relevant audio frame time t
a
Video time calculating part in the digital television receiver sound image synchronous detecting device of the present invention has the following structures feature, is provided with video time Mark Detection portion that is:, is used for detecting the time tag signal that is included in the vision signal; Be provided with the video field sequence judegment part, be used for the video time marking signal that detects is decoded, to differentiate associated video field sequence number; Also be provided with operational part, utilize the field frequency of the video field sequence obtain number and video, calculate associated video frame time t
v
The present invention obtains the temporal information of Voice ﹠ Video signal by the waveform of the Voice ﹠ Video signal of the free sign of pull-in range, the time difference of the Voice ﹠ Video signal that utilization records, carry out test, therefore can carry out the sound image synchronous detecting of digital television receiver very easily lip synchronization.
In addition, the present invention utilizes video TATS and audio frequency auxiliary channel to carry out sound image synchronous detecting, so it has one not require the advantage of additional sound image synchronous detecting program separately.Especially, digital television receiver sound image synchronous detecting method of the present invention also has a strong point, exactly when measurement has the Voice ﹠ Video signal waveform of time tag, irrelevant with zero-time, a time in office, Voice ﹠ Video signal waveform with seizure is an object, can carry out sound image synchronous detecting.
Embodiment
Below, the example to sound image synchronous detecting method of the present invention describes with reference to accompanying drawing.
Fig. 2 shows digital television receiver sound image synchronous detecting method of the present invention with the form of drawing.
Voice ﹠ Video resource generation phase S201 is the stage that generates audio signal and vision signal.Audio signal generates audio frequency auxiliary channel (AdditionalAudio Sub-channel) by computer simulation, and vision signal then in order to test, generates the vision signal form of predetermined pattern.
Time tag inserts stage S202, is to insert the flag of frame that generates audio signal to constitute the audio signal stage (please refer to Fig. 6) that has time tag; In addition, this stage is again to insert the field mark that generates vision signal to constitute the vision signal stage (please refer to Fig. 4 and Fig. 5) that has time tag.About audio signal that has time tag and the vision signal form that has time tag,, describe in detail more with reference to later Fig. 4 to Fig. 8.
Audio and video stream generation phase S203 is to have the audio signal of time tag and the stage that vision signal generates audio and video stream by compression.Final audio and video stream is undertaken constituting transport stream (transport stream) after the multipath conversion by devices such as MPEG (Standard of image compression) analytical equipments.
Like this audio and video stream that constitutes is wanted the digital television receiver tested with being input to, and is obtained decoding by Voice ﹠ Video decode phase S204.
The Voice ﹠ Video decoding, as is generally known like that, be in digital television receiver, to receive and syntactic analysis (parsing) to transport stream, and it is separated into the Voice ﹠ Video signal respectively, decode, pass through the Voice ﹠ Video signal processing then, distinguish output sound and image again.
Voice ﹠ Video signal waveform measuring phases S205 is for measuring, utilizing measurement mechanisms such as oscilloscope to catch the stage of decoded Voice ﹠ Video signal waveform.In stage, try to achieve the time difference t of Voice ﹠ Video signal at this waveform measurement
Dav
Voice ﹠ Video signal time information calculations stage S206 is to utilize to have the audio signal of time tag and the time mark information of vision signal (waveform), calculates audio frequency time t
aWith video time t
vStage.About audio frequency time t
a, to elaborate later on; Here briefly, it is the time mark information of elder generation according to the audio signal waveform, obtains frame number n, tries to achieve by this audio frame sequence number n is carried out calculation process with the audio frame time then; Video time t
vThen utilize video field sequence m and video field frequency (field/sec, field/second), carry out trying to achieve after the calculation process corresponding to audio frame n.
Poor (the t of Voice ﹠ Video signal time
a-t
v) calculation stages S207 is according to audio frequency time and video time, the Voice ﹠ Video time difference that records, and the initial value t of DTS (Decoding TimStamp, decode time mark)
DTSoffsetDeng, try to achieve final lip synchronization time (d
t) stage.DTS is for to being included in a decode value of gained of temporal information in video PES (Packetized ElementaryStream, the packetized elementary stream) packets of information.
The time difference d of Voice ﹠ Video
tBe to utilize n number audio frame time t
a, the m video field time t corresponding with it
v, the Voice ﹠ Video time difference t that uses measurement mechanism to record
Dav, and DTS initial value t
DTSoffset, and utilize following mathematical expression to calculate, that is:
d
t=t
a-t
v-t
dav-t
DTSoffset
As mentioned above, utilize measurement mechanism such as oscilloscope to measure the waveform of the Voice ﹠ Video signal that has temporal information, utilize the time mark information that records to try to achieve the Voice ﹠ Video time difference again; Can carry out sound image synchronous detecting in other words.
Below, arrive Fig. 9, bright for instance above-mentioned audio frequency time tag, video time sign, and the sound image synchronous detecting method of measured waveform with reference to Fig. 3.
At first, Fig. 3 shows 1920 * 1080 picture format with the drawing form.In this structure, demonstration field (Clean Aperture, clear window) 300 is of a size of 1888 * 1062, be provided with horizontal clear area (Horizontal Blank) 301 at the left end that shows the field, be provided with vertical blank district (Vertical Blank) 302 in the upper end, be respectively equipped with TA field (Transient Effect Area, 9 pixel) 303 at the two ends up and down in demonstration field 300, be respectively equipped with the TA field 304 of 16 pixels at the two ends, the left and right sides that show the field.
In the present invention, each frame of video in vertical TA field, the same with the VITS (Vertical Interval Test Signals) of simulated television system, insert TATS, thereby just inserted the sign of dependent field.That is to say that insert TATS on 8 initial lines of frame of video, this TATS inserts with different values in each, with expression field/frame.
Fig. 4 shows TATS with the drawing form.As shown in Figure 4, the video time sign that the present invention uses adopts 4 grades of level to come mark associated video field sequence number.That is to say that mark adopts the quarternary meaning; Utilize 0,1,2,3 the 401 mark field marks of the sign waveform with four level.Under the situation of interlace mode, two fields constitute a frame, and the example that goes out as shown in Figure 4 is such, supposes that the field mark value is ' 02134 ', and it represents the field No. 39 so, just represents the even field of No. 19 frame.
Fig. 5 shows the video sequential 501 that comprises TATS.As shown in Figure 5, vision signal frame is made of two video fields as can be seen; It can also be seen that from first, field mark is with 0,1, four level (quaternary) of 2,3 are the basis, successively mark in addition.
If a frame of video is 33.3 milliseconds, a field should be 16.66 milliseconds so.
Fig. 6 shows the time tag that the present invention is inserted into audio frame.As shown in Figure 6, in each audio frame,, insert the sign waveform of regulation for the marker frame sign.That is to say, remove to represent to comprise a kind of audio frequency auxiliary channel (AdditionalAudio Sub-channel) of the flag of frame relevant with frame number.
As can be seen, the audio frame sign is 32 milliseconds of formations with a frame from the example of Fig. 6; (insertion) audio frame marking signal of mark on each frame is that the number 602 by temporal information 601 and signal constitutes.
For example, first audio frame has been crossed after one millisecond, has inserted the sine wave of one-period, and second audio frame has been crossed one millisecond of sine wave that has inserted two cycles afterwards.
Now, put in order and up to this point describe, the relation of video time sign and audio frequency time tag can know that so they have as shown in Table 1 relation.
[table]
Video field sequence number | TATS (time tag value) | The audio frame sequence number | The time of time tag | The marking signal number |
1 | 0000 | 1 | 1 millisecond | 1 cycle |
2 | 0001 | 2 | 1 millisecond | 2 cycles |
3 | 0002 | | | |
4 | 0003 | | | |
5 | 0010 | 10 | 1 millisecond | 10 cycles |
| | 11 | 2 milliseconds | 1 cycle |
21 | 0110 | 12 | 2 milliseconds | 2 cycles |
| | | | |
120 | 1313 | | | |
| | 90 | 9 milliseconds | 10 cycles |
164 | 2210 | 91 | 10 milliseconds | 1 cycle |
| | | | |
256 | 3333 | 125 | 12 milliseconds | 5 cycles |
Video field and audio frame sign are to form a loopback with official hour.Specifically, as described in Figure 7, frame of video sequence number 701 and audio frame sequence number 702 for fear of obscuring between synchronously, will form one time loopback per 4 seconds.
Fig. 8 shows the example of the frame of video of inserting TATS.Video time sign, the field mark that promptly illustrated in front in TA field 801, have been inserted as can be seen; Supposed that a frame is to be made of two fields (odd field, even field), therefore can represent with odd field sign 802 and even field sign 803.In Fig. 8, the level value of the TATS in TA field, odd field are ' 32304 '; This is worth explanation, and it is No. 236 field, promptly represents No. 118 flag of frame.
As shown in Table 1, if have a look the video and audio signal that has time tag, that is to say, utilize measurement mechanisms such as oscilloscope to observe, can utilize so following, as time tag, the frame number of audio signal, the field of vision signal (frame) sequence number, frame frequency separately or frame time etc. calculate audio frequency time t in addition
aWith video time t
vAnd utilize this result, can calculate the lip synchronization time of Voice ﹠ Video.
Fig. 9 shows lip synchronization of the present invention and measures the Voice ﹠ Video waveform example of using digital television receiver.Fig. 9 is the waveform that utilizes oscilloscope to measure, and wherein 901 is audio signals, the 902nd, and vision signal.Audio signal 901 is the sine waves that detected one-period in 10 milliseconds, therefore as shown in Table 1, can judge from this time tag signal, and he is No. 91 a audio frame.No. 91 audio frame time t
aCan utilize the audio frame time is according to calculating with 32 milliseconds; Its result of calculation is as follows:
t
a=32 milliseconds/frame * 91=2.912 second
In addition, have a look the time tag TATS that is included in the vision signal 902; Its value is for ' 22104 ', and he is No. 164 field for this value representation, just No. 82 frame.And, the time t of No. 164 video field
v, coming result calculated according to field frequency (as 60/second) is t
v=164/60=2.7333 second.
In addition, the time difference t that records between No. 91 audio frame and No. 164 video field
Dav, be 2 milliseconds as seen from the figure; So lip synchronization time, just the time difference d of Voice ﹠ Video signal
tIf consider DTS initial value t
DTSoffsetBe 0.2 second, its result calculated is as follows so:
d
t=t
a-t
v-t
Dav-t
DTSoffset=2.912-2.733-0.002-0.2=-0.023 second
This test result represents that audio signal lags behind 23 milliseconds than vision signal; This value is within the scope of lip synchronization fiducial value (+25 milliseconds~-40 milliseconds).So the sound image synchronous detecting of this digital television receiver obtain by.
Figure 10 shows the structure of digital television receiver sound image synchronous detecting device of the present invention; This device is the sound image synchronous detecting method that is used for automatically performing aforesaid digital television receiver.
Digital television receiver sound image synchronous detecting device of the present invention is to be made of following components, is provided with audio frequency Time Calculation portion 11 that is:, is used for detecting the time tag signal that is included in the audio signal, differentiates the related audio frame number.And calculate the time t of related audio frame from the audio frame sequence number
aBe provided with video time calculating part 12, be used for detecting the time tag signal that is included in the vision signal, differentiate relevant video field sequence number, and number calculate video field time t from video field sequence
vBe provided with measurement section 13, be used for measuring the time difference of audio signal and vision signal; Also be provided with lip synchronization judegment part 14, be used for utilizing the time difference of the Voice ﹠ Video signal that records and related audio frame time of calculating and video field time, calculate lip synchronization time d
t
Audio frequency Time Calculation portion 11 is to be made of following components, is provided with audio frequency time tag test section 111 that is:, is used for detecting the time tag signal that is included in the audio signal; Be provided with audio frame sequence number judegment part 112, be used for the audio frequency time tag signal that detects is decoded, and differentiate relevant audio frame sequence number; Also be provided with operational part 113, be used for the audio frame sequence number that obtains differentiating and the time of audio frequency one frame are carried out computing, and calculate relevant audio frame time t
a
Video time calculating part 12 is to be made of following components, is provided with video time Mark Detection portion 121 that is:, is used for detection is included in time tag signal in the vision signal; Be provided with video field sequence judegment part 122, be used for the video time marking signal that detects is decoded, and differentiate associated video frame, video field sequence number; Also be provided with operational part 123, utilize the field frequency be judged to other video field sequence number and video, calculate relevant video frame time (t
v).
Below, the course of work of digital television receiver sound image synchronous detecting device of the present invention with said structure is described.
Audio frequency time tag test section 111 as shown in Figure 6, detects the audio frequency time tag signal of insertion from the audio signal of input.Audio frequency time tag signal is on the audio frequency auxiliary channel, insert the sine wave in n cycle with official hour to each audio frame, so this signal is can be detected.
Audio frame sequence number judegment part 112 is that record begins up to the sinusoidal wave time of ending of appearance from the initial point of audio frame, and writes down the number of detected sine wave, therefore can go to differentiate the sequence number of related audio frame.For example, in the waveform example shown in Figure 9, after 10 milliseconds, beginning to count the sine wave of record one-period, is No. 91 audio frame so can decode it from this value.
Operational part 113 is frame times that utilize decoded audio frame sequence number n and audio frequency, for example 32 milliseconds, then as mentioned above, with t
a=32 milliseconds * n[second] calculate relevant audio frame time t
a, output to lip synchronization judegment part 14 again.
The video time marking signal that inserts as shown in Figure 4 and Figure 5, detects in video time Mark Detection portion 121 from the vision signal of input.The video time marking signal is by in each video field, insert other level of 4 levels and constitute, so detected be exactly this level signal.At this moment, because detected video time sign, so can correctly measure both lip synchronization time differences corresponding to audio frame.
Video field sequence judegment part 122 is decoded to the level of the video time marking signal that detects, therefore can determine relevant video field sequence number.For instance, in the waveform example shown in Fig. 9, can decode it according to ' 22104 ' value is No. 164 video field (No. 82 frame of video).
Operational part 123 utilizes the video field sequence m of decoding and the field frequency of video, and for example 60/second, as mentioned above, with t
v=m/60[second] calculate relevant video field time t
v, and be input to lip synchronization judegment part 14.
Measurement section 13 is to measure the relevant audio frame that calculates by audio frequency Time Calculation portion 11 and video time calculating part 12 and the time difference of video field.The method of measuring is, for instance as shown in Figure 9, record begins therefore, can measure the time difference t of Voice ﹠ Video to the time (being 2 milliseconds among Fig. 9) that detects till the relevant field mark from the initial point of audio frame
Dav
Lip synchronization judegment part 14 is according to n number audio frame time t
a, the time t of the m video field corresponding with it
v, the Voice ﹠ Video time difference t that measures
Dav, and DTS initial value t
DTSoffset, by mathematical expression d
t=t
a-t
v-t
Dav-t
DTSoffsetCalculate the lip synchronization time; At this moment, can export the numerical value of result of calculation, can also be whether in the specified standard scope according to the result that calculates, the good still underproof judgement of output.