WO2015059782A1 - 映像検査方法及び音声検査方法 - Google Patents
映像検査方法及び音声検査方法 Download PDFInfo
- Publication number
- WO2015059782A1 WO2015059782A1 PCT/JP2013/078660 JP2013078660W WO2015059782A1 WO 2015059782 A1 WO2015059782 A1 WO 2015059782A1 JP 2013078660 W JP2013078660 W JP 2013078660W WO 2015059782 A1 WO2015059782 A1 WO 2015059782A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video
- inspection method
- value
- error
- power value
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N17/00—Diagnosis, testing or measuring for television systems or their details
- H04N17/004—Diagnosis, testing or measuring for television systems or their details for digital television systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/57—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for processing of video signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/60—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/233—Processing of audio elementary streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
- H04N21/23418—Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
- H04N21/4394—Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/442—Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
- H04N21/44209—Monitoring of downstream path of the transmission network originating from a server, e.g. bandwidth variations of a wireless network
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/21—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N17/00—Diagnosis, testing or measuring for television systems or their details
- H04N2017/006—Diagnosis, testing or measuring for television systems or their details for television sound
Definitions
- the present invention relates to a video inspection method and an audio inspection method capable of detecting video and audio errors included in a digital video / audio signal.
- Patent Document 1 discloses a technique for mechanically detecting block noise by differentiating pixels in a predetermined rectangular block unit.
- Patent Documents 1 and 2 are applied only to video signals subjected to compression / expansion processing, and detect errors caused by all noises such as communication line defects, VTR defect errors, and other failures. The method has not been realized yet. In addition, a technique for accurately inspecting a “puzzle” sound caused by noise in an audio signal has not been realized.
- the video inspection method of the first aspect of the present invention is to sample a continuous digital video signal by dividing it into 20 msec or less, extract a high frequency component from the sampled signal, and generate an error in the video based on the extracted high frequency component. Is detected.
- a continuous digital video signal is sampled by dividing it in a very short time of 20 msec or less, a high frequency component is extracted from the sampled signal, and based on the extracted high frequency component, an actual content and a segment are separated. Separately, it is possible to accurately detect image disturbance and block noise.
- the error is image disturbance and the extracted high-frequency component is an activity that is an average of dispersion values in units of blocks of the digital video signal.
- the error is block noise, and it is preferable to perform orthogonal transform on the pixel value in the inspection block of the video signal and determine that block noise has occurred when the conversion coefficient satisfies a predetermined condition.
- the voice inspection method is to sample a continuous digital audio signal by dividing it into 5 msec or less, extract a high frequency component from the sampled signal, and based on the extracted high frequency component, an error generated in the voice Is detected.
- a continuous digital audio signal is sampled by dividing it into a very short time of 5 msec or less, a high frequency component is extracted from the sampled signal, and based on the extracted high frequency component, it is separated from the actual content. Separately, it is possible to detect voice noise with high accuracy.
- the digital audio signal is recorded on a plurality of channels, it is preferable to detect the error for each channel.
- n power values P n (t) and a total power value P (t) within a predetermined band are obtained.
- the total power value P (t) is the total power value at the previous time (t ⁇ T).
- the value divided by P (t ⁇ T) (P (t) / P (t ⁇ T)) and the total power value P (t) are combined into the total power value P (t + T) at the subsequent time (t + T).
- the value sequence P n (t),..., P n (t + T) falls below the fifth threshold value, it is preferable to determine that sound skipping has occurred.
- the first power value P n (t) along the time axis are compared, the first power value P n (t ⁇ T5) and the third power value P n (t + T + T5) are below the sixth threshold value.
- the second power value sequence P n (t),..., P n (t + T) exceeds the seventh threshold value, it is preferable to determine that noise has occurred.
- the present invention it is possible to provide a video inspection method for detecting a video error due to noise generated due to various causes in a digital video signal, and to prevent noise generated due to various causes in a digital audio signal. It is possible to provide a sound inspection method for detecting a sound error caused by the sound.
- FIG. 1 is a block diagram of a video / audio inspection device 10.
- FIG. (A) It is a figure which shows the flame
- B It is a figure which shows the area
- (A) It is a figure which shows the flame
- B) It is a figure which shows the relationship between a test
- FIG. 1 is a block diagram of the video / audio inspection apparatus 10.
- the video / audio inspection apparatus 10 includes an input unit 11 that inputs a digital video / audio signal, an extraction unit 12 that extracts a high-frequency component from the input digital video / audio signal and performs an operation, and an extraction result of the extraction unit 12
- a comparison / determination unit 13 that performs comparison with a threshold value based on the result and determines whether or not an error has occurred in video or audio
- a control unit 14 that sets a threshold value or the like for the comparison / determination unit 13, -It has the output part 15 which outputs an alarm according to the determination result of the determination part 13.
- Video disturbance detection “Disturbance of video” refers to a phenomenon in which an image of content returns or shifts after being instantaneously lost between frames.
- a video / audio signal according to the BTAS-001B standard for 1125/60 high definition television broadcast HDTV (High-definition television) standardized by the Radio Industries Association ARIB will be described as an example.
- Such a video signal includes a luminance signal Y and color difference signals Pb and Pr.
- the extraction unit 12 When a video / audio signal is input from the input unit 11, the extraction unit 12, as shown in FIG. 2A, displays four fields (in the range of lines V 1 to V 2 and images H 1 to H 2) in one frame. Area) Divided into A, B, C, and D, and calculation is performed for each area. Specifically, for each field, a video level (Video ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ Level) and a video activity (Video Activity) are calculated.
- Video Level is an average value of pixel values included in an image frame, and is also referred to as a luminance signal level. Alternatively, the level of the color difference signal may be used.
- Activity when calculating
- the variance value may be used.
- a small block of m lines and n pixels is formed in one field. That is, the luminance value of each pixel in the small block can be expressed by Y (m, n).
- the luminance signal Y is preferably divided into small blocks of 16 pixels ⁇ 8 lines. When the luminance signal Y is used, the number of small blocks in one field is 1914. When the color difference signals Pb and Pr are used, it is preferable to divide them into small blocks of 8 pixels ⁇ 8 lines.
- Equation (1) is an equation for obtaining an average A (k) for the luminance signal Y in the small block #k
- Equation (2) is a variance V (k) for the luminance signal Y in the small block #k. Is a formula for obtaining.
- 12 is an equation for obtaining the S 22.
- Vn (t) when the video activity at time t in the n-th block #n in one field is Vn (t), attention is paid to the change with time.
- Vn (t-2) and Vn (t-1) at the previous times (t-2) and (t-1) and the subsequent times (t + 1) and (t + 2) , Vn (t + 1) and Vn (t ⁇ 1) are calculated.
- the time intervals of (t ⁇ 2), (t ⁇ 1), t, (t + 1), and (t + 2) are 20 msec or less and are unit time.
- the first-order differential value at each time is obtained as follows.
- the second-order differential value at each time is obtained as follows.
- d 2 Vn (t) / dt 2 dVn (t) / dt ⁇ dVn (t ⁇ 1) / dt (9)
- d 2 Vn (t + 1) / dt 2 dVn (t + 1) / dt ⁇ dVn (t) / dt (10)
- d 2 Vn (t + 2) / dt 2 dVn (t + 2) / dt ⁇ dVn (t + 1) / dt (11)
- (d 2 Vn (t) / dt 2 ) / Vn (t ⁇ 1) is defined as the content acceleration AC at time t, which can take a positive or negative value.
- the acceleration AC is input from the extraction unit 12 to the comparison / determination unit 13.
- FIG. 3 shows an example in which the acceleration AC at the times (t ⁇ 2), (t ⁇ 1), t, (t + 1), and (t + 2) is indicated by arrows along the time axis.
- the comparison / determination unit 13 compares three accelerations AC that are continuous along the time axis.
- the acceleration AC is a positive value and exceeds the threshold value Th1.
- the acceleration AC is a negative value, which is lower than the threshold value Th2.
- the direction of the acceleration AC is the same between the times (t ⁇ 2) and (t ⁇ 1), it can be determined that the image is not disturbed.
- the direction of the acceleration AC is negative at time t, there is a possibility that the image is disturbed.
- the direction of the acceleration AC returns to a positive value again and exceeds the threshold value Th1. Therefore, between time (t ⁇ 1), t, and (t + 1), the acceleration AC exceeds the threshold value, and is in a sequence of positive, negative, and positive.
- the acceleration AC changes greatly as described above, it can be determined that the image is disturbed in the block of the region #n at time t.
- the acceleration AC exceeds the threshold value and is in a negative, positive, and negative arrangement, it can be determined that the image is disturbed.
- the acceleration AC is in a negative, positive, and negative sequence along the time axis between the times t, (t + 1), and (t + 2), but the motion of the content image is in a normal range because the threshold AC is not exceeded. It is determined that the video is not disturbed at time (t + 1).
- the values of the threshold values Th1 and Th2 can be arbitrarily changed by input from the device control unit 14. The above calculation and comparison are performed for all small blocks.
- the comparison / determination unit 13 determines that a video disturbance has occurred
- information indicating in which field and in which small block the video disturbance has occurred is input to the alarm output unit 15.
- the alarm output unit 15 displays an alarm on a monitor (not shown) that displays video / audio to be inspected.
- the edge of the field where the disturbance of the image is detected can be illuminated in red.
- Video block noise refers to a phenomenon in which an image of content is converted into another image in a block form.
- the inspection target frame is represented by 1920 pixels in the horizontal direction and 540 lines in the vertical direction.
- the pixel value of the luminance signal of m pixels and n lines is represented by Y (m, n), and this is defined as a pixel block (inspection block) of 8 pixels ⁇ 8 lines with the upper left corner.
- the range of the inspection block is not limited to this.
- the extraction unit 12 When a video / audio signal is input from the input unit 11, the extraction unit 12 performs a two-dimensional discrete Fourier transform, which is an orthogonal transform, on the pixel values in the inspection block.
- Other orthogonal transforms include discrete cosine transform and wavelet transform, and the block noise angle can be detected in the same manner using any orthogonal transform.
- the comparison / determination unit 13 determines that the inspection block DB has the block noise shown in FIG. It is determined that it exists at one of the four corners of BN. Specifically, it is as follows. [1] When the condition 1 is satisfied, it is confirmed that the pixels Y (6, 6), Y (7, 6), Y (6, 7), Y (7, 7) of the inspection block DB are within the block noise. This indicates that the other pixels are outside the block noise, which means that the inspection block DB (1) shown in FIG. 4B is at the upper left of the block noise BN.
- the inspection target frame may be divided into four to detect whether block noise has occurred in each region.
- W uv is the square root ( ⁇ (A 2 + B 2 )) of the square sum of the real part (A) and the imaginary part (B) of F (u, v).
- the inspection target region (or frame) is composed of N pixels (v 1 to v N ) ⁇ M lines (h 1 to h M ).
- N pixels v 1 to v N
- M lines h 1 to h M .
- the total number of corners Nc in the inspection target area is equal to the total number of pixels in which corners are generated and is equal to the total number of lines in which corners are generated, it can be expressed by equation (13). Further, the standard deviation (Dh) 2 of the corner generated in the horizontal direction in the inspection target region is expressed by the equation (14), and the standard deviation (Dv) 2 of the corner generated in the vertical direction is expressed by the equation (15). Shall.
- the comparison / determination unit 13 determines whether ⁇ is more than the threshold Th5 when it is determined that a corner is generated in the inspection target region, and when ⁇ ⁇ Th5, the inspection target is determined. It is determined that block noise has occurred in the area. Note that the values of the threshold values Th3 to Th5 can be arbitrarily changed by input from the device control unit 14.
- the comparison / determination unit 13 determines that block noise of the video has occurred, information including position information indicating a corner is input to the alarm output unit 15. Based on the input information, the alarm output unit 15 displays an alarm on a monitor (not shown) that displays video / audio to be inspected. At this time, it is preferable to display the position of the corner of the block noise so as to overlap the image displayed on the monitor.
- Audio error detection One of the audio errors detected in this embodiment is a so-called “puzzle” sound that occurs instantaneously and disappears. Since digital audio is input through, for example, four channels, an error for each channel is detected.
- the extraction unit 12 divides the digital sound along the time axis at 1 msec, and samples, for example, 48 pieces of sound data. More detailed data than this is unnecessary because it exceeds the human audible range. Further, frequency conversion is performed on each audio data by discrete Fourier transform which is orthogonal transform.
- x (t) is a value of the sound level indicating the vibration amplitude of the sound at time t.
- the high frequency components fj (t) of the 23 sample data excluding the DC component are extracted as shown in the equation (16).
- sampling is performed while shifting every 0.5 msec, for example.
- the comparison / determination unit 13 determines that a popping sound has occurred when the following expressions (18) to (20) are satisfied.
- the condition of equation (18) indicates that the sound signal is not zero
- equation (19) indicates that there is a relatively large change in sound before and after the popping sound
- equation (20) indicates that the power is within the sampling time. Is relatively constant. Note that the values of the threshold values Th6 to Th8, T, m1, m2, n1, and n2 can be arbitrarily changed by input from the device control unit 14.
- FIG. 7 is a diagram showing a change in power P n (t) with the time axis as the horizontal axis.
- the values of the threshold values Th9, Th10, T, and T5 can be arbitrarily changed by input from the device control unit 14.
- P n (t + T ⁇ T5) ⁇ Th9 (23)
- FIG. 7 is a diagram showing a change in power P n (t) with the time axis as the horizontal axis.
- the values of the thresholds Th11, Th12, T, and T5 can be arbitrarily changed by input from the device control unit 14.
- an audio alarm signal is input to the alarm output unit 15.
- the alarm output unit 15 displays an alarm on a monitor (not shown) displaying video / audio to be inspected.
Abstract
Description
[1]該総合パワー値P(t)が第1の閾値を超えていた場合、及び
[2]該総合パワー値P(t)を、それ以前の時刻(t-T)での総合パワー値P(t-T)で除した値(P(t)/P(t-T))と、該総合パワー値P(t)を、それ以降の時刻(t+T)での総合パワー値P(t+T)で除した値(P(t)/P(t+T))が、それぞれ第2の閾値を超えていた場合、及び
[3]個々のパワー値Pn(t)を、総合パワー値P(T)で除した値(Pn(t)/P(T))が第3の閾値を超えていたときは、エラーが発生したと判定すると好ましい。
「映像の乱れ」とは、コンテンツの像がフレーム間で瞬間的に消失後復帰したり、シフトするような現象をいう。ここでは、一般社団法人電波産業会ARIBが規格化した1125/60方式の高精細度テレビジョン放送HDTV(High-definition television)向けのBTAS-001B規格による映像・音声信号を例にとり説明する。このような映像信号は、輝度信号Yと、色差信号Pb,Prとを含んでいる。
dVn(t-1)/dt=Vn(t-1)-Vn(t-2) (5)
dVn(t)/dt=Vn(t)-Vn(t-1) (6)
dVn(t+1)/dt=Vn(t+1)-Vn(t) (7)
dVn(t+2)/dt=Vn(t+2)-Vn(t-1) (8)
d2Vn(t)/dt2=dVn(t)/dt-dVn(t-1)/dt (9)
d2Vn(t+1)/dt2=dVn(t+1)/dt-dVn(t)/dt (10)
d2Vn(t+2)/dt2=dVn(t+2)/dt-dVn(t+1)/dt (11)
「映像のブロックノイズ」とは、コンテンツの像が、ブロック状に別の像に変換されてしまう現象をいう。ここでも、HDTVの映像・音声信号を例にとり説明する。図4に示すように、入力されたデジタル映像信号を20msec以下で区切ってサンプリングした際に、検査対象フレームを、水平方向に1920画素、垂直方向に540ラインで表すとする。ここで、m画素、nラインの輝度信号の画素値をY(m,n)で表し、これを左上端として8画素×8ラインの画素ブロック(検査ブロック)を定義する。検査ブロックの範囲は、これに限られない。入力部11から映像・音声信号が入力されたとき、抽出部12は、検査ブロック内の画素値に対して、直交変換である2次元離散フーリエ変換を実行する。尚、直交変換としては,これ以外にも離散コサイン変換、ウェーブレット変換などがあり、いずれの直交変換を用いても同様の態様でブロックノイズの角を検出できる。
[1]条件1が成立する場合、検査ブロックDBの画素Y(6,6)、Y(7,6)、Y(6,7)、Y(7,7)がブロックノイズ内にあることを示し、それ以外の画素がブロックノイズ外にあることを示すので、図4(b)において示す検査ブロックDB(1)が、ブロックノイズBNの左上にあることを意味する。
[2]条件2が成立する場合、検査ブロックDBの画素Y(0,6)、Y(1,6)、Y(0,7)、Y(1,7)がブロックノイズ内にあることを示し、それ以外の画素がブロックノイズ外にあることを示すので、図4(b)において示す検査ブロックDB(2)が、ブロックノイズBNの右上にあることを意味する。
[3]条件3が成立する場合、検査ブロックDBの画素Y(6,0)、Y(7,0)、Y(6,1)、Y(7,1)がブロックノイズ内にあることを示し、それ以外の画素がブロックノイズ外にあることを示すので、図4(b)において示す検査ブロックDB(3)が、ブロックノイズBNの左上にあることを意味する。
[4]条件4が成立する場合、検査ブロックDBの画素Y(0,0)、Y(1,0)、Y(0,1)、Y(1,1)がブロックノイズ内にあることを示し、それ以外の画素がブロックノイズ外にあることを示すので、図4(b)において示す検査ブロックDB(4)が、ブロックノイズBNの左上にあることを意味する。
本実施の形態で検出する音声のエラーの1つは、瞬間的に発生し消滅する、いわゆる「プツ」音である。デジタル音声は、例えば4チャンネルで入力されるので、個々のチャンネル毎のエラーを検出する。
比較・判定部13が、時刻tにおける高周波成分fj(t)から実数部と虚数部の二乗和を計算することで、パワーが得られる。よって全てのサンプルについてパワーを計算し、これをPn(t)、(但しn=1~23)とする。
P(t)≧Th6 (18)
P(t)/P(t-T)≧Th7 且つ P(t)/P(t+T)≧Th7 (19)
Pn(t)/P(t)≧Th8 (但し、nはサンプルデータ#1~#23のうち任意の連番n1~n2のサンプルデータ) (20)
図7は、時間軸を横軸としてパワーPn(t)の変化を示す図である。比較・判定部13は、n=1~23全てにおいて、以下の(21)~(23)式を満たすとき、時刻tで音飛びが発生したと判定する。これは時刻tより時間Tにわたって、音声のパワーが閾値Th10を下回っているが、その前後ではパワーが閾値Th9を上回っていることを意味する。尚、閾値Th9,Th10、T,T5の値は、装置制御部14からの入力で任意に変更することが可能である。
Pn(t-T5)≧Th9 (21)
Pn(t)、Pn(t+1)、・・・Pn(t+T)≦Th10 (22)
Pn(t+T-T5)≧Th9 (23)
図7は、時間軸を横軸としてパワーPn(t)の変化を示す図である。比較・判定部13は、n=1~23全てにおいて、以下の(24)~(26)式を満たすとき、時刻tでノイズ挿入が発生したと判定する。これは時刻tより時間Tにわたって、音声のパワーが閾値Th11を上回っているが、その前後ではパワーが閾値Th9を下回っていることを意味する。尚、閾値Th11,Th12、T,T5の値は、装置制御部14からの入力で任意に変更することが可能である。
Pn(t-T5)≦Th11 (24)
Pn(t)、Pn(t+1)、・・・Pn(t+T)≧Th12 (25)
Pn(t+T-T5)≦Th11 (26)
11 入力部
12 抽出部
13 比較・判定部
14 制御部
15 アラーム出力部
Claims (12)
- 連続するデジタル映像信号を20msec以下で区切ってサンプリングし、サンプリングした信号から高周波成分を抽出して、抽出された高周波成分に基づいて、映像に生じたエラーを検出することを特徴とする映像検査方法。
- 前記デジタル映像信号の1フレームを複数の領域に分割し、前記エラーの検出を各領域毎に行うことを特徴とする請求項1に記載の映像検査方法。
- 前記エラーは映像の乱れであり、前記抽出された高周波成分は、前記デジタル映像信号のブロック単位の分散値の平均であるアクティビティであることを特徴とする請求項1又は2に記載の映像検査方法。
- 前記アクティビティ(Vn(t))を時間(t)に対して2階微分してd2Vn(t)/dt2を得たときに、加速度(d2Vn(t)/dt2)/Vn(t-1)が、時間軸にそって、正、負、正又は負、正、負と並んでいたときは、映像の乱れが発生したと判定することを特徴とする請求項3に記載の映像検査方法。
- 前記エラーはブロックノイズであり、前記映像信号の検査ブロック内の画素値に対して直交変換を行い、その変換係数が所定の条件を満たしたときは、ブロックノイズが発生したと判定することを特徴とする請求項1又は2に記載の映像検査方法。
- 前記変換係数が前記所定の条件を満たしたときは、前記映像信号により表示されるコンテンツに角が生じたと判定することを特徴とする請求項5に記載の映像検査方法。
- 前記角の数と偏りから、前記角を、ブロックノイズに起因するものと、コンテンツに起因するものとに区別することを特徴とする請求項6に記載の映像検査方法。
- 連続するデジタル音声信号を5msec以下で区切ってサンプリングし、サンプリングした信号から高周波成分を抽出して、抽出された高周波成分に基づいて、音声に生じたエラーを検出することを特徴とする音声検査方法。
- 前記デジタル音声信号が複数チャンネルに記録されているときは、前記エラーの検出を各チャンネル毎に行うことを特徴とする請求項8に記載の音声検査方法。
- 時間軸に沿って時刻tでサンプリングを行って、前記サンプリングした信号に対して周波数変換を行い、それぞれn個のパワー値Pn(t)と、所定の帯域内における総合パワー値P(t)を求めた場合において、
[1]該総合パワー値P(t)が第1の閾値を超えていた場合、及び
[2]該総合パワー値P(t)を、それ以前の時刻(t-T)での総合パワー値P(t-T)で除した値(P(t)/P(t-T))と、該総合パワー値P(t)を、それ以降の時刻(t+T)での総合パワー値P(t+T)で除した値(P(t)/P(t+T))が、それぞれ第2の閾値を超えていた場合、及び
[3]個々のパワー値Pn(t)を、総合パワー値P(T)で除した値(Pn(t)/P(T))が第3の閾値を超えていたときは、エラーが発生したと判定することを特徴とする請求項8又は9に記載の音声検査方法。 - 時間軸に沿った3つのパワー値を比較したときに、1番目のパワー値Pn(t―T5)と3番目のパワー値Pn(t+T+T5)が第4の閾値を上回り、2番目のパワー値の列Pn(t)、・・・、Pn(t+T)が前記第5の閾値を下回ったときは、音飛びが発生したと判定することを特徴とする請求項8~10のいずれかに記載の音声検査方法。
- 時間軸に沿った3つのパワー値Pn(t)を比較したときに、1番目のパワー値Pn(t―T5)と3番目のパワー値Pn(t+T+T5)が第6の閾値を下回り、2番目のパワー値の列Pn(t)、・・・、Pn(t+T)が前記第7の閾値を上回ったときは、ノイズが発生したと判定することを特徴とする請求項8~10のいずれかに記載の音声検査方法。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2015543639A JP6222854B2 (ja) | 2013-10-23 | 2013-10-23 | 映像検査方法及び音声検査方法 |
PCT/JP2013/078660 WO2015059782A1 (ja) | 2013-10-23 | 2013-10-23 | 映像検査方法及び音声検査方法 |
US15/031,200 US20160249047A1 (en) | 2013-10-23 | 2013-10-23 | Image inspection method and sound inspection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2013/078660 WO2015059782A1 (ja) | 2013-10-23 | 2013-10-23 | 映像検査方法及び音声検査方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015059782A1 true WO2015059782A1 (ja) | 2015-04-30 |
Family
ID=52992420
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2013/078660 WO2015059782A1 (ja) | 2013-10-23 | 2013-10-23 | 映像検査方法及び音声検査方法 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20160249047A1 (ja) |
JP (1) | JP6222854B2 (ja) |
WO (1) | WO2015059782A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108877837A (zh) * | 2018-06-12 | 2018-11-23 | 北京小米移动软件有限公司 | 音频信号异常识别方法、装置和存储介质 |
JP2019145974A (ja) * | 2018-02-20 | 2019-08-29 | 日本放送協会 | 超高精細映像に適した画質評価装置 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0937244A (ja) * | 1995-07-14 | 1997-02-07 | Oki Electric Ind Co Ltd | 動画像データ誤り検出装置 |
JPH09503890A (ja) * | 1993-07-19 | 1997-04-15 | ブリテイッシュ・テレコミュニケーションズ・パブリック・リミテッド・カンパニー | ビデオ画像におけるエラー検出 |
JP2009094892A (ja) * | 2007-10-10 | 2009-04-30 | Toshiba Corp | 動画像復号装置及び動画像復号方法 |
JP2011203500A (ja) * | 2010-03-25 | 2011-10-13 | Toshiba Corp | 音情報判定装置、及び音情報判定方法 |
Family Cites Families (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2890740B2 (ja) * | 1990-08-09 | 1999-05-17 | 松下電器産業株式会社 | ディジタル映像信号再生装置 |
JPH04320160A (ja) * | 1991-04-19 | 1992-11-10 | Matsushita Electric Ind Co Ltd | 画像信号圧縮伸長装置および領域識別処理装置 |
EP0731601B2 (en) * | 1995-03-06 | 2006-10-18 | Matsushita Electric Industrial Co., Ltd. | Video signal noise reduction apparatus |
US6359929B1 (en) * | 1997-07-04 | 2002-03-19 | Matsushita Electric Industrial Co., Ltd. | Image predictive decoding method, image predictive decoding apparatus, image predictive coding apparatus, and data storage medium |
FI107108B (fi) * | 1998-11-05 | 2001-05-31 | Nokia Mobile Phones Ltd | Virheen ilmaiseminen alhaisen bittinopeuden videolähetyksessä |
DE10024374B4 (de) * | 2000-05-17 | 2004-05-06 | Micronas Munich Gmbh | Verfahren und Vorrichtung zum Messen des in einem Bild enthaltenen Rauschens |
KR20050049064A (ko) * | 2003-11-21 | 2005-05-25 | 삼성전자주식회사 | 영상신호의 노이즈 측정장치 및 그 측정방법 |
US20050207660A1 (en) * | 2004-03-16 | 2005-09-22 | Sozotek, Inc. | System and method for reduction of compressed image artifacts |
JP5044886B2 (ja) * | 2004-10-15 | 2012-10-10 | パナソニック株式会社 | ブロックノイズ低減装置および画像表示装置 |
US7957467B2 (en) * | 2005-09-15 | 2011-06-07 | Samsung Electronics Co., Ltd. | Content-adaptive block artifact removal in spatial domain |
TWI466547B (zh) * | 2007-01-05 | 2014-12-21 | Marvell World Trade Ltd | 用於改善低解析度視訊之方法與系統 |
US20080260350A1 (en) * | 2007-04-18 | 2008-10-23 | Cooper J Carl | Audio Video Synchronization Stimulus and Measurement |
TWI404408B (zh) * | 2008-10-07 | 2013-08-01 | Realtek Semiconductor Corp | 影像處理裝置及影像處理方法 |
US8144253B2 (en) * | 2009-07-21 | 2012-03-27 | Sharp Laboratories Of America, Inc. | Multi-frame approach for image upscaling |
JP2011244085A (ja) * | 2010-05-14 | 2011-12-01 | Sony Corp | 信号処理装置及び信号処理方法 |
JP2012231389A (ja) * | 2011-04-27 | 2012-11-22 | Sony Corp | 画像処理装置、画像処理方法、及びプログラム |
JP2012244319A (ja) * | 2011-05-18 | 2012-12-10 | Funai Electric Co Ltd | デジタル放送受信機 |
-
2013
- 2013-10-23 JP JP2015543639A patent/JP6222854B2/ja active Active
- 2013-10-23 US US15/031,200 patent/US20160249047A1/en not_active Abandoned
- 2013-10-23 WO PCT/JP2013/078660 patent/WO2015059782A1/ja active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09503890A (ja) * | 1993-07-19 | 1997-04-15 | ブリテイッシュ・テレコミュニケーションズ・パブリック・リミテッド・カンパニー | ビデオ画像におけるエラー検出 |
JPH0937244A (ja) * | 1995-07-14 | 1997-02-07 | Oki Electric Ind Co Ltd | 動画像データ誤り検出装置 |
JP2009094892A (ja) * | 2007-10-10 | 2009-04-30 | Toshiba Corp | 動画像復号装置及び動画像復号方法 |
JP2011203500A (ja) * | 2010-03-25 | 2011-10-13 | Toshiba Corp | 音情報判定装置、及び音情報判定方法 |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2019145974A (ja) * | 2018-02-20 | 2019-08-29 | 日本放送協会 | 超高精細映像に適した画質評価装置 |
JP7154522B2 (ja) | 2018-02-20 | 2022-10-18 | 日本放送協会 | 超高精細映像に適した画質評価装置 |
CN108877837A (zh) * | 2018-06-12 | 2018-11-23 | 北京小米移动软件有限公司 | 音频信号异常识别方法、装置和存储介质 |
CN108877837B (zh) * | 2018-06-12 | 2021-01-15 | 北京小米移动软件有限公司 | 音频信号异常识别方法、装置和存储介质 |
Also Published As
Publication number | Publication date |
---|---|
US20160249047A1 (en) | 2016-08-25 |
JP6222854B2 (ja) | 2017-11-01 |
JPWO2015059782A1 (ja) | 2017-03-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8553783B2 (en) | Apparatus and method of motion detection for temporal mosquito noise reduction in video sequences | |
US20140320534A1 (en) | Image processing apparatus, and image processing method | |
JP4290124B2 (ja) | 動き系列パターン検出 | |
US8548247B2 (en) | Image processing apparatus and method, and program | |
US20090060370A1 (en) | Filter for adaptive noise reduction and sharpness enhancement for electronically displayed pictures | |
CA2674149A1 (en) | Banding artifact detection in digital video content | |
US20160044315A1 (en) | System and method for adaptively compensating distortion caused by video compression | |
KR20020007402A (ko) | 액티브 비디오 신호에 대한 주관적 잡음 측정 | |
KR101156117B1 (ko) | 영상 검출 장치 및 영상 검출 방법 | |
JP6222854B2 (ja) | 映像検査方法及び音声検査方法 | |
EP2017788A1 (en) | Shielding-object video-image identifying device and method | |
Bong et al. | An efficient and training-free blind image blur assessment in the spatial domain | |
KR101452541B1 (ko) | 비디오 신호 분석 방법 | |
CN104796581B (zh) | 一种基于噪声分布特征检测的视频去噪系统 | |
US20080018755A1 (en) | Method and system for reducing mosquito noise in a digital image | |
WO2016199418A1 (en) | Frame rate conversion system | |
CN106507157B (zh) | 广告投放区域识别方法及装置 | |
KODAMA | A Screen Shake Determination Method Using Histograms of Motion Vectors in Video Scenes | |
US20090123079A1 (en) | Reduction of compression artefacts in displayed images | |
CN114120197A (zh) | 2si模式传输的超高清视频异态信号检测方法 | |
US20090207304A1 (en) | Method for generating distances representative of the edge orientations in a video picture, corresponding device and use of the method for deinterlacing or format conversion | |
US20150269904A1 (en) | Image processing device and method thereof | |
US8670071B1 (en) | Method and apparatus for de-interlacing video | |
KR20120062436A (ko) | 영상 감시 시스템 및 방법 | |
CN110473200B (zh) | 全参考的视频图像质量评价方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13895857 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2015543639 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15031200 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13895857 Country of ref document: EP Kind code of ref document: A1 |