WO1999031886A1

WO1999031886A1 - Image encoding method and device

Info

Publication number: WO1999031886A1
Application number: PCT/JP1998/005626
Authority: WO
Inventors: Masaaki Isozaki; Atsuo Yada
Original assignee: Sony Corporation
Priority date: 1997-12-12
Filing date: 1998-12-11
Publication date: 1999-06-24
Also published as: US6275528B1; JP3740813B2; JPH11177978A

Abstract

The stability of a pulldown pattern is calculated in step S33 based on the encoding conditions and the encoding difficulty parameter of a material which are inputted in steps S31, S32, and the stability is judged in step S34. If it is juged that the pulldown pattern is not stable, a warning is indicated in step S35 and it is decided whether or not the encoding under the initial conditions are to be continued in step S36. If it is decided that the encoding is not to be continued, an encoding etc. without a pulldown processing is performed in step S44. If it is decided that the encoding to be continued or if it is decided that the the pulldown pattern is stable in step S34, a scene change detection/processing, a chapter boundary processing, interpolation/correction of the encoding difficulty, calculation of target number of bits, and address calculation are performed in steps S37-S41, the target rate is calculated in step S42, and an encoder control file is produced in step S43, finishing the whole processing. With this constitution, an image encoding method and device by which it can be judged whether the encoding conditions are proper or not before two-pass variable rate encoding is performed by the pulldown processing.

Description

TECHNICAL FIELD The present invention relates to an image encoding method and apparatus for encoding video material, and more particularly to a video encoding method and apparatus based on a pull-down pattern of an input video material that has been subjected to pull-down conversion. TECHNICAL FIELD The present invention relates to an image encoding method and apparatus for performing encoding. Background technology When video information is stored in a package media such as a DVD (Digital Versatile Disk, Digital Video Disk) or a video CD, an encoding system that performs compression coding on the video information first requires The encoding difficulty (diff iculty) of the image of the video material is measured, and then, based on the encoding difficulty, each is set so as to be within a given number of bytes within the storage capacity of the package media. An encoding method that performs bit allocation processing for each frame of video information is generally adopted. Hereinafter, this encoding method is referred to as a two-pass encoding method.

FIG. 1 shows an example of the configuration of a conventional video encoding system used for authoring a DVD or the like by compressing and encoding video information. The supervisor 103 manages the entire video encoding system, gives encoding conditions to each encoding system such as video, audio, and menu, and receives a report of the encoding result. In this example, the video encoding conditions are specified by the file “v.enc”, and the video encoder outputs the address “v” on the RAID (Redundant Arrays of Inexpensive Disks) 104 where the encoded bitstream is written. .adr ”and“ vxxx.aui ”required for multiplexing bitstreams are reported.

The main controller 11 1 controls the operation of the entire video encoder system by data communication with a supervisor 103 connected via a network 102.

More specifically, the main controller 111 receives control from the supervisor 103 and manages it through the management of a graphical user interface (GUI) section 114. One night operation is accepted, and the bit assignment unit 115, encoder control unit 116, and VTR control unit 117 managed by the GUI unit 114 provide the encoder 112, video Controls the operation of the precoder (VTR) 110. Accordingly, the main controller 111 encodes the material to be processed in accordance with the encoding condition notified from the supervisor 103 and notifies the supervisor 3 of the processing result. Further, the main controller 111 can receive the setting of the operator via the GUI unit 114, and can change the above detailed conditions of the encoding.

The GUI section 1 1 4 of the main controller 1 1 1 Bit allocation program “BIT— ASSIGNj” for encoder section 115, encoder control program “CTRL—ENC” for encoder control section 116, and VTR control section section 117 It manages three control programs.

Also, the bit assigning unit 115 determines the encoding processing conditions in frame units according to the encoding condition file “v.enc” notified from the supervisor 103, and the control data based on these conditions is filed. In this case, the control unit 116 is notified by the CTL filej. At this time, the bit assignment unit 115 sets the bit allocation (bit assignment) in the encoding process, and further sets the set conditions. In addition, the bit assignment section 115 writes the video data D2 when the video data D2 that has been compressed is recorded in the RA ID 104. Notify the supervisor 103 of the address data “v.adr” on the RA ID 104 together with the information “vxxx.aui” such as the amount of data required for the multiplexing process in the subsequent stage.

The encoder control unit 116 controls the operation of the encoder 112 in accordance with the control file “CTL filej notified from the bit assignment unit 115. Further, the encoder control unit 116 controls the encoding The data of the encoding difficulty "difficulty" required for the processing is notified to the bit assigning unit 115 in frame units, and when the video data D2 is recorded in the RAID 104, the data of the recording address is deleted. In the evening, “v.adr” is notified to the bit assigning unit 115 of “vxxx.auij”, which is necessary for the subsequent multiplexing processing.

The VTR control section 1 17 is notified from the supervisor 3 The operation of the video tape recorder (VTR) 110 is controlled in accordance with the editing list to be played, and the desired material to be edited is reproduced.

The video tape recorder (VTR) 110 is the main controller.

In accordance with the edit list notified from the supervisor 103 via 1, the video data D 1 recorded on the magnetic tape is reproduced and output to the encoder 112.

Encoders 1 and 2 are connected to the main controller from supervisor 103

The operation is switched according to the condition notified via 1 1 1 and VTR 1

The video data D 1 output from 10 is compression-coded by the method of M PEG (Moving Picture Experts Group).

At this time, the encoder 1 12 notifies the main controller 1 1 1 of the result of the encoding process, and the main controller 1 1 1 1 controls the encoding condition in the data compression and generates the Control the amount of bits to be applied. As a result, the main controller 111 can grasp the amount of bits generated by data compression in frame units.

In addition, the encoder 112 simply compresses the video data D 1 during the processing of the pre-encoding condition setting in the two-pass encoding (during provisional encoding), and sends the processing result to the main controller 111. At the time of the final data compression process (during actual encoding), the compressed video data D2 is recorded in RAID 104, and the address where the data was recorded is also recorded. The main controller 11 is notified of the amount of data and the amount of data.

The monitor device 113 is configured to monitor the video data D2 that has been compressed by the encoder 112. With this monitor device 113, this video encoding system allows the operator Can check the result of the data compression processing as necessary, that is, perform a so-called preview. Then, the operator can operate the main controller 111 based on the preview result to change the encoding conditions in detail.

As described above, the DVD employs the Moving Picture Experts Group (MPEG) as a compression method for video decoding.

MPEG is a method of compressing data by removing the redundancy in the time direction by motion compensation prediction. It predicts the current from I (Intra) pictures coded only within the frame and past screens. Three types of coded images are used: P (Predictive) pictures to be coded, and B (Bidirectionally Predictive) pictures, which are coded by predicting the current from past and future images.

In addition, these images are defined as a GOP (Group of Pictures), which always includes one I-picture.

FIG. 2 shows an example of the G0P structure.

In this example, the number N of pictures (frames) constituting one GOP is 15. The order in which the pictures of the GOP are displayed is different from the order in which they are coded, and the first picture of the GOP in the display order is before the I picture and after the P picture or I picture. It is a B victim. The last of G〇P in the display order is the first P picture before the next I picture.

Next, two-pass encoding will be described with reference to the configuration of the video encoding system illustrated in FIG.

FIG. 3 shows a basic processing procedure of the two-pass code in the video code system described above. First, in step S51, the encoding conditions "v.enc" such as the total amount of bits allocated to video information and the maximum rate are given from the supervisor 103 via the network 102. The encoder control section 116 is set according to this encoder condition. Next, in step S52, the encoder control unit 116 measures the encoding difficulty (diff iculty) of the encoding material using the encoder 112. Here, the DC value of each pixel of the material and the motion vector amount ME are also measured. Then, a file is created based on these measurement results.

The actual measurement of the coding difficulty is performed as follows.

Video information to be used as an encoding material is reproduced by a VTR 110 from a digital video cassette, which is a master tape.

The encoder control unit 116 measures the encoding difficulty of the video information D1 reproduced by the VTR 110 via the encoder 112. Here, the number of generated bits is measured by setting the number of quantization steps to a fixed value during encoding. The amount of generated bits increases in images with many motions and many high frequency components, and decreases in still images and images with many flat parts. The amount of generated bits is defined as the encoding difficulty.

Next, in step S53, the encoder control unit 1 according to the magnitude of the encoding difficulty of each picture measured in step S52 according to the encoding condition set in step S51. 16 executes the bit allocation calculation program "BIT-ASSIGNj" in the bit assignment section 115 to calculate the allocation of the allocated bit amount (evening amount). Then, in step S54, provisional encoding is performed using the result of the above-described bit allocation calculation, and this encoding is executed according to the image quality of the output decoder output from the built-in encoder 112. Or not at the time of the operation.

In practice, the bit stream is not output to the RAID 104 with the bit allocation described above, but the image quality is checked in the Preview mode, which is a mode in which the operator can specify an arbitrary processing range. Is done. Then, in step S55, the image quality is evaluated. If there is a problem with the image quality (NG), the process proceeds to step S56, where the bit rate of the problematic part is increased or the filter level is adjusted. After performing customization work for image quality adjustment, recalculation of bit allocation is performed in step S57.

Thereafter, the process returns to step S54, where the customized portion is previewed, and the image quality is confirmed in step S55. Here, if the image quality of all parts is good, the process proceeds to step S58, and the encoder 1 12 executes encoding for the entire material by the bit allocation recalculated in step S57. .

On the other hand, if it is determined in step S55 that there is no problem with the image quality, the process directly proceeds to step S58, where the encoder 112 determines the encoder based on the bit allocation calculated in step S53. Is executed.

Then, in step S59, post-processing such as writing the bit stream as a result of encoding to RAID 104 via SCSI (Small Computer System Interface) or the like is performed. -The processing ends. After the execution of the encoding in step S58, the video encoder control unit 116 reports the information of the encoding result as described above to the supervisor 103 via the network 102.

Note that among the steps in FIG. 3, the processing of each step except for step S52, step S54, and step S58 is performed offline.

Next, in the above-described two-pass encoding, the bit allocation calculation performed by the bit assigning unit 115 will be further described. FIGS. 4A to 4G show examples of processing of surplus bits in the bit allocation calculation.

First, the supervisor specifies the total bit amount “QTY_BYTES” (Fig. 4A) and the maximum bit rate “MAXMTE” allocated to video information in the recording capacity of package media such as digital video discs (DVDs). Is done.

On the other hand, the encoder control section 116 executes the bit allocation calculation program "BIT_ASSIGN" in the bit assignment section 115, and first, the bit rate becomes less than the maximum bit rate "MAXRATE". The total number of bits “USB_BYTES”, which is limited as described above, is calculated (Fig. 4B), and the value obtained by subtracting the number of bits “T0TA HEADERj” required for the GOP header (Header) from this value is calculated. From the total number of frames, calculate “SUPPLY—BYTES”, which is the target value of the total number of evening targets (Fig. 4C).

Then, the bit amount (evening gate amount) allocated to each picture is allocated so as to be within the size of this “SUPPLY_BYTES”. Assuming that the total amount of bits allocated to all pictures is “TARGET_BYTES”, the target value “SUPPLY — BYTES ”minus the above“ TARGET—BYTES ”is the amount“ REMAIN_BYTES ”that indicates the remainder (Remain) in bit allocation.

Then, as shown in FIG. 4G, the sum of “TARGET_BYTES” and the Header is “TARGET_OUT_BYTES”.

FIG. 5 shows a specific example of the procedure of the bit allocation calculation process in step S53 of FIG.

First, in step S61, as described above, the total bit amount “QTY_BYTES” and the maximum bit rate “MAXRATE” sent from the supervisor 3 are input.

Next, in step S62, the file of the measurement result of the encoding difficulty (diffi cu lty) created in step S52 of FIG. 3 is read as it is.

Then, in step S63, the point at which the scene changes is detected from the DC value of each image measured together with the encoding difficulty and the amount of change in the parameter of the motion vector ME over time. Is done.

The detection / processing of the scene change in step S63 is based on the “video” disclosed in the specification and drawings of Japanese Patent Application No. 8-274904, which was already filed by the present applicant. Processing for detecting a scene change point in the “signal processing device” can be applied. This "video signal processing device" detects the DC level of each frame of the video signal, and detects a scene change frame of the video signal from an error value obtained by approximating the DC level with a curve. This is to clarify the change points. At the point detected as a scene change, the P picture is changed to an I picture to improve the picture quality. Next, in step S64, a chapter (CHAPTER) boundary process is performed. At the time of a chapter search in a DVD playback device, the reproduced picture jumps from an unspecified picture. In this case, in order to prevent the reproduced image from being disturbed, the picture type is changed or the G0P length is limited by this chapter boundary processing so that the chapter position is always at the top of G • P. In step S65, interpolation / correction is performed on the value of the coding difficulty (D ifficulty) corresponding to the picture type such as I picture, P picture, and B picture, which has been changed as a result of the above series of operations. Is performed.

This is because the maximum number of fields to be displayed when decoding one GOP is limited in DVDs, and the length of one GOP is limited by the change in the GOP structure due to the change in picture type. This is because the limit may be exceeded. In such a case, G • P constraint processing is performed in which the P picture is changed to an I picture to shorten the GOP length so as to satisfy the restriction.

In step S66, according to the encoding difficulty obtained by the interpolation / correction processing in step S65 and the number of bits “SUPPLY—BYTES” given to the entire material to be encoded, The number of target bits for each picture is calculated.

Then, in step S67, after calculating the address (ADDRESS) of RAID 4 when writing the bit stream of the encoding result, the process proceeds to step S68, and the control file for the encoder is obtained. Created and the process ends.

By the above procedure, the difficulty of encoding the material (Diff iculty) The number of evening target bits for each picture is calculated according to the number of bits “SUPPLY—BYTES” given to the entire material and a control file for the encoder is created.

In the following, such a series of bit allocation procedures will be described in more detail. Here, as an example of calculating the bit allocation, the bit amount is first allocated in GOP units, and then the bit allocation is performed within each GOP according to the encoding difficulty (Difficulty) of each picture. According to “gop-diff” which is the sum of the encoding difficulty for each G0P, the bit allocation amount “gop-targetj” in G0P units at the time of encoding is allocated.

Figure 6 shows the simplest function for converting the sum of the encoding difficulty for each GOP “gop—diffj” and the bit allocation “gop_target” in G〇P units at the time of encoding. An example is shown.

In this example, "gop_target" is Y, "gop—diff" is X, and Y = AX + B

Is used.

Total number of bits limited to less than the maximum bit rate allowed "USB—BYTES j is

USB—BYTES = min (QTY—BYTES-MAXRATE x KT

x total— frame—number). · · (\). here,

KT = 1/8 (bits) / 30 (Hz),

1/8 (bits) / 25 (Hz) for PAL system

It is. “Tota frame_number” is the total number of frames of the material to be encoded, and min (s, t) is a function that selects the smaller of s or t. It is. "" DIFFICULTY-SUM "is the sum of the encoding difficulty of all pictures.

SUPPLY— BYTES = USB_BYTES-TOTAL— HEADER... (2) DIFFICULTY— SUM = ∑ difficulty ·. · (3)

B two GOP_MINBYTES (4)

∑y = A x ∑x + Bx n

Here, ∑y = SUPPLY_BYTES, ∑x = DIFFICULTY-one SUM, n is the total number of GOPs. Therefore

A (SUPPLY-BYTES-Bxn) / DIFFICULTY—SIM

Becomes Therefore, the target amount of each picture is

gop—target 2 A x gop_diff + B · · · · (5)

After that, in each GOP, bits are allocated according to the coding difficulty of each picture. When the distribution of each of the pictures in G 0 P is proportional to the magnitude of the encoding difficulty, the evening target amount of each picture is obtained by the following equation.

target (k) = GOP— TARGET x diffuculty (k) / G0P_diff

• · · (6)

(Number of pictures in 1≤k≤GO P)

Then, after such a bit allocation calculation, a RAID 4 address to which the encoded bit stream is written is set, and an encoder control file is output. By performing the encoding process using the control file created in this way, variable bit-rate encoding according to the difficulty of the material image is executed. The above is an outline of the two-path variable bite-coding. Next, the pulldown processing of movie (film) material will be described.

In order to convert a movie film composed of 24 frames / sec into an NTSC video signal composed of 30 frames / sec, a process of periodically repeating the same field image is performed. In the following, this process is called 2-3 pulldown conversion.

Figure 7 illustrates the principle of this 2-3 pulldown conversion.

The phase of the pull-down pattern is determined when converting film material to NTSC video material. In many cases, patterns are converted regularly.

One frame of video material is composed of two fields, of which the first field (1st field) is the top field (top_field) and the second field (2nd field) is the bottom field. (Bottom—field). A place where the same field image is repeated is called a repeat-first-field.

In such movie material, if the position where the same field is repeated is known, the encoding efficiency can be improved by processing the field so as not to be encoded.

The following four combinations of 2-3 pull-down patterns are used for encoding.

0: bottom— iield—iirst

1: bottom— field— nrst, repeat— first— field

2: top—field—first

3: top field— first, repeat— first— field Here, the combination of these patterns is defined as “picture mode”.

When encoding is performed at the same time as pull-down conversion, or when information indicating the pull-down pattern is given, optimal coding in consideration of the pull-down processing as described above, that is, repeated fields Can be used to output only information indicating which fields have been repeated without encoding the image, but if such information is not given, Encoding that takes into account pull-down processing may not be performed properly.

Next, the relationship between the 2-3 pull-down pattern and the video frame number k from the beginning of the roll of the material according to the NTSSC method will be described.

Assuming that the relationship between the position including the unrepeated top field and the frame number k and the frame number k is p_mode [k], the picture mode described above does not belong to the picture modes 0 to 3 as shown in FIG. There is a new frame number. In this case, the value of p-mode [k] is 4. FIG. 2 is a state transition diagram of p_mode [k] in the pull-down. If the pulldowns are regular and continuous, the value of p—mode [k] will increase by one in the remainder modulo 5 (mod 5) with increasing frame number k.

P_mode [k + l] = (p_mode [k] + 1) mod 5 · · · (7) If the pull-down pattern is disturbed, the value can be repeated only when p_mode is 0 or 2.

p_mode [k + l] = (p_mode [k] + 1) mod 5

(Only when p_mode [k] is 0 or 2) In two-pass encoding, a pull-down pattern is automatically detected when measuring the encoding difficulty (diff iculty). At this time, bits are allocated based on the measured pull-down pattern, and a control file is created.

Then, at the time of final encoding, encoding is performed according to the pull-down pattern described in the control file.

However, the automatic detection method of the pull-down pattern described above is based on the difference between the top field and the bottom field between the current frame and the previous frame, and the source is a still image. If it is close to, the pull-down pattern may not be detected correctly.

For example, considering the first part of a movie, the footage starting at the first title will be fed in from black, with the movie company logo appearing and fed out to black. In such a case, it is difficult to accurately detect the pull-down phase in a portion with little movement, such as a feed-in / art to black, and the detection is often erroneous.

In addition, when using an old device that uses a picture tube as a converter for converting film material to video material, the pull-down phase may not be detected correctly due to the afterimage between frames.

For the same reason, the pull-down phase may not be detected correctly even with a material using a noise reducer that reduces the random noise by adding the image of the previous frame and the image of the current frame.

In addition, in a material in which a large number of film materials having different pull-down phases are edited, there is a possibility that an erroneous pull-down phase is detected from a delay in detecting a pull-down pattern at an editing point. In this way, if encoding is performed with the wrong phase, it will be processed with the wrong pull-down pattern until it locks to the correct phase, causing problems such as awkward movement of the image as a result of encoding. . Whether there is no problem in performing encoding by pull-down processing largely depends on the stability of the pull-down pattern of the material.

However, since it is difficult to make the determination in advance, the operation was forced to judge the suitability by looking at the movement of the image after the encodation, and it was judged that the encodation condition was inappropriate. In such a case, the problem was that the conditions had to be changed and the encoder re-written from the beginning. DISCLOSURE OF THE INVENTION The present invention has been made in view of the above-described problems, and has been made in consideration of the above-described problem. In two-pass variable rate encoding, it is possible to determine whether or not an encoding condition by pull-down processing is appropriate before executing the encoding. It is an object of the present invention to provide a simple image encoding method and apparatus.

In order to solve the above-mentioned problem, an image encoding method according to the present invention is directed to a two-pass image encoding method for performing encoding based on a pull-down pattern on an input video material subjected to pull-down conversion, It has a measuring step of measuring a pull-down pattern of a video material and a determining step of judging the stability of the measured pull-down pattern.

Here, in the determination step, the measured pull-down pattern is compared with the pull-down patterns based on a plurality of assumed initial phases. And selecting an initial phase that gives a pull-down pattern closest to the measured pull-down pattern in each of the pull-down patterns, and selecting a pull-down pattern based on the selected initial phase. An error calculating step of calculating an error with respect to the total number of frames of the video material; and determining the stability of the pull-down pattern based on the selected initial phase before the encoding based on the calculated error. It is preferable to have a stability determination step.

Further, in order to solve the above problem, an image encoding apparatus according to the present invention provides a two-pass image in which encoding based on a pull-down pattern is performed while performing pull-down processing on an input video material subjected to pull-down conversion. In the encoding device, a measuring means for measuring the pull-down pattern of the input video material is compared with the measured pull-down pattern and each of the pull-down patterns based on a plurality of assumed initial phases. Selecting means for selecting an initial phase that gives a pull-down pattern closest to the measured pull-down pattern, and calculating an error of the pull-down pattern based on the selected initial phase with respect to the total number of frames of the video material. The selected error is calculated based on the calculated error and the calculated error. And having a stability judging means for judging the stability of pull-down patterns based on the period phase Enko one de ago.

According to the present invention described above, in two-pass variable rate encoding, the pull-down pattern of the input video material is measured, and the appropriateness of the encoding conditions by the pull-down processing is determined based on the stability of the measured Burden pattern. Can be determined before. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a diagram showing a configuration example of a conventional video encoding system.

FIG. 2 is a diagram for explaining the G 0 P structure.

FIG. 3 is a flowchart showing a basic processing procedure of a two-pass code in a conventional video encoding system.

FIGS. 4A to 4G are diagrams for explaining examples of processing of surplus bits in the bit allocation calculation.

FIG. 5 is a flowchart showing a specific example of the procedure of the bit allocation calculation process.

FIG. 6 is a diagram showing an example of a function for converting the sum of the encoding difficulty levels for each GOP “gop-diffj” and the bit allocation amount “gop_target” per GOP at the time of encoding.

FIG. 7 is a diagram for explaining 2 − 3 pull-down conversion. Fig. 8 is a state transition diagram of p-mode [k] in the pull-down. FIG. 9 is a diagram showing an initial phase of a 2-3 pull-down pattern. FIG. 10 is a diagram showing an example of a comparison result between the measured pull-down pattern and the assumed pull-down pattern.

FIG. 11 is a flowchart illustrating an example of an algorithm for determining the stability of the pull-down pattern.

FIG. 12 is a flowchart showing an example of an algorithm for judging the stability of a pull-down pattern, which is subsequent to FIG. 11.

FIG. 13 is a flowchart showing a basic processing procedure of the video encoder according to the present invention. FIG. 14 is a diagram showing a configuration example of a video encoding system according to the present invention. BEST MODE FOR CARRYING OUT THE INVENTION Preferred embodiments of the present invention will be described below with reference to the drawings.

In the embodiment of the present invention, the pull-down pattern of the input video material is measured and measured during the two-pass image coding in which the input video material is coded according to the Blu-ray processing. The stability of the pull-down pattern is determined. The stability is determined by comparing the measured pull-down pattern with the pull-down patterns based on a plurality of assumed initial phases, and determining the pull-down pattern closest to the measured pull-down pattern among the pull-down patterns. Select the initial phase that gives the sunset, calculate the error of the bladder pattern based on the selected initial phase with respect to the total number of frames of the video material, and calculate the error based on the calculated error. This is achieved by judging the stability of the pull-down pattern based on the value before encoding. Based on this determination result, it is determined whether the input video material is suitable for encoding by pull-down processing. Note that the encoding according to the above-mentioned bundling process or the encoding by the pull-down process is to increase the encoding efficiency mainly by not encoding the repeated field of the pull-down pattern.

Hereinafter, an embodiment of an image encoding method and an image encoding apparatus according to the present invention will be described with reference to a video element to which the above-described 2-3 pull-down conversion is performed. A description will be given of an example of material encoding processing.

FIG. 9 shows the initial phase of the 2-3 pull-down pattern described above.

The 2-3 pull down when converting frames from film material to video signals is almost always regular. Therefore, it is assumed that the pull-down pattern of the encoded material is regular. Also, considering that the beginning of the encoded material is specified as a chapter and that the encoding must always be performed with P_mode set to 2, the initial phase of the pull-down pattern will be It can be represented by any of the seven patterns shown in this figure.

The pull-down patterns of these pull-downs are as follows. If the mode of the initial phase is p-start (0≤p-start≤6), the picture mode p-mode [k] of the k-th frame is expressed by the following equation. You.

pd_cycle [7] [10] = {{2,2,2,2,2,2,2,2,2,2},

{2, 3, 4, 0,0, 0, 0,0, 0,0},

{2,3,4,0,1,2,3,4,0,1},

{2,2,3,4,0,1,2,3,4,0},

{2,2,2,3,4,0,1,2,3,4},

{2,2,2,2,3,4,0,1,2,3},

{2,2,2,2,2,3,4,0,1,2}}

p_mode [kj = pd— cycle [pd— start] [k mod 5] (when k <5)

p_mode [k] = pd—cycle [pd— start] [k mod 5 + 5] (when k≥5) Therefore, in the image coding method according to the embodiment of the present invention, a pull-down pattern based on the above seven initial phases assumed for a frame number k (0≤k≤kend) and a measured pull-down pattern Compare with and select the one with the least error out of the seven. Hereinafter, a flow of the video encoding process as an embodiment of the image encoding method according to the present invention will be described.

(1) First, the ps-th pull-down pattern is compared with the measured pull-down pattern, and if they match, pd_match [ps] is increased by one. On the other hand, if the comparison results are different, the number of points at which the phase has started to shift (rising edge of the error) is counted as pd-errnb [ps].

(2) Repeat the above process (1) in the range of 0≤k ^ kend.

(3) The above processes (1) and (2) are repeated within the range of 0≤ps≤6.

(4) Find the maximum value of pd_match [ps] in the range 0≤ps≤6. Let pd_errnb [ps] be the number of points at which the phase has started to shift for the maximum ps, pd_error–max.

(5) The pattern matching rate pd-match-ratio and the error occurrence rate pd_error-ratio for the measured total number of frames frame_nb are calculated as follows.

pd— error— ratio = pd— error— max / f rame_nb * 100 (¾) pd— match— ratio = pd— match— max / frame— nb * 100 (%)

(6) If the pattern matching rate pd_match— ratio is less than or equal to the threshold value PD— MATC H— LIMIT, or if the total number of frames frame— nb is greater than or equal to the threshold value PD FRAME LIMIT, and the error occurrence rate pd error ratio Is greater than or equal to the threshold value PD—ERROR_LIMIT, it is determined that the pull-down pattern of the encoded material is not stable (there is a high possibility that there will be a problem with the image quality if encoding is performed by specifying the pull-down processing). A warning message will be displayed immediately after the operation.

(7) The operator sees the above warning and determines whether or not to continue processing under the original encoding conditions. To stop the processing, remove the pull-down processing from the encoding conditions and restart the encoding work from the beginning.

In the video encoding according to the embodiment of the present invention performed by the above-described procedure, the threshold value for determining the stability of the pull-down pattern of the encoded material is set as follows, for example.

PD—MATCH-LIMIT = 75 (%)

PD_ERROR_LIMIT = 0.5 (%)

PD_FRAME_LIMIT = 10000 (frame)

The initial value is set as follows.

pd_match [ps] = 0; (number of times p—mode was the same) pd_errnb [ps] 2 0; (number of times pd—mode changed)

(0 ≤ ps ≤ 6)

pd—error = pd_error_back: 0;

pd—match—max = 0

Next, the measurement of the pull-down pattern will be described.

FIG. 10 shows an example of a result of comparing a pull-down pattern of an encoder material, which is measured at the time of provisional encoding, with a pull-down pattern based on the above-mentioned seven initial phases in the present embodiment. In this figure, error is set to 1 when the measurement result and the pull-down pattern are different. pd_match [ps] indicates the number of error is 0, that is, the number of times p_mode is the same. Also, pd—error [ps] is equal to the number of edges where error goes from 0 to 1.

In this example, it can be seen that the pull-down pattern based on the initial phase represented by P S = 2 is closest to the measured pull-down pattern. Here, total— frame = 1 3

Pattern matching ratio pdjnatch— ratio = 84.5%

Error rate pd—error—rat io 2 7.7%.

When this result is determined using the threshold value set above, it is determined that the measured pull-down pattern of the material is stable. Thus, the one with the largest value of pd-match for the measured value of the pull-down pattern is selected as the initial phase.

FIGS. 11 and 12 show a specific example of an algorithm for judging the stability of a pull-down pattern in the video encoding processing as an embodiment of the present invention described above.

In step S1 of FIG. 11, the number of times pjnode is the same is initialized, pd—match [ps] and pd—error [ps] are initialized to zero. However, the initial phase of the pull-down pattern represented by ps is 0≤ps≤6 because the initial phase is limited to seven as described above. In addition, the values of pd—error, pd_error_Dac, pd—match—max, and ps are updated to 0 (this is also done).

Next, in step S2, the value of the video frame number k is set to 0. It is.

Next, in step S3, it is determined whether k <5. If k <5, go to step S4. On the other hand, when k <5 is not satisfied, the process proceeds to step S5. This is because in the initial phase of the pull-down pattern shown in FIG. 9, up to the first four frames are singularities.

Then, in step S4, pm = pd−cyc le [st] [k mod 5], and in step S5, pm = pd_cyc le [st] [k mod 5 + 5].

Next, in step S6, pd—error—back = pd—error. Then, in step S7, it is determined whether pm == p mode [k]. If this condition is satisfied, the process proceeds to step S8, and if not, the process proceeds to step S9. . Here, “=” means substitution, while “==” means condition determination as to whether or not they are the same as in C language.

Next, in step S8, the value of pd-match [ps] is incremented by 1 and the value of pd_error is set to 0.

On the other hand, in step S9, the value of pd_error is set to 1.

Then, in step S10, it is determined whether or not pd error == 1 and pd error back == 0. If this condition is satisfied, the process proceeds to step S11, and the value of p derrnb [ps] is incremented by one. On the other hand, if the condition of step S10 is not satisfied, the process skips step S11 and proceeds to step S12.

Next, in step S12, the frame number k is incremented by one.

In step S13, it is determined whether the frame number k has exceeded kend. If it is determined that this condition is not satisfied, that is, if the last frame has not been reached, the process returns to step S3 and the above procedure is repeated. On the other hand, if this condition is satisfied, that is, if the last frame has been reached, the process proceeds to step S14, in which the value of ps is incremented by 1 and the same applies to the pull-down pattern based on the next initial phase. Processing is performed.

In step S15, it is determined whether or not ps> 6, and if this condition is not satisfied, that is, if the processing for all the pull-down patterns based on the seven initial phases described above has not been completed, Returns to step S2 and repeats the above procedure. On the other hand, if this condition is satisfied, that is, if the processing has been completed for all of the above seven pull-down patterns based on the initial phase, the process proceeds to step S16 in FIG. 12 and is set to ps20.

In step S17, it is determined whether pd—match—max is smaller than pd_match [ps]. If this condition is satisfied, the process proceeds to step S18, where pd_match_max = pd_match [ps], pd_error_max = pd—errnb [ps]. On the other hand, if the condition of step S17 is not satisfied, step S18 is skipped and the process proceeds to step S19. In step S19, the value of ps is incremented by 1 and the same processing is performed for the next pull-down pattern based on the initial phase.

In step S20, it is determined whether or not ps> 6, that is, whether or not the processing for all of the above seven pull-down patterns based on the initial phase has been completed. Returning to S17, the above procedure is repeated. On the other hand, this condition If the condition is satisfied, the process proceeds to step S21, and the error occurrence rate pd—error—ratio and the non-uniform matching rate pd—match—rat io are calculated.

In step S22, the pattern matching ratio pd—match_ratio is smaller than the threshold value PD—MATCH—L IMIT, or the total number of measured frames frame_nb is equal to the threshold value PD_FRAME—L IMIT and Error — Determines whether dp_error_ratio is greater than the threshold PD — ERROR — R. If this condition is satisfied, it is determined that the measured pull-down pattern is not stable, and pd_stabl e = 0 is set in step S23. On the other hand, if this condition is satisfied, it is determined that the measured pull-down pattern is stable, and pd_stable = 1 is set in step S24.

FIG. 13 shows a basic processing procedure of the video encoder according to the present invention, including the calculation of the stability of the pull-down pattern described above.

First, in step S31, encoding conditions are input, and in step S32, encoding difficulty (Diff i culty) parameters are input. In step S33, pull-down material processing is performed by provisional encoding based on these parameters. Pull down The stability of the pattern is calculated.

In step S34, the stability of the calculated pull-down pattern is determined. If it is determined that the pattern is not stable, a warning is displayed in step S35, and the encoding is performed in step S36. Is determined by the operator. If it is determined in step S36 that the processing is to be continued under the initial encoding conditions, the process proceeds to step S37. On the other hand, if it is determined in step S34 that the calculated pull-down pattern is stable, Proceed to step S37.

If it is determined in step S36 that encoding is not to be continued, the video encoding process is terminated by performing encoding without pull-down processing in step S44.

Next, in step S37, detection / processing of a scene change is performed.

In step S38, a CHAPTER boundary process for converting the boundary into a sequence of P-pictures and I-pictures is performed. In step S39, interpolation / correction processing of encoding difficulty (Difficulty) is performed. In step S40, the number of target bits is calculated. In step S41, an address (ADDRESS) is calculated. In step S42, tg-rate is calculated.

The processing in each of the steps S37 to S41 can be performed in the same manner as the processing in the conventional processing procedure shown in FIG.

Then, in step S43, a control file for the encoder is created, and the video encoder processing ends.

Next, an image encoding device according to the present invention will be described.

FIG. 14 shows a configuration example of a video encoding system as an embodiment of the image encoding device according to the present invention.

This video encoding system is used for applying the image encoding method according to the present invention described above to compress and encode video information for DVD (Digital Versatile Disk, Digital Video Disk) and to perform authoring or the like. The basic configuration is almost the same as the configuration of the conventional video encoding system shown in FIG. Can be

The main controller 11 is composed of a computer assigned to the video encoding system, performs data communication with a supervisor 3 connected via a network 2, and controls the entire video encoding system. Control the operation of.

More specifically, the main controller 11 receives control from the supervisor 3 under the control of a graphical user interface (GUI: Graphical User Interface) unit 14 and operates an unillustrated operator. The evening operation is accepted, and the bit assignment section 15, the encoder control section 16, and the VTR control section 17 managed by the GUI section 14 allow the encoder 12, video tape recorder (VTR) Control the operation of 10. Accordingly, the main controller 11 encodes the material to be processed in accordance with the encoding condition notified from the supervisor 3 and notifies the supervisor 3 of the processing result. Further, the main controller 11 can receive the setting of the operating system through the GUI unit 14 and change the above detailed conditions of the encoding.

Specifically, the GUI section 14 of the main controller 11 includes a bit allocation program “BIT_ASSIGN” of the bit assignment section 15, an encoder control program “CTRL_ENC” of the encoder control port section 16, and The VTR control section 17 manages three VTR control programs.

Further, the bit assigning unit 15 determines the conditions of the encoding process in units of frames according to the encoded file “v.enc” notified from the supervisor 3, and converts the control data based on these conditions into the file format “CTL. The control section 16 is notified by filej.

At this time, bit assignment section 15 sets the bit allocation in the encoding process, and changes the set conditions according to the operation of the operator. In addition, when the bit-decompressed video data D 2 is recorded in the RA ID 4 from the encoder 12 via SCSI or the like, the bit assignment section 15 reads the address data on the RAID 4. “v.adr” is notified to the supervisor 3 together with information “vxxx.aui” such as the data amount necessary for the multiplexing process in the subsequent stage.

The encoder control unit 16 controls the operation of the encoder 12 via the Ethernet ETHER or the like according to the control file “CTL file” notified from the bit assignment unit 15. Further, the encoder control unit 16 notifies the bit assignment unit 15 of the encoding difficulty level "dif-ficultyj" required for the encoding process to the bit assigning unit 15 in frame units, and the video decoding unit D2 Notify the bit assignment section 15 of the recorded address of the recorded RAID 4 address — “v.adr” and the data “vxxx. Aui” required for the subsequent multiplexing process.

The VTR control section 17 controls the operation of the video tape recorder (VTR) 10 via an RS-422 (9-pin remote) or the like according to the edit list notified from the supervisor 3, and performs a desired editing target. Play the source material.

The video tape recorder (VTR) 10 reproduces the video data D1 recorded on the magnetic tape in accordance with the edit list notified from the supervisor 3 via the main controller 11, and outputs the video data D1 to be processed. SDI ”,“ REF V ”, and“ TIME CODEj ”are output to encoder 12. As VTR 10, usually a digital VTR is used. The encoder 12 switches its operation according to the conditions notified from the supervisor 3 via the main controller 11, and converts the video data D 1 output from the VTR 10 into the MPEG (Moving Picture Experts Group) method. Compression encoding.

At this time, the encoder 12 notifies the main controller 11 of the result of the encoding process, and the main controller 11 controls the encoding condition in the overnight compression and generates Controls the amount of bits. As a result, the main controller 11 can grasp the bit amount generated by the overnight compression in frame units.

Also, the encoder 12 simply compresses the video data from the VTR 10 at the time of processing of the pre-encode condition setting in the two-pass encoding (at the time of the provisional encoding), and processes the processing result. Only the main controller 11 is notified, but at the time of the final data compression processing (at the time of this encoding), the compressed video data D 2 is recorded in RAID 4, and Further, the main controller 11 is notified of the address at which the data was recorded, the data amount, and the like.

The monitor device 13 is configured to monitor the video data D2 compressed by the encoder 12 overnight. With this monitor device 13, in this video encoding system, a so-called preview can be performed, in which the operating system confirms the result of the data compression process as needed. The operator can operate the main controller 11 based on the preview result to change the encoding conditions in detail.

According to the image coding method and the image coding apparatus as described above, the pull-down processing is specified as a condition of the two-pass variable rate code. If this is specified, the suitability of the material to be encoded can be determined before executing this encoding, so work such as reviewing the encoding conditions that occur when performing DVD authoring, etc. Man-hours can be reduced. In other words, in a two-pass variable rate encoder with 2-3 pull-down, at the time of pre-encoding (temporary encoding), along with measuring the encoding difficulty indicating the complexity of the material, the stability of the pull-down pattern is determined to determine the stability of the encoder material. Since the suitability is determined before this code, it is possible to promptly review the encoding conditions at an early stage.

Claims

The scope of the claims

1. Encode input video material according to pull-down processing

In a two-pass image encoding method,

A measuring step of measuring a pull-down pattern of the input video material; and a determining step of determining stability of the measured pull-down pattern.

An image encoding method comprising:

2. The image encoding apparatus according to claim 1, further comprising a display step of displaying whether the input video material is suitable for encoding by pull-down processing based on a result of the determination in the determination step. Method.

3. The above determination step is

By comparing the measured pull-down pattern with each of the pull-down patterns based on a plurality of assumed initial phases, an initial phase that gives a pull-down pattern closest to the measured pull-down pattern among the pull-down patterns is determined. A selection process to select;

An error calculating step of calculating an error of the pull-down pattern based on the selected initial phase with respect to the total number of frames of the video material, and based on the selected initial phase based on the calculated error A stability determination process for determining the stability of the pull-down pattern before encoding

2. The image encoding method according to claim 1, comprising:

4. In the stability determination step, the matching ratio and the error occurrence ratio of the measured dull-down pattern of the video material and the pull-down pattern based on the selected initial phase are compared with threshold values, respectively. To determine the stability of the encoder by pulling down the video material

4. The image encoding method according to claim 3, wherein:

5. The image code according to claim 1, wherein the coding according to the pull-down processing is performed by not coding a repetitive field according to the measured Bouland pattern. Method.

6. In a 2-pass image encoding device that encodes input video material based on a pull-down pattern,

Measuring means for measuring the pull-down pattern of the input video material, and comparing the measured pull-down pattern with each of the pull-down patterns based on a plurality of assumed initial phases, and measuring the measured one of the pull-down patterns Selecting means for selecting an initial phase that gives a pull-down pattern closest to the pull-down pattern;

Error calculating means for calculating an error of the pull-down pattern based on the selected initial phase with respect to the total number of frames of the video material, and based on the selected initial phase based on the calculated error Stability determination means for determining the stability of the pull-down pattern before encoding

An image encoding device comprising:

7. A display for indicating whether the input video material is appropriate or not by the pull-down processing of the input video material based on the determination result in the stability determining means. 7. The image encoding device according to claim 6, further comprising: means.

8. The stability determining means compares a matching rate and an error occurrence rate between the measured bluedown pattern of the video material and the pull-down pattern based on the selected initial phase with threshold values, respectively. 7. The image coding method according to claim 6, wherein the stability of the encode by the pull-down processing of the video material is determined.