WO2001019082A9

WO2001019082A9 - Converting non-temporal based compressed image data to temporal based compressed image data

Info

Publication number: WO2001019082A9
Application number: PCT/US2000/022892
Authority: WO
Inventors: Daniel J Holmes; Christopher J Prinos
Original assignee: Media 100 Inc
Priority date: 1999-09-07
Filing date: 2000-08-18
Publication date: 2002-06-27
Also published as: CA2384166A1; WO2001019082A1; EP1221259A1; AU6921200A

Abstract

The application relates to creating an MPEG file from non-temporal based compressed image data that are stored on randomly accessible storage and edited on a nonlinear video editor to create a video program (fig.1, 10). The editor generates information indicating the complexity of portions of image data simultaneously with storing and/or editing of the image data (fig.1, 10). The information is accessed to determined MPEG compression quality levels for groups of pictures of the image data based upon information (fig.1, 12). The groups of pictures of non-temporal based compressed image data are converted into MPEG compressed data using the respective quality levels (fig.1, 12). Also disclosed are displaying a representation including an indication of complexity of the portions; concurrently with compressing of the image data, decompressing the image data of the compressed file and playing it on a monitor to permit real time viewing of the quality of compression (fig.1, 14); and editing an MPEG compressed file by accessing the non-temporal based compressed image data to make changes without creating artifacts at boundaries of the portions of image data in the sequence in the MPEG file (fig.1, 12).

Description

CONVERTING NON-TEMPORAL BASED COMPRESSED IMAGE DATA TO TEMPORAL BASED COMPRESSED IMAGE DATA

Background of the Invention

The invention relates to converting compressed data from an image-based compression type to a time and image based compression type.

When a video program is created at a nonlinear video editor, analog video source materials are often digitized and stored on disc for later random access for use in creating the video program. The Motion Joint Photographic Experts Group (MJPEG) compression technique is often used because it is an image based compression. I.e., the data are compressed on a field-by-field basis, without consideration of temporal changes, which permits making edit cuts at field boundaries without artifacts or loss of content. The Motion Pictures Expert Group (MPEG) compression technique, on the other hand, is a time and image based compression. MPEG employs (1) compressed frames based on prior frame content, (2) compressed frames based on subsequent frame content and (3) compressed frames based solely on present frame content . The latter frames are known as I frames, and a new scene (i.e., clip) must begin with an I frame. While the use of temporal changes in MPEG compression permits more compression for a given quality, the edit decisions must be made in a way such that new segments begin with I frames to avoid artifacts . Video and other media are often compressed into

MPEG format "files" that are stored, e.g., on a digital versatile disc (DVD), or transmitted, e.g., over the Internet, and then decompressed and played back at a desired time and location. With DVDs, the data volume of the DVD and the length (time) of the program yield an average bit rate for the decompressed data, which rate cannot be exceeded in order to store the entire program on the DVD. MPEG files sent over networks also have an associated maximum bit rate determined by the physical medium capabilities, and the maximum desired transmission time for a given file.

When creating MPEG files, it is common to identify the most complex portion of the source material, in terms of spatial and temporal changes, and select a compression quality level that provides acceptable quality for the complex portion. The same compression quality level is then used in creating the entire MPEG file, even in the compression of portions with very little spatial and temporal changes. With constant compression settings, portions with more spatial and temporal complexity will have higher amounts of data, and thus a higher bit rate, than portions with less complexity. Depending upon the codec employed, different parameters are used for the compression settings. In general, the parameters include a maximum bit rate and quality levels. The maximum bit rate guarantees that there will not be a condition that overloads the system bandwidth. The quality level can be expressed as a number (e.g., on a scale of 1 to 112) that determines the quantization factors used by the codec. The quantization factors determine the extent of compression. In general, the more the video data are compressed, the poorer the quality of the image, while the less the video data are compressed, the better the quality of the image. It is possible to adjust the compression settings during a program, and even within a video clip, so as to make efficient use of bandwidth but still achieve minimum quality levels.

"Spatial compressing" means a type of compression that is not based on temporal changes; it thus includes JPEG compression and MPEG I -frame only compression among other types.

Summary of the Invention In one aspect, the invention features, in general, creating an MPEG file from non-temporal based compressed image data that are stored on randomly accessible storage and edited on a nonlinear video editor to create a video program. The nonlinear video editor generates information indicating the complexity of portions of image data simultaneously with storing and/or editing of the image data. The information is accessed to determine MPEG compression quality levels for groups of pictures of the image data based upon this information. The groups of pictures of the non-temporal based compressed image data are converted into MPEG compressed data using the respective quality levels.

In another aspect, the invention features, in general, creating a compressed file of sequences of images in a nonlinear video and displaying a representation of the sequence of portions during editing, the representation including an indication of complexity of the portions. The representation can be, e.g., usefully used to determine the more complex and less complex portions of the video program being created. Preferably the compressing is variable bit rate compressing, and the display of the complexity of the data includes display of the actual bit rates associated with the sequence of portions in the compressed file.

In another aspect, the invention features, in general, creating a compressed file of sequences of images in a nonlinear video editor and, concurrently with compressing of the image data, decompressing the image data of the compressed file and playing it on a monitor to permit real time viewing of the quality of compression. In preferred embodiments the editor can receive inputs from a user interface to adjust quality level during the compressing.

In a further aspect, the invention features, in general, creating an MPEG file from non-temporal based compressed image data in a nonlinear video editor and thereafter editing the MPEG compressed file by accessing the non-temporal based compressed image data to make changes without creating artifacts at boundaries of the portions of image data in the sequence in the MPEG file.

Preferred embodiments of the invention may include one or more of the following features. In preferred embodiments, the non-temporal based compressed image data are spatially compressed data (e.g., JPEG data) that are compressed from source uncompressed image data. The uncompressed data are processed at an MPEG encoder to generate information on complexity simultaneously with JPEG compressing and storing of the JPEG data. Other advantages and features of the invention will be apparent from the following description of a preferred embodiment thereof and from the claims.

Description of Drawings Fig. 1 is a block diagram of a nonlinear video editing system with capabilities to convert compressed data having a non-MPEG compression format to MPEG compressed data.

Fig. 2 is a more detailed block diagram of some of the components of the Fig. 1 system.

Fig. 3 shows a time line for a video program displayed on a monitor of the Fig. 1 system.

Fig. 4 shows a time line displaying average and actual bit rates for the Fig. 3 program.

Description of Particular Embodiments Referring to Fig. 1, nonlinear video editing system 10 includes host computer 12 having monitor 14, user input devices 16 (e.g., keyboard and mouse), video editing software 18, video editing peripheral boards 20, 22 in expansion slots of computer 12, video tape recorder (VTR) 24, DVD unit 26, and mass storage disc 28. Computer 12 is also connected to network 29 (e.g., a local area network that has a connection to the Internet) . Referring to Fig. 2, host computer 12 includes CPU

30, PCI bus 32, and input/output ports 34 for connection to monitor 14 and user input devices 16. Peripheral boards 20, 22 are connected to PCI bus 32. Boards 20, 22 are also connected directly to each other via "over-the- top" connector 36, which conveys parallel video data (per ITU-R BT-656 standard) between boards 20, 22. Board 20 includes JPEG coder/decoder (codec) 38, and board 22 includes MPEG codec 42. Board 20 also includes input/output (I/O) port 46 for source video and multiplexer 40, which receives as inputs either decompressed video from codec 38 or uncompressed video from video I/O port 46. Board 22 also includes video I/O port 44. Boards 20, 22 each include busmaster circuits 48, 50 for transferring data to and from PCI bus 32. Board 20 also has audio I/O ports (not shown) . Boards 20, 22 also include other video processing circuits, buffers, and controllers (not shown) to carry out various nonlinear video editing and data transfer functions.

Referring to Fig. 3, time line 50 has solid vertical lines 52 to designate boundaries between video clips 54 that together make up a video program. Time line 50 differs from prior time lines in that it graphically shows the temporal and spatial complexity of the video material that has been compressed by the density of light vertical lines 56, which provide a shading. The extent of temporal complexity is determined by the number of I- frames per unit time. Typically there are two I -frames per second, and the B- and P- frames between the I -frames indicate differences from the last I-frame. Where the video is changing quickly, the codec inserts I-frames at greater than a two per second rate . The extent of spatial complexity is determined by the magnitude of the I-frame values. On time line 50, weighted values of the temporal and spatial complexity are combined to determine the density of vertical lines 56 so that darker shading (increased density) indicates relatively complex video (with spatial and temporal change) whereas a relative absence of shading lines 56 indicates relatively non- changing video (e.g., a black screen or blue sky) . Referring to Fig. 4, time line 60 shows average bit rate 62 (dashed line) , for the entire program, and actual bit rate 64 (solid line) , which varies depending on the complexity of the video material. Thus, the higher bit rate portions of actual bit rate 64 correspond to the portions of relatively complex video in Fig. 3, and the lower bit rate portions of actual bit rate 64 correspond to portions of relatively non-changing video in Fig. 3. In operation, nonlinear video editing system 10 receives video at input/output port 46. The source video can be in analog or digital form and can be compressed digital video. If the source material is analog video, e.g., from VTR 24, it is first digitized. Digital video data are then compressed at JPEG codec 38 prior to transfer by bus master circuit 48 to mass storage disc 28. If the source data are compressed, they can be decompressed (via an appropriate decoder) and then compressed at JPEG codec 38. The MJPEG compressed data are compressed at JPEG codec 38 on a field-by-field basis, with the Q factors used to compress each field being adjusted based upon the volume of data that resulted from compression of one or more prior fields. The data stored for each MJPEG compressed field include MJPEG data words, the Q factors that were used, a data count, target bit rate, and actual bit rate. U.S. Patent No. 5,909,250, which is hereby incorporated by reference, describes a video editing system with adaptive JPEG compression with storage of Q factors and a data count . After the source video material have been stored on storage disc 28, the user is ready to create a video program.

At the same time that source material is being input and compressed into MJPEG compressed data, uncompressed video data from I/O port 46 are also directed through multiplexer 40 and over-the-top connector 36 to MPEG codec 42. The uncompressed data are processed in MPEG codec 42, and the actual bit rate for a given setting, the number of I-frames per unit time, and the magnitudes of the values of the I-frames are monitored and correlated with the video clip being stored on disk 28. Also, any time that MJPEG compressed data are accessed and decompressed during the editing and creation of a video program, the decompressed video are directed through multiplexer 40 and over-the-top connector 36 to MPEG codec 42. Similarly, any time that video clips are modified, or new video clips created, e.g., by adding titles or effects, the noncompressed video data are directed to MPEG codec 42. The noncompressed data are processed in MPEG codec 42, for further processing and gathering of information regarding the complexity of the content of the source (or processed) video. This information is correlated with the respective video clip stored on disk 28.

The user creates a video program from the MJPEG compressed video by selecting segments (also referred to as video "clips") of the stored material to be employed in the final program. As the user assembles the program, time line 50 shows the video clips 54 that have been selected for inclusion in the program. In this process, the user can trim the video clips 54 (by deleting portions from the beginning or end) and move the location of a clip 54 on the time line 50. The user can also create effects using one clip or a combination of clips, and then store the processed, compressed video. Alternatively, the settings for the effects processors can be determined during editing, with one or more MJPEG compressed video streams being processed at a video effects circuit at the time of playback. The video clips can be trimmed at field boundaries without loss of content or artifacts, as the MJPEG compressed fields do not depend on prior or subsequent frames for content . The user can then play back the created program for viewing on monitor 14 or storage on a tape at VTR 24. During playback, CPU 30 randomly accesses the segments in proper sequence (as shown on time line 50, Fig. 3) and controls decompression at JPEG codec 38. The output of JPEG decoder 38 is parallel digital video according the ITU-R BT-656 standard.

The user can also cause the video program to be converted to MPEG format , for storage on a DVD at DVD unit 26 or for transmission over the Internet via network 29.

When converting the video program from MJPEG compressed data to an MPEG compressed file, system 10 efficiently and automatically varies the MPEG compression so as to use higher bit rates for the image data having complex temporal and spatial changes and lower bit rates for image data having smaller changes at the same time that desired image quality is maintained for all portions of the file and storage volume and bit rate constraints for the storage and/or transport mediums are met. In doing this, system 10 uses information as to the complexity of the image that had already been compiled from processing of the source uncompressed video during input and the processing of decompressed video during playing during editing. This information is graphically displayed on the time line in Figs. 3 and 4.

In determining the settings to be used by MPEG codec 42, the transmission and/or storage constraints for the MPEG file to be created need to be considered. These include:

1. program time (PT) ,

2. maximum file size (MFS) ,

3. average bit rate (MABR) , and

4. max bit rate for bursts (MBRB) . The program time is determined by the video program that the user has created and wishes to convert to an MPEG file. The maximum file size is particularly important for DVDs. E.g., a DVD5 disc has a 4.7 Gbyte maximum file size, and a DVD9 disc has 8.45 Gbyt maximum file size. For a DVD, the MABR is taken has the MFS divided by PT, and the MBRB is determined by the DVD player. For transmission over networks, or other uses for the MPEG file, the MABR and MBRB may be set by physical limitations or other constraints.

In determining how to adjust the quality of MPEG compression, clips 84 are first broken down into smaller segments having similar image complexity. The smaller segments of JPEG fields are referred to as groups of fields (GOFs) . Each clip is a group of GOFs (GOGOF) . In the MPEG program, the corresponding MPEG frames are referred to as groups of pictures (GOPs) , and each clip is thus a group of GOPs (GOGOPs) .

The same MPEG quality level (QL) is used in compressing each GOF into a GOP. Each GOP consists of MPEG I, B and P frames. The selected quality level thus depends on the complexity for all image data in a program and the MABR, so that the MABR will be maintained. In addition, the MBRB cannot be exceeded for any particular GOP.

Prior to playback of the MJPEG program, CPU 30 accesses the MPEG compression information stored for each clip and the transmission and/or storage constraints (both as noted above), and determines the GOP boundaries. If there are any clips that do not have stored compression information, these clips are decompressed and passed through MPEG codec 42 to obtain such information. Monitor 14 then displays time line 50, with shading lines 56 indicating image complexity, and time line 60. Shading lines 56 can also be optionally displayed throughout the editing process. Then, the video program that has been created on system 10 is played by sequentially accessing the stored MJPEG fields according to time line 50, decoding the stored MJPEG fields at JPEG codec 38 into parallel, uncompressed video data passed over connector 36, and encoding the parallel uncompressed video data into MPEG I, B, and P frames at MPEG encoder 42. The MPEG encoded output of MPEG codec 42 is then sent to DVD unit 26, for writing on a DVD, or is stored in mass storage 28 or other storage (not shown) for later access . The MPEG encoded output of codec 42 is also optionally simultaneously decoded at codec 42 and displayed on monitor 14 to permit the user to view the quality of the video after MPEG encoding/decoding. Based upon the viewed output, the user can manually adjust MPEG compression settings on the fly to see how the adjustments affect image quality.

System 10 thus can be used to automatically determine the highest quality for a given program length and maximum MPEG file size. Alternatively, for a given quality level, system 10 can minimize the file size. In addition, system 10 permits conversion to MPEG with variable compression levels in less than two passes, thus avoiding the need for multiple passes. System 10 can execute the conversion in close to real time, and without the need for operator intervention; however system 10 additionally provides operator monitoring and immediate MPEG compression adjustment, with immediate ability to see the change on the end product. This thus permits the operator to know that maximum quality or minimum bit size are being achieved for given program and file/transmission constraints, and permits the operator to verify the quality at the same time and easily make any adjustments that are desired. If a user decides to edit the video program after conversion to an MPEG file, he can simply do so by deleting, replacing or adding MPEG compressed data for a GOF or GOGOFs . This can be done at a field boundary by going back to the JPEG fields and extending, if necessary, the edited portion to adjacent portions of the raw material so that a new I frame is used and a complete sequence of B and P frames is used. Thus, some of the material that is being retained will be replaced by converting the JPEG source material to MPEG compressed data, thereby avoiding artifacts that would otherwise be caused and also avoiding the need to MPEG compress the entire program. Other embodiments of the invention are within the scope of the appended claims. E.g., instead of JPEG compression, other spatial compression techniques, such as MPEG I-frame only compression, can be used. In this case, codec 38 would be an MPEG codec.

Also, in addition to compressing video program, system 10 can be used to compress various forms of multimedia data and other data such as Web pages.

Also, information on the complexity of the source video is automatically collected during MJPEG compression at JPEG codec 38 and stored with the JPEG compressed fields, and this information can be accessed in determining the appropriate settings for the MPEG compression. This information is used by CPU 30 in controlling MPEG encoder 42.

Other embodiments of the invention will be apparent from the following description of a preferred embodiment thereof and from the claims .

What is claimed is:

Claims

1. A method of creating an MPEG file from non- temporal based compressed image data comprising storing non-temporal based compressed image data on randomly accessible storage, editing said image data on a nonlinear video editor to create a video program, generating information indicating the complexity of portions of said image data simultaneously with said storing and/or said editing, accessing said information to determine MPEG compression quality levels for groups of pictures of said image data based upon said information, and converting said groups of pictures of said non- temporal based compressed image data into MPEG compressed data using said respective quality levels.

2. The method of claim 1 wherein said non-temporal based compressed image data are spatially compressed data, and further comprising compressing source uncompressed image data at a spatial compression encoder to obtain said spatially compressed data.

3. The method of claim 1 wherein said non-temporal based compressed image data are JPEG compressed data, and further comprising JPEG compressing source uncompressed image data at a JPEG encoder to obtain said compressed JPEG compressed data.

4. The method of claim 3 wherein said generating comprises processing said uncompressed data at an MPEG encoder to generate said information simultaneously with said JPEG compressing and storing of said JPEG data.

5. A method of creating a compressed file of sequences of images in a nonlinear video editor comprising storing sequential image data, storing information indicating the complexity of said sequential image data, determining the identity and sequence of portions of said sequential image data to be included in said compressed file, displaying a representation of said sequence of portions, said representation including an indication of complexity of said portions, and compressing said sequence of portions of said sequential image data to create said compressed file.

6. The method of claim 5 wherein said compressing is MPEG compressing.

7. The method of claim 5, wherein said compressing is variable bit rate compressing, and further comprising displaying the actual bit rates associated with said sequence of portions in said compressed file.

8. A method of creating a compressed file of sequences of images in a nonlinear video editor comprising storing sequential image data, determining the identity and sequence of portions of said sequential image data to be included in said compressed file, compressing said sequence of portions of said sequential image data to create said compressed file, and concurrently with said compressing, decompressing said sequential image data of said compressed file and playing it on a monitor to permit real time viewing of the quality of compression.

9. The method of claim 8 wherein said compressing is MPEG compressing.

10. The method of claim 8 wherein said compressing is variable quality level compressing, and further comprising receiving inputs from a user interface to adjust quality level during said compressing.

11. A method of creating an MPEG file from non- temporal based compressed image data in a nonlinear video editor comprising storing non-temporal based compressed image data, determining the identity and sequence of portions of said non-temporal based compressed image data to be included in said MPEG file, converting said portions of said sequence of portions of said non-temporal based compressed image data into MPEG compressed data of said MPEG file, storing said MPEG file, determining changes to be made to said identity and sequence of portions in said MPEG file, and accessing said non-temporal based compressed image data to make said changes without creating artifacts at boundaries of said portions of image data in said sequence in said MPEG file.

12. The method of claim 11 wherein said non- temporal based compressed image data are JPEG compressed data, and further comprising JPEG compressing source uncompressed image data at a JPEG encoder to obtain said JPEG compressed data.

13. The method of claim 1 further comprising displaying a representation of portions of said video program, said representation including an indication of complexity of said portions.

14. The method of claim 11 further comprising displaying a representation of said sequence of portions, said representation including an indication of complexity of said portions.

15. The method of claim 1 or 11, further comprising, concurrently with said converting, decompressing said MPEG compressed data and playing it on a monitor to permit real time viewing of the quality of compression.

16. The method of claim 5, further comprising, concurrently with said compressing, decompressing said compressed file and playing it on a monitor to permit real time viewing of the quality of compression.

17. Apparatus for creating an MPEG file from non- temporal based compressed image data comprising randomly accessible storage for said non-temporal based compressed image data, a nonlinear video editor to create a video program from said image data, said nonlinear video editor generating information indicating the complexity of portions of said image data simultaneously with storing and/or editing, said nonlinear video editor accessing said information to determine MPEG compression quality levels for groups of pictures of said image data based upon said information, and a temporal based compression coder that converts said groups of pictures of said non-temporal based compressed image data into MPEG compressed data using said respective quality levels.

18. The apparatus of claim 17 wherein said non- temporal based compressed image data are JPEG compressed data, and further comprising a JPEG encoder that compresses source uncompressed image data to obtain said compressed JPEG compressed data.

19. The apparatus of claim 18 wherein said temporal based compression coder is an MPEG encoder that generates said information simultaneously with said JPEG compressing of said source uncompressed image data.

20. Apparatus for creating a compressed file of sequences of images comprising randomly accessible storage for sequential image data, a nonlinear video editor that stores information indicating the complexity of said sequential image data, said nonlinear video editor determining the identity and sequence of portions of said sequential image data to be included in said compressed file, a monitor that displays a representation of said sequence of portions, said representation including an indication of complexity of said portions, and a compression coder that compresses said sequence of portions of said sequential image data to create said compressed file.

21. Apparatus for creating a compressed file of sequences of images comprising randomly accessible storage for sequential image data, a nonlinear video editor for determining the identity and sequence of portions of said sequential image data to be included in said compressed file, a compression coder that compresses said sequence of portions of said sequential image data to create said compressed file, a compression decoder that receives data of said compressed file as it is being created by said compression coder and decompresses said compressed file concurrently with compressing at said compression coder, and a monitor that receives compressed data from said compression decoder and plays it to permit real time viewing of the quality of compression.

22. Apparatus for creating an MPEG file from non- temporal based compressed image data in a nonlinear video editor comprising randomly accessible storage for non-temporal based compressed image data, a nonlinear video editor that determines the identity and sequence of portions of said non-temporal based compressed image data to be included in said MPEG file, a decoder and an MPEG encoder that convert said portions of said sequence of portions of said non-temporal based compressed image data into MPEG compressed data of said MPEG file, means for storing said MPEG file, means for determining changes to be made to said identity and sequence of portions in said MPEG file, and means for accessing said non-temporal based compressed image data to make said changes without creating artifacts at boundaries of said portions of image data in said sequence in said MPEG file.