CN1698370A

CN1698370A - Data processing device

Info

Publication number: CN1698370A
Application number: CN 200480000376
Authority: CN
Inventors: 伊藤正纪; 冈内理; 中村正
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Godo Kaisha IP Bridge 1; Panasonic Holdings Corp
Priority date: 2003-03-06
Filing date: 2004-03-03
Publication date: 2005-11-16
Anticipated expiration: 2024-03-03
Also published as: CN100536554C

Abstract

A data processing device includes: a signal input section for inputting a video signal and an audio signal; a compression section for compressing/coding the video signal and the audio signal and generating video data and audio data; a stream assembling section for generating a plurality of packets by dividing the video data and the audio data, generating a plurality of data units obtained by multiplexing the video packets associated with the video data and the audio packets associated with the audio data, and generating a data stream containing a plurality of data units; and a recording section for recording the data stream onto a recording medium. The stream assembling section decides the video packet and the audio packet to be contained in the data unit at least according to the video reproduction time. When all of the audio data corresponding to the video data stored in a predetermined data unit is not contained in the predetermined data unit, among the audio data, at least partial audio data not contained is copied as copy data, which is put in the data stream.

Description

Data processing equipment

Technical field

The present invention relates to real time record and comprise the method and apparatus that holds within video and the audio frequency.

Background technology

With low bit rate compress and the various data flow of encoded video (image) signal and audio frequency (sound) signal by standardization.As the example of this data flow, the system flow of known MPEG2 system standard (ISO/IEC 13818-1).System flow comprises that program flow (PS), transport stream (TS) and PES flow 3 kinds.

In recent years, CD such as phase change disc, MO just receives publicity to replace tape as the recording medium that is used for record data stream.Now, as the standard that content stream data is recorded in real time in the phase change disc (for example DVD) and can edits, stipulated DVD videograph standard (being called " VR standard " later on) (DVD Specifications for Re-writable/Re-recordable Discs Part3VIDEO RECORDING VERSION 1.0 September 1999).And, as the standard that the bag medium (package media) of the data flow of dedicated content such as regeneration such as record film etc. uses, stipulated DVD video standard (being called " video standard " later on).

Fig. 1 represented according to the data structure of the MPEG2 program flow 10 of VR standard (should flow later on to record and narrate and be " VR normal stream 10 ").

VR normal stream 10 comprises a plurality of object videos (Video Object; VOB) #1, #2 ..., #k.For example, when hypothesis VR normal stream 10 be content by video camera (cam-corder) photography, each VOB had stored the once video recording that stops to record a video by beginning to record a video to from the user and has moved the dynamic image data that is generated.

Each VOB comprises (the Video Object unit of a plurality of VOB units; VOBU) #1, #2 ..., #n.Each VOBU mainly is included in the data unit of the video data in the scope from 0.4 second to 1 second as the video refresh memory time.

Below, the VOBU#2 of the VOBU#1 of initial configuration among Fig. 1 and then configuration is illustrated as an example the data structure of VOBU.

VOBU#1 is by constituting as a plurality of set of bag of the low level layer of MPEG program flow.The data length of each bag in the VR normal stream 10 (bag is long) is constant (2k byte (2048 byte)).In the beginning of VOBU, the real time information bag of representing with " R " in the allocation plan 1 (RDI bag) 11.In the back of RDI bag 11, comprise the video packets (video packets 12 etc.) of a plurality of usefulness " V " expression and the audio pack of representing with " A " (audio pack 13 etc.).Even the recovery time is identical, if video data is a variable-digit speed, the range changing of the size of data of each VOBU below the dominant record regeneration rate then, if video data is a fixed bit speed, then the size of data of each VOBU almost is constant.

Information below each bag has been stored.Such as open the spy of Japan in the 2001-197417 communique record, RDI bag 11 has been stored the information of the regeneration that is used to control VR normal stream 10, the information of representing the information of VOBU regeneration sequential for example and being used for controlling the copy of VR normal stream 10.Video packets 12 has been stored the video data that is compressed by MPEG2.Audio pack 13 has been stored by for example voice data of MPEG2 audio standard compression.In contiguous video packets 12 and audio pack 13, stored for example by the video data of synchronizing regeneration and voice data.

VOBU#2 also is made of a plurality of bags.In the beginning of VOBU#2, configuration RDI bag 14 has disposed a plurality of video packets 15 and audio pack 16 etc. thereafter.In each bag the content of institute's canned data be with VOBU#1 in identical.

When being surrounded by, RDI is not recorded to the beginning of each VOBU in the VOB.At this moment, video packets must be recorded in the VOBU beginning.

Fig. 2 represents by the video flowing of the formation of the video data in the video packets with by the relation between the audio stream of the formation of the voice data in the audio pack.

Particularly, in VOBU#i, by comprise video packets 22 more than one in the bag institute's video data stored constitute picture (picture) 21b of video flowing.Then, by comprise video packets 22 more than one in the bag institute's video data stored constitute next picture, and the next again picture that constitutes by institute's video data stored in the later again video packets.On the other hand, constitute audio frame 23b by the voice data of in audio pack 23a, being stored.Audio pack for other also is identical.And the data of an audio frame also can be stored by the audio pack that is divided into more than 2.Also a plurality of audio frames can be included in the audio pack.

Suppose that the data of audio frame that VOBU comprises finish in VOBU.That is, suppose that the data of audio frame that VOBU comprises are present in the VOBU fully, and on next VOBU, do not comprise.

(show time mark (presentation time stamp) based on the information that is used to specify the recovery time of being stored at the packets of information head of each video packets and audio pack; PTS) regenerate frame of video and audio frame.Among Fig. 2, video pictures 21b and audio frame 23b regenerated with the time much at one.That is both synchronizing regenerations.

Pay close attention to video packets 24a and the 24b of VOBU#i.Constitute the last picture 24c of VOBU#i by institute's video data stored the video packets from video packets 24a to video packets 24b.As above-mentioned, each VOBU made up as benchmark with video refresh memory time etc., did not consider especially that audio frequency makes up.Therefore, even additional recovery time information (PTS) makes and video pictures 24c synchronizing regeneration that the data of audio frame 25c also are stored among the audio pack 25a of next VOBU# (i+1) and the 25b etc.

Like this, the reason that departs from record position with the audio frame of video frame synchronization regeneration is because in the aims of systems decoder (P-STD) of having stipulated video packets and audio pack multichannel compound rule, and the size (for example being the 4k size) of the buffer that the size of data of the buffer that video data uses (for example be the 224k size) is used than voice data greatly a lot.Voice data is because the data volume that can accumulate is few, and is therefore compound so that read in before regeneration constantly by multichannel.

For this program flow, the user can login the VOBU regeneration order of hope as " playlist ".Regenerating unit is based on the data of certain VOBU of playlist by the obtaining appointment video etc. of regenerating, and then, continues regeneration by the opening sense data from the VOBU of appointment.

But, when video data that should synchronizing regeneration is stored among the different VOBU with voice data, in regeneration, produced the problem that audio frequency is interrupted in the way and so on based on playlist.Although its reason is the data of having read the VOBU of regeneration object, then do not read out in the voice data of being stored among the VOBU of the non-regeneration object that is disposed.At this moment, only regenerate video and do not bear again ought to the audio frequency of its synchronizing regeneration.

For example, among Fig. 2, suppose that playlist specifies regeneration VOBU#k (k ≠ (i+1)) after the VOBU#i that regenerated.At this moment, after the data of the video pictures 24c that has read VOBU#i, read data in the next VOBU#k.Therefore, should be synchronized with video pictures 24c and the data of audio frame 25c of the storage in VOBU# (i+1) of regenerating are not read out, not have reproducing audio.As a result, the user just hears that voice are cut away midway.

In VOBU#k, the video pictures corresponding audio frame that where begins to have stored with its beginning in VOBU#k is different to each VOBU.Begin wherefrom to store by the relativeness between VOBU#k and the VOBU before it (VOBU# (k-1)) and determine.Particularly, decide by the position amount of program flow and the buffer sizes of aims of systems decoder (P-STD).Therefore, even audio frame that should synchronizing regeneration is complete in VOBU#i, just storage immediately should with the audio frame of VOBU#k synchronizing regeneration.For this reason, the user just hears that voice are cut away midway.

Summary of the invention

Even the objective of the invention is under situation based on regeneration video such as playlist and audio frequency, significantly reduce during voice are cut away midway, perhaps do not have voice to be cut away midway during.

According to data processing equipment of the present invention, have: signal input part, its incoming video signal and audio signal; Compression unit, it generates video data and voice data by described vision signal of compressed encoding and described audio signal; The stream combination section, it is by cutting apart described video data and described voice data generates a plurality of packets of information, generation will be relevant with described video data video information packets and with described voice data relevant compound a plurality of data units of audio-frequency information bag multichannel, and generate the data flow that comprises a plurality of described data units; And recording portion, it arrives recording medium with described traffic logging.The video information packets and the audio-frequency information bag that are comprised determined based on the video refresh memory time at least in described stream combination section in described data unit, and with specified data unit in corresponding all audio frequency data of institute's video data stored when not being included in the described predetermined data unit, in described voice data, the copies data that has copied as the part voice data that does not comprise part at least is included in the described data flow.

Described stream combination section also can be stored the described copies data corresponding with described data unit in the video information packets that is configured in beginning at least of follow-up data unit into.

Described stream combination section also can be stored the described copies data of correspondence in the described data unit into.

Described stream combination section also can be stored described copies data in the audio stream special-purpose in the described data flow.

Described stream combination section also can be stored described copies data in the private data stream special-purpose in the described data flow.

Described stream combination section also can be included in the copies data that has copied with the synchronous whole described voice data of described video data in the described predetermined data unit.

Described stream combination section also can be stored the copies data that has copied with the synchronous whole described voice data of described video data in the audio stream special-purpose in the described data flow into.

Described stream combination section also can be stored the copies data that has copied with the synchronous whole described voice data of described video data in the audio stream special-purpose in the described data flow into, and the transmission time information in the transmission time of the described copies data of conduct expression, regulation has departed from transmission time of stipulated time, the line item of going forward side by side ahead of time than the transmission time in the data unit in described copy source.

Described stream combination section also can be used as first file that has comprised described a plurality of data units and second file that has comprised described copies data, generates described data flow; Described recording portion arrives described recording medium with described data unit and copies data recording occurring continuously.

Described stream combination section also can generate described second file according to the copies data that has copied the whole described voice datas corresponding with described video data.

Also can add rate information in described voice data, described voice data has the data length according to first rate; Described compression unit comes the described audio signal of compressed encoding and is stored as described voice data according to second speed littler than described first rate; Into clear area is stored described copies data in described stream combination section, this clear area poor corresponding between the 1st data length of the described voice data of the 2nd data length of described relatively second speed regulation and described relatively first rate regulation.

According to data processing method of the present invention, comprising: the step of receiving video signals and audio signal; By described vision signal of compressed encoding and described audio signal, generate the step of video data and voice data; By cutting apart the step that described video data and described voice data generate a plurality of packets of information, generation will be relevant with described video data video information packets and with described voice data relevant compound a plurality of data units of audio-frequency information bag multichannel, and the step that generates the data flow that comprises a plurality of described data units; With the step that described traffic logging is arrived recording medium; The step that is used for generating described data flow is determined the video information packets and the audio-frequency information bag that comprise in described data unit based on the video refresh memory time at least, and with specified data unit in corresponding all audio frequency data of institute's video data stored when not being included in the described predetermined data unit, in described voice data, the copies data that has copied as the part voice data that does not comprise part at least is included in the described data flow.

The step that is used to generate described data flow also can store the described copies data corresponding with described data unit in follow-up data unit is configured in the video information packets of beginning.

The step that is used to generate described data flow also can be included in the copies data that has copied the whole described voice data corresponding with described video data in the described predetermined data unit.

Be used to generate the step of described data flow, also can be used as first file that has comprised described a plurality of data units and second file that has comprised described copies data, generate described data flow; Described recording step arrives described recording medium with described data unit and described copies data recording occurring continuously.

The step that is used to generate described data flow also can generate described second file according to the copies data that has copied the whole described voice datas corresponding with described video data.

Also can add rate information in described voice data, described voice data has the data length according to first rate; Be used to generate the step of described voice data by generating described voice data with the described audio signal of first rate compressed encoding; Be used for generating the step of described data flow for the described voice data that comprises in described specified data unit, described first rate second rate value more early that likens to described rate information by setting generates described voice data, and store described copies data into clear area, this clear area poor corresponding between the 1st data length of the described voice data of the 2nd data length of described relatively second speed regulation and described relatively first rate regulation.

In recording medium of the present invention, write down the data flow that comprises a plurality of data units.Each of described a plurality of data units by will be relevant with video data video information packets and with described voice data relevant audio-frequency information bag multichannel is compound constitutes.Described video data and be stored in the predetermined data unit with the part of described video data corresponding audio data, and be not stored in the described predetermined data unit as the part voice data of other parts of the described voice data corresponding with described video data.Described data flow also comprises the copies data that has copied described part voice data.

According to data processing equipment of the present invention, receive the decode above-mentioned data flow and outputting video signal and audio signal.Data processing equipment has: the regeneration control part, and it indicate the reading of data of the object of constitute regenerating in the data that data flow comprised; Read portion, it is based on the indication of described regeneration control part, from the described predetermined data unit of described data flow, read described video data and with the part of described video data corresponding audio data; And lsb decoder, its part by decode described video data and described voice data is come synchronous outputting video signal and audio signal.Described regeneration control part is also indicated reading of described copies data after described indication, and described lsb decoder exports synchronously with described vision signal by the described copies data of decoding after the part of described voice data of having decoded.

Description of drawings

Fig. 1 is the data structure schematic diagram of expression according to the MPEG2 program flow 10 of VR standard.

Fig. 2 is the schematic diagram of the relation between the audio stream of representing to pass through the video flowing of the video data formation in the video packets and pass through the interior voice data formation of audio pack.

Fig. 3 is the formation block diagram of the function of expression data processing equipment 30.

Fig. 4 is the schematic diagram of the data structure of expression VR normal stream 10.

Fig. 5 is the schematic diagram of the relation between the posting field territory of expression VR normal stream 10 and CD 131.

Fig. 6 is the schematic diagram of the state managed in CD 131 file system of the VR normal stream 10 that is recorded of expression and management information.

Fig. 7 is the schematic diagram that concerns between VOBU, video flowing and the audio stream of expression according to execution mode 1.

Fig. 8 is the flow chart of the recording processing step of expression data processing equipment 30.

Fig. 9 is the schematic diagram that concerns between VOBU, video flowing and the audio stream of expression according to execution mode 2.

Figure 10 is the schematic diagram that concerns between VOBU, video flowing and the audio stream of expression according to execution mode 3.

Figure 11 is the schematic diagram that concerns between VOBU, video flowing and the audio stream of expression according to execution mode 4.

Figure 12 is the schematic diagram that concerns between VOBU, video flowing and the audio stream of expression according to execution mode 5.

Figure 13 is the schematic diagram that concerns between VOBU, video flowing and the audio stream of expression according to execution mode 5 variation.

Figure 14 is the schematic diagram that concerns between VOBU, video flowing and the audio stream of expression according to execution mode 6.

Figure 15 is data structure, additional information position and the big or small schematic diagram of the audio frame of expression AC-3 standard.

Figure 16 (a) and (b) be that expression has the schematic diagram according to the audio packet data structure of the sub-information flow ID of voice data kind.

Figure 17 represents the data structure of the audio frame of MPEG-1 audio standard.

Embodiment

(execution mode 1)

Below, the structure according to the data processing equipment of present embodiment is described, simultaneously, the data structure of the data flow relevant with the processing of data processing equipment is described.Afterwards, operation of recording and the regeneration action that data processing equipment carries out is described.And, in this manual, will exemplify as example data stream according to the MPEG2 program flow (VR normal stream) of DVD videograph standard (VR standard) and describe.

Fig. 3 represents the formation block diagram of the function of data processing equipment 30.Data processing equipment 30 has writing function, and it arrives VR normal stream 10 real time record on the recording medium of DVD-RAM dish, Blu-ray disc phase change disc 131 representatives such as (BD).Data processing equipment 30 also has the regeneration function that the VR normal stream 10 that writes down by reading is decoded and regenerated.But in carrying out treatment in accordance with the present invention, data processing equipment 30 not necessarily leaves no choice but be provided with writing function and regeneration function.Data processing equipment 30 is for example fixed device, video camera (cam-corder).

Below, the structure of the writing function that relates to data processing equipment 30 is described.Data processing equipment 30 has: vision signal input part 100; Audio signal input part 102; MPEG2PS encoder 170; Recording portion 120; Continuous data region detecting part 160; Recording control part 161 and logical block management department 163.

The summary of the operation of recording of data processing equipment 30 at first, is described.When generating record VR normal stream 10, the PS combination section 104 (aftermentioned) of MPEG2PS encoder 170 is by deciding at (the Video ObjectUnit of object video unit as data unit based on the video refresh memory time at least; VOBU) video packets that comprises in and audio pack and generate VOBU.When in same VOBU, not comprising the total data of the audio frequency corresponding, the copies data that has copied the voice data that does not comprise is at least comprised and be recorded in the VR normal stream 10 with video.Here, " audio frequency corresponding with video " is meant " with the audio frequency of audio video synchronization and regeneration ".

Copies data is stored in the follow-up VOBU and (for example is the user data area in the initial video packets), perhaps is stored in other audio files except VR normal stream 10 files.Perhaps, voice data also can be used as private information stream and is stored, and also can be used as additional information and is stored, so that the video of synchronizing regeneration and audio frequency are encased in the VOBU.

And, can with the total data of the corresponding audio frequency of video as different audio stream staggered (interleave) in identical VOBU.Can also be stored in other audio files except VR normal stream 10 files.Perhaps the total data of the audio frequency corresponding with video can also be stored as private information stream.

Below, relate to the general utility functions of each combiner of the writing function of data processing equipment 30 with reference to figure 3～Fig. 6 explanation.Then, the physical record with reference to explanation data processing equipments 30 such as figure 7, Fig. 8 moves.

Vision signal input part 100 is video signal input terminal, and it receives the vision signal of expression video data.Audio signal input part 102 is audio signal input end, and it receives the audio signal of expression voice data.For example, when data processing equipment 30 is video recorder, vision signal input part 100 is connected with audio output part with the video efferent of tuner (tuner) portion (not shown) respectively with audio signal input part 102, and respectively from them there receiving video signals and audio signal.When data processing equipment 30 was film register, video camera etc., vision signal input part 100 and audio signal input part 102 received the CCD (not shown) from video camera and the vision signal and the audio signal of microphone output respectively.

MPEG2PS encoder 170 (record and narrate later on and be " encoder 170 ") receiving video signals and audio signal and the MPEG2 program flow (PS) that generates according to the VR standard are VR normal stream 10.Encoder 170 has video compression portion 101, audio compression portion 103 and PS combination section 104.Video compression portion 101 and audio compression portion 103 are based on the MPEG2 standard video data and the voice data that obtain from vision signal and audio signal of compressed encoding respectively.PS combination section 104 the video data that will be compressed coding and voice data be divided into the video packets and the audio pack of 2k byte unit respectively and these bags be arranged in order constitute a VOBU in, by generating VR normal stream 10 at the additional RDI bag 27 of beginning.

Fig. 4 represents the data structure of VR normal stream 10.VR normal stream 10 comprises a plurality of VOBU.Although put down in writing 2 VOBU among Fig. 4, can also comprise more.Each VOBU in the VR normal stream 10 is made of a plurality of bags.Because the information that is comprised in these bags and each bag is illustrated with reference to figure 1, omits explanation here.

Below, the data structure of video packets 12-1 etc. is described.Video packets 12 has been stored the video data 12a that is compressed by MPEG2.Video packets 12 comprises packet header 12b and is used for discerning is the PES packets of information head 12c of video packets.And, if the initial video packets of VOBU then also comprises system's head (not shown) in packet header 12b.

The video data 12a of video packets 12-1 shown in Figure 4 constitutes the data of I frame 44 with video data 12d after subsequent video bag 12-2 etc.And then, after the I frame, then write down the video packets that constitutes B frame 45 or P frame.

Video data 12a comprises sequence (sequence) head 41, user data 42 and GOP head 43.In the MPEG2 standard, stipulated to compile " picture group " (Group ofPicture of a plurality of frame of video; GOP).The beginning of the sequence that sequence header 41 expression is made of a plurality of GOP.On the other hand, the beginning of GOP head 43 each GOP of expression.The beginning frame of GOP is the I frame.Because these heads are known, it describes omission in detail.User's head 42 is set between sequence header 41 and the GOP head 43, and it can record and narrate arbitrary data.

Added in the beginning of sequence header 41, user data 42 and GOP head 43 and to be used to discern its each beginning code.For example, are " 000001B3 " in sequence header 41, be " 000001B5 " and are " 000001B8 " (they all being that 16 systems are represented) at user data 42 at GOP head 43.Continue reading till the beginning code that detects GOP head 43 of user data 42, when detecting the beginning code of GOP head 43, the part of beginning head B5 of removing user data 42 before this in the middle of the data of Huo Deing is obtained as user data.

And the recovery time of whole GOP is adjusted in principle and is in more than 0.4 second in the scope below 1.0 seconds in each VOBU, but the recovery time of last VOBU is adjusted at more than 0 second in the scope below 1.0 seconds exceptionally.For VR normal stream 10 by real time record, stopped the record of not enough 0.4 second time.If in this scope,, allow the change of the recovery time of video for each VOBU.

Recording portion 120 is based on the instruction of recording control part 161, and control pick-up 130 is from the object video unit (VOBU) by the position opening entry VR normal stream 10 of the logical block number of recording control part 161 instructions.At this moment, recording portion 120 is divided into the 32K byte unit with each VOBU, records on the CD 131 as a logical block by the additional error correcting code under this unit.When record end, do not vacate the gap and carry out the record of next VOBU continuously at the VOBU of the way of a logical block.

Fig. 5 represents the relation between VR normal stream 10 and CD 131 posting fields.Each VOBU of VR normal stream 10 is recorded to the continuous data zone of CD 131.The continuous data zone is made of physically continuous logical block, in the data of recovery time record more than 17 seconds of pressing on this zone under the maximum rate.Data processing equipment 30 provides error-correcting code on each logical block.The size of data of logical block is the 32k byte.Each logical block has comprised the sector of 16 2k bytes.

Continuous data region detecting part 160 is checked the behaviour in service by CD 131 sectors of logical block management department 163 management, and detects and can store the data suitable with above-mentioned time span but untapped continuous idle logical block zone.

And need not often detect the continuous idle logical block zone more than 17 seconds that is used to guarantee cyclic regeneration, for example, also can follow the tracks of by the cumulant of calculating the residue playback of data on one side, Yi Bian dynamically determine the size of data of continuous idle logical block.That is, when the continuous data that can guarantee 20 seconds amounts sometime in record is regional, by guaranteeing just can to guarantee cyclic regeneration as the continuous data zone of its follow-up 14 seconds amount.

The action of recording control part 161 controlling recording portions 120.Recording control part 161 indication recording portion 120 as data file (for example filename " VR_MOVIE.VRO ") record, and are recorded to VR normal stream 10 in the CD 131.And recording portion 120 also will record CD 131 from the management information file to the VR normal stream (filename VR_MANGR.IFO) that recording control part 161 receives.Comprise in the management information each VOBU for example size of data, comprise the size of data of video field (field) number and beginning I frame.

Be recording control part 161 control action more specifically below.That is, recording control part 161 outputs to instruction continuous data region detecting part 160 in advance, detects continuous idle logical block zone.Recording control part 161 is fashionable just with this logical block number Notification Record portion 120 whenever writing of logical block unit taken place, and uses notification logic piece management department 163 when finishing when logical block.For continuous data test section 160, recording control part 161 can dynamically detect the size in continuous idle logical block zone.Continuous data region detecting part 160 is carried out the dominant record regeneration rate with the remainder in a continuous data zone and is converted, and for example the moment of 3 seconds amounts is detected next continuous data zone once more cutting apart.When a continuous data zone becomes full, recording control part 161 indications writing to next continuous data zone.

Fig. 6 represents the state that the VR normal stream 10 that is recorded and management information are managed in CD 131 file system.For example use the file system or ISO/IEC 13346 (the Volume and file structure of write-once and rewritablemedia using non-sequential recording for information interchange) file system of UDF (Universal Disk Format) standard.Among Fig. 6, continuous recording VR normal stream 10 is recorded as filename VR_MOVIE.VRO.Management information is recorded as filename VR_MANGR.IFO.Each file is used the position of FID (file identification specifier) management document name and document entry.By using the distribution specifier (Allocation Descriptor) in the document entry, a file is associated with the data area that constitutes this document.To start sector numbers in distributing specifier sets as the position that is used for the document entry of configuration file.The document entry of VR normal stream file comprises the distribution specifier a～c that is used to manage each continuous data zone (CDA:Continuous Data Area) a～c.The reason that file is divided into a plurality of regional a～c is because have bad logical block and the PC file that can not write etc. at the halfway of regional a.On the other hand, the document entry of management information file has been preserved the distribution specifier in the zone of reference record management information.

Logical block management department 163 manages by the behaviour in service of grasping each logical block number according to the logical block number that the use by recording control part 161 notices finishes.That is, come regulation to constitute the specifier zone, room (space bit) of operating position of each sector unit of logical block number, just can write down and manage the zone of using or not having use by using with UDF or ISO/IEC 13346 file structures.In the final stage of recording processing, file identifier (FID) and document entry are write file management zone on the dish.

The subclass of UDF standard and ISO/IEC 13346 standards is suitable.By inserting 1394 interfaces and SBP-2 (serial bus protocol-2) agreement the phase change disc driver is connected to PC, then also can be on PC with the file that writes in foundation UDF form by a file process.

Below, the physical record action according to the data processing equipment 30 of present embodiment is described.In the following description, the term of so-called " correspondence " is to be assumed to be video and audio frequency or relative video data and the voice data that expression should synchronizing regeneration.

Now, PS combination section 104 pairing video datas of supposition and voice data have all generated and have not been included in a VR normal stream among the VOBU.As above-mentioned because based on definite VOBU such as video refresh memory times, the part of voice data can consider to be stored in among the different VOBU subsequently of corresponding video data.The voice data that is comprised among the VOBU identical with video data has comprised an integer audio frame.

Fig. 7 represents the relation between VOBU, video flowing and the audio stream according to present embodiment.The superiors represent the set of VOBU that is used to constitute VR normal stream 10 as the mpeg file setting, comprise the set of video data among the 2nd layer of each VOBU of expression, the set of the set corresponding audio data of the 3rd layer of expression and video data.The video data that is comprised among the VOBU#i is expressed as V (i) etc.Be expressed as voice data A with the voice data of video data V (i) synchronizing regeneration ₀(i).The superiors represent to be used to constitute the VOBU of MPEG2 program flow.The 2nd layer of set of representing institute's stored video frames among each VOBU.Will with the voice data A of each sets of video frames synchronizing regeneration ₀(i) the position relation between memory location and the VOBU boundary is illustrated in the 3rd layer with vertical dotted line and goes up (Fig. 9 of back, 10,11,12,13,14 also is roughly the same).

Under above-mentioned supposition, with the voice data A of video data V (i) synchronizing regeneration ₀(i) memory location is from the midway of VOBU#i, and the end is stored in the beginning part of VOBU (i+1).Among Fig. 7, from the beginning of VOBU# (i+1) to voice data A ₀(i+1) the data A that front end is stored is suitable with voice data in being stored in the VOBU# (i+1) different with the VOBU#i that has stored video data.This voice data is called as " separating the storage data " in the back.

PS combination section 104 generates expression and the copies data of separating storage data identical content when VOBU#i generates with VOBU# (i+1).This copies data is stored in subsequently in the video packets of VOBU# (i+1) beginning of VOBU#i back.Particularly, copies data is stored in the user data area (for example user data area 42 of Fig. 4) of beginning video packets.Copies data is stored in each data that means in the user data area 42 video and audio frequency all to be stored in the VR normal stream 10 (file).So-called copies data is meant the copy of the voice data that separates the storage data itself.

At this moment, only copy flows (elementary stream) substantially, also can be with bag unit copy.But when implementing the copy of bag unit, the SCR value of the packet header of audio pack is owing to needn't have as the meaning of transmission time sequence so copy value that can be in statu quo.Pts value in the bag in the PES packets of information head becomes and can directly utilize.

And, PS combination section 104 when VOBU# (i+1) and VOBU# (i+2) generate with video data V (i+1) corresponding audio data A ₀(i+1) also generate the copies data of storing the data identical content of separating of being stored among expression and the VOBU# (i+2) in.Then, this copies data is stored in the video packets of VOBU (i+1) beginning of VOBU#i back.

PS combination section 104 is owing to have by holding which picture of video and which frame of audio frequency are carried out the function that synchronizing regeneration adds PTS, therefore at voice data A ₀In, which can be grasped partly is to separate the storage data.Therefore, identification separation storage data are easy.

Fig. 8 is the flow chart of the recording processing step of expression data processing equipment 30.At first in step S81, vision signal input part 100 and audio signal input part 102 be receiving video signals and audio signal respectively.In step S82, video data and voice data that video compression portion 101 and audio compression portion 103 compressed encodings obtain from each signal.

PS combination section 104 is in next procedure S83, based on generation VOBU#i such as video refresh memory times.Determine configuration (in proper order) of each bag of video packets in the VOBU#i etc. according to the regulation of System Target Decoder model (model).For example, the configuration (in proper order) of each bag is determined to be the regulation that satisfies the buffer capacity of stipulating in program flow System Target Decoder (P-STD) model.

Then, in step S84, judge whether pairing video data and voice data are stored in the same VOBU.In the time of in being stored in same VOBU, the data of the VOBU that generated are delivered to step record portion 120.Recording portion 120 with this data record in CD 131.After this, repeat the processing that begins from step S83.

When the video data of correspondence and voice data are not stored in the same VOBU, promptly when with a part of data A of video data corresponding audio data as separating the storage storage in follow-up VOBU the time, handle proceeding to step S85.In step S85, PS combination section 104 outputs to recording portion 120 by the user data area that will separate storage data (the partial data A of Fig. 7) and record and narrate in the video packets of next VOBU# (i+1) beginning.Recording portion 120 arrives CD 131 with this data record.

Then, in step S86, judge whether PS combination section 104 has handled whole video datas and voice data.When handling when not finishing, repeat the processing that begins from step S83, when processing finished, end record was moved.

Below, refer again to the function that Fig. 3 illustrates each combiner relevant with the regeneration function of data processing equipment 30.The regeneration action of data processing equipment 30 is described afterwards.

Data processing equipment 30 has: video display part 110; Audio output part 112; Reproducing unit 121; Transformation component 141; Output interface portion 140; Regeneration control part 162; Playlist regeneration control part 164 and MPEG2PS decoder 171.

Video display part 110 is the display unit such as television set that are used for output video, and audio output part 112 is loud speakers that are used for output video and audio frequency etc.Video display part 110 and audio output part 112 are not the necessary parts of output processing apparatus 30, can be provided with as external device (ED).Reproducing unit 121 will be regenerated as digital signal based on the instruction VR normal stream 10 as analog signal of reading from CD 131 by light picker 130 of regeneration control part 162.Regeneration control part 162 is given light picker 130 by being identified in contained data among the VOBU that constitutes the regeneration object and this VOBU with the indication of reading of these data.Playlist regeneration control part 164 is according to each scene (scene) of the step regeneration dynamic image of user's appointment.Each scene is for example with the VOBU unit management.

MPEG2-PS decoder 171 (being called " decoder 171 " later on) has: program flow decomposition portion 114; Video elongated portion 111 and audio frequency elongated portion 113.Program flow decomposition portion 114 (record and narrate later on and be " PS decomposition portion ") separating video data and voice data from VR normal stream 10.Video elongated portion 111 will be based on the MPEG2 standard and the video data of compressed encoding is decoded according to its standard and exported as vision signal.Equally, audio frequency elongated portion 113 will be based on the MPEG1-audio standard and the voice data of compressed encoding is decoded according to its standard and as audio signal output.

The general regeneration action of data processing equipment 30 at first, is described.When VR normal stream 10 that data processing equipment 30 regeneration is write down, parallel carrying out read decoding (regeneration) with institute's sense data from the data of CD 131.At this moment, make that by controlling comparing the data read-out speed with data maximum regeneration speed becomes more at a high speed, the data that should regenerate are enough operated.As a result, when continuing the regeneration of VR normal stream 10,, just because of the difference of speed between data maximum regeneration speed and the data read-out speed, make it possible to additionally to guarantee the data that regenerate in each unit interval.The data that data processing equipment 30 is additionally guaranteed by (for example in seek operations) regeneration during pick-up 130 can not sense data can realize the regeneration of the VR normal stream 10 of not cut away midway.

For example, the data read-out speed of supposing reproducing unit 121 is that the data maximum regeneration speed of 11.08Mbps, PS decomposition portion 114 is that the maximum traveling time of 10.08Mbps, pick-up is 1.5 seconds, the VR normal stream 10 of midway not cut away in order to regenerate, in the moving of pick-up 130, the excessive data of 15.12M position necessitates.In order to ensure this data, need to read continuously during 15.12 seconds.That is it is necessary, reading continuously in the time of removing the 15.12M position with the difference of data read-out speed 11.08Mbps and data dominant record regeneration rate 10.08Mbps.Therefore and since 15.12 seconds continuous datas read during read the data (playback of data of promptly 16.62 seconds amounts) of maximum 167.53M position, by guaranteeing the continuous data zone more than 16.62 seconds (about 17 seconds), might guarantee continuous data reproduction.In the way in continuous data zone, can have several bad logical blocks.But, in this case,, need guarantee the continuous data zone than 16.62 seconds recovery times amount by predicting to reading in the needed readout time of bad logical block of relevant regeneration a little some morely.

Below, the concrete regeneration action of data processing equipment 30 is described.At first, action when the data processing equipment 30 when video and audio frequency are sequentially regenerated in the beginning of VR normal stream 10 is described.

162 identifications of regeneration control part constitute the VOBU of regeneration object, and pilot light pick-up 130 makes from beginning most sense data in turn.PS decomposition portion 114 will be separated into video data and voice data by the VR normal stream 10 of pick-up 130 and reproducing unit 121 regeneration.Video elongated portion 111 and audio frequency elongated portion 113 be decode video data and voice data respectively, and the result will be presented at based on the video of gained vision signal in the video display part 110, will output to audio output part 112 based on the audio frequency of audio signal.

Below, based on " playlist " of stipulating the VOBU regeneration order that the user wishes, the action of the VR normal stream 10 that data processing equipment 30 regeneration CDs 131 are write down is described.

Now, certain part of supposing playlist is specified regeneration VOBU#k (k ≠ (i+1)) after the VOBU#i that regenerated.Playlist regeneration control part 164 at first pilot light pick-up 131 is read VOBU#i.PS decomposition portion 114 will become video data and voice data and decoding output by the data separating of the pick-up 130 and the VOBU#i of reproducing unit 121 regeneration.At this moment, when data were recorded and narrated the user data area of the video packets that VOBU#i beginning exists, these data were not owing to being to ignore with VOBU#i video corresponding audio data.

When in the last sense data of VOBU#i, playlist regeneration control part 164 pilot light pick-ups 130 are so that read out in the data that follow-up VOBU# (i+1) starts the user data area of the video packets that exists.Because these data are relevant with the contained video corresponding audio of the VOBU#i storage data of separating, this separation of decoding after the voice data decoding of audio frequency elongated portion 113 in VOBU#i is stored data and is exported as audio frequency.Afterwards, based on the instruction from playlist regeneration control part 164, reading as next one regeneration object is the data of VOBU#k, and it is the data of VOBU#k that PS decomposition portion 114 obtains as next one regeneration object by reproducing unit 121, and decoding output.

Because RDI bag is configured in beginning and the video packets of VOBU and is configured in its next position, therefore can be easily and the separation that promptly realizes reading out in the video packets of follow-up VOBU beginning store data.The separation storage data conditions of a video packets also is identical more than near record starts across VOBU.Data processing equipment 30 time is also read this separations storage data by regeneration, and owing to the total data that has obtained to comprise with VOBU the video corresponding audio, so audio frequency is not reproduced midway with cutting away.Replacement is with voice data A ₀(i) the separation storage storage in the user data of VOBU (i+1) beginning video packets, can the compound private information stream that stores in the VOBU (i) of multichannel in.

Data processing equipment 30 can stay out of the separation of stream as described above and decoding and export the data of record.That is, transformation component 141 transforms to the form (for example for DVD video standard form) of regulation, the stream behind output interface portion 140 output transforms with the VR normal stream 10 of being read.At this moment, on the basis of the data of the VOBU of the VR normal stream 10 that should read, the data of the user data area by reading the video packets that follow-up VOBU beginning exists can be carried out the regeneration that audio frequency is not cut away midway in the device of output destination.Output interface portion 140 is according to the interface of IEEE1394 standard for example, can control from the data of external device (ED) and read and write processing from the data of external device (ED).

Subsequent implementation mode 2 each later execution mode are and the relevant various distortion of the recording/reproducing of present embodiment data processing equipment 30 action.Each built-up section of the data processing equipment 30 of execution mode 1 explanation in the following embodiments unless stated otherwise with beyond, think to have identical functions.

(execution mode 2)

According to execution mode 1, suppose to have stored respectively in the VR normal stream 10 video flowing and the audio stream of a correspondence, in the voice data, will less than with (in the video packets) in copy that video data is stored in data among the same VOBU (separating the storage data) stores the video data of follow-up VOBU into.

According to present embodiment, also write down other the audio stream that pairing each video flowing and audio stream are added the data that copy this audio stream.Below, specify the operation of recording of present embodiment.

Fig. 9 represents the relation between VOBU, video flowing and the audio stream according to present embodiment.Although this VR normal stream 10 is prescribed as a mpeg file identical with execution mode 1, multichannel is compound 2 audio streams different with execution mode 1.Now, will be assumed to be " audio stream #0 " with video flowing corresponding audio stream.In audio stream #0, exist and separate the storage data.

PS combination section 104 records CD 131 with the data copy of audio stream #0 as other audio stream #1.More specifically, PS combination section 104 comprises the audio pack that data that the video corresponding audio flows #0 generate audio stream #1 by copy and VOBU#i.Then, these audio pack multichannels are compound in the VOBU#i of VR normal stream 10.Audio stream #0 and #1 can discern respectively by the stream ID that records and narrates in the packets of information head of each bag.And the capacity of the data that are copied must satisfy the restriction of the audio buffer institute permissible range of the System Target Decoder (P-STD) as program flow.Among Fig. 9, copied the voice data A that is used to constitute audio stream #0 ₀(i), A ₀(i+1), A ₀(i+2) etc. data are stored as A ₁(i), A ₁(i+1), A ₁(i+2) etc.

But, because the bit rate of supposition audio stream #1 and audio stream #2 is identical, A ₀(i) copies data not necessarily can be stored in the VOBU#i.When the overall transmission time of the data of total recovery time of frame of video in the VOBU#i and VOBU#i (the SCR value that the SCR value of VOBU#i beginning and VOBU#i+1 start poor) was equal, the copies data of A0 (i) might just in time be stored.

But, for not by regeneration and video corresponding audio midway, finish unanimity reading of VOBU#i with cutting away, as possible, be necessary to obtain as much as possible and these video corresponding audio data.Therefore, SCR and the PTS of PS combination section 104 by revising the mpeg standard that has in the audio pack relevant with audio stream #0 generates SCR and PTS about audio stream #1.That is, PS combination section 104 is when the bag of the data that relate to the identical audio frequency of storage representation, and SCR that has in the video packets with audio stream #1 and the value of PTS are arranged to the SCR that has in the bag than audio stream #0 and the little specified rate of value of PTS.When SCR and PTS become more hour, this bag is that the bags that can be configured in the VR normal stream 10 are arranged on the position of more early reading.Therefore, become and to be stored among the VOBU#i more with the data of separating in the suitable VOBU# (i+1) of storage data of execution mode 1.

PS combination section 104 will represent that the variable quantity data record of SCR and the less set amount of PTS is to for example in the user data area 42 of the video packets that the VOBU#i beginning is disposed.

Below, the regeneration action according to the data processing equipment 30 of present embodiment is described.Because following explanation is effective especially in the regeneration based on playlist, illustrates as an example with this situation.

Playlist is regenerated control part 164 in the video of the VOBU#i that decoding CD 131 is write down, decoded stream #1 rather than stream #0.This is because compare with stream #0, more comes from the data that flow #1 with institute's video data stored corresponding audio data in the VOBU#i.

But, be necessary to write down audio stream #1 with copy data time offset for audio stream #0.Reason such as above-mentioned, the SCR of each audio pack of stream #1 and PTS can not in statu quo carry out synchronizing regeneration with video owing to be set to the little value than stream #0.Therefore, PS decomposition portion 114 reads the bias of recovery time and this value is added to PTS from the user data area 42 of the video packets of VOBU#i beginning setting, promptly by postponing the recovery time, reproducing audio.By like this, can synchronizing regeneration video and audio frequency.

Can will for example record in the management information file to dynamic image stream file " VR_MOVIE.VRO " with the PTS of the audio frame AF#0 of the audio stream #0 of VOBU#i beginning video frame synchronization and the difference of PTS that comprises the audio frame of AF#0 copies data.Difference can be recorded the interior manufacturer independent data zone of RDI bag of each VOBU.By this, the regeneration control part deducts difference from the time mark of VOBU beginning frame of video when regeneration VOBU#i, become the audio frame that can regenerate and be comprised in the later audio stream #1 of its subtraction result.

The bias of recovery time can be recorded in the manufacturer independent data zone in the RDI bag of each VOBU.

When regenerating the dynamic image file that has write down by the CD drive PC regeneration application software that is connected with PC, reproducing audio stream #0.That is, when being regenerated as general mpeg file, dynamic image file uses audio stream #0.

Even do not comprise with each VOBU under the corresponding all audio frequency data conditions, owing to generate little that the data volume of separate storing data can certain degree for audio stream #0, therefore in regeneration, can realize the almost seamless regeneration of audio frequency based on playlist.

Can be with the information other record relevant with audio stream #1 recorded content.For example, the copy data of expression audio stream #0 can be stored in sign (flag) in the audio stream #1 records in the management information file for dynamic image stream file " VR_MOVIE.VRO ".Wish that this sign is at least with the VOB unit record.Can be recorded in the dynamic image stream VOB or in the audio stream #1 etc.By this sign, can distinguish in audio stream #1 is to have stored other audio frequency different with audio stream #0, has still stored the copy of audio stream #0.

(execution mode 3)

In execution mode 1, separate the storage data and be stored in the interior user data area 42 of video packets.

In the present embodiment, data processing equipment 30 will separate conduct of storage data record and the different alternative document data of mpeg file of stipulating VR normal stream 10.

Figure 10 represents the relation between VOBU, video flowing and the audio stream according to present embodiment.When PS combination section 104 generates at VOBU#i, discern and relevant with this VOBU separate storage data and generation and copied this and separate the voice data #i that stores data.PS combination section 104 is this voice data of intersection record and each VOBU that constitutes VR normal stream 10 physically.Each voice data and each VOBU are recorded respectively as an audio file and a mpeg file.PS combination section 104 is interleaved to voice data #i after being right after of VOBU#i.

On the other hand, when regenerating based on playlist, even when playlist is specified the regeneration of VOBU#k (k ≠ (i+1)) after VOBU#i regeneration, playlist regeneration control part 164 is not only read VOBU#i but also is read follow-up voice data #i, after this, read the data of the VOBU#k that should regenerate next time.After being separated into video data and voice data by PS decomposition portion 114, video elongated portion 111 and audio frequency elongated portion 113 decoding and output video data and voice datas.Especially, audio frequency elongated portion 113 is in decoding and regenerated in the VOBU#i after the voice data of contained audio pack contained voice data #i in decoding and the reproducing audio data file.

Because on separating the next one of VOBU that the relevant voice data of storage data is stored in the regeneration object, therefore can be easily and apace realization read this voice data continuously.Owing to all obtain and the contained video corresponding audio of VOBU data by also reading these separation storage data during data processing equipment 30 regeneration, so audio frequency by regeneration midway with cutting away.

In the present embodiment, be recorded in after the pairing VOBU, it be recorded in pairing VOBU also be fine before although will separate the copy of storing data.

(execution mode 4)

In execution mode 3, data processing equipment is only based on the storage of the separation in audio stream data, the audio file that generation is different with mpeg file with record.The voice data #i relevant with for example VOBU#i is recorded in after the VOBU#i.

On the other hand, according to the data processing equipment of present embodiment for the total data generation of the audio stream audio file different with mpeg file with record.And the voice data relevant with each VOBU is recorded in before this VOBU.

Figure 11 represents the relation between VOBU, video flowing and the audio stream according to present embodiment.When PS combination section 104 generates at VOBU#i, identification and contained video data V (i) the corresponding audio data A of this VOBU ₀(i), generation has copied formation voice data A ₀The voice data #i of data (i).PS combination section 104 is this voice data of intersection record and each VOBU that constitutes VR normal stream 10 physically.Each voice data and each VOBU are recorded respectively as an audio file and a mpeg file.PS combination section 104 is interleaved to voice data #i before being right after of VOBU#i.

On the other hand, when regenerating based on playlist, 164 indications of playlist regeneration control part made before VOBU#i reads carries out reading of voice data #i at first.So, before VOBU#i reads end owing to finish reading and finishing decoding of voice data #i by audio frequency elongated portion 113, therefore can with the audio video synchronization of the VOBU#i all audio frequency of regenerating.Therefore, even when specifying the regeneration of VOBU#k (k ≠ (i+1)) afterwards, also can realize the seamless regeneration of audio frequency.

Although present embodiment has illustrated voice data #i is recorded in before the VOBU#i, identical with execution mode 3, voice data #i can be recorded in after the VOBU#i.At this moment, after VOBU#i regeneration and begin to read voice data #i before other VOBU read.

In above-mentioned execution mode 3 and 4, although do not address the interior data structure of audio file especially, can be the basic stream of audio frequency, can be the MPEG2 program flow that comprises audio stream, can be the MP4 stream that comprises audio stream, also can be other system flow.

(execution mode 5)

In execution mode 1, separation storage storage related among the VOBU#i is arrived among the next VOBU# (i+1).

On the other hand, in the present embodiment, separation storage data related among the VOBU#i are stored among this VOBU#i as other stream.

Figure 12 represents the relation between VOBU, video flowing and the audio stream according to present embodiment.PS combination section 104 is compound to it among VOBU#i as the sector-specific private information stream multichannel of separating storage data A by separation storage data A related among the copy VOBU#i.

In VR normal stream 10,, have stream ID in order to discern contained video flowing and audio stream in this stream.Stream ID is stored in PES packets of information head, and for example, the stream ID of video flowing is for example 0xE0, and the stream ID of audio stream is 0xC0 or 0xBD.0xBD is the value according to MPEG-2 system standard defined in private information stream uses.When 0xBD was used in audio stream in the VR standard, a byte after also being right after by PES packets of information head was discerned the compressed encoding of this audio stream.0xBD is used as the stream ID of newly-installed privately owned stream in the present embodiment.

When regenerating based on playlist, even when playlist is specified the regeneration of VOBU#k (k ≠ (i+1)) after VOBU#i regeneration, owing to regenerate by reading the storage data A that separates that is comprised as private information stream continuously, therefore can realize not cut away the situation of audio frequency easily midway with audio stream #0.

In private information stream, just do not separate the total data that storage data A also copies audio stream by copy, it can be compound in the VOBU#i as the special-purpose private information stream of the part multichannel of separating storage data A.Figure 13 represents according to the relation between the VOBU of the variation of present embodiment, video flowing and the audio stream.

PS combination section 104 will comprise the copy record of video corresponding audio data as private information stream 1 (stream_ID=0xBD) special-purpose in the VOBU#i with VOBU#i.The buffer sizes of supposing the System Target Decoder that this private information stream uses has the size of the voice data that can accumulate 2 seconds amounts at least.Here, the meaning of " 2 seconds " is the numerical value with (1 second) addition maximum regeneration time of delay of maximum regeneration time (1 second) of the contained video of VOBU and System Target Decoder.

When regenerating based on playlist, even when playlist is specified the regeneration of VOBU#k (k ≠ (i+1)) after VOBU#i regeneration, if the voice data of regeneration copy of the voice data #0 of storage in private information stream 1 is then realized the situation of not cut away audio frequency midway easily.

As present embodiment, flow as private information by the data record that will copy audio stream, under the situation that mpeg file is edited with VOBU unit, can easily seamless reproducing audio data.Its reason is for example when the editing and processing of carrying out in conjunction with 2 VOBU, and contained private information stream is combined and obtain the separation storage data of combination in these VOBU.

(execution mode 6)

In first example of execution mode 5, flow the separation storage data A storage of VOBU#i association PS combination section 104 as the private information in the VOBU#i.

On the other hand, in the present embodiment, the copy of the separation of VOBU#i association being stored data A is recorded in the audio frame of VOBU#i as additional data.

Figure 14 represents the relation between VOBU, video flowing and the audio stream according to present embodiment.PS combination section 104 will copy in additional data (AD) zone of storing in the VOBU#i audio frame by related separation storage data A in the audio stream #0 of VOBU#i.

Figure 15 represents the data structure of the audio frame of the AC-3 standard that audio compression portion 103 generates.The AC-3 audio frame is made of synchronizing information (SI), bitstream information (BSI), audio block (from ABn to ABn+5) and satellite information (AD).

In synchronizing information (SI), write down the rate information of the bit rate of expression audio frame.In the present embodiment, the bit rate of supposing audio frame is 448kbps (frame sign coded representation 448kbps).Audio frame has the corresponding data length of bit rate (Figure 15 shows that 1792 bytes) with regulation in synchronizing information (SI).But audio compression portion 103 writes down the valid data of synchronizing information, bitstream information and audio block actually with the bit rate below the 256kbps, and the satellite information zone for the separation storage data A that writes down later at leisure.

Like this, will guarantee the 1 frame data length (1792 byte) corresponding with the 448kbps data rate and with difference between the corresponding 1 frame data length (1024 byte) of 256kbps data rate be the satellite information zone of the data length (768 byte) of 192kbps.PS combination section 104 is stored the copies data of separation storage data A shown in Figure 14 in this satellite information zone into.With the average bit rate that separates storage data A corresponding audio is poor less than between the data of 448kbps and the 256kbps, is assumed to be 192kbps.

To separate the storage data by in above-mentioned each audio frame of the audio stream of record like that originally, the clear area being set and copy this clear area to, and can be stored in the voice data (separating the storage data) that does not have storage in the VOBU in fact.

When regenerating, when the data of VOBU are read end, analyze data flow, the copies data that data A is stored in audio frequency elongated portion 113 can obtain can not to obtain in the data structure of prior art separation by PS decomposition portion 114 based on playlist.By this, even in the video scene that usually audio frequency is cut away midway, also can be continuously and audio video synchronization ground reproducing audio.

A bit rate of/2nd of the bit rate of regulation in the synchronizing information (SI) can be applied as actual bit rate, remaining 1/2nd are applied as the bit rate of separating the storage data.For example, the AC-3 audio stream is 448kbps, and actual bit stream can be assumed to be 224kbps, and the bit stream that separates the storage data also is 224kbps.By audio frame is constituted like this, the voice data of audio stream #0 all can be stored in the satellite information zone.And, can be according to the continuous form of the audio frame of AC-3 standard as the audio stream that separates the copy of storing data, and an audio frame that separates storage data A can be recorded in the satellite information through the audio frame of 2 AC-3 standards.The data structure of separating the storage data can be the MPEG2 program flow that comprises the basic stream of audio frequency, also can be other system flow.

Although present embodiment is only stored storage in the satellite information zone with sub-argument, if can guarantee record space, also all storing audio flows #0.

(execution mode 7)

In execution mode 6, store on additional information (AD) zone of the audio frame of AC-3 standard separating storage data A.In the present embodiment, with on additional data (ancillary_data) zone of separating in the audio frame that storage data A stores the MPEG-1 audio standard into.Other structures are identical with execution mode 6.

Figure 17 represents the data structure of the audio frame of MPEG-1 audio standard in the present embodiment.The audio frame of MPEG-1 audio standard has head, error checking (error check), voice data and additional data (ancillary_data), and audio compression portion 103 generates the audio frame with data structure shown in Figure 17.

The information of representing bit rate, sample frequency and the layer (layer) of audio frame is recorded head.In the present embodiment, suppose it is respectively 384kbps, 48kHz and 2 layers.At this moment, each audio frame has according to the data length in the information of header specifies bit rate.But, audio compression portion 103 actually the summation of head, error checking and voice data record is become 256kbps quite below, and the copy of the additional data zone separation storage data A that is used at leisure writing down in the back.

By like this, will guarantee the 1 frame data length (1152 byte) corresponding with the 384kbps data rate and with difference between the corresponding 1 frame data length (768 byte) of 256kbps data rate be the additional data zone of the data length (384 byte) of 128kbps.PS combination section 104 is stored the copies data of separation storage data A shown in Figure 14 in this data area into.The audio frequency bit rate hypothesis of storing as the copy that separates storage data A is on average less than 128kbps.

And, audio stream as the copy that separates the storage data can be according to the continuous form of the audio frame of MPEG-1 audio standard, and an audio frame that separates storage data A can be recorded across the additional data zone in the audio frame of 2 MPEG-1 audio standards.The data structure of separating the storage data can be the MPEG2 program flow that comprises the basic stream of audio frequency, also can be other system flow.

In the execution mode of this explanation, with which kind of mode write down with regenerate separate the copy of storing data or audio stream #0 whole copies data as problem.But operation data processing device 30 makes not to be handled when record especially, directly reads when regeneration and separates storage data self.Particularly, when playlist was specified the regeneration of VOBU#k (k ≠ (i+1)) after VOBU#i regeneration, playlist regeneration control part 164 must be read to separate and store data after reading the data of VOBU#i, after this, can begin reading of VOBU#k.In view of the above, make the tediously long record that separates the storage data become unnecessary in, also make seamless reproducing audio become possibility.But, owing to have necessity of the program flow of reading the longest 1 second amount on the MPEG2 standard, so the seamless regeneration of video might become difficult.Therefore, when program flow in this case generates, wish to generate as few as possible to separate and store data.

Do not exist separation storage data to constitute VOBU in order to make by compressed encoding, for example video elongated portion 111 can generate each frame, makes the video frame size of each VOBU become below " video bit speed/1 second interval frame number ".Like this, just become and do not generate relevant with the audio frequency storage data of separating.Its reason is to transmit the voice data of 1 frame in 1 image duration at every turn.And, be necessary to note producing the problem of image quality reduction by restriction I (in the frame) size of data of frame.

Perhaps, separate the restriction that the storage data comprise the voice data within the regulation frame number (for example 4 frames) by providing what is called to make, audio frequency elongated portion 113 can the compressed encoding voice data.

In this manual, although will also can use the system flow of MPEG1 or the transport stream of MPEG2 as the VR normal stream of program flow as an example.And transport stream can be the form according to the digital television broadcasting standard of using transport stream.It can also be form according to the numerical data broadcasting of using transport stream.In the use of transport stream, utilized transport stream packets (packet).And what is called " is wrapped (pack) ", is known as an illustrative form of packets of information (packet).

Although will be as an example, also can be will be with the ISO Base Media File Format (base media file form) of ISO/IEC14496-12 regulation as basic data flow as the VR normal stream of program flow.

Although recording medium is a phase change disc, also can use the recording medium of other disk shapes such as CD such as Blu-ray disc, DVD-RAM, DVD-R, DVD-RW, DVD+RW, MO, CD-R, CD-RW for example and hard disk.Can also be semiconductor memories such as flash memory.Relevant therewith, although read/write head is the pick-up that CD uses, but also for example pick-up when recording medium is MO and magnetic head, the perhaps magnetic head when hard disk.

In this manual, the technology that can not cut away the ground reproducing audio when playlist is regenerated midway has been described.But, strictly have the situation that only on the time below audio frequency 1 frame, does not have voice data.This is because when then VOBU#i read VOBU#k (k ≠ (i+1)), video frame period and audio frame cycle were slightly different, can not be synchronous fully.In order to prevent the disappearance of the voice data below 1 frame, can comprise extraly and separate the data (execution mode 1,3,4 and 5) that the storage data are added 1 audio frame of follow-up these data.Comprised extra voice data although become like this, extra part can not regenerated.

As the audio compression mode from execution mode 1 to 5, generally can use MPEG-1 audio frequency or MPEG-2 audio frequency, AAC, AC-3 etc.And the situation of AC-3 is the situation of storing when the voice data shown in Figure 16 (a) is stored in the VOBU as private information stream (stream_ID=0xBD).At this moment, for example in execution mode 5, be necessary to separate other streams difference mutually such, that used other private information stream 1 of the private information stream of storing data with having stored.Therefore, PS combination section 104 only is provided with 1 byte at then PES packets of information head and just can discerns by son being flowed ID (0x80).Figure 16 (a) expression has son stream ID (0x80) and comprises the data structure of the audio pack of AC-3 data.

For the private information stream of discerning explanation in the execution mode 5 by difference and the private information stream that AC-3 uses, can use the son stream ID of different numerical value.Figure 16 (b) expression has son stream ID (0xFF) and comprises the data structure of the audio pack of data.This numerical value is the value of stipulating in the DVD audio standard (0xFF).

Separation storage data from execution mode 5 to 7 can be basic stream or copy PES packets of information head.In the above description, although do not mention 2 audio frames in the VOBU scope should with which VOBU synchronizing regeneration, can consider that the later audio frame of the PTS of frame of video for example is corresponding to identical VOBU.And, in the present embodiment, be illustrated as video data although enumerate the MPEG-2 video flowing, also can use other compressed encoding forms such as MPEG-4 video flowing and MPEG-4AVC video flowing.

(industrial applicibility)

According to the present invention, can obtain this kind tape deck, even it is in same data unit (example Such as VOBU) in do not comprise in the situation of total data of the audio frequency corresponding with video, also will copy At least the copies data that does not comprise voice data for example is recorded to when access easily access to should The data sheet bit position (for example the beginning part of next VOBU, this VOBU be right after before or After the person is right after).

Like this, especially based on playlist and in the situation of synchronizing regeneration Audio and Video, and to bag When the data unit that contains video data is carried out access, can obtain should synchronizing regeneration all audio frequency Data. Therefore, can obtain again generating apparatus of this kind, it can the decrease sound when striding across scene Frequently interruption midway, and the audio visual environment of improvement can be provided to the user.

Claims

1, a kind of data processing equipment has:

Signal input part, its incoming video signal and audio signal;

Compression unit, it generates video data and voice data by described vision signal of compressed encoding and described audio signal;

The stream combination section, it is by cutting apart described video data and described voice data generates a plurality of packets of information, generation will be relevant with described video data video information packets and with described voice data relevant compound a plurality of data units of audio-frequency information bag multichannel, and generate the data flow that comprises a plurality of described data units; With

Recording portion, it arrives recording medium with described traffic logging;

The video information packets and the audio-frequency information bag that are comprised determined based on the video refresh memory time at least in described stream combination section in described data unit, and with specified data unit in corresponding all audio frequency data of institute's video data stored when not being included in the described predetermined data unit, in described voice data, the copies data that has copied as the part voice data that does not comprise part at least is included in the described data flow.

2, data processing equipment according to claim 1 is characterized in that, described stream combination section is stored the described copies data corresponding with described data unit in the video information packets that is configured in beginning at least of follow-up data unit into.

3, data processing equipment according to claim 1 is characterized in that, described stream combination section is stored the described copies data of correspondence in the described data unit into.

4, data processing equipment according to claim 1 is characterized in that, described stream combination section is stored described copies data in the audio stream special-purpose in the described data flow into.

5, data processing equipment according to claim 1 is characterized in that, described stream combination section is stored described copies data in the private data stream special-purpose in the described data flow into.

6, data processing equipment according to claim 1 is characterized in that, the copies data that described stream combination section will copy with the synchronous whole described voice data of described video data is included in the described predetermined data unit.

7, data processing equipment according to claim 6 is characterized in that, described stream combination section is stored described copies data in the private data stream special-purpose in the described data flow into.

8, data processing equipment according to claim 1 is characterized in that, the copies data that described stream combination section will copy with the synchronous whole described voice data of described video data stores in the audio stream special-purpose in the described data flow.

9, data processing equipment according to claim 1, it is characterized in that, the copies data that described stream combination section will copy with the synchronous whole described voice data of described video data stores in the audio stream special-purpose in the described data flow, and the transmission time information in the transmission time of the described copies data of conduct expression, regulation has departed from transmission time of stipulated time, the line item of going forward side by side ahead of time than the transmission time in the data unit in described copy source.

10, data processing equipment according to claim 1 is characterized in that, described stream combination section as first file that has comprised described a plurality of data units and second file that has comprised described copies data, generates described data flow;

Described recording portion arrives described recording medium with described data unit and copies data recording occurring continuously.

11, data processing equipment according to claim 10 is characterized in that, described stream combination section generates described second file according to the copies data that has copied whole described voice datas corresponding with described video data.

12, data processing equipment according to claim 1 is characterized in that,

Described voice data has the data length according to first rate;

Described compression unit comes the described audio signal of compressed encoding and is stored as described voice data according to second speed littler than described first rate;

Into clear area is stored described copies data in described stream combination section, this clear area poor corresponding between the 1st data length of the described voice data of the 2nd data length of described relatively second speed regulation and described relatively first rate regulation.

13, a kind of data processing method comprises:

The step of receiving video signals and audio signal;

By described vision signal of compressed encoding and described audio signal, generate the step of video data and voice data;

By cutting apart the step that described video data and described voice data generate a plurality of packets of information, generation will be relevant with described video data video information packets and with described voice data relevant compound a plurality of data units of audio-frequency information bag multichannel, and the step that generates the data flow that comprises a plurality of described data units; With

With the step of described traffic logging to recording medium;

The step that is used for generating described data flow is determined the video information packets and the audio-frequency information bag that comprise in described data unit based on the video refresh memory time at least, and with specified data unit in corresponding all audio frequency data of institute's video data stored when not being included in the described predetermined data unit, in described voice data, the copies data that has copied as the part voice data that does not comprise part at least is included in the described data flow.

14, data processing method according to claim 13 is characterized in that, the step that is used to generate described data flow stores the described copies data corresponding with described data unit in follow-up data unit is configured in the video information packets of beginning.

15, data processing method according to claim 13 is characterized in that, the copies data that the step that is used to generate described data flow will copy the whole described voice data corresponding with described video data is included in the described predetermined data unit.

16, data processing method according to claim 13 is characterized in that, is used to generate the step of described data flow, as first file that has comprised described a plurality of data units and second file that has comprised described copies data, generates described data flow;

Described recording step arrives described recording medium with described data unit and described copies data recording occurring continuously.

17, data processing method according to claim 16 is characterized in that, the step that is used to generate described data flow generates described second file according to the copies data that has copied whole described voice datas corresponding with described video data.

18, data processing method according to claim 13 is characterized in that,

Described voice data has the data length according to first rate;

Be used to generate the step of described voice data by generating described voice data with the described audio signal of first rate compressed encoding;

Be used for generating the step of described data flow for the described voice data that comprises in described specified data unit, described first rate second rate value more early that likens to described rate information by setting generates described voice data, and store described copies data into clear area, this clear area poor corresponding between the 1st data length of the described voice data of the 2nd data length of described relatively second speed regulation and described relatively first rate regulation.