WO2021149892A1 - Apparatus and method for recording video data - Google Patents

Apparatus and method for recording video data Download PDF

Info

Publication number
WO2021149892A1
WO2021149892A1 PCT/KR2020/013182 KR2020013182W WO2021149892A1 WO 2021149892 A1 WO2021149892 A1 WO 2021149892A1 KR 2020013182 W KR2020013182 W KR 2020013182W WO 2021149892 A1 WO2021149892 A1 WO 2021149892A1
Authority
WO
WIPO (PCT)
Prior art keywords
bitstream
frame
updated
idr
period
Prior art date
Application number
PCT/KR2020/013182
Other languages
French (fr)
Inventor
Sun Young Lee
Original Assignee
Atins Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Atins Inc. filed Critical Atins Inc.
Publication of WO2021149892A1 publication Critical patent/WO2021149892A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/177Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/40Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/27Server based end-user applications
    • H04N21/274Storing end-user multimedia data in response to end-user request, e.g. network recorder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources

Definitions

  • the present invention relates to apparatus and method for recording video data.
  • CCTV Computer-circuit television
  • CCTV systems that have been installed for video surveillance and security purposes in governments and public institutions are increasing, and such CCTV systems are being expanded to various locations such as intersections and alleys in residential areas to collect traffic or secure social safety nets.
  • the CCTV system includes a camera group 110 consisting of IP cameras installed at each designated location as shown in FIG. 1, a video control device 120, and a video storage device 130.
  • the video footage captured by each IP camera may be monitored at the control center 140 located at a remote location.
  • the video control device 120 may be, for example, a device that monitors and manages digital video footages captured by a plurality of IP cameras
  • the video storage device 130 is may be a device for video recording, monitoring, event handling of IP cameras connected through network.
  • each IP camera is connected to the video control device 120, which is, for example, a Video Management System (VMS), through a wired or wireless communication network, and the video control device 120 receives video footages captured by each IP camera (i.e., the video footages of each channel) in the form of a bitstream and provides to the control center 140.
  • VMS Video Management System
  • the video control device 120 provides the received bitstream to the video storage device 130, for example, NVR (Network Video Recorder) device to store the bitstream, and when the control center 140 requests a certain bitstream, retrieves the requested bitstream from the video storage device 130 and provides to the control center 140.
  • NVR Network Video Recorder
  • the video storage device 130 receives and stores a bitstream.
  • the video storage device 130 may be connected to a monitor to check whether a bitstream is seamlessly being received.
  • the video storage device 130 may include a decoder capable of outputting video footage corresponding to the bitstream on the monitor.
  • the video storage device 130 is set to store and maintain the received bitstream for a predetermined period (e.g., 30 days).
  • FHD Full high definition
  • a large storage space is required.
  • about 2 GB of storage space is required to store one hour length video footage at 30 fps (frames per second) in FHD. Therefore, about 1.44 TB storage space is required to store 24 hour length video data of one single channel for 30 days.
  • at least 1,152 TB of storage space must be provided in the video storage device 130.
  • the present invention is to provide apparatus and method for recording video data capable of significantly reducing the cost of building CCTV system by effectively compressing, storing, and using vast amounts of data for high-resolution and high-definition video footage without any video quality degradation.
  • apparatus for recording video data includes a transcoder, configured for transcoding at least a portion of a first bitstream provided by a network video transmitter to process a second bitstream having an updated IDR (Instantaneous Decoder Refresh) period that has a time length different from that of an original IDR of the first bitstream, and a storage unit, configured for storing the second bitstream is provided.
  • IDR Intelligent Decoder Refresh
  • the transcoder designates at least one of Intra frames as a transcoding target frame and change into an Inter frame in order for the second bitstream to have the updated IDR period, wherein the updated IDR period is applied equally throughout the second bitstream, or is determined by interpreting the first bitstream in accordance with a predetermined bitstream characteristic analysis rule.
  • the transcoder is configured for designating Intra frames belonging to the second GOP to the last GOP within each updated IDR period in the first bitstream as the transcoding target frames, wherein the Intra frames belonging to each of a plurality of GOPs in each updated IDR period are processed to have a predetermined reference relationship in the second bitstream, wherein the predetermined reference relationship is any one of IPPP structure, IBBB structure, and IBBP structure.
  • the time length of the updated IDR period is calculated by an equation that is predefined to have the number of channels that provide bitstream to be stored in the storage unit, and a total storage capacity of the storage unit, a size of the free storage capacity of the storage unit, and a time period during which the bitstream is stored and maintained in the storage unit as factors of consideration.
  • the predetermined bitstream characteristic analysis rule for adaptively applying the updated IDR period is, for reconstructed pixels of the Intra frames belonging to the second GOP to the last GOP among a plurality of GOPs in the first bitstream, to calculate a difference value between corresponding reconstructed pixels in Intra frame of the first GOP or a reference frame, to specify the Intra frame as a transcoding target frame only when the difference value between frames is less than a predetermined threshold value, and to change the Intra frame into Inter frame.
  • the transcoder changes the Intra frame designated as the transcoding target frame into the Inter frame by a lossless coding technique.
  • a flag value in at least one of SPS(Sequence Parameter Set), PPS(Picture Parameter Set), PH(Picture Header), and CU syntax is adjusted in order for an encoder of the transcoder to operate in the lossless coding technique.
  • computer program stored on a computer-readable media for executing method for storing video data
  • the computer program makes a computer to execute following steps, the steps including (a) receiving a first bitstream from a network video transmitter, and (b) transcoding at least a portion of the first bitstream to process a second bitstream having an updated IDR (Instantaneous Decoder Refresh) period that has a time length different from that of an original IDR of the first bitstream, wherein at least one of Intra frames is designated as a transcoding target frame and changed into an Inter frame in order for the second bitstream to have the updated IDR period, wherein the updated IDR period is applied equally throughout the second bitstream, or is determined by interpreting the first bitstream in accordance with a predetermined bitstream characteristic analysis rule.
  • IDR Intelligent Decoder Refresh
  • Intra frames belonging to the second GOP to the last GOP within each updated IDR period in the first bitstream are designated as the transcoding target frames, wherein the Intra frames belonging to each of a plurality of GOPs in each updated IDR period are processed to have a predetermined reference relationship in the second bitstream, wherein the predetermined reference relationship is any one of IPPP structure, IBBB structure, and IBBP structure.
  • the time length of the updated IDR period is calculated by an equation that is predefined to have the number of channels that provide bitstream to be stored in the storage unit, and a total storage capacity of the storage unit, a size of the free storage capacity of the storage unit, and a time period during which the bitstream is stored and maintained in the storage unit as factors of consideration.
  • the predetermined bitstream characteristic analysis rule for adaptively applying the updated IDR period is, for reconstructed pixels of the Intra frames belonging to the second GOP to the last GOP among a plurality of GOPs in the first bitstream, to calculate a difference value between corresponding reconstructed pixels in Intra frame of the first GOP or a reference frame, to specify the Intra frame as a transcoding target frame only when the difference value between frames is less than a predetermined threshold value, and to change the Intra frame into Inter frame.
  • FIG. 1 illustrates a configuration of general CCTV system
  • FIG. 2 is a block diagram of apparatus for recording video data according to one embodiment of the present invention.
  • FIG. 3 illustrates bitstream having IPPP structure
  • FIG. 4 illustrates images selected from a surveillance video that IP camera at a fixed location has captured
  • FIG. 5 illustrates a bitstream transcoding technique according to one embodiment of the present invention.
  • FIG. 6 illustrates a structure of HEVC encoder that is reconfigured according to one embodiment of the present invention.
  • FIG. 2 is a block diagram of apparatus for recording video data according to one embodiment of the present invention
  • FIG. 3 illustrates bitstream having IPPP structure
  • FIG. 4 illustrates snap shots selected from a surveillance video that IP camera at a fixed location has captured
  • FIG. 5 illustrates a bitstream transcoding technique according to one embodiment of the present invention
  • FIG. 6 illustrates a structure of HEVC encoder that is reconfigured according to one embodiment of the present invention.
  • the video storage device 130 may include a decoder 210, a transcoder 220, and a storage unit 230.
  • the decoder 210 provides video footage decoded of the received bitstream to the monitor so as to check whether the bitstreams of each channel are seamlessly received from the network video transmitter (NVT) 240. An operator will be able to easily check whether the bitstream is seamlessly received by watching the video footage presented on the monitor.
  • NNT network video transmitter
  • the NVT 240 provides a bitstream to be stored in the video storage device 130, and may be, for example, one or more of the video control devices 120, an IP cameras, and other NVR devices. If one NVT240 provides a high-resolution bitstream together with a low-resolution bitstream, both corresponding to a video footage captured by one IP camera, the decoder 210 may decode the low-resolution bitstream to provide the monitor for decreasing decoder processing complexity.
  • the transcoder 220 is a typical configuration implemented including a decoder and an encoder, as described later, transcodes the input bitstream maintaining the same quality while having a relatively small volume of data, and stores the transcoded bitstream in the storage unit 230.
  • a video compression format of the bitstream generated and transmitted by the IP camera includes an intra (I) frame and inter frames per second in an IPPP structure as illustrated in FIG. 3A.
  • I frame is an IDR (Instantaneous Decoder Refresh) frame and has a function to clear a DPB (Decoded Picture Buffer) in the decoder unit.
  • the IDR frame is required for random access.
  • IDR frames are arranged at predetermined time intervals (i.e., IDR periods).
  • IBBB structure IBBP structure
  • the present invention may adopt any of known video formats without any limitation.
  • one I (Intra) frame and successive 29 P (Predictive) frames are located within a time interval of 1 second.
  • I frame is required to reduce image error propagation due to camera movement or scene change, or to enable random access. Accordingly, in the case of encoding general content such as a drama or a movie, it is encoded so that one I frame per second is included.
  • the compressed I frame has a significantly larger volume of data than that of the compressed P frame. For example, when encoding 30 frames per second, the similarity of successive frames in time is very high, thus P frame that is encoded with a difference between the current frame and the preceding reference frame has a compression performance relatively higher than I frame.
  • FIG. 4 illustrates four snap shots selected from surveillance video footage captured by an IP camera at a fixed location. For reference, these snap shots were selected at intervals of about 30 seconds to 1 minute from surveillance video footage captured by the IP camera installed in an intersection area.
  • surveillance video footages captured by IP cameras show that only some moving objects change their position and shape over time, but the background that occupies most of scene remains still or not much changed. This is a distinctive characteristic in video footage captured by IP camera installed at a fixed location to continuously shoot a designated area.
  • IP cameras not only have few camera movements or scene changes, but also have significantly lower need for random access to specific time points of images taken by IP cameras.
  • the video storage device 130 By reflecting the characteristics of the video footage generated by CCTV system, the video storage device 130 according to one embodiment operates to significantly reduce the volume of data of the bitstream stored in the storage unit 230.
  • FIG. 5 illustrates a bitstream transcoding technique by the transcoder 220 of the video storage device 130.
  • the video compression format of the bitstream corresponding to video footage captured by IP camera has IPPP structure in which I frame (i.e., IDR frame) is placed first at each predetermined IDR period.
  • I frame i.e., IDR frame
  • this specification illustrates that the reference frame for all P frames is set to one immediately preceding frame as example, but is not limited thereto, P frames may also have a plurality of reference frames preceding in the encoding/decoding order.
  • the transcoder 220 may reduce the volume of data of bitstream to be stored in the storage unit 230 by applying a relatively long updated IDR period, and reprocessing the bitstream to have one I frame in the updated IDR period.
  • the transcoder 220 may transcode the most preceding first frame in the updated IDR period as I frame, and all other frames as P frames that are dependently encoded with reference to each preceding frame to reduce the volume of data of bitstream to be stored.
  • the transcoder 220 may reuse one I frame and a plurality of P frames belonging to the first GOP (Group of pictures; see FIG. 5(a)) of successive GOPs without reprocessing.
  • I frame belonging to the second GOP of the successive GOPs In order to reprocess I frame belonging to the second GOP of the successive GOPs as P frame, all of the most preceding I frame to the immediately preceding P frame (i.e., the last P frame of the first GOP) are decoded, and then I frame belonging to the second GOP is encoded with reference to the immediately preceding P frame as P T10 frame.
  • the reconstructed pixel value of P T10 frame becomes different from the reconstructed pixel value of the I frame because the I frame is converted to the P T10 frame in a loss manner.
  • the P frame therefore, no longer refers to P T10 frame as the reference frame instantly.
  • the corresponding P frame must also be changed as P T11 frame by being decoded and then encoded with reference to P T10 .
  • the reprocessing the I frame of the second GOP into P frame and the reprocessing the remaining P frames must be commonly performed for all successive GOPs within the updated IDR period.
  • the reprocessing all of the frames included in the second GOP to the last GOP of all successive GOPs within the updated IDR period into a P frame referencing the immediately preceding frame separately performs decoding and encoding for each target frame, and this will remarkably increase a complexity of transcoding.
  • the video quality may be deteriorated in a process of decoding and re-encoding a plurality of frames in a loss manner.
  • the transcoder 220 may reprocess the bitstream using distinctive characteristics of the surveillance video footage captured by the IP camera.
  • the transcoder 220 encodes the reconstructed pixel value of the I frame in a lossless manner by ridding the encoder of Transform, Quantization, Inverse Transform, Inverse Quantization, and In-loop Filter and finally generates the P frame that has the same reconstructed pixel value as I frame. (See FIG. 6(b)) Through this, it is possible to remarkably reduce the complexity that may occur in the transcoding process, and also solve the problem of video quality degradation.
  • the transducer 220 designates only I frames in the second GOP to the last GOP of all frames within the updated IDR period as transcoding target frames to be changed into P frames by performing lossless encoding process, and then reprocesses into a bitstream having IPPP encoding structure within the updated IDR period. (See Fig. 5(c))
  • the transcoder 220 decodes the I frame of the second GOP of successive GOPs belonging to the updated IDR period and encodes the reconstructed pixel value of the I frame into a P T1 frame referencing the I frame of the first GOP.
  • the reconstructed value of the I frame (that is, the pixel value to be encoded and also the original pixel value before encoding), which is the transcoding target frame, is the same as the reconstructed value of the reprocessed P T1 frame.
  • frames within the updated IDR period can have IPPP encoding structure by applying a rule of transcoding the I frame as the transcoding target frame in the third GOP of successive GOPs, into P T2 frame with reference to P T1 frame that was transcoded in the immediately preceding GOP (e.g., the second GOP), in other words, a rule of transcoding the transcoding target frame in the third GOP and successive GOPs with reference to the immediately preceding transcoding target frame.
  • frames within the updated IDR period can have IPPP encoding structure by applying a rule of transcoding, the I frame as the transcoding target frame in the third GOP of successive GOPs into P T2 frame with reference to the I frame in the first GOP, same as the I frame in the second GOP, in other words, a rule of transcoding all transcoding target frames in and after the second GOP with reference to the I frame in the first GOP.
  • the transcoder 220 deals with only the transcoding target frames (i.e., I frames after the second GOP), when decoding, independently decodes the I frames that were independently encoded, and when encoding, encodes it with reference to the I frame in the first GOP and/or the preceding transcoding target frame, so it is advantageous that the complexity of the transcoding process can be remarkably reduced while the compression rate can be remarkably improved. In addition, there is an advantage of removing image quality degradation by applying a lossless encoding technique in the transcoding process.
  • fast search can become available.
  • the storage unit 230 retrieves a bitstream consisting of the first I frame and the P frames such as P T1 , P T2 frames that are reprocessed transcoding target frames and provides to the control center 140.
  • the search speed can be about thirty times faster than the conventional search by selecting one frame per second as a decoding target frame, decoding 30 frames per second, and searching 30 second-length video in 1 second.
  • transcoder 220 After the fast search, it is also possible to provide a normal search function for a specific period of time. This may be an advantage of the transcoder 220 according to one embodiment that a bitstream generated by an IP camera that simply extends the IDR period, or a bitstream that is a result of general transcoding cannot have.
  • the compression performance of the conventional video storage device is as follows.
  • the video footage is FHD at 30 fps
  • the video compression format is IPPP
  • IDR is 1 second.
  • the data volume of the P frame compared to the I frame is estimated to be about several tens of 1, but for convenience of calculation, the data volume of the I frame is set to 20 and the data volume of the P frame is set to 1 as a ratio of 20:1.
  • the volume of data is 49 (i.e., 20x1+1x29), and the total volume of data is 490 because it is a 10-second interval.
  • the I frames of the successive GOPs are transcoded into P frames so as to correspond to the updated IDR period by the transcoder 220, there are 1 I frame and 299 P frames in the 10 second period (i.e., 29 of the first GOP and 270 of the remaining 9 GOPs), and the total volume of data is 319 (i.e., 20x1+1x299), which is about 35% improvement in compression performance compared to conventional video storage devices.
  • the compression performance may be relatively remarkably improved even though the restriction on the random access increases.
  • the updated IDR period may be fixed to a specific value set by the operator in consideration of typical characteristics of the video footage captured by an IP camera.
  • the updated IDR period may be adaptively determined in the bitstream in consideration of the characteristics of the bitstream by a controller provided in the video storage device 130.
  • the controller provided in the video storage device 130 adaptively determines whether or not to change the transcoding target intra frame (i.e., I frame) to an inter frame (i.e., P frame or B frame) in the received bitstream so a plurality of different updated IDR periods (e.g., updated IDR periods adaptively determined to have time lengths such as 3 seconds, 10 seconds, 7 seconds, 20 seconds, etc.) may be applied in the reprocessed bitstream.
  • a plurality of different updated IDR periods e.g., updated IDR periods adaptively determined to have time lengths such as 3 seconds, 10 seconds, 7 seconds, 20 seconds, etc.
  • a difference value between the reconstructed pixel values of the transcoding target intra frame and the reconstructed pixel values of the intra frame in the first GOP or a difference value between the reconstructed pixel values of the transcoding target intra frame and the reconstructed pixel values of a reference frame (intra frame or inter frame) is calculated, and the transcoding is not performed so the transcoding target intra frame remains as intra frame if the calculated difference value exceeds a predetermined threshold value, while the transcoder 220 performs the transcoding to process as inter frame if the calculated difference value is less than the threshold value.
  • the controller can adaptively adjust the updated IDR period, and the difference value between the reconstructed pixel values of frames may be calculated using an error measurement technique such as SAD (Sum of Absolute Difference), MSE (Mean Squared Error), SSE (Sum of Squared Error), SAE (Sum of Absolute Error) that calculates a difference value between corresponding pixels.
  • SAD Sum of Absolute Difference
  • MSE Mean Squared Error
  • SSE Sum of Squared Error
  • SAE Sud of Absolute Error
  • the processed bitstream may have only one intra frame as a whole and thus all frames belong to one updated IDR period.
  • the maximum value of the updated IDR period that can be adaptively selected may be predetermined so that a bitstream capable of random access above the minimum level may be processed.
  • a controller (not shown) provided in the control center 140 and/or the video storage device 130 may adaptively calculate and apply the optimal updated IDR period by use of an equation that is predefined to have the number of channels in which bitstream is to be stored in the video storage device 130, and a total storage capacity of the storage unit 230, a size of the free storage capacity of the storage unit 230, and a time period during which the bitstream is stored and maintained in the storage unit 230 as factors of consideration.
  • the updated IDR period may be calculated relatively large.
  • the limit range of the updated IDR period calculated by the equation may be set in advance, and may be limited in advance so that the updated IDR period within the limit range is applied.
  • the video storage device 130 may have processing complexity compared to the conventional video storage device because the transcoding process is performed by the transcoder 220.
  • the transcoder 220 transcodes (i.e., decodes and encodes) only one I frame out of 30 frames per second, and assuming that the complexity of the encoding process for one frame is about 5 times the complexity of the decoding process, the transcoder 220 may be estimated to decode about 6 frames based on the decoding processing complexity.
  • the transcoder 220 occurs the complexity of further processing 6 frames by an additional transcoding process compared to the conventional video storage device. Therefore, when the conventional video storage device decodes 30 frames per second, the video storage device 130 according to one embodiment can be considered to decode 36 frames per second, and this means that the complexity increases by up to 20%.
  • the amount saved by increasing the compression rate of the bitstream to prevent an increase in storage capacity is significantly larger so the industrial applicability of the video storage device according to one embodiment will be sufficient.
  • the encoder unit of the transcoder 220 may be newly developed to perform the aforementioned processing.
  • the configuration and operation structure of the encoder unit of the transcoder 220 can be easily understood by those skilled in the art through the above description, a detailed description thereof will be omitted.
  • the encoder unit of the transcoder 220 uses the same structure of the conventional encoder as illustrated in Fig. 6(a), but may be applied to enable the operation according to one embodiment by flag(s).
  • FIG. 6(a) illustrates the structure of an HEVC encoder.
  • the modules consisting of the encoder structure include Block Structure, Intra Prediction, Inter Prediction, Transform, Quantization, Inverse Transform, Inverse Quantization, In-loop Filter, and Entrophy Coding.
  • the encoder consisting of the aforementioned blocks is generally processed in a lossy coding technique, which causes loss in the final coded result.
  • a lot of losses may occur in Quantization block, and some losses may also occur in Transform and In-loop filter.
  • the encoder unit of the transcoder 220 can be adjusted to perform lossless encoding processing.
  • transquant_bypass_enabled_flag i.e., a flag that specifies on/off of the transquant bypass function
  • CU coding unit
  • Cu_transquant_bypass_flag which is a flag of the CU syntax
  • transquant_bypass_enabled_flag is set to on to enable transquant bypass for each CU
  • Cu_transquant_bypass_flags for the whole CUs in the transcoding target frame are set to on so that the transcoding target frame is losslessly encoded, according to one embodiment.
  • the encoder unit of the transcoder 220 can use, as illustrated in of FIG. 6(b), an encoder structure in which only minimum number of modules (i.e., Block Structure, Intra Prediction, Inter Prediction and Entropy Coding that guarantee lossless) operate.
  • an encoder structure in which only minimum number of modules (i.e., Block Structure, Intra Prediction, Inter Prediction and Entropy Coding that guarantee lossless) operate.
  • the improved encoder structure can greatly reduce the encoding complexity by omitting the operation of several modules, and it is possible to simplify the implementation because the minimum number of modules can be included even if the encoder unit of the transcoder 220 is newly developed.
  • the video storage device 130 using the aforementioned improved encoder structure stores bitstreatm that is processed in aforementioned technique that modifies header information of bitstream indicating reference relationship regarding transcoding.
  • the reference relationship indicates reference frame information on a target frame, and the reference frame information may determine whether to delete the reconstructed frames in the DPB in the encoder and/or decoder.
  • the corresponding bitstream may be retrieved from the storage unit 230 and provided to the control center 140.
  • the video storage device 130 reprocesses the bitstream received from the NVT 240 and stores it in the storage unit 230, but when a specific video footage is requested from the control center 140, the video storage device 130 may reprocess the corresponding bitstream by transcoding P frames that were the transcoding target frames into I frames and provide to the control center 140.
  • processing complexity may be increased by reprocessing the bitstream without any image quality degradation in a lossless encoding technique, but when the corresponding image is used in the control center 140, the random access performance may be relatively improved. That is, if the lossless encoding technique is applied, the I frame to be transcoded may be reprocessed into a P frame, and then the corresponding P frame may be reprocessed into an I frame.
  • the aforementioned method of storing video data may be implemented as a software program or application embedded in a digital processing device such as a video processing device, and may be performed as an automated procedure according to a time-series sequence. Codes and code segments constituting the program or the like can be easily inferred by a computer programmer in the art. Further, the program is stored in a computer readable media, and is read and executed by a computer to implement the method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Apparatus and method for recording video data provided. Apparatus for recording video data, includes a transcoder, configured for transcoding at least a portion of a first bitstream provided by a network video transmitter to process a second bitstream having an updated IDR (Instantaneous Decoder Refresh) period that has a time length different from that of an original IDR of the first bitstream, and a storage unit, configured for storing the second bitstream.

Description

APPARATUS AND METHOD FOR RECORDING VIDEO DATA
The present invention relates to apparatus and method for recording video data.
Recently, the number of CCTV (Closed-circuit television) systems that have been installed for video surveillance and security purposes in governments and public institutions are increasing, and such CCTV systems are being expanded to various locations such as intersections and alleys in residential areas to collect traffic or secure social safety nets.
In general, the CCTV system includes a camera group 110 consisting of IP cameras installed at each designated location as shown in FIG. 1, a video control device 120, and a video storage device 130. The video footage captured by each IP camera may be monitored at the control center 140 located at a remote location.
Here, the video control device 120 may be, for example, a device that monitors and manages digital video footages captured by a plurality of IP cameras, and the video storage device 130 is may be a device for video recording, monitoring, event handling of IP cameras connected through network.
As shown, each IP camera is connected to the video control device 120, which is, for example, a Video Management System (VMS), through a wired or wireless communication network, and the video control device 120 receives video footages captured by each IP camera (i.e., the video footages of each channel) in the form of a bitstream and provides to the control center 140.
In addition, the video control device 120 provides the received bitstream to the video storage device 130, for example, NVR (Network Video Recorder) device to store the bitstream, and when the control center 140 requests a certain bitstream, retrieves the requested bitstream from the video storage device 130 and provides to the control center 140.
The video storage device 130 receives and stores a bitstream. The video storage device 130 may be connected to a monitor to check whether a bitstream is seamlessly being received. The video storage device 130 may include a decoder capable of outputting video footage corresponding to the bitstream on the monitor.
In general, the video storage device 130 is set to store and maintain the received bitstream for a predetermined period (e.g., 30 days).
However, the latest CCTV video footage is processed in FHD (Full high definition), and thus a large storage space is required. For example, about 2 GB of storage space is required to store one hour length video footage at 30 fps (frames per second) in FHD. Therefore, about 1.44 TB storage space is required to store 24 hour length video data of one single channel for 30 days. Similarly, in the case of a CCTV system having 800 channels, at least 1,152 TB of storage space must be provided in the video storage device 130.
Therefore, if there are a plurality of IP cameras, namely, a plurality of channels of which bitstreams are to be stored, a large amount of storage space proportional to the number of channels must be sufficiently secured in the video storage device 130, which cause a significant increase in the cost of building CCTV system. In addition, as CCTV video footages are gradually improved to 4K and 8K, a larger storage space will be needed in the long run, which will inevitably increase the cost of building CCTV system exponentially.
So far, however, development of technologies for CCTV systems has focused on encryption of bitstreams to be stored due to the needs of information security, and efforts to develop technologies for compressing and storing bitstreams more efficiently are insufficient.
The present invention is to provide apparatus and method for recording video data capable of significantly reducing the cost of building CCTV system by effectively compressing, storing, and using vast amounts of data for high-resolution and high-definition video footage without any video quality degradation. Other objectives and advantages will be easily understood from the following description.
According to one aspect of the present invention, apparatus for recording video data, includes a transcoder, configured for transcoding at least a portion of a first bitstream provided by a network video transmitter to process a second bitstream having an updated IDR (Instantaneous Decoder Refresh) period that has a time length different from that of an original IDR of the first bitstream, and a storage unit, configured for storing the second bitstream is provided.
The transcoder designates at least one of Intra frames as a transcoding target frame and change into an Inter frame in order for the second bitstream to have the updated IDR period, wherein the updated IDR period is applied equally throughout the second bitstream, or is determined by interpreting the first bitstream in accordance with a predetermined bitstream characteristic analysis rule.
If the updated IDR period is applied equally throughout the second bitstream, the transcoder is configured for designating Intra frames belonging to the second GOP to the last GOP within each updated IDR period in the first bitstream as the transcoding target frames, wherein the Intra frames belonging to each of a plurality of GOPs in each updated IDR period are processed to have a predetermined reference relationship in the second bitstream, wherein the predetermined reference relationship is any one of IPPP structure, IBBB structure, and IBBP structure.
The time length of the updated IDR period is calculated by an equation that is predefined to have the number of channels that provide bitstream to be stored in the storage unit, and a total storage capacity of the storage unit, a size of the free storage capacity of the storage unit, and a time period during which the bitstream is stored and maintained in the storage unit as factors of consideration.
The predetermined bitstream characteristic analysis rule for adaptively applying the updated IDR period is, for reconstructed pixels of the Intra frames belonging to the second GOP to the last GOP among a plurality of GOPs in the first bitstream, to calculate a difference value between corresponding reconstructed pixels in Intra frame of the first GOP or a reference frame, to specify the Intra frame as a transcoding target frame only when the difference value between frames is less than a predetermined threshold value, and to change the Intra frame into Inter frame.
The transcoder changes the Intra frame designated as the transcoding target frame into the Inter frame by a lossless coding technique.
A flag value in at least one of SPS(Sequence Parameter Set), PPS(Picture Parameter Set), PH(Picture Header), and CU syntax (Coding Unit Syntax) is adjusted in order for an encoder of the transcoder to operate in the lossless coding technique.
According to another aspect of the present invention, computer program stored on a computer-readable media for executing method for storing video data, wherein the computer program makes a computer to execute following steps, the steps including (a) receiving a first bitstream from a network video transmitter, and (b) transcoding at least a portion of the first bitstream to process a second bitstream having an updated IDR (Instantaneous Decoder Refresh) period that has a time length different from that of an original IDR of the first bitstream, wherein at least one of Intra frames is designated as a transcoding target frame and changed into an Inter frame in order for the second bitstream to have the updated IDR period, wherein the updated IDR period is applied equally throughout the second bitstream, or is determined by interpreting the first bitstream in accordance with a predetermined bitstream characteristic analysis rule.
If the updated IDR period is applied equally throughout the second bitstream, at the step (b), Intra frames belonging to the second GOP to the last GOP within each updated IDR period in the first bitstream are designated as the transcoding target frames, wherein the Intra frames belonging to each of a plurality of GOPs in each updated IDR period are processed to have a predetermined reference relationship in the second bitstream, wherein the predetermined reference relationship is any one of IPPP structure, IBBB structure, and IBBP structure.
The time length of the updated IDR period is calculated by an equation that is predefined to have the number of channels that provide bitstream to be stored in the storage unit, and a total storage capacity of the storage unit, a size of the free storage capacity of the storage unit, and a time period during which the bitstream is stored and maintained in the storage unit as factors of consideration.
The predetermined bitstream characteristic analysis rule for adaptively applying the updated IDR period is, for reconstructed pixels of the Intra frames belonging to the second GOP to the last GOP among a plurality of GOPs in the first bitstream, to calculate a difference value between corresponding reconstructed pixels in Intra frame of the first GOP or a reference frame, to specify the Intra frame as a transcoding target frame only when the difference value between frames is less than a predetermined threshold value, and to change the Intra frame into Inter frame.
Other aspects, features, and advantages will be more apparent from accompanying drawings, claims and detailed description.
According to an embodiment of the present invention, it is possible to effectively compress, store, and use massive data of high-resolution and high-definition images without any image quality degradation, thereby significantly reducing the cost of building a CCTV system.
FIG. 1 illustrates a configuration of general CCTV system;
FIG. 2 is a block diagram of apparatus for recording video data according to one embodiment of the present invention;
FIG. 3 illustrates bitstream having IPPP structure;
FIG. 4 illustrates images selected from a surveillance video that IP camera at a fixed location has captured;
FIG. 5 illustrates a bitstream transcoding technique according to one embodiment of the present invention; and
FIG. 6 illustrates a structure of HEVC encoder that is reconfigured according to one embodiment of the present invention.
The invention can be modified in various forms and specific embodiments will be described and shown below. However, the embodiments are not intended to limit the invention, but it should be understood that the invention includes all the modifications, equivalents, and replacements belonging to the concept and the technical scope of the invention.
The terms used in the following description are intended to merely describe specific embodiments, but not intended to limit the invention. An expression of the singular number includes an expression of the plural number, so long as it is clearly read differently. The terms such as "include" and "have" are intended to indicate that features, numbers, steps, operations, elements, components, or combinations thereof used in the following description exist and it should thus be understood that the possibility of existence or addition of one or more other different features, numbers, steps, operations, elements, components, or combinations thereof is not excluded.
FIG. 2 is a block diagram of apparatus for recording video data according to one embodiment of the present invention, FIG. 3 illustrates bitstream having IPPP structure, FIG. 4 illustrates snap shots selected from a surveillance video that IP camera at a fixed location has captured, FIG. 5 illustrates a bitstream transcoding technique according to one embodiment of the present invention, and FIG. 6 illustrates a structure of HEVC encoder that is reconfigured according to one embodiment of the present invention.
Referring to FIG. 2, the video storage device 130 according to one embodiment may include a decoder 210, a transcoder 220, and a storage unit 230.
The decoder 210 provides video footage decoded of the received bitstream to the monitor so as to check whether the bitstreams of each channel are seamlessly received from the network video transmitter (NVT) 240. An operator will be able to easily check whether the bitstream is seamlessly received by watching the video footage presented on the monitor.
Here, the NVT 240 provides a bitstream to be stored in the video storage device 130, and may be, for example, one or more of the video control devices 120, an IP cameras, and other NVR devices. If one NVT240 provides a high-resolution bitstream together with a low-resolution bitstream, both corresponding to a video footage captured by one IP camera, the decoder 210 may decode the low-resolution bitstream to provide the monitor for decreasing decoder processing complexity.
The transcoder 220 is a typical configuration implemented including a decoder and an encoder, as described later, transcodes the input bitstream maintaining the same quality while having a relatively small volume of data, and stores the transcoded bitstream in the storage unit 230.
In general, a video compression format of the bitstream generated and transmitted by the IP camera includes an intra (I) frame and inter frames per second in an IPPP structure as illustrated in FIG. 3A. I frame is an IDR (Instantaneous Decoder Refresh) frame and has a function to clear a DPB (Decoded Picture Buffer) in the decoder unit. The IDR frame is required for random access. IDR frames are arranged at predetermined time intervals (i.e., IDR periods). For the convenience of explanation, this specification focuses mainly on the IPPP structure, but as it is well known, there are various video compression formats such as IBBB structure, IBBP structure, and so on, and the present invention may adopt any of known video formats without any limitation.
For example, in a bitstream at 30 fps in which the IDR period is set to 1 second, one I (Intra) frame and successive 29 P (Predictive) frames are located within a time interval of 1 second.
I frame is required to reduce image error propagation due to camera movement or scene change, or to enable random access. Accordingly, in the case of encoding general content such as a drama or a movie, it is encoded so that one I frame per second is included.
I frame is encoded independently from other frames, while P frame is encoded dependently on one or more reference frames that are preceding frames in encoding/decoding order. Accordingly, as illustrated in FIG. 3(b), the compressed I frame has a significantly larger volume of data than that of the compressed P frame. For example, when encoding 30 frames per second, the similarity of successive frames in time is very high, thus P frame that is encoded with a difference between the current frame and the preceding reference frame has a compression performance relatively higher than I frame.
FIG. 4 illustrates four snap shots selected from surveillance video footage captured by an IP camera at a fixed location. For reference, these snap shots were selected at intervals of about 30 seconds to 1 minute from surveillance video footage captured by the IP camera installed in an intersection area.
In general, surveillance video footages captured by IP cameras show that only some moving objects change their position and shape over time, but the background that occupies most of scene remains still or not much changed. This is a distinctive characteristic in video footage captured by IP camera installed at a fixed location to continuously shoot a designated area.
In addition, these IP cameras not only have few camera movements or scene changes, but also have significantly lower need for random access to specific time points of images taken by IP cameras.
By reflecting the characteristics of the video footage generated by CCTV system, the video storage device 130 according to one embodiment operates to significantly reduce the volume of data of the bitstream stored in the storage unit 230.
FIG. 5 illustrates a bitstream transcoding technique by the transcoder 220 of the video storage device 130.
As illustrated in FIG. 5(a), the video compression format of the bitstream corresponding to video footage captured by IP camera has IPPP structure in which I frame (i.e., IDR frame) is placed first at each predetermined IDR period. For the convenience of explanation, this specification illustrates that the reference frame for all P frames is set to one immediately preceding frame as example, but is not limited thereto, P frames may also have a plurality of reference frames preceding in the encoding/decoding order.
Since the volume of compressed data of I frame is significantly larger than that of P frame, the transcoder 220 according to one embodiment may reduce the volume of data of bitstream to be stored in the storage unit 230 by applying a relatively long updated IDR period, and reprocessing the bitstream to have one I frame in the updated IDR period.
As an example of the methods that can be considered, as illustrated in FIG. 5(b), the transcoder 220 may transcode the most preceding first frame in the updated IDR period as I frame, and all other frames as P frames that are dependently encoded with reference to each preceding frame to reduce the volume of data of bitstream to be stored. The transcoder 220 may reuse one I frame and a plurality of P frames belonging to the first GOP (Group of pictures; see FIG. 5(a)) of successive GOPs without reprocessing.
In order to reprocess I frame belonging to the second GOP of the successive GOPs as P frame, all of the most preceding I frame to the immediately preceding P frame (i.e., the last P frame of the first GOP) are decoded, and then I frame belonging to the second GOP is encoded with reference to the immediately preceding P frame as PT10 frame.
Even in the case of the P frame referencing the I frame, which is transcoded to PT10 frame, in the second GOP, the reconstructed pixel value of PT10 frame becomes different from the reconstructed pixel value of the I frame because the I frame is converted to the PT10 frame in a loss manner. The P frame, therefore, no longer refers to PT10 frame as the reference frame instantly. In order to prevent mismatch, the corresponding P frame must also be changed as PT11 frame by being decoded and then encoded with reference to PT10.
The reprocessing the I frame of the second GOP into P frame and the reprocessing the remaining P frames must be commonly performed for all successive GOPs within the updated IDR period.
In this way, the reprocessing all of the frames included in the second GOP to the last GOP of all successive GOPs within the updated IDR period into a P frame referencing the immediately preceding frame separately performs decoding and encoding for each target frame, and this will remarkably increase a complexity of transcoding. In addition, there is a problem in that the video quality may be deteriorated in a process of decoding and re-encoding a plurality of frames in a loss manner.
In order to improve the aforementioned problem and reprocess it into a bitstream with a significantly small volume of data, the transcoder 220 according to one embodiment may reprocess the bitstream using distinctive characteristics of the surveillance video footage captured by the IP camera.
That is, the transcoder 220 encodes the reconstructed pixel value of the I frame in a lossless manner by ridding the encoder of Transform, Quantization, Inverse Transform, Inverse Quantization, and In-loop Filter and finally generates the P frame that has the same reconstructed pixel value as I frame. (See FIG. 6(b)) Through this, it is possible to remarkably reduce the complexity that may occur in the transcoding process, and also solve the problem of video quality degradation.
Specifically, the transducer 220 designates only I frames in the second GOP to the last GOP of all frames within the updated IDR period as transcoding target frames to be changed into P frames by performing lossless encoding process, and then reprocesses into a bitstream having IPPP encoding structure within the updated IDR period. (See Fig. 5(c))
That is, the transcoder 220 decodes the I frame of the second GOP of successive GOPs belonging to the updated IDR period and encodes the reconstructed pixel value of the I frame into a PT1 frame referencing the I frame of the first GOP.
Since lossless encoding is performed to ensure that the original pixel value before encoding and the reconstructed pixel value after the corresponding pixel value is encoded and reconstructed are the same, the reconstructed value of the I frame (that is, the pixel value to be encoded and also the original pixel value before encoding), which is the transcoding target frame, is the same as the reconstructed value of the reprocessed PT1 frame.
Therefore, since no mismatch will occur in the reference pixel value (frame) of the P frame immediately after referencing the I frame in the second GOP of successive GOPs, there is no need to reprocess the corresponding P frame. This is the same for successive GOPs. However, frames within the updated IDR period can have IPPP encoding structure by applying a rule of transcoding the I frame as the transcoding target frame in the third GOP of successive GOPs, into PT2 frame with reference to PT1 frame that was transcoded in the immediately preceding GOP (e.g., the second GOP), in other words, a rule of transcoding the transcoding target frame in the third GOP and successive GOPs with reference to the immediately preceding transcoding target frame.
As another example, frames within the updated IDR period can have IPPP encoding structure by applying a rule of transcoding, the I frame as the transcoding target frame in the third GOP of successive GOPs into PT2 frame with reference to the I frame in the first GOP, same as the I frame in the second GOP, in other words, a rule of transcoding all transcoding target frames in and after the second GOP with reference to the I frame in the first GOP.
Accordingly, it is sufficient to selectively reference the I frame in the first GOP and/or the preceding transcoding target frame, irrespective of the immediately preceding P frame, in order to re-encode after the transcoding target frame is decoded. This is based on the distinctive characteristics of the video footage captured by the IP camera of the CCTV system (i.e., most of the video scene contains a large portion of the same background area), so that the preceding I frame that is temporally far from can be referenced. .
As described above, the transcoder 220 according to one embodiment deals with only the transcoding target frames (i.e., I frames after the second GOP), when decoding, independently decodes the I frames that were independently encoded, and when encoding, encodes it with reference to the I frame in the first GOP and/or the preceding transcoding target frame, so it is advantageous that the complexity of the transcoding process can be remarkably reduced while the compression rate can be remarkably improved. In addition, there is an advantage of removing image quality degradation by applying a lossless encoding technique in the transcoding process.
When the bitstream is reprocessed through the transcoder 220 according to one embodiment, fast search can become available. For example, when a fast search for a specific bitstream is requested from the control center 140, the storage unit 230 retrieves a bitstream consisting of the first I frame and the P frames such as PT1, PT2 frames that are reprocessed transcoding target frames and provides to the control center 140. Compared to a conventional search in which 1 second-length video can be searched in 1 second by selecting and decoding 30 frames per second, the search speed can be about thirty times faster than the conventional search by selecting one frame per second as a decoding target frame, decoding 30 frames per second, and searching 30 second-length video in 1 second. In addition, after the fast search, it is also possible to provide a normal search function for a specific period of time. This may be an advantage of the transcoder 220 according to one embodiment that a bitstream generated by an IP camera that simply extends the IDR period, or a bitstream that is a result of general transcoding cannot have.
Comparison between the compression performance of the conventional video storage device and the video storage device according to one embodiment for storing the bitstream corresponding to the video footage captured by the IP camera is as follows. For reference, the video footage is FHD at 30 fps, the video compression format is IPPP, and IDR is 1 second. In general, the data volume of the P frame compared to the I frame is estimated to be about several tens of 1, but for convenience of calculation, the data volume of the I frame is set to 20 and the data volume of the P frame is set to 1 as a ratio of 20:1.
In addition, for convenience of comparison, it is specified as a case where the updated IDR period is changed to 10 seconds in consideration of the characteristics of the CCTV system video.
In this case, in the conventional video apparatus, since one I frame and 29 P frames per second exist, the volume of data is 49 (i.e., 20x1+1x29), and the total volume of data is 490 because it is a 10-second interval.
However, in the video storage device 130 according to one embodiment, since the I frames of the successive GOPs are transcoded into P frames so as to correspond to the updated IDR period by the transcoder 220, there are 1 I frame and 299 P frames in the 10 second period (i.e., 29 of the first GOP and 270 of the remaining 9 GOPs), and the total volume of data is 319 (i.e., 20x1+1x299), which is about 35% improvement in compression performance compared to conventional video storage devices.
As the updated IDR period is set to be relatively large, the compression performance may be relatively remarkably improved even though the restriction on the random access increases.
Thus, as an example, the updated IDR period may be fixed to a specific value set by the operator in consideration of typical characteristics of the video footage captured by an IP camera.
In addition, as another example, the updated IDR period may be adaptively determined in the bitstream in consideration of the characteristics of the bitstream by a controller provided in the video storage device 130.
That is, the controller provided in the video storage device 130 adaptively determines whether or not to change the transcoding target intra frame (i.e., I frame) to an inter frame (i.e., P frame or B frame) in the received bitstream so a plurality of different updated IDR periods (e.g., updated IDR periods adaptively determined to have time lengths such as 3 seconds, 10 seconds, 7 seconds, 20 seconds, etc.) may be applied in the reprocessed bitstream.
In this case, under control of the controller, a difference value between the reconstructed pixel values of the transcoding target intra frame and the reconstructed pixel values of the intra frame in the first GOP or a difference value between the reconstructed pixel values of the transcoding target intra frame and the reconstructed pixel values of a reference frame (intra frame or inter frame) is calculated, and the transcoding is not performed so the transcoding target intra frame remains as intra frame if the calculated difference value exceeds a predetermined threshold value, while the transcoder 220 performs the transcoding to process as inter frame if the calculated difference value is less than the threshold value. The controller can adaptively adjust the updated IDR period, and the difference value between the reconstructed pixel values of frames may be calculated using an error measurement technique such as SAD (Sum of Absolute Difference), MSE (Mean Squared Error), SSE (Sum of Squared Error), SAE (Sum of Absolute Error) that calculates a difference value between corresponding pixels.
In some cases, if all of the difference between the frames are equal to or less than the threshold value, the processed bitstream may have only one intra frame as a whole and thus all frames belong to one updated IDR period. In this case, since an unfavorable situation may occur in terms of random access, the maximum value of the updated IDR period that can be adaptively selected may be predetermined so that a bitstream capable of random access above the minimum level may be processed. .
In addition, as another example, a controller (not shown) provided in the control center 140 and/or the video storage device 130 may adaptively calculate and apply the optimal updated IDR period by use of an equation that is predefined to have the number of channels in which bitstream is to be stored in the video storage device 130, and a total storage capacity of the storage unit 230, a size of the free storage capacity of the storage unit 230, and a time period during which the bitstream is stored and maintained in the storage unit 230 as factors of consideration.
In this case, when the number of channels is relatively large compared to the size of the total storage capacity or the size of the free storage capacity of the storage unit 230 or the storage maintenance period of the bitstream is long, the updated IDR period may be calculated relatively large. Of course, the limit range of the updated IDR period calculated by the equation may be set in advance, and may be limited in advance so that the updated IDR period within the limit range is applied.
Next, the video storage device 130 according to one embodiment may have processing complexity compared to the conventional video storage device because the transcoding process is performed by the transcoder 220.
The transcoder 220 transcodes (i.e., decodes and encodes) only one I frame out of 30 frames per second, and assuming that the complexity of the encoding process for one frame is about 5 times the complexity of the decoding process, the transcoder 220 may be estimated to decode about 6 frames based on the decoding processing complexity.
Accordingly, the transcoder 220 occurs the complexity of further processing 6 frames by an additional transcoding process compared to the conventional video storage device. Therefore, when the conventional video storage device decodes 30 frames per second, the video storage device 130 according to one embodiment can be considered to decode 36 frames per second, and this means that the complexity increases by up to 20%.
However, compared to the increase in processor cost when the transcoder 220 is included in the conventional video storage device to cause complexity, the amount saved by increasing the compression rate of the bitstream to prevent an increase in storage capacity is significantly larger so the industrial applicability of the video storage device according to one embodiment will be sufficient.
When the transcoder 220 according to one embodiment is implemented in the video storage device 130, the encoder unit of the transcoder 220 may be newly developed to perform the aforementioned processing. In this case, since the configuration and operation structure of the encoder unit of the transcoder 220 can be easily understood by those skilled in the art through the above description, a detailed description thereof will be omitted.
In another way, the encoder unit of the transcoder 220 uses the same structure of the conventional encoder as illustrated in Fig. 6(a), but may be applied to enable the operation according to one embodiment by flag(s).
FIG. 6(a) illustrates the structure of an HEVC encoder. As shown, the modules consisting of the encoder structure include Block Structure, Intra Prediction, Inter Prediction, Transform, Quantization, Inverse Transform, Inverse Quantization, In-loop Filter, and Entrophy Coding.
The encoder consisting of the aforementioned blocks is generally processed in a lossy coding technique, which causes loss in the final coded result. In particular, a lot of losses may occur in Quantization block, and some losses may also occur in Transform and In-loop filter.
Therefore, by adjusting the flag values of SPS (Sequence Parameter Set), which includes information on the sequence level, PPS (Picture Parameter Set), which includes information on the picture level, PH (Picture Header), SH (Slice Header) and/or CU syntax (Coding Unit Syntax), the encoder unit of the transcoder 220 can be adjusted to perform lossless encoding processing.
In general, transquant_bypass_enabled_flag (i.e., a flag that specifies on/off of the transquant bypass function) in the PPS is set to be turned on to enable or disable transquant bypass for each coding unit (CU), and Cu_transquant_bypass_flag, which is a flag of the CU syntax, is set to on so that the block (i.e., coding unit) is losslessly encoded. Specifically, transquant_bypass_enabled_flag is set to on to enable transquant bypass for each CU, and Cu_transquant_bypass_flags for the whole CUs in the transcoding target frame are set to on so that the transcoding target frame is losslessly encoded, according to one embodiment.
Through this, the encoder unit of the transcoder 220 can use, as illustrated in of FIG. 6(b), an encoder structure in which only minimum number of modules (i.e., Block Structure, Intra Prediction, Inter Prediction and Entropy Coding that guarantee lossless) operate.
The improved encoder structure can greatly reduce the encoding complexity by omitting the operation of several modules, and it is possible to simplify the implementation because the minimum number of modules can be included even if the encoder unit of the transcoder 220 is newly developed.
In the process of storing the reprocessed bitstream in the storage unit 230, the video storage device 130 using the aforementioned improved encoder structure stores bitstreatm that is processed in aforementioned technique that modifies header information of bitstream indicating reference relationship regarding transcoding. The reference relationship indicates reference frame information on a target frame, and the reference frame information may determine whether to delete the reconstructed frames in the DPB in the encoder and/or decoder. Thereafter, when a specific video footage is requested from the control center 140, the corresponding bitstream may be retrieved from the storage unit 230 and provided to the control center 140. In this case, it is possible to improve management efficiency by providing the reprocessed bitstream stored in the storage unit 230 to the control center 140, but when the video footage is used in the control center 140, random access may be relatively restricted.
In addition, the video storage device 130 reprocesses the bitstream received from the NVT 240 and stores it in the storage unit 230, but when a specific video footage is requested from the control center 140, the video storage device 130 may reprocess the corresponding bitstream by transcoding P frames that were the transcoding target frames into I frames and provide to the control center 140. In this case, processing complexity may be increased by reprocessing the bitstream without any image quality degradation in a lossless encoding technique, but when the corresponding image is used in the control center 140, the random access performance may be relatively improved. That is, if the lossless encoding technique is applied, the I frame to be transcoded may be reprocessed into a P frame, and then the corresponding P frame may be reprocessed into an I frame.
So far, the description has been made focusing on the case where the video storage device 130 provided with the transcoder 220 according to one embodiment is applied to a CCTV system, but in addition, the present invention can be applied to, without limitation, various applications in which an video storage device for receiving and storing a bitstream corresponding to a video footage captured by a fixed camera without scene change.
It should be understood that the aforementioned method of storing video data may be implemented as a software program or application embedded in a digital processing device such as a video processing device, and may be performed as an automated procedure according to a time-series sequence. Codes and code segments constituting the program or the like can be easily inferred by a computer programmer in the art. Further, the program is stored in a computer readable media, and is read and executed by a computer to implement the method.
While the invention has been described above with reference to exemplary embodiments, it will be understood by those skilled in the art that the invention can be modified and changed in various forms without departing from the concept and scope of the invention described in the appended claims.

Claims (11)

  1. Apparatus for recording video data, comprising:
    a transcoder, configured for transcoding at least a portion of a first bitstream provided by a network video transmitter to process a second bitstream having an updated IDR (Instantaneous Decoder Refresh) period that has a time length different from that of an original IDR of the first bitstream; and
    a storage unit, configured for storing the second bitstream.
  2. The apparatus of claim 1, wherein the transcoder designates at least one of Intra frames as a transcoding target frame and change into an Inter frame in order for the second bitstream to have the updated IDR period,
    wherein the updated IDR period is applied equally throughout the second bitstream, or is determined by interpreting the first bitstream in accordance with a predetermined bitstream characteristic analysis rule.
  3. The apparatus of claim 2, wherein if the updated IDR period is applied equally throughout the second bitstream, the transcoder is configured for designating Intra frames belonging to the second GOP to the last GOP within each updated IDR period in the first bitstream as the transcoding target frames,
    wherein in the second bitstream, the Intra frames belonging to each of a plurality of GOPs in each updated IDR period are processed to have a predetermined reference relationship,
    wherein the predetermined reference relationship is any one of IPPP structure, IBBB structure, and IBBP structure.
  4. The apparatus of claim 2, wherein the time length of the updated IDR period is calculated by an equation that is predefined to have the number of channels that provide bitstream to be stored in the storage unit, and a total storage capacity of the storage unit, a size of the free storage capacity of the storage unit, and a time period during which the bitstream is stored and maintained in the storage unit as factors of consideration.
  5. The apparatus of claim 2, wherein the predetermined bitstream characteristic analysis rule for adaptively applying the updated IDR period is, for reconstructed pixels of the Intra frames belonging to the second GOP to the last GOP among a plurality of GOPs in the first bitstream, to calculate a difference value between corresponding reconstructed pixels in Intra frame of the first GOP or a reference frame, to specify the Intra frame as a transcoding target frame only when the difference value between frames is less than a predetermined threshold value, and to change the Intra frame into Inter frame.
  6. The apparatus of claim 2, wherein the transcoder changes the Intra frame designated as the transcoding target frame into the Inter frame by a lossless coding technique.
  7. The apparatus of claim 6, wherein a flag value in at least one of SPS(Sequence Parameter Set), PPS(Picture Parameter Set), PH(Picture Header), and CU syntax (Coding Unit Syntax) is adjusted in order for an encoder of the transcoder to operate in the lossless coding technique.
  8. A computer program stored on a computer-readable media for executing method for storing video data, wherein the computer program makes a computer to execute following steps, the steps comprising:
    (a) receiving a first bitstream from a network video transmitter; and
    (b) transcoding at least a portion of the first bitstream to process a second bitstream having an updated IDR (Instantaneous Decoder Refresh) period that has a time length different from that of an original IDR of the first bitstream,
    wherein at least one of Intra frames is designated as a transcoding target frame and changed into an Inter frame in order for the second bitstream to have the updated IDR period,
    wherein the updated IDR period is applied equally throughout the second bitstream, or is determined by interpreting the first bitstream in accordance with a predetermined bitstream characteristic analysis rule.
  9. The computer program of claim 8, wherein if the updated IDR period is applied equally throughout the second bitstream, at the step (b), Intra frames belonging to the second GOP to the last GOP within each updated IDR period in the first bitstream are designated as the transcoding target frames,
    wherein the Intra frames belonging to each of a plurality of GOPs in each updated IDR period are processed to have a predetermined reference relationship in the second bitstream,
    wherein the predetermined reference relationship is any one of IPPP structure, IBBB structure, and IBBP structure.
  10. The computer program of claim 8, wherein the time length of the updated IDR period is calculated by an equation that is predefined to have the number of channels that provide bitstream to be stored in the storage unit, and a total storage capacity of the storage unit, a size of the free storage capacity of the storage unit, and a time period during which the bitstream is stored and maintained in the storage unit as factors of consideration.
  11. The computer program of claim 8, wherein the predetermined bitstream characteristic analysis rule for adaptively applying the updated IDR period is, for reconstructed pixels of the Intra frames belonging to the second GOP to the last GOP among a plurality of GOPs in the first bitstream, to calculate a difference value between corresponding reconstructed pixels in Intra frame of the first GOP or a reference frame, to specify the Intra frame as a transcoding target frame only when the difference value between frames is less than a predetermined threshold value, and to change the Intra frame into Inter frame.
PCT/KR2020/013182 2020-01-22 2020-09-28 Apparatus and method for recording video data WO2021149892A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2020-0008843 2020-01-22
KR1020200008843A KR102271118B1 (en) 2020-01-22 2020-01-22 Apparatus and method for recording video data

Publications (1)

Publication Number Publication Date
WO2021149892A1 true WO2021149892A1 (en) 2021-07-29

Family

ID=76601943

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2020/013182 WO2021149892A1 (en) 2020-01-22 2020-09-28 Apparatus and method for recording video data

Country Status (2)

Country Link
KR (1) KR102271118B1 (en)
WO (1) WO2021149892A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4383705A1 (en) * 2022-12-06 2024-06-12 Axis AB A method and device for pruning a video sequence

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102510454B1 (en) * 2022-09-02 2023-03-16 주식회사 솔디아 Video streaming delivery system with reduced transmission volume

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070038700A (en) * 2005-10-06 2007-04-11 삼성전자주식회사 Method for coding of moving picture frame with less flickering and apparatus therefor
KR20070051757A (en) * 2005-11-15 2007-05-18 한국전자통신연구원 A method of scalable video coding for varying spatial scalability of bitstream in real time and a codec using the same
KR20080033813A (en) * 2006-10-13 2008-04-17 삼성전자주식회사 / Method and apparatus for encoding /decoding data
JP2014064124A (en) * 2012-09-20 2014-04-10 Casio Comput Co Ltd Video processing device, video processing method, and program
JP2018019186A (en) * 2016-07-26 2018-02-01 Kddi株式会社 Transcoding system, transcoding method, computer readable recording medium, decoder and encoder

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102012037B1 (en) 2019-05-16 2019-08-19 주식회사 경림이앤지 Transcoding and encryption transmission device of video and audio data of IP based CCTV camera

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070038700A (en) * 2005-10-06 2007-04-11 삼성전자주식회사 Method for coding of moving picture frame with less flickering and apparatus therefor
KR20070051757A (en) * 2005-11-15 2007-05-18 한국전자통신연구원 A method of scalable video coding for varying spatial scalability of bitstream in real time and a codec using the same
KR20080033813A (en) * 2006-10-13 2008-04-17 삼성전자주식회사 / Method and apparatus for encoding /decoding data
JP2014064124A (en) * 2012-09-20 2014-04-10 Casio Comput Co Ltd Video processing device, video processing method, and program
JP2018019186A (en) * 2016-07-26 2018-02-01 Kddi株式会社 Transcoding system, transcoding method, computer readable recording medium, decoder and encoder

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4383705A1 (en) * 2022-12-06 2024-06-12 Axis AB A method and device for pruning a video sequence

Also Published As

Publication number Publication date
KR102271118B1 (en) 2021-06-30

Similar Documents

Publication Publication Date Title
US20220303583A1 (en) Video coding using constructed reference frames
KR101108661B1 (en) Method for coding motion in a video sequence
US10009628B2 (en) Tuning video compression for high frame rate and variable frame rate capture
WO2009110741A2 (en) Method and apparatus for encoding and decoding image by using filtered prediction block
WO2011019246A2 (en) Method and apparatus for encoding/decoding image by controlling accuracy of motion vector
AU2005272046B2 (en) Method and apparatus for detecting motion in MPEG video streams
JPH06181569A (en) Method and device for encoding and decoding picture and picture recording medium
EP3402204A1 (en) Encoding a video stream having a privacy mask
WO2021149892A1 (en) Apparatus and method for recording video data
CN115209153A (en) Encoder, decoder and corresponding methods
WO2009113812A2 (en) Method and apparatus for encoding and decoding image
US20080069226A1 (en) Motion picture encoder, motion picture decoder,and method for generating encoded stream
CN114946181A (en) Reference picture management method for video coding
KR100327952B1 (en) Method and Apparatus for Segmentation-based Video Compression Coding
US11849138B2 (en) Video coding and decoding
WO2019135636A1 (en) Image coding/decoding method and apparatus using correlation in ycbcr
JP2007180970A (en) Video processor and monitoring camera system
CN110572675A (en) Video decoding method, video encoding method, video decoding apparatus, video encoding apparatus, storage medium, video decoder, and video encoder
RU2795812C1 (en) Method and device for image coding/decoding using adaptive color conversion and method for bitstream transmission
KR102580900B1 (en) Method and apparatus for storing video data using event detection
JP3356413B2 (en) Image decoding method and apparatus
JP2003189242A (en) Video recording and reproducing device and reproducing method
US11490121B2 (en) Transform device, decoding device, transforming method, and decoding method
RU2793777C1 (en) Method and device for internal prediction based on internal subsegments in image coding system
RU2811759C2 (en) Method and device for image encoding/decoding using adaptive color conversion and method for bitstream transmission

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20915585

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20915585

Country of ref document: EP

Kind code of ref document: A1