WO2023192579A1 - Systems and methods for region packing based compression - Google Patents
Systems and methods for region packing based compression Download PDFInfo
- Publication number
- WO2023192579A1 WO2023192579A1 PCT/US2023/017072 US2023017072W WO2023192579A1 WO 2023192579 A1 WO2023192579 A1 WO 2023192579A1 US 2023017072 W US2023017072 W US 2023017072W WO 2023192579 A1 WO2023192579 A1 WO 2023192579A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- region
- frame
- video
- bounding box
- interest
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/167—Position within a video image, e.g. region of interest [ROI]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/20—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
- H04N19/88—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving rearrangement of data among different coding units, e.g. shuffling, interleaving, scrambling or permutation of pixel data or permutation of transform coefficient data among different blocks
Definitions
- the present application relates generally to video encoding and decoding and more particularly relates to video encoding and decoding using object and/or region detection and packing at an encode and region unpacking at the decoder.
- a video codec can include an electronic circuit or software that compresses or decompresses digital video, ft can convert uncompressed video to a compressed format or vice versa.
- a device that compresses video (and/or performs some function thereof) can typically be called an encoder, and a device that decompresses video (and/or performs some function thereof) can be called a decoder.
- a format of the compressed data can preferably conform to a standard video compression specification such as HEVC, AVI, VVC and the like.
- VCM refers broadly to video coding and decoding for machine consumption and while the disclosed systems and methods may be standard compliant, the disclosure is not limited to a specific proposed protocol or standard.
- Video and image analysis methods and applications often attempt to detect and track specific classes of objects and regions of interest.
- the tasks may only depend on specific objects or regions.
- Object classes and regions of interest in a video may depend on the tasks an analysis engine or machine task system is expected to perform.
- video content may be compressed by identifying objects of interest in a video frame and only transmitting information related to such obj ects and omitting other objects or regions which are not of interest. Further compression efficiency may be realized by packing objects of interest identified in a frame into a contiguous region prior to video compression.
- the presently disclosed method for compressing video and image data focuses on compression that preserves objects in each frame.
- a general system using this method detects one or more regions of interest or objects of interest in a video frame, tightly packs regions in a frame while discarding regions that are not of interest.
- the term region may refer to an area in an image with a common characteristic (e.g., color, texture, water, grass, sky, etc.) or including a specific object of interest (e.g., cat, dog, person, car, etc.).
- a video encoder for compression using region packing may include a region detection module receiving a video frame for encoding, identifying a regions of interest in the video frame based on target task parameters, and generating a bounding box for the region of interest.
- a region extractor module may be coupled to the region detection module and for each identified region of interest, the region extractor may obtains the pixels within the bounding box from the video frame.
- a region packing module receives the identified regions of interest and arranges the bounding boxes in a packed frame while substantially omitting data in the frame outside the identified regions of interest.
- a video encoder receives the packed frame and generates an encoded bitstream therefrom.
- the bounding box is a rectangle and the region detector module generates parameters representing the size and location of the bounding box including coordinates in the frame for a comer of the bounding box, a width parameter and a height parameter.
- the region detector may include one or more object detectors.
- the region detector may also detect a region comprising a region of color, texture, or other region characteristic or feature.
- a video decoder for decoding a video bitstream encoded using region packing includes a video decoder module receiving an encoded bitstream including at least one encoded region therein.
- a region unpacking module is coupled to the video decoder module and identifies parameters of a bounding box for the encoded region.
- a frame reconstruction module is provided and uses the parameters to position and size the bounding box within a reconstructed frame and populate the bounding box with decoded pixels corresponding to the region.
- Fig. 1 is a simplified block diagram illustrating components of a region packing based video compression system.
- Fig. 2 is a simplified diagram illustrating an exemplary frame of video having multiple objects therein.
- Fig. 3 is the simplified diagram of Fig. 3 in which “car” objects and “cat” objects have been identified.
- Figs 4A-4D are images illustrating an object of interest and various representations of the objects of interest with different treatment of background pixels.
- Figs. 5A and 5B illustrate two examples of region packing in which the objects in Fig. 4 are packed.
- Figs 6A and 6B illustrate the exemplary packed frame of Fig. 5A output from a decoder and used to recreate the unpacked frame of Fig. 4, including the objects of interest.
- Fig. 7 is a simplified flow diagram illustrating the process of unpacking and reconstructing an image frame based on the decoded, packed frame data.
- Fig. 8 is a simplified block diagram further detailing an embodiment of a decoder in accordance with the present disclosure.
- Figure 1 is a simplified block diagram illustrating components of a region packing based video compression system, including an encoder 100, transmission channel 105 for compressed video, and a receiver/decoder 110.
- the region detection module 115 takes at least one picture/frame as input and detects regions of interest in the picture.
- the regions can be different objects in the frame or portions of the picture with similar texture.
- region detector 115 can use two or more frames as input to identify regions in a frame that have similar motion.
- the detected regions can be rectangular or any arbitrary shape. It will be appreciated, however, that for efficient compression and packing, regions may preferably be restricted to rectangular shapes.
- each detected region may correspond to an object and in such cases an object detector may be employed to perform the functions of the region detector 115.
- a receiver system 110 may send target task parameters 120 to the region detector 115 to change the behavior of the region detection module 115.
- the target task parameters 120 may indicate the type of regions that the region detection module 115 should identify and detect.
- the target task parameters 120 may also identify other region parameters, such as whether a rectangular or arbitrary shaped region should be detected.
- receiver system 110 may dynamically request different types of regions or objects that are to be detected.
- Region detection module 115 may be comprised of multiple detection systems that can be selected based on the target task parameters 120. For example, region detection module 115 may select a specific detector optimized for particular class of objects such as a first detector for people objects and a different detector for car objects. Region detection module 115 may be configured to detect regions of specific color such as red regions or specific areas such as water surface or sky. In another example, a region detector may be configured to detect specific objects, such as a backpack. It will be appreciated that some region detection system may be able to detect multiple types of objects.
- Region detection module 115 may use previously configured target task parameters 120 without a need for additional information from the receiver system.
- the region detection module 115 produces bounding boxes of the regions of interest when the regions are rectangular.
- a bounding box definition specifies the location, size and shape of the bounding box.
- a bounding box may be defined by the coordinates of the top-left comer of the box, box width, and box height. Any other protocol which allows the position, size and shape to be specified may also be employed.
- the coordinates of two diagonally opposite comers may define a rectangular bounding box. Bounding boxes of more than one region may overlap. In some cases, the entire area of a frame may be included in detected regions. In some cases, only a small portion of the input frame may be included in the detected regions.
- a binary mask may be used to identify the region.
- a binary mask can be represented with Is and 0s for each pixel of the image, where a value of 1 indicates that the pixel belongs to the region of interest and a value of 0 indicates the pixel is not in the region of interest.
- Figure 2 is an example of a sample frame having a number of objects therein.
- Region detection module 115 can be configured to identify all objects or only a subset of objects of interest. In this case, there are five objects, a white car 205, a black car 210, a black cat 215, a white car 220, a white care 225, and a tree 230.
- Each object is defined by a bounding box with (x,y) coordinate of the top left comer, the width of the bounding box, and the height of the bounding box.
- (Olx, Oly) are the (x,y) coordinates of the top left comer and O1W is the width of the box, and O1H is the height of the box.
- the tree object 330 is not detected and is not processed as a detected region.
- the region detector in the example may be configured with target task parameters set to detect at least cats and cars. It will be appreciated that these objects are merely exemplary and a wide range of anticipated obj ects can be detected.
- the detected regions and/or objects can be applied to a region extraction module 125.
- Region extraction can be a separate functional element or can be combined with region detection module 115 or region packing module 130.
- the region extraction module 125 uses the input image and the bounding box as input data and extracts the sub-images that correspond to the detected regions.
- regions correspond to specific object class or classes
- the extracted sub-images may have the pixels in the bounding box that are not part of the detected object or region of interest.
- Such pixels are called background pixels. Background pixels can be handled in three different ways 1) replaced by black or another solid color pixel 2) replaced by average pixel value of the all the background pixels, 3) left unmodified.
- background pixel information may help detect the objects of interest on the receiver side and improve the machine task performance at the receiver.
- Figs. 4A through 4D in which penguins are the objects of interest.
- Fig. 4A illustrates the original image, which includes a number or penguin objects.
- Fig. 4B regions outside the objects are replaced by black pixels.
- Fig. 4C Regions outside the objects are replaced by pixels having an average of background pixels in the object bounding boxes and in Fig. 4D Regions outside the objects in the object bounding boxes left unmodified.
- the region packing module 130 extracts the sub-images corresponding to each region and packs them into compact regions for compression.
- the detected regions are extracted and packed into a compact region and compressed using an efficient video compression.
- Video compression can generally take place using conventional compression methods, such as those employed in known video codec standards such as VVC, AVI, HEVC and the like.
- the regions may be packed in multiple arrangements as shown in Figs 5 A and 5B which illustrate two examples of region packing arrangements in accordance with the present disclosure.
- the arrangement of objects of interest 505, 510, 515, 520, and 525 may be selected to maximize the compression performance of the video encoder used.
- the region packing arrangement may be changed as a part of the encoding process. In the example shown in Fig. 5A, having a black cat 515a (object 03) placed above black car 510a (object 01) may produce best compression. In each case, the tree object (Fig. 3, 330) in the original frame is not among the objects of interest and is not detected or included in the packed frame.
- Object parameters such as the bounding box and object position are needed at the decoder to recover the position of the objects in the reconstructed frame.
- the object list, the bounding box, and object placement in the packed frame are preferably included in video bitstream headers.
- An exemplary syntax for the frame region information header is shown in the table below.
- the frame region information may be included in header such as picture or slice header of a frame.
- Frame regions information semantics can be extended to support more than 2 dimensions.
- the semantics will be extended with three additional parameters: fri_object_z_pos - z coordinate of object in the packed frame fri_object_bbox_z_pos - z coordinate of object in the reconstructed frame fri object bbox depth - depth of the object
- the video encoder 135 is suitable for encoding single frames or a sequence of frames. An image encoder may also be used. Frames with packed regions are encoded with compression efficiency suitable for targeted use at the receiver/decoder 140.
- the frame packing arrangement is usually determined as a part of the encoding step.
- the encoder 135 receives the original frame and the region bounding boxes as input and as a part of the encoding process, determines the region packing arrangement that maximizes the compression performance.
- the encoder 135 includes the frame region information in the compressed video bitstream.
- the original video width and height are also encoded in the compressed video bitstream.
- the Point Cloud Compression (PCC) encoder can be used instead or in conjunction with the video encoder.
- the corresponding video decoder 140 uses the compressed video bitstream as input and outputs a decoded a region packed frame and the frame region information.
- the original video width and height are also decoded from the video bitstream.
- Video decoder 140 can take the form of known video decoders that are compliant with the encoding scheme used by encoder 135, such as VVC, HEVC, AVI standard compliant encoders and the like.
- the region unpacking stage 145 receives the decoded frame which includes the packed objects (Fig. 6A), frame region information, and original frame dimensions from the video decoder 140 as input and reconstructs the frame with objects/regions 605, 610, 615, 620, 625 placed in their correct positions from the original frame (Fig. 6B).
- the reconstruction process in this case will copy pixels in the bounding box of a given object to the corresponding location of the object in the original frame.
- the reconstructed frame in Fig. 6B is used as input to the machine task system 150 that performs the desired operations.
- the regions from the packed frame are extracted and placed in corresponding places in the reconstructed frame (Fig. 6B) using the bounding box information for each of the packed regions.
- the reconstructed frame preferably has the same dimensions as the input frame, although scaling of the reconstructed frame is also possible.
- the reconstructed frame will generally not have regions that are not detected and packed at the encoder system 100.
- the tree region object 330 shown in Fig. 3 was not detected and was not packed in the bitstream and will not be present in the reconstructed frame.
- background information around the objects in the original frame may not be present in the packed bitstream, further reducing the data to be encoded and decoded.
- the machine task system 150 uses the reconstructed frame (Fig. 6B) as input to perform the intended tasks.
- the machine task sy stem 150 may dynamically send target task parameters to the encoding system 100.
- the encoding system 100 in response to the updated target task parameters, can preferably update the type and number of region/object detectors selected to encode the video frame.
- FIG. 7 A simplified example of the region unpacking for a single region/object in the decoded frame is presented in Fig. 7.
- the figure further illustrates the process for unpacking objects, such as object “04”.
- each detected object is packed with information sufficient to identify the object/regions position and size in the original frame. In one example, this can take the form of the coordinates of one comer of a rectangular bounding box, e.g., the top left comer, as well as the width and heigh of the object.
- the video decoder 140 will output the packed frame 705.
- information about each object is used to position the object in the reconstructed frame.
- region unpacking for object 04 705 the coordinates O4x and O4y locate the top left hand comer of a rectangular bounding box for the object in the reconstructed frame, 04W specifies the width of the bounding box and O4H specifies the height of the bounding box for 04.
- the remaining objects are extracted and placed in the reconstructed frame 715 concurrently or subsequently using substantially the same process.
- Fig. 8 is a simplified block diagram further illustrating an example of a decoder in accordance with the present disclosure.
- Coded video is received at an entropy decoding module 805.
- the semantic and video payload information is decoded from the binary representation and passed to an inverse quantization (for video payload) module 810 and in-loop filters 825(for video information), and to the frame unpacking component 845 (for packing semantics).
- the inverse quantization module 810 applies the operation that inverts the quantization employed during encoding and produces the frequency coefficients of the residual.
- An inverse transform processor 815 is coupled to the inverse quantization module 810 and applies complementary operations that inverts the forward transform employed during encoding and produces pixel values of the residual. These values are added in a summation stage 820 to the previously decoded frames to reconstruct current frame.
- the in-loop filters 825 apply processing at the boundaries of the predicted blocks in order to smooth-out the abrupt changes between blocks.
- a decoded picture buffer 830 stores the decoded video frames that are used for prediction of the other frames in the independent group-of-pictures.
- the size of the buffer is typically controlled by the decoder parameters.
- the decoder includes an intra prediction processing block 835 in which the pixel value prediction is performed based on the information contained in the current frame. All the previously decoded blocks of the frame can be used to predict next block in the frame.
- the decoder further includes a motion compensated prediction module 840 in which the blocks in the current frame are predicted from the collocated or displaced matching blocks in the neighboring frames, using motion vectors to describe displacement.
- a frame unpack module 845 is coupled to the decoded picture buffer and the entropy decoder 805.
- the frame unpack module 845 takes the fully decoded video frames and using the packing semantic information received from the entropy decoder 805 unpacks the regions placing them in the specified locations in the reconstructed frame, such as illustrated in Fig. 7.
- the reconstructed frame processor 850 provides the final output of the decoder that generally has the dimensions of the input frame at the encoder side and contains all the regions of interest in locations as in the input frame. It will be appreciated, however, that in some applications encoder/decoder might decide to encode locations and scales of the regions that do not match the input locations and scales.
- Detectron2 With an object detector from the Detectron2 library (Girshick et al. 2018, Detectron, retrieved from https://github.com/facebookresearch/detectron), inferences for each frame are used to black-out all pixels outside of the object bounds. Region coordinates output by the model are then used to perform packing such that all regions are arranged into an optimal bin size. Each of the packed frames serve as input to the video encoder.
- the compressed frames are unpacked using the region and location parameters included in the bitstream.
- the reconstructed images are then finally processed through an object segmentation model implemented with Detectron2.
- the table describes results using a VVC reference encoder (Bross et al., Overview of the Versatile Video Coding (VVC) Standard and its Applications. IEEE Transactions on Circuits and Systems for Video Technology 31, 10 (October 2021), 3736-3764. DOI:https://doi.org/10.1109/TCSVT.2021.3101953), VTM, in intra-coding mode.
- the columns indicate the average bits per pixel (BPP) and mean average precision (mAP) across quantization parameters 22, 27, 32, 37, 42, and 47 for the aforementioned 100 images.
- BPP average bits per pixel
- mAP mean average precision
- “Blk Packed” corresponds to packed frames where a black color is used for any pixels outside of a region box.
- “Original” columns show results for the same 100 images not processed with region packing.
- region packing significantly reduces BPP while simultaneously maintaining high mAP.
- An encoded frame processed with region packing in the majority of cases, has comparable precision to that of the original -untransformed video frame.
- BD-rate numbers show that such a packing system can produce outputs with lower BPP for the same precision.
- BD-mAP results indicate that there is some potential to improve mAP for equivalent BPP.
- any one or more of the aspects and embodiments described herein may be conveniently implemented using digital electronic circuitry, integrated circuitry, specially designed application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs) computer hardware, firmware, software, and/or combinations thereof, as realized and/or implemented in one or more machines (e.g., one or more computing devices that are utilized as a user computing device for an electronic document, one or more server devices, such as a document server, etc.) programmed according to the teachings of the present specification, as will be apparent to those of ordinary skill in the computer art.
- ASICs application specific integrated circuits
- FPGAs field programmable gate arrays
- Various aspects or features may include implementation in one or more computer programs and/or software that are executable and/or interpretable on a programmable system including at least one programmable processor, which can be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
- a programmable processor which can be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
- Such software may be a computer program product that employs a machine-readable storage medium.
- a machine-readable storage medium may be any medium that is capable of storing and/or encoding a sequence of instructions for execution by a machine (e.g., a computing device) and that causes the machine to perform any one of the methodologies and/or embodiments described herein.
- Examples of a machine-readable storage medium include, but are not limited to, a magnetic disk, an optical disc (e.g., CD, CD-R, DVD, DVD- R, etc.), a magneto-optical disk, a read-only memory “ROM” device, a random access memory “RAM” device, a magnetic card, an optical card, a solid-state memory device, an EPROM, an EEPROM, Programmable Logic Devices (PLDs), and/or any combinations thereof
- a machine-readable medium, as used herein, is intended to include a single medium as well as a collection of physically separate media, such as, for example, a collection of compact discs or one or more hard disk drives in combination with a computer memory.
- a machine-readable storage medium does not include transitory forms of signal transmission.
- Such software may also include information (e g., data) carried as a data signal on a data carrier, such as a carrier wave.
- a data carrier such as a carrier wave.
- machine-executable information may be included as a data-carrying signal embodied in a data carrier in which the signal encodes a sequence of instruction, or portion thereof, for execution by a machine (e.g., a computing device) and any related information (e.g., data structures and data) that causes the machine to perform any one of the methodologies and/or embodiments described herein.
- Examples of a computing device include, but are not limited to, an electronic book reading device, a computer workstation, a terminal computer, a server computer, a handheld device (e.g., a tablet computer, a smartphone, etc.), a web appliance, a network router, a network switch, a network bridge, any machine capable of executing a sequence of instructions that specify an action to be taken by that machine, and any combinations thereof.
- a computing device may include and/or be included in a kiosk.
- any one or more of the aspects and embodiments described herein may be conveniently implemented using one or more machines (e.g., one or more decoder and/or encoders that are utilized as a user decoder and/or encoder for an electronic document, one or more server devices, such as a document server, etc.) programmed according to the teachings of the present specification, as will be apparent to those of ordinary skill in the computer art.
- Appropriate software coding can readily be prepared by skilled programmers based on the teachings of the present disclosure, as will be apparent to those of ordinary skill in the software art.
- Aspects and implementations discussed above employing software and/or software modules may also include appropriate hardware for assisting in the implementation of the machine executable instructions of the software and/or software module.
- phrases such as “at least one of’ or “one or more of’ may occur followed by a conjunctive list of elements or features.
- the term “and/or” may also occur in a list of tw o or more elements or features. Unless otherwise implicitly or explicitly contradicted by the context in which it is used, such a phrase is intended to mean any of the listed elements or features individually or any of the recited elements or features in combination with any of the other recited elements or features.
- the phrases “at least one of A and B;” “one or more of A and B;” and “A and/or B” are each intended to mean “A alone, B alone, or A and B together.”
- a similar interpretation is also intended for lists including three or more items.
- the phrases “at least one of A, B, and C;” “one or more of A, B, and C;” and “A, B, and/or C” are each intended to mean “A alone, B alone, C alone, A and B together, A and C together, B and C together, or A and B and C together.”
- use of the term “based on,” above and in the claims is intended to mean, “based at least in part on,” such that an unrecited feature or element is also permissible.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR1020247034035A KR20240169641A (en) | 2022-04-01 | 2023-03-31 | Systems and methods for region packing-based compression |
| CN202380031819.1A CN119301944A (en) | 2022-04-01 | 2023-03-31 | System and method for region encapsulation based compression |
| EP23781862.0A EP4505705A4 (en) | 2022-04-01 | 2023-03-31 | SYSTEMS AND METHODS FOR AREA PACKAGING-BASED COMPRESSION |
| US18/902,154 US20250324067A1 (en) | 2022-04-01 | 2024-09-30 | Systems and methods for region packing based compression |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202263326313P | 2022-04-01 | 2022-04-01 | |
| US63/326,313 | 2022-04-01 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/902,154 Continuation US20250324067A1 (en) | 2022-04-01 | 2024-09-30 | Systems and methods for region packing based compression |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023192579A1 true WO2023192579A1 (en) | 2023-10-05 |
Family
ID=88203317
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2023/017072 Ceased WO2023192579A1 (en) | 2022-04-01 | 2023-03-31 | Systems and methods for region packing based compression |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US20250324067A1 (en) |
| EP (1) | EP4505705A4 (en) |
| KR (1) | KR20240169641A (en) |
| CN (1) | CN119301944A (en) |
| WO (1) | WO2023192579A1 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2026024820A1 (en) * | 2024-07-23 | 2026-01-29 | Snap Inc. | Processing and transmitting active regions of display for improved performance |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB2636752A (en) * | 2023-12-20 | 2025-07-02 | Sony Interactive Entertainment Europe Ltd | Method and system |
| US20250234010A1 (en) * | 2024-01-17 | 2025-07-17 | Tencent America LLC | Adaptive coding based on region of interest |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110228846A1 (en) * | 2005-08-26 | 2011-09-22 | Eran Eilat | Region of Interest Tracking and Integration Into a Video Codec |
| US20170339417A1 (en) * | 2016-05-23 | 2017-11-23 | Intel Corporation | Fast and robust face detection, region extraction, and tracking for improved video coding |
| US20190253719A1 (en) * | 2016-10-21 | 2019-08-15 | Peking University Shenzhen Graduate School | Describing Method and Coding Method of Panoramic Video ROIs |
| US20190379856A1 (en) * | 2018-06-08 | 2019-12-12 | Lg Electronics Inc. | Method for processing overlay in 360-degree video system and apparatus for the same |
| WO2021211884A1 (en) * | 2020-04-16 | 2021-10-21 | Intel Corporation | Patch based video coding for machines |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP6612135B2 (en) * | 2016-01-14 | 2019-11-27 | 日立オートモティブシステムズ株式会社 | Vehicle detection device and light distribution control device |
| CN112929662B (en) * | 2021-01-29 | 2022-09-30 | 中国科学技术大学 | Coding method for solving object overlapping problem in code stream structured image coding method |
-
2023
- 2023-03-31 KR KR1020247034035A patent/KR20240169641A/en active Pending
- 2023-03-31 CN CN202380031819.1A patent/CN119301944A/en active Pending
- 2023-03-31 EP EP23781862.0A patent/EP4505705A4/en active Pending
- 2023-03-31 WO PCT/US2023/017072 patent/WO2023192579A1/en not_active Ceased
-
2024
- 2024-09-30 US US18/902,154 patent/US20250324067A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110228846A1 (en) * | 2005-08-26 | 2011-09-22 | Eran Eilat | Region of Interest Tracking and Integration Into a Video Codec |
| US20170339417A1 (en) * | 2016-05-23 | 2017-11-23 | Intel Corporation | Fast and robust face detection, region extraction, and tracking for improved video coding |
| US20190253719A1 (en) * | 2016-10-21 | 2019-08-15 | Peking University Shenzhen Graduate School | Describing Method and Coding Method of Panoramic Video ROIs |
| US20190379856A1 (en) * | 2018-06-08 | 2019-12-12 | Lg Electronics Inc. | Method for processing overlay in 360-degree video system and apparatus for the same |
| WO2021211884A1 (en) * | 2020-04-16 | 2021-10-21 | Intel Corporation | Patch based video coding for machines |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4505705A4 * |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2026024820A1 (en) * | 2024-07-23 | 2026-01-29 | Snap Inc. | Processing and transmitting active regions of display for improved performance |
Also Published As
| Publication number | Publication date |
|---|---|
| EP4505705A1 (en) | 2025-02-12 |
| KR20240169641A (en) | 2024-12-03 |
| EP4505705A4 (en) | 2026-04-01 |
| US20250324067A1 (en) | 2025-10-16 |
| CN119301944A (en) | 2025-01-10 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20250324067A1 (en) | Systems and methods for region packing based compression | |
| JP2014525176A (en) | Intensity-based chromaticity intra prediction | |
| AU2022202473B2 (en) | Method, apparatus and system for encoding and decoding a tensor | |
| WO2023048070A1 (en) | Systems and methods for compression of feature data using joint coding in coding of multi-dimensional data | |
| AU2025201260A1 (en) | Method, apparatus and system for encoding and decoding a tensor | |
| JP2025533723A (en) | Method, apparatus, and system for encoding and decoding tensors | |
| CN118872268A (en) | Method, device and system for encoding and decoding tensors | |
| US20250227255A1 (en) | Systems and methods for object boundary merging, splitting, transformation and background processing in video packing | |
| US20250254362A1 (en) | Systems and methods for region packing based encoding and decoding | |
| US20260105643A1 (en) | Method, apparatus and system for encoding and decoding a tensor | |
| WO2024211956A1 (en) | Method, apparatus and system for encoding and decoding a tensor | |
| JP2025533727A (en) | Method, apparatus, and system for encoding and decoding tensors | |
| CN118872269A (en) | Method, device and system for encoding and decoding tensors | |
| US20250227254A1 (en) | Systems and methods for region detection and region packing in video coding and decoding for machines | |
| WO2026085554A1 (en) | Method, apparatus and system for encoding and decoding a plurality of tensors | |
| WO2025208169A1 (en) | Method, apparatus and system for encoding and decoding a plurality of tensors | |
| WO2025213210A1 (en) | Method, apparatus and system for encoding and decoding a plurality of tensors | |
| CN121925850A (en) | Encoding and decoding methods based on feature-adaptive mapping of regions of interest in VCM | |
| WO2024072769A1 (en) | Systems and methods for region detection and region packing in video coding and decoding for machines | |
| AU2022202474A1 (en) | Method, apparatus and system for encoding and decoding a tensor | |
| AU2022202472A1 (en) | Method, apparatus and system for encoding and decoding a tensor |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23781862 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202380031819.1 Country of ref document: CN |
|
| REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112024020161 Country of ref document: BR |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202417076557 Country of ref document: IN |
|
| ENP | Entry into the national phase |
Ref document number: 20247034035 Country of ref document: KR Kind code of ref document: A |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023781862 Country of ref document: EP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2023781862 Country of ref document: EP Effective date: 20241104 |
|
| ENP | Entry into the national phase |
Ref document number: 112024020161 Country of ref document: BR Kind code of ref document: A2 Effective date: 20240927 |
|
| WWP | Wipo information: published in national office |
Ref document number: 202380031819.1 Country of ref document: CN |


