WO2022220207A1 - 情報処理装置および方法 - Google Patents

情報処理装置および方法 Download PDF

Info

Publication number
WO2022220207A1
WO2022220207A1 PCT/JP2022/017458 JP2022017458W WO2022220207A1 WO 2022220207 A1 WO2022220207 A1 WO 2022220207A1 JP 2022017458 W JP2022017458 W JP 2022017458W WO 2022220207 A1 WO2022220207 A1 WO 2022220207A1
Authority
WO
WIPO (PCT)
Prior art keywords
image quality
quality improvement
information
image
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2022/017458
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
充 勝股
光浩 平林
優 池田
健史 筑波
健治 近藤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Group Corp
Original Assignee
Sony Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Group Corp filed Critical Sony Group Corp
Priority to US18/552,438 priority Critical patent/US12537981B2/en
Priority to JP2023514636A priority patent/JPWO2022220207A1/ja
Publication of WO2022220207A1 publication Critical patent/WO2022220207A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/231Content storage operation, e.g. caching movies for short term storage, replicating data over plural servers, prioritizing data for deletion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/85406Content authoring involving a specific file format, e.g. MP4 format

Definitions

  • the present disclosure relates to an information processing device and method, and more particularly, to an information processing device and method that enable an easy grasp of image quality improvement technology required for content reproduction.
  • VVC Very Video Coding
  • ISOBMFF International Organization for Standardization Base Media File Format
  • MPEG-4 Moving Picture Experts Group - 4
  • a VVC File Format using is under development (for example, see Non-Patent Document 2 and Non-Patent Document 3).
  • Non-Patent Document 4 a method of applying MPEG-DASH (Moving Picture Experts Group Dynamic Adaptive Streaming over HTTP (Hypertext Transfer Protocol)) and adaptively distributing content according to the bit rate etc. has been considered (for example, Non-Patent Document 4 reference). Also, as content distribution to which such technology is applied, there is, for example, distribution of 360-degree video (see, for example, Non-Patent Document 5).
  • MPEG-DASH Motion Experts Group Dynamic Adaptive Streaming over HTTP (Hypertext Transfer Protocol)
  • the parameter data necessary for applying super-resolution technology which is one of the image quality improvement technologies, to a picture is stored in the VVC bitstream, and the client uses the parameter data to apply the super-resolution technology.
  • a method of applying has been proposed (see, for example, Non-Patent Document 6).
  • a method has also been proposed in which the value of the amount of processing required to apply the image quality improvement technology is stored in the VVC bitstream, and the client determines whether playback with the image quality improvement technology applied is possible based on this value. (See, for example, Non-Patent Document 7).
  • the client that plays the content had to perform complicated processing such as extracting and parsing the VVC bitstream from the file in order to understand the image quality improvement technology required for that playback.
  • An information processing apparatus includes an image quality improvement technology information generation unit that generates image quality improvement technology information related to an image quality improvement technology for improving the image quality of an image to be encoded, and content that stores encoded data of the image. and a file generation unit that generates a file and stores the image quality improvement technical information in the content file.
  • An information processing method includes generating image quality improvement technical information related to an image quality improvement technique for improving the image quality of an image to be encoded, generating a content file storing encoded data of the image, An information processing method for storing improvement technical information in the content file.
  • An information processing apparatus includes an acquisition unit that acquires encoded data of an image to be reproduced from a content file based on image quality improvement technology information related to an image quality improvement technology for improving image quality of an image; and a decoding unit that decodes encrypted data.
  • An information processing method acquires encoded data of an image to be reproduced from a content file based on image quality improvement technical information related to image quality improvement technology for improving image quality of an image, and converts the encoded data to An information processing method for decoding.
  • image quality improvement technical information related to an image quality improvement technique for improving the image quality of an image to be encoded is generated, and a content file storing encoded data of the image is generated.
  • the image quality improvement technology information is stored in the content file.
  • encoded data of an image to be reproduced is acquired from a content file based on image quality improvement technology information related to an image quality improvement technology for improving the image quality of an image, and the code is obtained. Encrypted data is decrypted.
  • FIG. 4 is a diagram illustrating an example of syntax and semantics of an SEI message
  • FIG. 4 is a diagram illustrating an example of syntax and semantics of an SEI message
  • FIG. 10 is a diagram for explaining an NNR level correspondence table
  • It is a figure explaining an encoding / decoding method. It is a figure explaining the example of image quality improvement technical information.
  • It is a figure explaining the storage example of discrimination
  • FIG. 4 is a diagram illustrating an example of storing parameter data;
  • FIG. 4 is a diagram illustrating an example of storing parameter data
  • FIG. 10 is a diagram illustrating an example of syntax regarding parameter data;
  • FIG. 10 is a diagram illustrating an example of syntax regarding parameter data; It is a figure explaining the storage example of image quality improvement technical information. It is a figure explaining the example of the syntax regarding image quality improvement technical information. It is a figure explaining the storage example to MPD of image quality improvement technical information. It is a figure explaining the storage example to MPD of image quality improvement technical information. It is a figure explaining the example of image quality improvement technical information. It is a figure explaining the example of the syntax regarding image quality improvement technical information.
  • 1 is a block diagram showing a main configuration example of a file generation device;
  • FIG. 10 is a flowchart describing an example of the flow of file generation processing;
  • FIG. 2 is a block diagram showing a main configuration example of a client device;
  • FIG. FIG. 11 is a flowchart for explaining an example of the flow of reproduction processing;
  • FIG. It is a figure which shows the main structural examples of a delivery system.
  • FIG. 11 is a flowchart for explaining an example of the flow of reproduction processing;
  • FIG. It is a block diagram which shows the main structural examples of a computer.
  • Non-Patent Document 1 (above) Non-Patent Document 2: (above) Non-Patent Document 3: (above) Non-Patent Document 4: (above) Non-Patent Document 5: (above) Non-Patent Document 6: (above) Non-Patent Document 7: (above) Non-Patent Document 8: Recommendation ITU-T H.264 (04/2017) "Advanced video coding for generic audiovisual services", April 2017
  • Non-Patent Document 9 Recommendation ITU-T H.265 (02/18) "High efficiency video coding", February 2018
  • the content described in the non-patent literature and patent literature mentioned above is also the basis for judging the support requirements.
  • the Quad-Tree Block Structure and QTBT (Quad Tree Plus Binary Tree) Block Structure described in the above non-patent documents are within the scope of disclosure of the present technology even if they are not directly described in the embodiments. , shall meet the support requirements of the claims.
  • technical terms such as Parsing, Syntax, Semantics, etc. are also within the scope of disclosure of the present technology, even if they are not directly described in the embodiments. Satisfy claims support requirements.
  • the term "block” (not a block indicating a processing unit) used in the description as a partial area of an image (picture) or a processing unit indicates an arbitrary partial area in a picture unless otherwise specified. Its size, shape, characteristics, etc. are not limited.
  • the "block” includes TB (Transform Block), TU (Transform Unit), PB (Prediction Block), PU (Prediction Unit), SCU (Smallest Coding Unit), CU (Coding Unit), LCU (Largest Coding Unit), CTB (Coding Tree Block), CTU (Coding Tree Unit), sub-blocks, macro-blocks, tiles, or slices.
  • the block size may be specified not only directly but also indirectly.
  • the block size may be specified using identification information that identifies the size.
  • the block size may be designated by a ratio or a difference from the size of a reference block (for example, LCU or SCU).
  • a reference block for example, LCU or SCU.
  • the above-mentioned information indirectly specifying a size may be used as the information. By doing so, the information amount of the information can be reduced, and the coding efficiency can be improved in some cases.
  • This block size specification also includes block size range specification (for example, block size range specification, etc.).
  • Non-Patent Document 1 there is VVC (Versatile Video Coding), which derives a prediction residual of a moving image, performs coefficient conversion, quantizes, and encodes, as an image coding method.
  • VVC Very Video Coding
  • MPEG-4 Moving Picture Experts Group-4
  • ISOBMFF International Organization for Standardization Base Media File Format
  • Non-Patent Document 4 a method of applying MPEG-DASH (Moving Picture Experts Group Dynamic Adaptive Streaming over HTTP (Hypertext Transfer Protocol)) and adaptively distributing content according to bit rate etc. was thought.
  • MPEG-DASH Motion Experts Group Dynamic Adaptive Streaming over HTTP (Hypertext Transfer Protocol)
  • Non-Patent Document 5 there is distribution of 360-degree video as content distribution to which such technology is applied.
  • Non-Patent Document 6 parameter data necessary for applying a super-resolution technique, which is one of image quality improvement techniques, to a picture is stored in a VVC bitstream, and the client receives the data.
  • a method is proposed to apply super-resolution techniques using parameter data.
  • SEI Supplemental Enhancement Information
  • the value of the amount of processing necessary for applying the image quality improvement technique is stored in the VVC bitstream, and the client applies the image quality improvement technique based on the value for playback.
  • a method was proposed to determine if it is possible.
  • FIG. 3 is a correspondence table for setting the NNR level.
  • Image quality improvement technology refers to technology that improves picture quality.
  • image quality improvement techniques include noise reduction that removes noise from pictures, and edge enhancer that sharpens contours and boundaries.
  • a super-resolution technique is a technique for complementing resolution when generating a high-resolution picture from a low-resolution picture.
  • VVC post-filters that apply super-resolution techniques to decoded pictures are being studied.
  • the decoder can generate a high-resolution picture from the low-resolution picture after decoding.
  • bit rate code amount
  • Parameter data applied to super-resolution technology using this deep learning technology is transmitted for each content or each picture.
  • the client that plays the content had to perform complicated processing such as extracting and parsing the VVC bitstream from the file in order to understand the image quality improvement technology required for that playback.
  • the client downloads the content file, extracts the VVC bitstream from the content file, and until the VVC bitstream is parsed, does not have information about image quality enhancement techniques.
  • the client downloads the content file, extracts the VVC bitstream from the content file, and until the VVC bitstream is parsed, does not have information about image quality enhancement techniques.
  • the VVC bitstream is parsed, does not have information about image quality enhancement techniques.
  • the client selects the bitrate of the content to be downloaded, and the server distributes the segment file with the selected bitrate.
  • the image quality improvement technique can be applied as described above, the client needs to select the segment file to be downloaded in consideration of the application of the image quality improvement technique.
  • the client parses the downloaded data as described above, the client cannot even know whether or not the information regarding the image quality improvement technique is included. Therefore, it was difficult to select segment files to be downloaded in consideration of the application of image quality improvement technology.
  • the client's processing volume may change according to various conditions.
  • conventional methods do not provide the client with information for the client to determine whether image quality improvement techniques can be applied. Therefore, it is difficult for the client to determine whether the image quality improvement technique is applicable.
  • data may be required for image quality improvement for each content, it has been difficult to provide these to the client.
  • an image quality improvement technology information generation unit that generates image quality improvement technology information related to an image quality improvement technology for improving the image quality of an image to be coded, and the coded data of the image are stored.
  • a file generation unit for generating a content file for the image quality improvement technique and storing the image quality improvement technical information in the content file.
  • image quality improvement technical information relating to an image quality improvement technique for improving the image quality of an image to be encoded is generated, and a content file storing the encoded data of the image is generated.
  • the image quality improvement technology information is stored in the content file.
  • an acquisition unit for acquiring encoded data of an image to be reproduced from a content file based on image quality improvement technology information related to an image quality improvement technology for improving image quality of an image, and the code thereof.
  • a decoding unit for decoding the encoded data.
  • coded data of an image to be reproduced is acquired from a content file based on image quality improvement technical information related to an image quality improvement technique for improving the image quality of an image, and the coded data is obtained. to decrypt.
  • a content file is a file that stores content data.
  • the content is arbitrary as long as it includes images. For example, information other than images, such as sound, may be included. Also, this image may be a moving image or a still image. Any encoding method can be used.
  • the decoding method may be any method as long as it is a method corresponding to the encoding method (a method capable of correctly decoding the encoded data encoded by the encoding method).
  • the information processing device for example, the client device
  • the information processing device that reproduces the content can easily grasp the image quality improvement technique required for reproducing the content.
  • “easily” means that there is no need to parse content data.
  • the client device will be able to select segment files in consideration of the application of image quality improvement technology.
  • the server can provide high quality, viewable segment files at low bitrates. That is, it is possible to suppress an increase in the amount of data transmission.
  • CDN Contents Delivery Network
  • costs are charged for the amount of data transmitted from the CDN. Increase can be suppressed.
  • the server can provide delivery services regardless of whether the client device is capable of applying image quality enhancement techniques.
  • the client can ensure sufficient quality with low-bitrate segment files by applying image quality improvement technology. Therefore, high-quality display is possible even when the transmission band is narrow. Also, if the image quality improvement technique of the super-resolution technique can be applied, even if a segment file with a resolution lower than the resolution of the client's display is reproduced, high-quality display is possible.
  • the increase in the amount of data transmission can be suppressed by applying the present technology as described above, thereby suppressing the increase in the cost.
  • the content of the image quality improvement technical information is arbitrary. For example, as shown in the second row from the top of the table in FIG. 4, as this image quality improvement technology information, determination information for determining whether the image quality improvement technology is applicable may be transmitted (#1-1). .
  • the image quality improvement technology information may include determination information for determining whether the image quality improvement technology is applicable.
  • a syntax 101 shown in FIG. 5 shows an example of the syntax of this discrimination information.
  • "ImageQualityImprovementInformation” indicates discrimination information.
  • the content of this discrimination information is arbitrary.
  • the determination information may include the type of image quality improvement technique, the amount of processing, the result of the image quality improvement technique, configuration information for image quality improvement, and the like.
  • level_idc_flag is flag information indicating whether or not the "level_idc” field exists. If this flag is true (eg "1"), it indicates that the level_idc field exists. Conversely, if this flag is false (eg, "0"), it indicates that the level_idc field does not exist.
  • "display_size_flag” is flag information indicating whether or not the "display_width” field and the "display_height” field exist. If this flag is true (eg, '1'), it indicates that the display_width and display_height fields are present. Conversely, if this flag is false (eg "0"), it indicates that the display_width and display_height fields are not present.
  • Uri is a Uri that indicates the type of image quality improvement technical information. For example, in the case of a super-resolution filter using a neural network under consideration in VVC, "urn:mpeg:vvc:postfilter:superresolution:2021" is defined as this value.
  • level_idc indicates level information that serves as a guideline for the processing of the image quality improvement technology indicated by "type_uri”. This level information is defined for each "type_uri”. If “level_idc” does not exist, it indicates that the image quality improvement technology indicated by “type_uri” has only one processing level. Also, for example, in the case of "type_uri” in the above example, nnr_level_idc described in Non-Patent Document 7 is stored.
  • display_width and “display_height” respectively indicate the width and height of the displayed image after applying the image quality improvement technology indicated by "type_uri”. Note that when these parameters do not exist, the width and height of the displayed image are the same as before the image quality improvement technology is applied.
  • “quality” indicates the value of the image quality (quality) of the display image after applying the image quality improvement technology indicated by “type_uri”.
  • config_data_size indicates the number of bytes of "config_data”. Note that if there is no “config_data”, the value of "config_data_size” is set to "0".
  • “config_data” indicates the initialization data of the image quality improvement technology indicated by “type_uri”. Data to be stored is determined for each “type_uri”. For example, in the case of the above example of "type_uri", it includes neural network topology information, parameter data format information, and the like.
  • parameter data applied in processing to which the image quality improvement technology is applied may be transmitted (#1-1 -1).
  • the image quality improvement technology information may include parameter data applied in processing to which the image quality improvement technology is applied.
  • the syntax 102 shown in FIG. 5 shows an example of the syntax of this parameter data.
  • "ImageQualityImprovementData” indicates parameter data.
  • the content of this parameter data is arbitrary. Note that this parameter data may not be necessary depending on the type of image quality improvement technology.
  • data_size indicates the number of bytes of "data”.
  • data indicates the parameter data of the image quality improvement technology indicated by “type_uri”.
  • type_uri the image quality improvement technology indicated by "type_uri”.
  • it contains neural network parameter data.
  • An information processing device (for example, a client device) that reproduces content can determine whether or not to apply the designated image quality improvement technology based on such image quality improvement technology information.
  • the image quality improvement technical information as described above may be stored in a file container and transmitted, for example, as shown in the fourth row from the top of the table in FIG. 4 (#1-2).
  • a file generation unit may store image quality improvement technical information in a file container that stores content (encoded data of an image). Also, in an information processing device (for example, a client device), an acquisition unit acquires encoded data of an image to be reproduced based on image quality improvement technical information stored in a file container in which content (encoded data of an image) is stored. may be obtained.
  • the format and specifications of the file container are arbitrary.
  • it may be ISOBMFF (International Organization for Standardization Base Media File Format). It may also be a Matryoshka Media Container. Of course, other formats may be used.
  • the client device that reproduces the content can acquire the image quality improvement technical information without decoding the content (encoded image data). Therefore, the client device can easily grasp the image quality improvement technique required for reproducing the content.
  • discrimination information may be stored and transmitted in a video track as shown in the fifth row from the top of the table in FIG. ).
  • the file generation unit may store image quality improvement technical information in the ISOBMFF video track.
  • the acquisition unit may acquire the encoded data of the image to be reproduced based on the image quality improvement technical information stored in the video track.
  • a video track is a track in which content (encoded data of an image) is stored. That is, the image quality improvement technical information (including discrimination information) regarding the image may be stored in the same track as the encoded data of the image.
  • a file generation unit stores image quality improvement technical information (including discrimination information) in a box in a sample entry of a video track in which content (encoded data of an image) is stored. may be stored. Further, in the information processing device (for example, the client device), the acquisition unit acquires the encoded data of the image to be reproduced based on the image quality improvement technical information (including the discrimination information) stored in the box in the sample entry.
  • Video Track ('trak') 112 in move box ('moov') 111 of content file 110 Visual sample entry 113 is expanded, and as sample description box ('stsd') 114, An ImageQualityImprovementInfoBox ('iqii') 115 is newly defined, and image quality improvement technical information (including discrimination information) is stored in this box.
  • an ImageQualityImprovementInfoBox ('iqii') is defined and ImageQualityImprovementInformation is stored in that box, as shown in the syntax 121 of FIG.
  • This ImageQualityImprovementInformation is discrimination information and corresponds to syntax 101 in FIG. That is, various parameters defined in the syntax 101 are stored in the ImageQualityImprovementInfoBox.
  • VisualSampleEntry is then extended to contain the ImageQualityImprovementInfoBox, as shown in syntax 122 of FIG.
  • the client device that reproduces the content can easily apply the image quality improvement technology (for example, without parsing the content (encoded data of the image)). It is possible to determine whether or not there is a processing capability to apply the improvement technology to reproduce the content, whether the image quality improvement technology is executable by the user, etc.).
  • the client device can grasp the resolution after applying the image quality improvement technique. Therefore, the client device can easily select whether or not to apply the image quality improvement technique in comparison with the resolution of the display screen. For example, the client device can perform control such that the image quality improvement technique is not applied when the resolution is higher than the display screen.
  • the client device can arbitrarily choose whether or not to apply the ImageQualityImprovementInfoBox.
  • the file generator cannot force the client device to apply ImageQualityImprovementInfoBox.
  • the ImageQualityImprovementInfoBox (that is, discrimination information) may be stored in a configuration box defined for each codec (encoding/decoding method).
  • the file generation device can individually control whether or not the image quality improvement technique can be used for each codec (that is, for each codec).
  • the file generator can apply image quality improvement technology only to AVC (Advanced Video Coding) (in other words, it can control other codecs without affecting them). )be able to.
  • an ImageQualityImprovementInfoBox (that is, discrimination information) may be stored in a user data box (udta box) that stores user data.
  • the image quality improvement technical information (including discrimination information) may be stored using a restriction scheme as shown in the seventh row from the top of the table in FIG. 4 (# 1-2-1-2).
  • the file generation device performs the following three processes.
  • the client device can arbitrarily select whether or not to apply the ImageQualityImprovementInfoBox, as in the case of ⁇ Discrimination information storage 1>.
  • the file generation device can require the application of image quality improvement technology.
  • Restricted sample entry (' resv') 144 is used in Video Track ('trak') 142 in move box ('moov') 141 of content file 140.
  • a Restricted Scheme Information Box ('rinf') 145 is stored in the Restricted sample entry ('resv') 144 thereof.
  • the scheme type (scheme_type) in the scheme type box ('schm') 146 is defined as 'iqip'.
  • ImageQualityImprovementInfoBox ('iqii') 148 is stored in the scheme information box ('schi') 147 in the Restricted Scheme Information Box ('rinf') 145 .
  • the "width" and "height" of the TrackHeaderBox are set to the "width" and "height” displayed after applying the image quality improvement technique.
  • the "width” and “height" of TrackHeaderBox will be set to the "width” and “height” when image quality improvement technology is not applied.
  • parameter data when storing image quality improvement technical information in ISOBMFF, parameter data may be stored in a video track and transmitted, for example, as shown in the eighth row from the top of the table in FIG. 4 (#1-2 -1-3).
  • the parameter data may be stored in a box newly defined in the sample entry (SampleEntry) and transmitted (#1 -2-1-3-1).
  • SampleEntry For example, in the case of FIG. 10, in Video Track ('trak') 152 in move box ('moov') 151 of content file 150, Visual sample entry 153 is expanded, and as sample description box ('stsd') 154, ImageQualityImprovementInfoBox ('iqii') 155 and ImageQualityImprovementDataBox ('iqid') 156 are newly defined, discrimination information is stored in ImageQualityImprovementInfoBox ('iqii') 155, and parameter data is stored in ImageQualityImprovementDataBox ('iqid') 156. stored.
  • an ImageQualityImprovementDataBox ('iqid') is defined, and the ImageQualityImprovementData is stored in that box, as shown in the syntax 161 of FIG.
  • This ImageQualityImprovementData is parameter data and corresponds to the syntax 102 in FIG. That is, various parameters defined in the syntax 102 are stored in the ImageQualityImprovementDataBox.
  • VisualSampleEntry is then extended to contain its ImageQualityImprovementDataBox, as shown in syntax 162 of FIG.
  • the parameter data can be stored in the video track along with the discrimination information. That is, the client device can obtain both discrimination information and parameter data by referring to the visual sample entry.
  • ImageQualityImprovementDataBox (that is, parameter data) may be stored in a configuration box defined for each codec (encoding/decoding method).
  • the file generation device can individually set the parameter data necessary for applying the image quality improvement technique to each codec (that is, for each codec). For example, by storing ImageQualityImprovementDataBox in AVCConfigurationBox, the file generator sets parameter data only for AVC (Advanced Video Coding) (that is, sets parameter data without affecting other codecs). )be able to.
  • an ImageQualityImprovementDataBox (that is, parameter data) may be stored in a user data box (udta box) that stores user data.
  • ImageQualityImprovementDataBox should be added to the scheme information box ('schi') in addition to the processing described above in ⁇ Storage of Discrimination Information 2>.
  • ImageQualityImprovementData which is parameter data
  • ImageQualityImprovementInfoBox may be stored in ImageQualityImprovementInfoBox.
  • ImageQualityImprovementInformation and ImageQualityImprovementData may be stored in ImageQualityImprovementInfoBox('iqii') as shown in syntax 163 of FIG.
  • the client device can acquire discrimination information and parameter data by referring to the ImageQualityImprovementInfoBox.
  • ImageQualityImprovementInfoBox containing this identification information and parameter data may be stored in the sample description box in VisualSampleEntry, as in the case described above in ⁇ Storage of identification information 1>.
  • ImageQualityImprovementData (parameter data) may be stored in the ImageQualityImprovementInfoBox 115 in the example of FIG. 6 (FIG. 7).
  • an ImageQualityImprovementInfoBox containing this discrimination information and parameter data may be stored in a configuration box (ConfigurationBox) in the same manner as described above in ⁇ Storage of Discrimination Information 1>.
  • the ImageQualityImprovementInfoBox containing the discrimination information and the parameter data may be stored in the user data box (udta box) in the same manner as described above in ⁇ Store discrimination information 1>.
  • an ImageQualityImprovementInfoBox containing this discrimination information and parameter data may be stored using a restriction scheme. That is, in the example of FIG. 8, the ImageQualityImprovementInfoBox 138 may store ImageQualityImprovementData (parameter data). In the example of FIG. 9, ImageQualityImprovementInfoBox 148 may store ImageQualityImprovementData (parameter data).
  • ⁇ Storing parameter data 2> In the bitstream (during the sequence), when the applied parameter data changes (when the parameter data is dynamic), as shown in the tenth row from the top of the table in FIG. may be extended to store parameter data (#1-2-1-3-2).
  • VisualSampleGroupEntry may be extended to define ImageQualityImprovementDataEntry, which is a sample group that stores parameter data.
  • ImageQualityImprovementDataEntry is parameter data and corresponds to the syntax 102 in FIG. That is, various parameters defined in the syntax 102 are stored in the ImageQualityImprovementDataEntry.
  • the parameter data applied to each sample in the bitstream can be made variable (dynamic).
  • the discrimination information is stored in the ImageQualityImprovementInformationBox as described above in ⁇ Store discrimination information 1> and ⁇ Store discrimination information 2>.
  • the discrimination information may also be stored using sample groups.
  • VisualSampleGroupEntry may be extended to define an ImageQualityImprovementInfoEntry, in which ImageQualityImprovementInformation and ImageQualityImprovementData are stored, as shown in syntax 172 of FIG.
  • the client device can acquire the discrimination information and the parameter data by referring to the ImageQualityImprovementInfoEntry.
  • ImageQualityImprovementInformation does not change in the bitstream (it is static). That is, each sample group entry contains the same ImageQualityImprovementInformation.
  • the image quality improvement technical information may be stored in the metadata track and transmitted as shown in the 11th row from the top of the table in FIG. 2-2).
  • the file generation unit may store image quality improvement technical information in a metadata track that stores metadata.
  • the acquisition unit may acquire the encoded data of the image to be reproduced based on the image quality improvement technical information stored in the metadata track.
  • a metadata track is a track different from the video track in which the content (encoded data of the image) is stored. That is, image quality improvement technical information (including discrimination information) regarding an image may be stored in a track different from the track on which the encoded data of the image is stored.
  • the metadata track is extended and the image quality improvement technology Meta track is defined.
  • the metafile 180 on the lower side of FIG. 13 is a file that stores metadata tracks (that is, image quality improvement technology Metatracks) corresponding to the video tracks stored in the content file 190 on the upper side of FIG.
  • an ImageQualityImprovementMetadataSampleEntry ('iqim') derived from MetadataSampleEntry is defined as shown in the syntax 201 of FIG. 14 in order to indicate the image quality improvement technology Metatrack. That is, ImageQualityImprovementMetadataSampleEntry ('iqim') as a sample description box ('stsd') 184 in the image quality improvement technology Meta Track ('trak') 182 in the move box ('moov') 181 of the metafile 180 in FIG. 185 are newly defined. This makes it possible to identify this track as the image quality improvement technology Meta track.
  • ImageQualityImprovementMetadataSampleEntry ('iqim') 185 stores ImageQualityImprovementInformationBox ('iqii') 186 (that is, discrimination information).
  • Each sample of the image quality improvement technology Meta track (ImageQualityImprovementMetadataSample 188 of the media data box ('mdat') 187 in FIG. 13) stores parameter data of the image quality improvement technology. That is, as shown in syntax 202 of FIG. 14, ImageQualityImprovementMetadataSample is defined and ImageQualityImprovementData is stored in its box. This ImageQualityImprovementData is parameter data and corresponds to the syntax 102 in FIG. That is, various parameters defined in the syntax 102 are stored in ImageQualityImprovementMetadataSample.
  • a Video Track is specified to which image quality improvement technology is applied using the Track reference function that enables reference between tracks.
  • ('cdsc') is specified as the reference_type of Track reference box ('tref') 183 in the image quality improvement technology Meta track 182 .
  • 'cdsc' means Metadata indicating detailed information of the track to be referenced.
  • Track reference box ('tref') 183 Video Track ('trak') of move box ('moov') 191 of content file 190 in which an image corresponding to the image quality related technical information is stored is stored as a reference destination. 192 is designated. That is, as indicated by an arrow 193, the image quality improvement technology Meta Track and Video Track are associated.
  • a unique reference_type ('iqim') may be defined as the reference_type of the Track reference box ('tref') 183. By doing so, it is possible to clearly indicate that the image quality improvement technique is to be applied.
  • the image quality improvement technique information may be stored in MPD (Media Presentation Description) and transmitted, for example, as shown in the twelfth row from the top of the table in FIG. 4 (#1-3).
  • MPD is a control file that stores information for controlling distribution of content files.
  • the client device selects a segment file to receive based on MPD information, and sends it to the server. Request delivery.
  • the server delivers the requested segment files. By doing so, for example, adaptive distribution can be performed with respect to the bit rate and the like.
  • Image quality improvement technology information may be stored in such an MPD.
  • the file generation unit may further generate a control file for controlling distribution of the content file, and store the image quality improvement technical information in the control file.
  • an acquisition unit acquires a content file based on the image quality improvement technical information stored in the control file, and from the acquired content file, encoded data of an image to be reproduced. may be obtained.
  • this control file is arbitrary.
  • a case of storing in MPD used in MPEG-DASH will be described below.
  • the image quality improvement technical information may be stored in the representation (#1- 3-1). Also, in that case, as shown in the 14th row from the top of the table in FIG. 4, the image quality improvement technical information may be stored in SupplementalProperty (#1-3-1-1).
  • FIG. 15 is a diagram showing a description example of MPD in which image quality improvement technical information is stored.
  • ImageQualityImprovementInformation (discrimination information) is defined using @schemeIdUri of SupplementalProperty of Representation.
  • @iqi:type_uri, @iqi:level, @iqi:display_width, @iqi:display_height, and @iqi:quality are defined as parameters, and respective values are set. These parameters match the fields of the same names in the syntax 101 of FIG. 5 and have equivalent meanings.
  • This Representation is a Representation of a Video track that includes technical information for improving image quality. Therefore, the file (Video.mp4 in the example of FIG. 15) storing the Video track is specified by BaseURL.
  • the image quality improvement technical information is also stored in the content file (ISOBMFF, etc.) in the same manner as described above in ⁇ Storing in ISOBMFF 1>.
  • the client device refers to the image quality improvement technical information stored in the MPD and confirms whether it can be handled.
  • the client device compares the values of iqi:display_width, iqi:display_height, and iqi:quality with other Representations to determine the Representation to acquire.
  • the client device requests the server to distribute the segment file corresponding to the selected Representation.
  • the server delivers the requested segment files.
  • the client device obtains the distributed segment file, extracts the bitstream, and decodes it. Then, a display image is generated by appropriately applying an image quality improvement technique to the obtained image, and the generated display image is displayed.
  • an information processing device for example, a client device
  • a client device that reproduces the content can easily implement the image quality improvement technique required for reproducing the content without decoding the bitstream (encoded data of the image). can grasp. Therefore, the client device can select (request from the server) the content file to be distributed adaptively with respect to the bit rate, etc., considering the application of the image quality improvement technology.
  • the server can provide a high-quality displayable segment file at a low bit rate, for example, if the client device can apply image quality improvement technology. That is, it is possible to suppress an increase in the amount of data transmission.
  • CDN Contents Delivery Network
  • the server can provide delivery services regardless of whether the client device is capable of applying image quality enhancement techniques. In other words, the server can distribute content adaptively considering the application of image quality improvement techniques.
  • the client device can ensure sufficient quality with low bit rate segment files by applying image quality improvement technology. Therefore, high-quality display is possible even when the transmission band is narrow. Also, if the image quality improvement technique of the super-resolution technique can be applied, even if a segment file with a resolution lower than the resolution of the client's display is reproduced, high-quality display is possible.
  • the increase in the amount of data transmission can be suppressed by applying the present technology as described above, thereby suppressing the increase in the cost.
  • the value of the parameter "iqi:quality” may be indicated not by a numerical value, but by the expected bit rate when encoding to the same resolution and image quality without using image quality improvement technology.
  • config_data included in ImageQualityImprovementInfomation may be included.
  • image quality improvement technology information may be stored in a metadata track (image quality improvement technology Meta track) different from the video track in which the content (encoded image data) is stored.
  • BaseURL designates a file (for example, SuperResolutionMetadata.mp4 in FIG. 16) that stores the image quality improvement technology Metatrack.
  • the image quality improvement technology Meta track file and the Video Representation to which it is applied are stored in different AdaptationSets. Therefore, for example, as shown in the 15th row from the top of the table in FIG. , may be linked to the Representation of the Video bitstream file to be applied (#1-3-1-2). For example, this association may be made using Representation@associationId.
  • FIG. 16 is a diagram showing a description example of MPD in which image quality improvement technical information is stored.
  • the image quality improvement technology Meta track and Video track can be stored in different files. Therefore, the client device to which the image quality improvement technology is not applied does not need to acquire the image quality improvement technology Metatrack, so it is possible to reduce the amount of transmission.
  • @width may be set to the same value as iqi:display_width in Representaiton of the image quality improvement technology Metatrack.
  • @height may be set to the same value as iqi:display_height.
  • the BaseURL of Representation may be indicated in a file of another format of metadata for performing image quality improvement technology in the same manner as the image quality improvement technology Metatrack.
  • the image quality improvement technical information for each region may be transmitted (#1-4).
  • the image quality improvement technical information stored in the content file (and MPD) may include information for each region (information for each partial region of the image).
  • FIG. 17 is a diagram showing an example of ImageQualityImprovementInformation syntax in that case.
  • parameters shown in bold are information (parameters) for each region.
  • target_region_type is a parameter that specifies the region to be processed. For example, a value of '1' for this parameter may indicate a 'Tile region group entry'. Also, if the value of this parameter is "2", it may indicate "region_wise_packing".
  • target_region_num is a parameter that indicates the number of regions to which image quality improvement technology is applied.
  • ISOBMFF allows still image data to be stored in MetaBox.
  • the image quality improvement technical information may be applied to the still image data. For example, as shown at the bottom of the table in FIG. 4, still image data image quality improvement technical information may be transmitted (#1-5).
  • the file generation unit may associate the image quality improvement technical information with the still image data stored in the content file. Further, in the information processing device (for example, the client device), the acquisition unit may acquire still image data associated with the image quality improvement technical information.
  • ItemFullProperty may be extended to define ImageQualityImprovementProperty as in the syntax shown in FIG.
  • ImageQualityImprovementInfomation and ImageQualityImprovementData may be stored in this Property.
  • This ImageQualityImprovementInformation is discrimination information and corresponds to syntax 101 in FIG.
  • this ImageQualityImprovementData is parameter data and corresponds to the syntax 102 in FIG. That is, various parameters defined in syntax 101 and syntax 102 are stored in ImageQualityImprovementProperty.
  • This data can be linked as a Property of the still image data.
  • set the essential field of the ItemPropertyAssociationBox that associates ItemProperty to "1".
  • FIG. 19 is a block diagram showing an example of a configuration of a file generation device, which is one aspect of an information processing device to which the present technology is applied.
  • a file generation device 300 shown in FIG. 19 is a device that encodes video content with VVC and stores it in ISOBMFF.
  • FIG. 19 shows main elements such as the processing unit and data flow, and the elements shown in FIG. 19 are not necessarily all. That is, in the file generation device 300, there may be processing units not shown as blocks in FIG. 19, or there may be processes or data flows not shown as arrows or the like in FIG.
  • file generation device 300 has control unit 301 and file generation processing unit 302 .
  • a control unit 301 controls a file generation processing unit 302 .
  • the file generation processing unit 302 is controlled by the control unit 301 to perform processing related to file generation. For example, the file generation processing unit 302 acquires content data having sub-pictures in pictures, encodes the data, and generates a VVC bitstream.
  • the file generation processing unit 302 further stores the generated VVC bitstream in an ISOBMFF file and outputs the file to the outside of the file generation device 300 .
  • the file generation processing unit 302 has an input unit 311, a preprocessing unit 312, an encoding unit 313, a file generation unit 314, a recording unit 315, and an output unit 316.
  • the input unit 311 acquires content data including images and supplies it to the preprocessing unit 312 .
  • the preprocessing unit 312 extracts information necessary for file generation from the content data.
  • the preprocessing unit 312 supplies the extracted information to the file generation unit 314 .
  • the preprocessing unit 312 supplies content data to the encoding unit 313 .
  • the encoding unit 313 encodes the content data supplied from the preprocessing unit 312 using the VVC method to generate a VVC bitstream.
  • the encoding unit 313 supplies the generated VVC bitstream to the file generation unit 314 .
  • the file generation unit 314 generates an ISOBMFF content file and stores the VVC bitstream supplied from the encoding unit 313 in the content file. At that time, the file generation unit 314 may store the information supplied from the preprocessing unit 312 in the content file as appropriate. Also, the file generation unit 314 may generate an MPD corresponding to the content file.
  • the file generation unit 314 supplies the generated content file and MPD to the recording unit 315.
  • the recording unit 315 has an arbitrary recording medium such as a hard disk or a semiconductor memory, and records the content file and MPD supplied from the file generating unit 314 on the recording medium. Also, the recording unit 315 reads the content file or MPD recorded on the recording medium according to a request from the control unit 301 or the output unit 316 or at a predetermined timing, and supplies it to the output unit 316 .
  • the output unit 316 acquires the content files and MPD supplied from the recording unit 315, and outputs them to the outside of the file generation device 300 (for example, distribution server, playback device, etc.).
  • the preprocessing unit 312 may generate image quality improvement technique information related to an image quality improvement technique for improving the image quality of an image to be encoded.
  • the preprocessing unit 312 can also be said to be an image quality improvement technical information generation unit.
  • the file generation unit 314 may generate a content file that stores the encoded data of the image, and store the image quality improvement technical information in the content file.
  • the image quality improvement technology information may include determination information for determining whether the image quality improvement technology is applicable.
  • the image quality improvement technique information may include parameter data applied in the process to which the image quality improvement technique is applied.
  • the image quality improvement technical information may include information for each partial region of the image.
  • the file generation unit 314 may store the image quality improvement technical information in the video track that stores the image.
  • the file generator may store the image quality improvement technical information in a box within the sample entry.
  • the file generation unit 314 may store the image quality improvement technical information in the metadata track that stores the metadata.
  • the file generation unit 314 may further generate a control file (MPD) for controlling distribution of the content file, and store the image quality improvement technical information in the control file.
  • MPD control file
  • the file generation unit may associate the image quality improvement technical information with the still image data stored in the content file.
  • the client device that reproduces the content can easily grasp the image quality improvement technology required for the reproduction of the content.
  • the preprocessing unit 312 of the file generation device 300 acquires image data via the input unit 311 in step S301, and based on the image data, an image quality improvement technique applied to the content. set.
  • step S302 the preprocessing unit 312 generates image quality improvement technique information related to an image quality improvement technique for improving the image quality of the image to be encoded.
  • the encoding unit 313 encodes image data to generate encoded data.
  • step S303 the file generation unit 314 generates a content file that stores the encoded data of the image generated in step S302. Then, the file generation unit 314 stores the image quality improvement technical information in the content file.
  • step S304 the file generation unit 314 generates MPD. Then, the file generation unit 314 stores the image quality improvement technical information in the MPD.
  • the recording unit 315 stores the content file and MPD generated as described above.
  • the output unit 316 reads the content file and MPD at a predetermined timing and outputs them to the outside of the file generation device 300 .
  • step S304 ends, the file generation process ends.
  • the image quality improvement technology information may include determination information for determining whether the image quality improvement technology is applicable.
  • the image quality improvement technique information may include parameter data applied in the process to which the image quality improvement technique is applied.
  • the image quality improvement technical information may include information for each partial region of the image.
  • the file generation unit 314 may store the image quality improvement technical information in the video track storing the image.
  • the file generator may store the image quality improvement technique information in a box within the sample entry.
  • the file generation unit 314 may store the image quality improvement technical information in the metadata track that stores the metadata.
  • the file generation unit 314 may generate an MPD (control file for controlling distribution of the content file) as described above, and store the image quality improvement technical information in the MPD.
  • the file generation unit may associate the image quality improvement technical information with the still image data stored in the content file.
  • the client device that reproduces the content can easily grasp the image quality improvement technique required for reproducing the content.
  • step S304 the process of step S304 is omitted.
  • step S303 the recording unit 315 stores the content file, and the output unit 316 reads the content file at a predetermined timing and outputs it to the outside of the file generation device 300.
  • FIG. 21 is a block diagram showing an example of a configuration of a client device, which is one aspect of an information processing device to which the present technology is applied.
  • the client device 400 shown in FIG. 21 is a playback device that decodes a VVC bitstream in a VVC file format stored in an ISOBMFF content file, and generates and displays a display image of the generated moving image content.
  • FIG. 21 shows the main components such as the processing units and data flow, and what is shown in FIG. 21 is not necessarily all. That is, in the client device 400, there may be processing units not shown as blocks in FIG. 21, or there may be processes or data flows not shown as arrows or the like in FIG.
  • the client device 400 has a control unit 401 and a reproduction processing unit 402 as shown in FIG.
  • the control unit 401 performs processing related to control of the reproduction processing unit 402 .
  • the reproduction processing unit 402 performs processing related to reproduction of moving image content stored in the content file.
  • the reproduction processing unit 402 is controlled by the control unit 401 to acquire a content file from another device (for example, the file generation device 300, a server, etc.).
  • This content file is an ISOBMFF file container, and stores content (encoded image data).
  • the reproduction processing unit 402 executes reproduction processing on the acquired content file, decodes the bitstream of the moving image content stored in the content file, and generates and displays the display image of the moving image content.
  • the reproduction processing unit 402 has a file acquisition unit 411 , a file processing unit 412 , a decoding unit 413 , a display information generation unit 414 , a display unit 415 , a measurement unit 416 and a display control unit 417 .
  • the file acquisition unit 411 acquires content files supplied from the outside of the client device 400 (for example, the server, the file generation device 300, etc.).
  • the file acquisition unit 411 supplies the acquired content file to the file processing unit 412 .
  • the file processing unit 412 acquires the content file supplied from the file acquisition unit 411.
  • the file processing unit 412 acquires measurement results supplied from the measurement unit 416 .
  • the file processing unit 412 acquires control information supplied from the display control unit 417 .
  • the file processing unit 412 uses the information to extract the encoded data of the image from the content file.
  • the file processing unit 412 supplies the extracted encoded data (bitstream) to the decoding unit 413 .
  • the decoding unit 413 decodes the encoded data (bitstream) and generates (restores) image data.
  • the decoding unit 413 supplies the generated image data (video content data) to the display information generating unit 414 .
  • the display information generation unit 414 acquires data of video content supplied from the decoding unit 413 . Also, the display information generation unit 414 acquires control information supplied from the display control unit 417 . Then, the display information generation unit 414 generates the display image and the like from the acquired moving image content data according to the control information. The display information generation unit 414 supplies the generated display image and the like to the display unit 415 .
  • the display unit 415 has a display device and displays the supplied display image using the display device.
  • the measurement unit 416 measures arbitrary information and supplies the measurement result to the file processing unit 412 .
  • the display control unit 417 controls display by supplying control information to the file processing unit 412 and the display information generation unit 414 .
  • the file processing unit 412 may acquire the encoded data of the image to be reproduced from the content file based on the image quality improvement technology information regarding the image quality improvement technology for improving the image quality of the image.
  • the file processing unit 412 can also be said to be an acquisition unit.
  • the decoding unit 413 may decode the encoded data.
  • the image quality improvement technology information may include determination information for determining whether the image quality improvement technology is applicable.
  • the image quality improvement technique information may include parameter data applied in the process to which the image quality improvement technique is applied.
  • the image quality improvement technical information may include information for each partial region of the image.
  • the file processing unit 412 may acquire the encoded data of the image to be reproduced based on the image quality improvement technical information stored in the video track in which the image is stored. In that case, the file processing unit 412 may acquire the encoded data of the image to be reproduced based on the image quality improvement technical information stored in the box in the sample entry.
  • the file processing unit 412 may store image quality improvement technical information in a metadata track that stores metadata.
  • the file processing unit 412 may acquire still image data associated with the image quality improvement technical information.
  • the client device 400 can easily grasp the image quality improvement technique required for reproducing the content.
  • the file processing unit 412 of the client device 400 acquires the content file via the file acquisition unit 411 in step S401.
  • step S402 the file processing unit 412 selects a reproducible track for the content file acquired in step S401 based on the image quality improvement technical information.
  • the file processing unit 412 selects a track to be played from among the playable tracks selected at step S402 based on other information.
  • step S404 the file processing unit 412 acquires the bitstream of the track selected in step S403 from the content file. That is, the file processing unit 412 acquires the encoded data of the image to be reproduced from the content file based on the image quality improvement technique information regarding the image quality improvement technique for improving the image quality of the image.
  • step S405 the decoding unit 413 decodes the bitstream acquired in step S404 to generate (restore) image data.
  • step S406 the display information generation unit 414 executes processing (also referred to as image quality improvement technology processing) for applying the image quality improvement technology to the image data restored in step S405 based on the image quality improvement technology.
  • processing also referred to as image quality improvement technology processing
  • step S407 the display information generation unit 414 generates a display image. Then, the display information generation unit 414 supplies the display image to the display unit 415 to display it. When the process of step S407 ends, the reproduction process ends.
  • the image quality improvement technology information may include determination information for determining whether the image quality improvement technology is applicable.
  • the image quality improvement technique information may include parameter data applied in the process to which the image quality improvement technique is applied.
  • the image quality improvement technical information may include information for each partial region of the image.
  • the file processing unit 412 may acquire the encoded data of the image to be reproduced based on the image quality improvement technical information stored in the video track in which the image is stored. In that case, the file processing unit 412 may acquire the encoded data of the image to be reproduced based on the image quality improvement technical information stored in the box in the sample entry.
  • the file processing unit 412 may store the image quality improvement technical information in the metadata track that stores the metadata.
  • the file processing unit 412 may acquire still image data associated with the image quality improvement technical information.
  • the client device 400 can easily grasp the image quality improvement technology required to reproduce the content.
  • ⁇ Application to distribution system> 21 and 22 an example in which the present technology is applied when the client device 400 selects and acquires content (encoded data of an image) from a received content file has been described. That is, it has been described that the client device 400 selects content (encoded data of images) based on the image quality improvement technology information included in the content file, taking into account the application of the image quality improvement technology.
  • the image quality improvement technical information may be stored in the control file (MPD). That is, the present technology can also be applied to a system that adaptively distributes content files using a control file (MPD).
  • FIG. 23 is a block diagram showing a main configuration example of the distribution system.
  • distribution system 500 has file generation device 511 , distribution server 512 , and client device 513 that are communicably connected to each other via network 510 .
  • This distribution system 500 is a system in which a distribution server 512 distributes a content file generated by a file generation device 511 to a client device 513 using MPEG-DASH.
  • the distribution server 512 and the client device 513 use the MPD to achieve adaptive content distribution with respect to, for example, the bit rate.
  • the file generation device 511 generates a plurality of segment files and MPDs with different bit rates etc. as a content file of one content.
  • File generation device 511 uploads them to distribution server 512 .
  • the distribution server 512 uses those files to adaptively distribute the content with respect to the bit rate and the like.
  • the client device 513 first acquires the MPD. Then, the client device 513 refers to the information described in the MPD and selects the segment file with the optimum bit rate etc. from among the plurality of segment files. The client device 513 then requests the distribution server 512 to distribute the selected segment file. The distribution server 512 distributes the requested segment file to the client device 513 . The client device 513 receives the segment file, extracts and decodes the bitstream, and plays the resulting content.
  • the network 510 is a communication network that serves as a communication medium between devices.
  • Network 510 may be a wired communication network, a wireless communication network, or both.
  • it may be a wired LAN (Local Area Network), a wireless LAN, a public telephone network, a wide area communication network for wireless mobiles such as the so-called 4G line or 5G line, or the Internet, etc., or a combination thereof.
  • the network 510 may be a single communication network or a plurality of communication networks.
  • the network 510 is partly or wholly of a predetermined standard such as a USB (Universal Serial Bus) (registered trademark) cable or HDMI (High-Definition Multimedia Interface) (registered trademark) cable. It may be configured by a communication cable.
  • USB Universal Serial Bus
  • HDMI High-Definition Multimedia Interface
  • FIG. 23 one file generation device 511, one distribution server 512, and one client device 513 are shown, but the number of these devices is arbitrary.
  • the present technology may be applied to such a distribution system 500. That is, the above-described file generation device 300 (FIG. 19) may be applied as the file generation device 511 . Also, as the client device 513, the above-described client device 400 (FIG. 21) may be applied.
  • the file generation unit 314 further generates a content file storing encoded data of an image and a control file (MPD) for controlling distribution of the content file, and transmits image quality improvement technical information to that file. May be stored in the control file.
  • MPD control file
  • the file processing unit 412 acquires the content file based on the image quality improvement technical information stored in the control file (MPD) that controls the distribution of the content file, and reproduces the acquired content file. Encoded data of the image may be obtained.
  • MPD control file
  • the client device 513 can easily grasp the image quality improvement technique required for content reproduction before acquiring the content file. Therefore, the client device 513 can select a segment file in consideration of application of image quality improvement technology.
  • the distribution server 512 can provide high-quality displayable segment files at a low bit rate. That is, it is possible to suppress an increase in the amount of data transmission.
  • CDN costs are charged based on the amount of data transmitted from the CDN. Therefore, as described above, this technology can be applied to suppress the increase in data transmission amount, thereby suppressing the increase in cost. can be done. Additionally, distribution server 512 can provide distribution services regardless of whether client device 513 is capable of applying image quality enhancement techniques.
  • the client device 513 can ensure sufficient quality with low-bit-rate segment files by applying image quality improvement technology. Therefore, high-quality display is possible even when the transmission band is narrow. Also, if the image quality improvement technique of the super-resolution technique can be applied, even if a segment file with a resolution lower than the resolution of the display of the client device 513 is reproduced, high-quality display is possible.
  • the increase in the amount of data transmission is suppressed by applying the present technology as described above, thereby suppressing the increase in the cost. can be done.
  • the file processing unit 412 of the client device 400 acquires the MPD via the file acquisition unit 411 in step S501.
  • step S502 the file processing unit 412 selects a reproducible representation based on the image quality improvement technical information stored in the MPD acquired in step S501.
  • the file processing unit 412 selects a representation to be played from among the playable representations selected at step S502 based on other information.
  • step S504 the file processing unit 412 acquires the content file corresponding to the selected representation via the file acquisition unit 411.
  • the file processing unit 412 acquires the bitstream included in the content file acquired in step S504. That is, the file processing unit 412 acquires the content file based on the image quality improvement technical information stored in the control file (MPD) that controls the distribution of the content file, and extracts the encoded data of the image to be reproduced from the acquired content file. to get
  • MPD control file
  • step S505 the decoding unit 413 decodes the bitstream and generates (restores) image data.
  • step S506 the display information generation unit 414 executes image quality improvement technology processing on the image data restored in step S505 based on the image quality improvement technology.
  • step S507 the display information generation unit 414 generates a display image. Then, the display information generation unit 414 supplies the display image to the display unit 415 to display it. When the process of step S507 ends, the reproduction process ends.
  • the client device 400 can easily grasp the image quality improvement technology required to reproduce the content.
  • ⁇ Computer> The series of processes described above can be executed by hardware or by software.
  • a program that constitutes the software is installed in the computer.
  • the computer includes, for example, a computer built into dedicated hardware and a general-purpose personal computer capable of executing various functions by installing various programs.
  • FIG. 25 is a block diagram showing an example of the hardware configuration of a computer that executes the series of processes described above by means of a program.
  • a CPU Central Processing Unit
  • ROM Read Only Memory
  • RAM Random Access Memory
  • An input/output interface 910 is also connected to the bus 904 .
  • An input unit 911 , an output unit 912 , a storage unit 913 , a communication unit 914 and a drive 915 are connected to the input/output interface 910 .
  • the input unit 911 consists of, for example, a keyboard, mouse, microphone, touch panel, input terminal, and the like.
  • the output unit 912 includes, for example, a display, a speaker, an output terminal, and the like.
  • the storage unit 913 is composed of, for example, a hard disk, a RAM disk, a nonvolatile memory, or the like.
  • the communication unit 914 is composed of, for example, a network interface.
  • Drive 915 drives removable media 921 such as a magnetic disk, optical disk, magneto-optical disk, or semiconductor memory.
  • the CPU 901 loads, for example, a program stored in the storage unit 913 into the RAM 903 via the input/output interface 910 and the bus 904, and executes the above-described series of programs. is processed.
  • the RAM 903 also appropriately stores data necessary for the CPU 901 to execute various processes.
  • a program executed by a computer can be applied by being recorded on removable media 921 such as package media, for example.
  • the program can be installed in the storage unit 913 via the input/output interface 910 by loading the removable medium 921 into the drive 915 .
  • This program can also be provided via wired or wireless transmission media such as local area networks, the Internet, and digital satellite broadcasting.
  • the program can be received by the communication unit 914 and installed in the storage unit 913 .
  • this program can be installed in the ROM 902 or the storage unit 913 in advance.
  • This technology can be applied to any image encoding/decoding method.
  • this technology can be applied to any configuration.
  • the present technology can be applied to various electronic devices.
  • the present technology includes a processor (e.g., video processor) as a system LSI (Large Scale Integration), etc., a module (e.g., video module) using a plurality of processors, etc., a unit (e.g., video unit) using a plurality of modules, etc.
  • a processor e.g., video processor
  • LSI Large Scale Integration
  • module e.g., video module
  • a unit e.g., video unit
  • it can be implemented as a part of the configuration of the device, such as a set (for example, a video set) in which other functions are added to the unit.
  • the present technology can also be applied to a network system configured by a plurality of devices.
  • the present technology may be implemented as cloud computing in which a plurality of devices share and jointly process via a network.
  • this technology is implemented in cloud services that provide image (moving image) services to arbitrary terminals such as computers, AV (Audio Visual) equipment, portable information processing terminals, and IoT (Internet of Things) devices. You may make it
  • a system means a set of multiple components (devices, modules (parts), etc.), and it does not matter whether all the components are in the same housing. Therefore, a plurality of devices housed in separate housings and connected via a network, and a single device housing a plurality of modules in one housing, are both systems. .
  • Systems, devices, processing units, etc. to which this technology is applied can be used in any field, such as transportation, medical care, crime prevention, agriculture, livestock industry, mining, beauty, factories, home appliances, weather, and nature monitoring. . Moreover, its use is arbitrary.
  • this technology can be applied to systems and devices used to provide viewing content. Further, for example, the present technology can also be applied to systems and devices used for traffic, such as traffic condition supervision and automatic driving control. Further, for example, the technology can be applied to systems and devices that serve security purposes. Also, for example, the present technology can be applied to systems and devices used for automatic control of machines and the like. Furthermore, for example, the technology can be applied to systems and devices used in agriculture and animal husbandry. The present technology can also be applied to systems and devices that monitor natural conditions such as volcanoes, forests, oceans, and wildlife. Further, for example, the technology can be applied to systems and devices used for sports.
  • “flag” is information for identifying a plurality of states, not only information used for identifying two states of true (1) or false (0), Information that can identify the state is also included. Therefore, the value that this "flag” can take may be, for example, two values of 1/0, or three or more values. That is, the number of bits constituting this "flag” is arbitrary, and may be 1 bit or multiple bits.
  • the identification information (including the flag) is assumed not only to include the identification information in the bitstream, but also to include the difference information of the identification information with respect to certain reference information in the bitstream.
  • the "flag” and “identification information” include not only that information but also difference information with respect to reference information.
  • various types of information (metadata, etc.) related to the encoded data may be transmitted or recorded in any form as long as they are associated with the encoded data.
  • the term "associating" means, for example, making it possible to use (link) data of one side while processing the other data. That is, the data associated with each other may be collected as one piece of data, or may be individual pieces of data.
  • information associated with coded data (image) may be transmitted on a transmission path different from that of the coded data (image).
  • the information associated with the encoded data (image) may be recorded on a different recording medium (or another recording area of the same recording medium) than the encoded data (image). good.
  • this "association" may be a part of the data instead of the entire data. For example, an image and information corresponding to the image may be associated with each other in arbitrary units such as multiple frames, one frame, or a portion within a frame.
  • a configuration described as one device may be divided and configured as a plurality of devices (or processing units).
  • the configuration described above as a plurality of devices (or processing units) may be collectively configured as one device (or processing unit).
  • part of the configuration of one device (or processing unit) may be included in the configuration of another device (or other processing unit) as long as the configuration and operation of the system as a whole are substantially the same. .
  • the above-described program may be executed on any device.
  • the device should have the necessary functions (functional blocks, etc.) and be able to obtain the necessary information.
  • each step of one flowchart may be executed by one device, or may be executed by a plurality of devices.
  • the plurality of processes may be executed by one device, or may be shared by a plurality of devices.
  • a plurality of processes included in one step can also be executed as processes of a plurality of steps.
  • the processing described as multiple steps can also be collectively executed as one step.
  • a computer-executed program may be configured such that the processing of the steps described in the program is executed in chronological order according to the order described in this specification, in parallel, or when calls are executed. It may also be executed individually at necessary timings such as when it is interrupted. That is, as long as there is no contradiction, the processing of each step may be executed in an order different from the order described above. Furthermore, the processing of the steps describing this program may be executed in parallel with the processing of other programs, or may be executed in combination with the processing of other programs.
  • the present technology can also take the following configuration.
  • an image quality improvement technology information generation unit that generates image quality improvement technology information related to an image quality improvement technology for improving the image quality of an image to be encoded; and a file generation unit that generates a content file that stores the encoded data of the image, and stores the image quality improvement technical information in the content file.
  • the image quality improvement technology information includes determination information for determining whether or not the image quality improvement technology is applicable.
  • the image quality improvement technique information includes parameter data applied in processing to which the image quality improvement technique is applied.
  • the file generation unit stores the image quality improvement technical information in a video track that stores the image.
  • the information processing apparatus stores the image quality improvement technical information in a box within a sample entry.
  • the file generation unit stores the image quality improvement technical information in a metadata track that stores metadata.
  • the file generation unit further generates a control file for controlling distribution of the content file, and stores the image quality improvement technical information in the control file. processing equipment.
  • the image quality improvement technical information includes information for each partial area of the image.
  • the information processing apparatus according to any one of (1) to (8), wherein the file generation unit associates the image quality improvement technical information with still image data stored in the content file. (10) generating image quality improvement technical information related to an image quality improvement technique for improving the image quality of an image to be encoded; An information processing method comprising: generating a content file for storing encoded data of the image; and storing the image quality improvement technical information in the content file.
  • the image quality improvement technology information includes determination information for determining whether the image quality improvement technology is applicable.
  • the image quality improvement technique information includes parameter data applied in processing to which the image quality improvement technique is applied.
  • the acquisition unit acquires the encoded data of the image to be reproduced based on the image quality improvement technical information stored in the video track in which the image is stored. information processing equipment.
  • the information processing device acquires the encoded data of the image to be reproduced based on the image quality improvement technical information stored in the box in the sample entry.
  • the acquisition unit acquires encoded data of an image to be reproduced based on the image quality improvement technical information stored in a metadata track in which metadata is stored.
  • the information processing device acquires the content file based on the image quality improvement technical information stored in a control file that controls delivery of the content file, and encodes an image to be reproduced from the acquired content file.
  • the information processing apparatus according to any one of (11) to (16), which acquires data.
  • the information processing apparatus according to any one of (11) to (17), wherein the image quality improvement technical information includes information for each partial area of the image.
  • the acquisition unit acquires still image data associated with the image quality improvement technical information.
  • (20) obtaining coded data of an image to be reproduced from a content file based on image quality improvement technical information relating to an image quality improvement technique for improving image quality; An information processing method for decoding the encoded data.
  • 300 file generation device 301 control unit, 302 file generation processing unit, 311 input unit, 312 preprocessing unit, 313 encoding unit, 314 file generation unit, 315 recording unit, 316 output unit, 400 client device, 401 control unit, 402 playback processing unit, 411 file acquisition unit, 412 file processing unit, 413 decryption unit, 414 display information generation unit, 415 display unit, 416 measurement unit, 417 display control unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
PCT/JP2022/017458 2021-04-13 2022-04-11 情報処理装置および方法 Ceased WO2022220207A1 (ja)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US18/552,438 US12537981B2 (en) 2021-04-13 2022-04-11 Information processing device and method thereof
JP2023514636A JPWO2022220207A1 (https=) 2021-04-13 2022-04-11

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202163174096P 2021-04-13 2021-04-13
US63/174,096 2021-04-13

Publications (1)

Publication Number Publication Date
WO2022220207A1 true WO2022220207A1 (ja) 2022-10-20

Family

ID=83640079

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/017458 Ceased WO2022220207A1 (ja) 2021-04-13 2022-04-11 情報処理装置および方法

Country Status (3)

Country Link
US (1) US12537981B2 (https=)
JP (1) JPWO2022220207A1 (https=)
WO (1) WO2022220207A1 (https=)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019087905A1 (ja) * 2017-10-31 2019-05-09 シャープ株式会社 画像フィルタ装置、画像復号装置、および画像符号化装置
WO2019193097A1 (en) * 2018-04-05 2019-10-10 Canon Kabushiki Kaisha Method and apparatus for encapsulating images in a file
WO2020058570A1 (en) * 2018-09-20 2020-03-26 Nokia Technologies Oy An apparatus and a method for artificial intelligence
JP2020150516A (ja) * 2019-03-15 2020-09-17 シャープ株式会社 画像復号装置及び画像符号化装置

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101167802B1 (ko) * 2010-12-27 2012-07-25 삼성전기주식회사 회로 기판 및 그 제조 방법
US9866878B2 (en) * 2014-04-05 2018-01-09 Sonic Ip, Inc. Systems and methods for encoding and playing back video at different frame rates using enhancement layers
JP6477699B2 (ja) * 2014-06-20 2019-03-06 ソニー株式会社 情報処理装置および情報処理方法
KR102091072B1 (ko) * 2014-12-23 2020-03-19 삼성전자주식회사 컨텐츠 제공 장치, 디스플레이 장치 및 그 제어 방법
WO2017030425A1 (ko) * 2015-08-20 2017-02-23 엘지전자 주식회사 방송 신호 송신 장치, 방송 신호 수신 장치, 방송 신호 송신 방법, 및 방송 신호 수신 방법
US10939086B2 (en) * 2018-01-17 2021-03-02 Mediatek Singapore Pte. Ltd. Methods and apparatus for encoding and decoding virtual reality content
JP2020150156A (ja) 2019-03-14 2020-09-17 株式会社荏原製作所 基板を処理する方法、および基板処理装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019087905A1 (ja) * 2017-10-31 2019-05-09 シャープ株式会社 画像フィルタ装置、画像復号装置、および画像符号化装置
WO2019193097A1 (en) * 2018-04-05 2019-10-10 Canon Kabushiki Kaisha Method and apparatus for encapsulating images in a file
WO2020058570A1 (en) * 2018-09-20 2020-03-26 Nokia Technologies Oy An apparatus and a method for artificial intelligence
JP2020150516A (ja) * 2019-03-15 2020-09-17 シャープ株式会社 画像復号装置及び画像符号化装置

Also Published As

Publication number Publication date
US12537981B2 (en) 2026-01-27
US20240163491A1 (en) 2024-05-16
JPWO2022220207A1 (https=) 2022-10-20

Similar Documents

Publication Publication Date Title
US20210326378A1 (en) Information processing apparatus and information processing method
US10135952B2 (en) Method and corresponding device for streaming video data
US11653054B2 (en) Method and apparatus for late binding in media content
US20170092280A1 (en) Information processing apparatus and information processing method
IL230273A (en) Transmission of reconstruction data in a tiered signal quality hierarchy
US9865304B2 (en) File generation device and method, and content playback device and method
US20170127152A1 (en) Information processing device and information processing method
GB2506911A (en) Streaming data corresponding to divided image portions (tiles) via a description file including spatial and URL data
WO2021117802A1 (ja) 画像処理装置および方法
JP2022019932A (ja) 情報処理装置および情報処理方法
JP7287454B2 (ja) 情報処理装置、再生処理装置、情報処理方法及び再生処理方法
US20190373213A1 (en) Information processing device and method
KR101944601B1 (ko) 기간들에 걸쳐 오브젝트들을 식별하기 위한 방법 및 이에 대응하는 디바이스
WO2022220278A1 (ja) 情報処理装置および方法
JP6501127B2 (ja) 情報処理装置および方法
WO2022220207A1 (ja) 情報処理装置および方法
JP2019125865A (ja) 情報処理装置および方法
JP7816367B2 (ja) 画像処理装置および方法
Kammachi‐Sreedhar et al. Omnidirectional video delivery with decoder instance reduction
KR20200108420A (ko) 정보 처리 장치 및 방법
HK40064165A (en) Method and apparatus for late binding in media content

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22788132

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023514636

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 18552438

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22788132

Country of ref document: EP

Kind code of ref document: A1

WWG Wipo information: grant in national office

Ref document number: 18552438

Country of ref document: US