WO2024171796A1 - 情報処理装置、情報処理方法、およびプログラム - Google Patents

情報処理装置、情報処理方法、およびプログラム Download PDF

Info

Publication number
WO2024171796A1
WO2024171796A1 PCT/JP2024/002853 JP2024002853W WO2024171796A1 WO 2024171796 A1 WO2024171796 A1 WO 2024171796A1 JP 2024002853 W JP2024002853 W JP 2024002853W WO 2024171796 A1 WO2024171796 A1 WO 2024171796A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
images
tile images
information processing
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2024/002853
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
秀信 秋吉
洋介 河内
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Group Corp
Original Assignee
Sony Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Group Corp filed Critical Sony Group Corp
Priority to JP2025501028A priority Critical patent/JPWO2024171796A1/ja
Publication of WO2024171796A1 publication Critical patent/WO2024171796A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/91Television signal processing therefor
    • H04N5/92Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback

Definitions

  • This technology relates to an information processing device, an information processing method, and a program, and in particular to an information processing device, an information processing method, and a program that enable files to be created in a format that can store images in chronological order, even if the image size is larger than the maximum image size that can be supported in that format.
  • HEIF High Efficiency Image File Format
  • ISO BMFF Basic media file format
  • HEIF SEQUENSE format is being developed for HEIF, which uses video container technology to store still images from multiple points in time in chronological order, further increasing the compression rate. Because files in this HEIF Sequence format are small in size, they are suitable for real-time transfer by mobile devices, and further development is anticipated.
  • the HEIF SEQUENSE format uses video container technology, the maximum image size is limited to around 32 megapixels (e.g., around 8K x 4K) due to the constraints of the HEVC (High Efficiency Video Codec) format used to encode video.
  • HEVC High Efficiency Video Codec
  • the number of pixels in the image sensors that capture images is increasing, and may be 10K x 5K (approximately 50M pixels) or the like.
  • This technology was developed in light of these circumstances, and makes it possible to file images in a format that can store images in chronological order even if the image size is larger than the maximum image size that can be supported by that format.
  • the information processing device or program of the first aspect of the present technology is an information processing device having a generation unit that generates a file by storing a plurality of tile images obtained by dividing an image in each of a plurality of video track areas of a file in a format capable of storing images in chronological order, or a program for causing a computer to function as an information processing device.
  • a file is generated by storing a number of tile images into which an image is divided, in each of a number of video track areas of the file in a format capable of storing images in chronological order.
  • the information processing method includes a generating step in which an information processing device generates a file in a format capable of storing images in chronological order by storing a plurality of tile images obtained by dividing an image in each of a plurality of video track areas of the file, and a combining step in which the plurality of tile images are read from the file generated by the processing in the generating step and combined.
  • a file is generated in a format capable of storing images in chronological order, with each of a number of tile images obtained by dividing an image being stored in each of a number of video track areas of the file, and the tile images are read from the file and combined.
  • the information processing device may be an independent device or a module that is incorporated into another device.
  • FIG. 13 is a diagram showing an example configuration of a digital still camera to which the present technology is applied.
  • FIG. 13 is a diagram showing a first example of a tile image.
  • FIG. 13 is a diagram showing a second example of a tile image.
  • 1 is a diagram showing an outline of the file structure of a captured image when the size of the captured image is greater than 32 M pixels.
  • FIG. 5 is a diagram showing a tree structure of the file in FIG. 4.
  • 1 is a diagram showing an outline of the structure of a file of a captured image when the size of the captured image is 32 M pixels or less.
  • 13 is a flowchart illustrating a small size recording process.
  • 11 is a flowchart illustrating a playback process.
  • 13 is a flowchart illustrating a large size reproduction process.
  • 13 is a flowchart illustrating a
  • One embodiment digital still camera
  • FIG. 1 is a block diagram showing an example of the configuration of a digital still camera as an embodiment of an information processing device to which the present technology is applied.
  • a CPU Central Processing Unit
  • memory 12 N (N is multiple) encoding/decoding units 13-1 to 13-N
  • an image processing unit 14 are interconnected by a bus 15. Further connected to the bus 15 are an imaging unit 16, a media IF (Interface) 17 to which a media 20 is connected, a display unit 18, and an operation unit 19.
  • the digital still camera 10 performs continuous shooting, time lapse shooting, etc., and records images captured at multiple times as files in the HEIF SEQUENSE format.
  • the CPU 11 controls each part of the digital still camera 10 based on user instructions received by the operation unit 19, for example by executing a program stored in the memory 12.
  • the CPU 11 causes the image capture unit 16 to capture images based on user instructions, and acquires captured images at multiple times (frames).
  • the CPU 11 controls the image processing unit 14 to generate N tile images by dividing the captured image into rectangular areas of a predetermined size.
  • the N tile images all have the same size, and are an integer multiple of the smallest encoding unit that is set based on the encoding method of the encoding/decoding units 13-1 to 13-N.
  • the size of each tile image is an integer multiple of the CU (Coding Unit) unit.
  • the size of each tile image can be, for example, 1920 pixels by 1080 pixels.
  • the encoding/decoding units 13-1 to 13-N they will be collectively referred to as the encoding/decoding unit 13.
  • the CPU 11 (generation unit) generates a file in the HEIF SEQUENSE format by chronologically storing the encoded data for the N tile images obtained as a result of encoding by the encoding/decoding unit 13 in each of the N video track areas.
  • a track is media data such as images and audio that is played according to an independent timeline
  • a video track is a track of images (video).
  • Each of the encoded data for the N tile images obtained by dividing an image captured at a specific time is stored in areas corresponding to the same time in different video tracks.
  • the CPU 11 also stores in this file synthesis information, which is meta-information used when synthesizing N tile images.
  • the synthesis information is made up of track information, division number information, valid area information, and division information.
  • the track information is a track ID serving as identification information for identifying each of the N video tracks corresponding to the N tile images.
  • the track IDs are included in the track information in order starting with the track ID corresponding to the tile image whose position on the captured image is the topmost row, and in order starting with the track ID corresponding to the tile image whose position on the captured image is the leftmost column within each row.
  • the track information includes the track ID corresponding to the top left tile image first, and the track ID corresponding to the bottom right tile image last.
  • the division number information is information that indicates the number of divisions into the horizontal and vertical directions of the captured image.
  • the valid area information is information that indicates the area of the captured image within an area consisting of N tile images.
  • the division information is information that indicates whether the captured image is divided or not. Specifically, the division information is information that indicates whether the file is a HEIF Sequence file in which the image is divided and stored, or whether the file is a HEIF Sequence file in which the image is stored as is.
  • the CPU 11 When the size of a captured image is 32 megapixels or less, the CPU 11 generates a HEIF SEQUENSE file in which the encoded data of captured images at multiple times, obtained as a result of encoding by a single encoding/decoding unit 13, is stored in the area of a single video track. The CPU 11 supplies the generated file to the media 20 via the media IF 17 for recording.
  • the CPU 11 Based on a user's instruction, the CPU 11 reads, via the media IF 17, the division information of a file that stores the encoded data of the captured image at the time to be played back, which is recorded on the media 20. If the division information indicates that the captured image is divided, the CPU 11 also reads the remaining synthesis information stored in the file.
  • the CPU 11 reads, via the media IF 17, from the media 20 the encoded data of N tile images stored in an area corresponding to the time to be played back for each of the N video tracks identified by the track information.
  • the CPU 11 supplies each of the encoded data of the N tile images to each of the N encoding/decoding units 13.
  • the CPU 11 controls the image processing unit 14 based on the division number information and valid area information to synthesize the N tile images obtained as a result of decoding by the N encoding/decoding units 13, thereby generating the captured image.
  • the CPU 11 reads the encoded data of the captured image stored in an area corresponding to the time of the playback target of one video track from the medium 20 via the media IF 17.
  • the CPU 11 supplies the encoded data to one encoding/decoding unit 13.
  • the memory 12 is made of a non-volatile memory or the like.
  • the memory 12 stores programs and the like executed by the CPU 11.
  • the memory 12 stores images captured at multiple times by the image capture unit 16.
  • the memory 12 stores multiple tile images or captured images obtained as a result of decoding by the encoding/decoding unit 13.
  • Each of the N encoding/decoding units 13 encodes one tile image supplied from the image processing unit 14 using an encoding method such as the HEVC method to generate encoded data.
  • the encoding/decoding unit 13 performs encoding so as to suppress or make less noticeable degradation in image quality in the boundary areas between adjacent tile images.
  • the N encoding/decoding units 13 allocate bit rates when encoding each tile image based on the magnitude of the motion vector of each of the N tile images, such that a larger bit rate is allocated to tile images with larger motion vectors. This makes the boundaries less noticeable and makes it possible to suppress degradation in image quality in the boundary areas.
  • the encoding/decoding unit 13 may use a predetermined quantization parameter (Q value) that is set in advance as a quantization parameter when dividing and encoding the captured image, and perform encoding at a bit rate corresponding to the quantization parameter. This makes it possible to suppress deterioration of image quality in boundary areas.
  • This quantization parameter may be set for each type of picture, such as an I picture, a P picture, etc.
  • the encoding/decoding unit 13 may encode the captured image at a higher bit rate than when the captured image is encoded as a P picture without being divided. This makes it possible to make degradation in image quality in boundary areas less noticeable.
  • the encoding/decoding unit 13 may encode the boundary area between each tile image and an adjacent tile image using a predetermined quantization parameter that has been set in advance. This reduces the difference in image quality between the boundary areas of each tile image. As a result, degradation in image quality in the boundary areas can be made less noticeable.
  • the encoding/decoding unit 13 When generating a reconstructed image (decoded image), the encoding/decoding unit 13 performs a deblocking filter process on the CUs in the boundary area between each tile image and an adjacent tile image. This makes it possible to suppress image quality degradation in the boundary area.
  • One encoding/decoding unit 13 reads out and encodes images captured at multiple times that are stored in memory 12, generating encoded data.
  • the encoding/decoding unit 13 decodes the encoded data of the tile images or captured images supplied from the CPU 11, and supplies the resulting tile images or captured images to memory 12 for storage.
  • the image processing unit 14 generates N tile images by dividing the images captured at multiple times and stored in the memory 12 into N pieces under the control of the CPU 11.
  • the image processing unit 14 supplies each of the N tile images to each of the N encoding/decoding units 13.
  • the image processing unit 14 Under the control of the CPU 11, the image processing unit 14 generates a composite image by combining multiple tile images stored in the memory 12, and extracts and reduces the captured image from the composite image. The image processing unit 14 reduces the captured image stored in the memory 12. The image processing unit 14 supplies the reduced captured image to the display unit 18 for display.
  • the photographing unit 16 includes a lens 16a and an image sensor 16b.
  • the photographing unit 16 receives light incident through the lens 16a with the image sensor 16b and converts it into an electrical signal, thereby performing continuous shooting, time-lapse shooting, and the like.
  • the photographing unit 16 supplies the resulting images taken at multiple times to the memory 12 for storage.
  • the media IF 17 reads and writes files from and to the media 20.
  • the display unit 18 is made up of a display and the like.
  • the display unit 18 displays captured images supplied from the image processing unit 14.
  • the operation unit 19 is made up of a touch pad, operation buttons, a microphone, and the like which together with the display unit 18 constitute a touch panel.
  • the operation unit 19 accepts instructions from the user and supplies the instructions to the CPU 11.
  • the program executed by the digital still camera 10 can be provided, for example, by recording it on the medium 20.
  • the program can also be provided via a wired or wireless transmission medium, such as a local area network, the Internet, or digital satellite broadcasting.
  • the program can be installed in the memory 12 by inserting the medium 20.
  • the program can also be received by a communication unit (not shown) via a wired or wireless transmission medium and installed in the memory 12.
  • the program can be pre-installed in the memory 12.
  • the photographing unit 16 may be provided as a separate unit.
  • FIG. 2 is a diagram showing a first example of a tile image.
  • the size of the captured image 30 is 10K (9600 pixels) x 6K (6480 pixels), which is larger than 8K (7680 pixels) x 4K (4320 pixels).
  • the image processing unit 14 divides the captured image 30 into regions of, for example, 5120 pixels by 3840 pixels, and generates four tile images 31-1 to 31-4. Of these four tile images 31-1 to 31-4, three tile images 31-2 to 31-4 include blank regions 32 that are not part of the captured image 30. Note that, below, when there is no need to particularly distinguish between the tile images 31-1 to 31-4, they will be collectively referred to as tile images 31.
  • Each of the four encoding/decoding units 13 encodes each of the four tile images 31.
  • the CPU 11 generates a file in which the resulting encoded data is stored in an area of a different video track for each tile image 31.
  • the CPU 11 also stores in the file valid area information indicating the area of the 10K x 6K captured image 30 within an area 40 of 10240 pixels x 7680 pixels consisting of the four tile images 31.
  • this valid area information is 9600 as the number of pixels in the horizontal direction and 6480 as the number of pixels in the vertical direction of the captured image 30.
  • FIG. 3 is a diagram showing a second example of a tile image.
  • the size of the captured image 70 is 15,000 pixels x 7,000 pixels, which is greater than 32 megapixels.
  • the image processing unit 14 divides the captured image 70 into regions of, for example, 5120 pixels by 3840 pixels, and generates six tile images 71-1 to 71-6. Of these six tile images 71-1 to 71-6, four tile images 71-3 to 71-6 include blank regions 72 that are not part of the captured image 70. In the following, when there is no need to particularly distinguish between the tile images 71-1 to 71-6, they will be collectively referred to as tile images 71.
  • Each of the six encoding/decoding units 13 encodes each of the six tile images 71.
  • the CPU 11 generates a file in which the resulting encoded data is stored in a different video track area for each tile image 71.
  • the CPU 11 also stores in the file valid area information indicating the area of the captured image 70, which is 15,000 pixels by 7,000 pixels, out of the 15,360 pixels by 7,680 pixel area 80 consisting of the six tile images 71.
  • this valid area information is 15,000 as the number of pixels in the horizontal direction and 7,000 as the number of pixels in the vertical direction of the captured image 70.
  • FIG. 4 is a diagram showing an outline of the file structure of a captured image when the size of the captured image is greater than 32 megapixels.
  • the captured image is divided into four tile images.
  • the file format of the captured images is the HEIF Sequence method, and is therefore compliant with ISO BMFF.
  • ISO BMFF has a box structure, and data is stored in boxes as containers.
  • a box is made up of a Box Type, which indicates the type of actual data in the box, and actual data, etc.
  • the actual data contained in a box can have a box structure, which allows ISO BMFF to have a hierarchical structure.
  • the file of the captured image will have an ftyp box, a moov box, an mdat box, a tkrf box, a cnvs box, etc.
  • information indicating the HEIF Sequence format is stored as identification information for identifying the file format.
  • the ftyp box also stores information indicating whether the captured image has been divided as division information.
  • the moov box in Figure 4 stores data required for playing and managing the encoded data of the four tile images stored as media data in the mdat box. Specifically, the moov box contains trak boxes for four video tracks corresponding to the four tile images. These trak boxes store information about the corresponding video tracks.
  • the mdat box in Figure 4 has areas for four video tracks. In each video track area, the encoded data for each tile image is stored as media data.
  • the tkrf (Track Reference) box stores track information. This track information is the track IDs of the four video tracks corresponding to the four tile images, for example 1, 2, 3, and 4.
  • the cnvs (Canvas) box stores division number information and valid area information. If the file in FIG. 4 is, for example, the file of the captured image 30 in FIG. 2, the division number information stored in the cnvs box is 2 as the number of horizontal divisions and 2 as the number of vertical divisions.
  • the valid area information is 9600 as the number of horizontal pixels of the captured image 30 and 6480 as the number of vertical pixels.
  • FIG. 5 is a diagram showing the tree structure of the file in FIG.
  • the ftyp box, tkrf box, cnvs box, mdat box, and moov box described in Figure 4 are provided as boxes in the top layer.
  • the layer below the moov box stores four trak boxes and others. Note that in Figure 5, only the first trak box is shown to simplify the illustration.
  • tkhd boxes and the like are stored in the lower layer of the trak box in FIG. 5, tkhd boxes and the like are stored.
  • the track ID of the corresponding video track is stored in this tkhd box.
  • the tkhd boxes of the four trak boxes store track IDs of 1, 2, 3, and 4, respectively.
  • the tkhd boxes in FIG. 4 also store image size information indicating the size of the tile image corresponding to the video track. If the file in FIG. 4 is, for example, the file of the captured image 30 in FIG. 2, the image size information is 5120 as the number of pixels in the horizontal direction and 3840 as the number of pixels in the vertical direction of the tile image 31.
  • the valid area information may be stored in an idat box.
  • the track information, number of divisions information, and valid area information may be stored in the same box.
  • the boxes in which the track information, number of divisions information, and valid area information are stored may be provided in a layer other than the top layer, such as a lower layer of the moov box.
  • the track information, number of divisions information, and valid area information may be stored in a uuid box.
  • the captured image is divided into four parts, but even if it is divided into a number other than four, the file structure will be the same as that of Figures 4 and 5, except for the number of trak boxes and the composition information other than the division information.
  • the tkhd boxes of the six trak boxes store track IDs of 1, 2, 3, 4, 5, and 6, respectively.
  • the track information in the composite information is 1, 2, 3, 4, 5, and 6.
  • the division number information is 3 as the number of horizontal divisions and 2 as the number of vertical divisions.
  • the effective area information is 15,000 as the number of horizontal pixels of the captured image 70 and 7,000 as the number of vertical pixels.
  • FIG. 6 is a diagram showing an outline of the file structure of a captured image when the size of the captured image is 32 M pixels or less.
  • the file of the captured image will have an ftyp box, a moov box, an mdat box, etc.
  • information indicating the HEIF Sequence format is stored as identification information for identifying the file format.
  • the ftyp box also stores information indicating that the captured image is not divided as division information.
  • the moov box in Figure 6 stores data necessary for playing and managing the encoded data of the captured image stored as media data in the mdat box. Specifically, the moov box contains a trak box for one video track corresponding to the captured image. This trak box stores information about the corresponding video track.
  • the mdat box in Figure 6 has an area for one video track.
  • encoded data for the captured image is stored as media data.
  • Fig. 7 is a flow chart for explaining the recording process by the digital still camera 10 of Fig. 1. This recording process is started, for example, when the user operates the operation unit 19 to instruct shooting.
  • step S11 the digital still camera 10 prepares to record images captured at multiple times.
  • step S12 the CPU 11 recognizes the size of the captured image specified by the user.
  • step S13 the CPU 11 determines whether the size of the captured image exceeds 32 megapixels. If it is determined in step S13 that the size exceeds 32 megapixels, the CPU 11 determines the number of divisions of the captured image based on the size of the captured image, and proceeds to step S14.
  • step S14 the CPU 11 activates N encoding/decoding units 13, which is the number of divisions of the captured image.
  • step S15 the digital still camera 10 performs large-size recording processing. Details of this large-size recording processing will be described later with reference to FIG. 8. After the processing of step S15, the recording processing ends.
  • step S13 determines whether the number of pixels does not exceed 32M pixels. If it is determined in step S13 that the number of pixels does not exceed 32M pixels, then in step S16, the CPU 11 starts one encoding/decoding unit 13. In step S17, the digital still camera 10 performs small-size recording processing. Details of this small-size recording processing will be described later with reference to FIG. 9. After the processing of step S17, the recording processing ends.
  • FIG. 8 is a flow chart for explaining the large size recording process in step S15 of FIG.
  • step S31 the image capture unit 16 performs continuous shooting, time lapse shooting, or the like to obtain images captured at multiple times, and supplies the images to the memory 12 for storage.
  • step S32 the image processing unit 14 generates N tile images by dividing the captured image acquired in the processing of step S31 into N pieces.
  • the image processing unit 14 supplies each of the N tile images to each of the N encoding/decoding units 13 started in the processing of step S14 in FIG. 7.
  • each of the N encoding/decoding units 13 encodes one tile image supplied from the image processing unit 14, and generates encoded data for the N tile images.
  • step S34 the CPU 11 generates a HEIF SEQUENSE file in which the encoded data for the N tile images generated in the process of step S33 is stored in a different video track area for each tile image.
  • step S35 the CPU 11 also stores the synthesis information in the file generated in step S34.
  • step S36 the CPU 11 records the file in which the synthesis information is stored in step S35 on the media 20 via the media IF 17. Then, the process returns to step S15 in FIG. 7, and the recording process ends.
  • FIG. 9 is a flow chart for explaining the small size recording process in step S17 of FIG.
  • step S51 the image capturing unit 16 captures images captured at multiple times by performing continuous shooting, time lapse shooting, or the like.
  • step S52 the image capturing unit 16 supplies the captured images captured in step S51 to the memory 12 for storage.
  • step S53 the encoding/decoding unit 13 started in the process of step S16 in FIG. 7 reads out the captured image stored in the memory 12 in the process of step S52, encodes it, and generates encoded data.
  • step S54 the CPU 11 generates a HEIF SEQUENSE file in which the encoded data of the captured image generated in the process of step S53 is stored in the area of one video track.
  • step S55 the CPU 11 supplies the file generated in the process of step S54 to the medium 20 via the media IF 17 and records it. Then, the process returns to step S17 in FIG. 7, and the recording process ends.
  • Fig. 10 is a flowchart for explaining the playback process by the digital still camera 10 of Fig. 1. This playback process is started, for example, when the user operates the operation unit 19 to instruct playback of an image captured at a specific time.
  • step S71 the CPU 11 reads, via the media IF 17, division information for a file that stores encoded data for the captured image to be played back and that is recorded on the media 20, based on a user instruction supplied from the operation unit 19.
  • step S72 the CPU 11 determines whether or not the captured image to be played back is divided based on the division information read in step S71. Specifically, the CPU 11 determines whether or not the division information indicates that the captured image is divided.
  • step S72 If it is determined in step S72 that the captured image to be played back is divided, i.e., if the division information indicates that the captured image is divided, the CPU 11 reads out the track information stored in the file together with the division information read out in the process of step S71. Then, in step S73, the CPU 11 activates N encoding/decoding units 13, the number of which is the number of track IDs included in the track information.
  • step S74 the digital still camera 10 performs a large-size playback process. Details of this large-size playback process will be described later with reference to FIG. 11. After the large-size playback process is completed, the playback process ends.
  • step S72 determines whether the captured image to be played back is not divided, i.e., if the division information indicates that the captured image is not divided.
  • step S75 the CPU 11 activates one encoding/decoding unit 13.
  • step S76 the digital still camera 10 performs small-size playback processing. Details of this small-size playback processing will be described later with reference to FIG. 12. After the small-size playback processing is completed, the playback processing ends.
  • FIG. 11 is a flow chart for explaining the large size reproduction process in step S74 of FIG.
  • step S91 the CPU 11 obtains the division number information and valid area information stored in the file together with the division information read by the processing in step S71 of FIG. 10.
  • step S92 the CPU 11 reads from the file, based on the track information, the encoded data of N tile images with the same frame number that are stored in the area corresponding to the time to be played back for each of the N video tracks identified by the track information.
  • the CPU 11 supplies the encoded data of the N tile images, for each tile image, to each of the N encoding/decoding units 13 started in step S73 of FIG. 10.
  • each of the N encoding/decoding units 13 decodes the encoded data for one tile image supplied from the CPU 11, and supplies the resulting N tile images to the memory 12 for storage.
  • step S94 the image processing unit 14 generates a composite image by combining the N tile images stored in the memory 12 based on the division number information.
  • step S95 the image processing unit 14 extracts the captured image from the composite image generated in the processing of step S94 based on the effective area information.
  • step S96 the image processing unit 14 reduces the captured image extracted in the processing of step S95.
  • step S97 the image processing unit 14 supplies the captured image reduced in the processing of step S96 to the display unit 18 for display. Then, the processing returns to step S74 in FIG. 10, and the playback processing ends.
  • FIG. 12 is a flow chart for explaining the small size reproduction process in step S76 of FIG.
  • step S111 the CPU 11 reads out the encoded data of the image captured at the time to be played back, which is stored in one video track area of the file in which the split information read out by the process of step S71 in FIG. 10 is stored.
  • the CPU 11 supplies the encoded data to one encoding/decoding unit 13 started in step S75 in FIG. 10.
  • step S112 one encoding/decoding unit 13 decodes the encoded data supplied from the CPU 11 to generate a captured image.
  • step S113 one encoding/decoding unit 13 supplies the captured image generated by the processing of step S112 to the memory 12 for storage.
  • step S114 the image processing unit 14 reduces the captured image stored in the memory 12 in the process of step S113.
  • step S115 the image processing unit 14 supplies the captured image reduced in the process of step S114 to the display unit 18 for display. Then, the process returns to step S76 in FIG. 10, and the playback process ends.
  • the digital still camera 10 generates a file in the HEIF SEQUENSE format by storing each of the N tile images into which the captured image is divided, in each of the N video track areas of the HEIF SEQUENSE file. Therefore, the digital still camera 10 can file captured images in the HEIF SEQUENSE format that are larger than the maximum image size that the HEIF SEQUENSE format can support.
  • the threshold value for the captured image size used to determine whether or not to split is not limited to 32 Mpixels. This threshold value is determined based on the maximum image size in the encoding method of the captured image, and if the encoding method is a method other than the HEVC method, the value will be based on the maximum image size in that encoding method, not 32 Mpixels. This threshold value does not have to be the same as the maximum image size, as long as it is equal to or smaller than the maximum image size in the encoding method.
  • the size of the tile image may be predetermined, regardless of the size of the captured image.
  • the number of divisions and the number of encoding/decoding units 13 are the same, N, but if the number of divisions changes according to the size of the captured image, the number of encoding/decoding units 13 can be set to the maximum number of divisions expected. If the encoding/decoding units 13 are realized by software, the number of encoding/decoding units 13 may be changed dynamically. Only one encoding/decoding unit 13 may be provided, and this single encoding/decoding unit 13 may encode each tile image and decode the encoded data of each tile image in a time-division manner.
  • the programs executed by the digital still camera 10 may be programs that are processed chronologically in the order described in this specification, or may be programs that are processed in parallel or at the required timing, such as when called.
  • This technology can also be applied to devices that generate or play files in a format that can store images in chronological order other than the HEIF Sequence format. For example, it can be applied to devices that generate or play files in a video file format.
  • This technology can also be applied to devices other than digital still cameras, such as mobile terminal devices that generate or play image files.
  • this technology can be configured as cloud computing, in which a single function is shared and processed collaboratively by multiple devices over a network.
  • each step described in the above flowchart can be executed by a single device, or can be shared and executed by multiple devices.
  • one step includes multiple processes
  • the multiple processes included in that one step can be executed by one device, or can be shared and executed by multiple devices.
  • the present technology can take the following configurations.
  • an information processing device comprising: a generation unit that generates a file in a format capable of storing images in chronological order by storing a plurality of tile images obtained by dividing an image in each of a plurality of video track areas of the file.
  • the generation unit is configured to store each of the plurality of tile images in an area corresponding to an identical time in a different one of the video tracks.
  • the generation unit is configured to store meta information used when combining the plurality of tile images in the file.
  • the meta information includes track information that identifies each of the video tracks corresponding to the tile images.
  • the meta information is configured to include division number information indicating the number of divisions of the image in the horizontal and vertical directions.
  • the meta information includes valid area information indicating an area of the image among the areas consisting of the plurality of tile images.
  • the meta information is configured to include division information indicating whether or not the image is divided.
  • the information processing device configured to store the meta-information in a box in a top layer of the file.
  • the information processing device further comprising: an encoding unit that encodes the plurality of tile images for each of the tile images.
  • the information processing device includes a blank area that is not an area of the image.
  • the information processing device according to any one of (1) to (8), wherein the plurality of tile images are configured to have the same size.
  • an encoding unit that encodes the plurality of tile images for each of the tile images, The information processing device according to (11), wherein the size is an integer multiple of a minimum unit of the encoding.
  • an encoding unit that encodes the plurality of tile images for each of the tile images, The information processing device according to any one of (1) to (8), wherein the encoding unit is configured to allocate a bit rate for encoding each of the plurality of tile images based on a magnitude of a motion vector of each of the plurality of tile images.
  • an encoding unit that encodes the plurality of tile images for each of the tile images,
  • the information processing device according to any one of (1) to (8), wherein the encoding unit is configured to perform the encoding using a predetermined quantization parameter.
  • the encoding unit is configured to perform the encoding on a boundary area between the tile image and an adjacent tile image by using the predetermined quantization parameter.
  • an encoding unit that encodes the plurality of tile images for each of the tile images, The information processing device according to any one of (1) to (8), wherein the encoding unit is configured to perform the encoding at a bit rate that is higher than a bit rate in the case of encoding the image without dividing the image.
  • an encoding unit that encodes the plurality of tile images for each of the tile images, The information processing device according to any one of (1) to (8), wherein the encoding unit is configured to perform a deblocking filter process on a boundary area between the tile image and an adjacent tile image.
  • An information processing device a generating step of generating a file by storing a plurality of tile images obtained by dividing an image in each of a plurality of video track areas of the file in a format capable of storing images in chronological order; a compositing step of reading the plurality of tile images from the file generated by the processing in the generating step, and combining the tile images.
  • Computer a generation unit that generates a file in a format capable of storing images in chronological order by storing a plurality of tile images obtained by dividing an image in each of a plurality of video track areas of the file, said file being generated.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Television Signal Processing For Recording (AREA)
PCT/JP2024/002853 2023-02-17 2024-01-30 情報処理装置、情報処理方法、およびプログラム Ceased WO2024171796A1 (ja)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2025501028A JPWO2024171796A1 (https=) 2023-02-17 2024-01-30

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2023023189 2023-02-17
JP2023-023189 2023-02-17

Publications (1)

Publication Number Publication Date
WO2024171796A1 true WO2024171796A1 (ja) 2024-08-22

Family

ID=92421696

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2024/002853 Ceased WO2024171796A1 (ja) 2023-02-17 2024-01-30 情報処理装置、情報処理方法、およびプログラム

Country Status (2)

Country Link
JP (1) JPWO2024171796A1 (https=)
WO (1) WO2024171796A1 (https=)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020022182A (ja) * 2013-01-18 2020-02-06 キヤノン株式会社 メディアファイルの生成方法、及びメディアファイルを生成する装置
WO2021177044A1 (ja) * 2020-03-04 2021-09-10 ソニーグループ株式会社 画像処理装置及び画像処理方法
WO2021182090A1 (ja) * 2020-03-09 2021-09-16 ソニーグループ株式会社 ファイル処理装置、ファイル処理方法、及び、プログラム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020022182A (ja) * 2013-01-18 2020-02-06 キヤノン株式会社 メディアファイルの生成方法、及びメディアファイルを生成する装置
WO2021177044A1 (ja) * 2020-03-04 2021-09-10 ソニーグループ株式会社 画像処理装置及び画像処理方法
WO2021182090A1 (ja) * 2020-03-09 2021-09-16 ソニーグループ株式会社 ファイル処理装置、ファイル処理方法、及び、プログラム

Also Published As

Publication number Publication date
JPWO2024171796A1 (https=) 2024-08-22

Similar Documents

Publication Publication Date Title
JP3632703B2 (ja) 映像記録装置及び映像記録方法
US20090103630A1 (en) Image processing device
KR102702392B1 (ko) 컨텐츠의 처리 방법 및 장치
JP4881210B2 (ja) 撮像装置、画像処理装置及びそれらの制御方法
CN100469105C (zh) 图像处理方法和图像处理设备
CN101455090A (zh) 利用广角镜头拍摄的摄像数据的压缩方法、扩展显示方法、压缩装置、广角摄像装置以及监视系统
JP6270555B2 (ja) 画像処理システム、撮像装置及びその制御方法
WO2018131515A1 (ja) 画像処理装置及び画像処理方法
JP7100493B2 (ja) 表示制御装置及びその制御方法及びプログラム
WO2018173873A1 (ja) 符号化装置及び符号化方法、並びに、復号装置及び復号方法
KR102177605B1 (ko) 정보 처리 장치, 콘텐츠 요구 방법 및 컴퓨터 판독가능 저장 매체
US10783670B2 (en) Method for compression of 360 degree content and electronic device thereof
KR102012717B1 (ko) 360 vr 영상을 실시간으로 생성하는 영상변환장치 및 시스템
JP2010016599A (ja) カメラ
JP5930748B2 (ja) 記録装置、記録方法および記録システム
CN102263987A (zh) 图像处理设备及其控制方法
US20130177287A1 (en) Reproduction apparatus, image capturing apparatus, and program
WO2024171796A1 (ja) 情報処理装置、情報処理方法、およびプログラム
JP5156196B2 (ja) 撮像装置
JP7679829B2 (ja) 画像処理装置及び画像処理方法
JP7374698B2 (ja) 撮像装置、制御方法およびプログラム
JP5809906B2 (ja) 画像読出し装置及び画像処理システム
CN110999309A (zh) 生成装置、再现装置、生成方法、再现方法、控制程序、记录介质
JP6463967B2 (ja) 撮像装置及びその制御方法
JP5187316B2 (ja) 画像撮影装置、エンコード方法およびプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24756647

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2025501028

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2025501028

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 24756647

Country of ref document: EP

Kind code of ref document: A1