JP7628550B2 - 再帰ベースの機械学習システムを使用したビデオ圧縮 - Google Patents

再帰ベースの機械学習システムを使用したビデオ圧縮 Download PDF

Info

Publication number
JP7628550B2
JP7628550B2 JP2022551741A JP2022551741A JP7628550B2 JP 7628550 B2 JP7628550 B2 JP 7628550B2 JP 2022551741 A JP2022551741 A JP 2022551741A JP 2022551741 A JP2022551741 A JP 2022551741A JP 7628550 B2 JP7628550 B2 JP 7628550B2
Authority
JP
Japan
Prior art keywords
time step
video frame
step operation
data
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2022551741A
Other languages
English (en)
Japanese (ja)
Other versions
JP2023517846A (ja
JP2023517846A5 (enExample
Inventor
ゴリンスキー、アダム・バルデマール
ヤン、ヤン
プレザ、レザ
ソティエール、ギヨーム・コンラッド
ファン・ロゼンダール、ティーズ・ジャン
コーエン、タコ・セバスティアーン
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of JP2023517846A publication Critical patent/JP2023517846A/ja
Publication of JP2023517846A5 publication Critical patent/JP2023517846A5/ja
Application granted granted Critical
Publication of JP7628550B2 publication Critical patent/JP7628550B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/436Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • H04N19/463Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Signal Processing (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Image Analysis (AREA)
  • Processing Or Creating Images (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Studio Devices (AREA)
JP2022551741A 2020-03-03 2021-01-15 再帰ベースの機械学習システムを使用したビデオ圧縮 Active JP7628550B2 (ja)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US202062984673P 2020-03-03 2020-03-03
US62/984,673 2020-03-03
US17/091,570 2020-11-06
US17/091,570 US11405626B2 (en) 2020-03-03 2020-11-06 Video compression using recurrent-based machine learning systems
PCT/US2021/013599 WO2021178050A1 (en) 2020-03-03 2021-01-15 Video compression using recurrent-based machine learning systems

Publications (3)

Publication Number Publication Date
JP2023517846A JP2023517846A (ja) 2023-04-27
JP2023517846A5 JP2023517846A5 (enExample) 2023-12-26
JP7628550B2 true JP7628550B2 (ja) 2025-02-10

Family

ID=77554929

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2022551741A Active JP7628550B2 (ja) 2020-03-03 2021-01-15 再帰ベースの機械学習システムを使用したビデオ圧縮

Country Status (9)

Country Link
US (1) US11405626B2 (enExample)
EP (1) EP4115617A1 (enExample)
JP (1) JP7628550B2 (enExample)
KR (1) KR20220150298A (enExample)
CN (1) CN115211115A (enExample)
BR (1) BR112022016793A2 (enExample)
PH (1) PH12022551821A1 (enExample)
TW (1) TW202135529A (enExample)
WO (1) WO2021178050A1 (enExample)

Families Citing this family (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110677649B (zh) * 2019-10-16 2021-09-28 腾讯科技(深圳)有限公司 基于机器学习的去伪影方法、去伪影模型训练方法及装置
US12148120B2 (en) * 2019-12-18 2024-11-19 Ati Technologies Ulc Frame reprojection for virtual reality and augmented reality
WO2021220008A1 (en) 2020-04-29 2021-11-04 Deep Render Ltd Image compression and decoding, video compression and decoding: methods and systems
US11425402B2 (en) * 2020-07-20 2022-08-23 Meta Platforms, Inc. Cross-codec encoding optimizations for video transcoding
US11551090B2 (en) * 2020-08-28 2023-01-10 Alibaba Group Holding Limited System and method for compressing images for remote processing
US20220151540A1 (en) * 2020-11-19 2022-05-19 4N Inc. Explainable artificial intelligence system for diagnosis of mental diseases and the control method thereof
CN121771392A (zh) * 2020-12-17 2026-03-31 华为技术有限公司 基于神经网络的码流的解码和编码
CN116648906A (zh) * 2020-12-24 2023-08-25 华为技术有限公司 通过指示特征图数据进行编码
US11490078B2 (en) * 2020-12-29 2022-11-01 Tencent America LLC Method and apparatus for deep neural network based inter-frame prediction in video coding
US11570465B2 (en) * 2021-01-13 2023-01-31 WaveOne Inc. Machine-learned in-loop predictor for video compression
TWI804181B (zh) * 2021-02-02 2023-06-01 聯詠科技股份有限公司 影像編碼方法及其影像編碼器
US11399198B1 (en) * 2021-03-01 2022-07-26 Qualcomm Incorporated Learned B-frame compression
US11831909B2 (en) * 2021-03-11 2023-11-28 Qualcomm Incorporated Learned B-frame coding using P-frame coding system
US20240146938A1 (en) * 2021-03-18 2024-05-02 Nokia Technologies Oy Method, apparatus and computer program product for end-to-end learned predictive coding of media frames
WO2022221205A1 (en) 2021-04-13 2022-10-20 Headroom, Inc. Video super-resolution using deep neural networks
US20230019874A1 (en) * 2021-07-13 2023-01-19 Nintendo Co., Ltd. Systems and methods of neural network training
EP4420352A4 (en) * 2021-10-18 2025-09-03 Op Solutions Llc SYSTEMS AND METHODS FOR OPTIMIZING A LOSS FUNCTION FOR VIDEO CODING FOR MACHINES
US11546614B1 (en) * 2021-10-25 2023-01-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Encoder and decoder for encoding and decoding images
CN116112673A (zh) * 2021-11-10 2023-05-12 华为技术有限公司 编解码方法及电子设备
WO2023092388A1 (zh) * 2021-11-25 2023-06-01 Oppo广东移动通信有限公司 解码方法、编码方法、解码器、编码器和编解码系统
US20230214630A1 (en) * 2021-12-30 2023-07-06 Cron Ai Ltd. (Uk) Convolutional neural network system, method for dynamically defining weights, and computer-implemented method thereof
WO2023138687A1 (en) * 2022-01-21 2023-07-27 Beijing Bytedance Network Technology Co., Ltd. Method, apparatus, and medium for data processing
US12548330B1 (en) 2022-01-26 2026-02-10 Upwork Inc. Determining engagement using sensor data
DE112022006625T5 (de) * 2022-02-08 2024-12-05 Nvidia Corporation Bilderzeugung unter verwendung eines neuronalen netzes
CN114545899B (zh) * 2022-02-10 2024-09-10 上海交通大学 基于先验知识的燃气轮机系统多传感器故障信号重构方法
US20230262237A1 (en) * 2022-02-15 2023-08-17 Adobe Inc. System and methods for video analysis
WO2023167502A1 (ko) * 2022-03-02 2023-09-07 엘지전자 주식회사 피쳐 부호화/복호화 방법, 장치, 비트스트림을 저장한 기록 매체 및 비트스트림 전송 방법
CN118872263A (zh) * 2022-03-03 2024-10-29 抖音视界有限公司 用于视觉数据处理的方法、装置和介质
CN115240099B (zh) * 2022-06-21 2026-04-03 有米科技股份有限公司 基于多模态关联数据的模型训练方法及装置
US20260067479A1 (en) * 2022-06-30 2026-03-05 Interdigital Ce Patent Holdings, Sas Fine-tuning a limited set of parameters in a deep coding system for images
WO2024015638A2 (en) * 2022-07-15 2024-01-18 Bytedance Inc. A neural network-based image and video compression method with conditional coding
CN119586135A (zh) * 2022-07-19 2025-03-07 字节跳动有限公司 具有可变率的基于神经网络的自适应图像和视频压缩方法
CN115604475B (zh) * 2022-08-12 2025-06-10 西安电子科技大学 一种多模态信源联合编码方法
TWI832406B (zh) * 2022-09-01 2024-02-11 國立陽明交通大學 反向傳播訓練方法和非暫態電腦可讀取媒體
WO2024054585A1 (en) 2022-09-09 2024-03-14 Tesla, Inc. Artificial intelligence modeling techniques for vision-based occupancy determination
KR20250086631A (ko) * 2022-09-30 2025-06-13 테슬라, 인크. 머신 러닝 모델들의 가속화된 비디오-기반 학습을 위한 시스템들 및 방법들
CN115294224B (zh) * 2022-09-30 2022-12-16 南通市通州区华凯机械有限公司 用于驾驶模拟器的图像数据快速载入方法
US20240169708A1 (en) * 2022-11-10 2024-05-23 Qualcomm Incorporated Processing video data using delta quantization
TWI824861B (zh) * 2022-11-30 2023-12-01 國立陽明交通大學 機器學習裝置及其訓練方法
KR20240086085A (ko) * 2022-12-09 2024-06-18 삼성전자주식회사 시맨틱 맵에 기초하여 프레임 이미지를 복원하는 방법 및 장치
US12167003B2 (en) * 2023-02-19 2024-12-10 Deep Render Ltd. Method and data processing system for lossy image or video encoding, transmission, and decoding
WO2024175727A1 (en) * 2023-02-22 2024-08-29 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Deep video coding with block-based motion estimation
TWI860054B (zh) * 2023-08-22 2024-10-21 國立清華大學 訓練機器學習模型的方法、裝置和電腦程式產品
WO2025198937A1 (en) * 2024-03-16 2025-09-25 Bytedance Inc. Method, apparatus, and medium for visual data processing
CN119922332A (zh) * 2025-01-21 2025-05-02 山东大学 一种基于隐式神经视频表示的视频编码方法及系统
CN121053039B (zh) * 2025-11-03 2026-01-27 北京铁力山科技股份有限公司 视频质量恢复方法、装置、设备及存储介质

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210044804A1 (en) 2018-11-29 2021-02-11 Beijing Sensetime Technology Development Co., Ltd. Method for video compression processing, electronic device and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10192327B1 (en) * 2016-02-04 2019-01-29 Google Llc Image compression with recurrent neural networks
US10706351B2 (en) * 2016-08-30 2020-07-07 American Software Safety Reliability Company Recurrent encoder and decoder

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210044804A1 (en) 2018-11-29 2021-02-11 Beijing Sensetime Technology Development Co., Ltd. Method for video compression processing, electronic device and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
George Toderici et al.,Full Resolution Image Compression with Reccurent Neural Networks,Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition,2017年,pp.5435-5443
Guo Lu et al.,DVC: An End-to-end Deep Video Compression Framework,Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019年,pp.10998-11007

Also Published As

Publication number Publication date
PH12022551821A1 (en) 2024-02-12
JP2023517846A (ja) 2023-04-27
TW202135529A (zh) 2021-09-16
BR112022016793A2 (pt) 2022-10-11
KR20220150298A (ko) 2022-11-10
CN115211115A (zh) 2022-10-18
US11405626B2 (en) 2022-08-02
WO2021178050A1 (en) 2021-09-10
US20210281867A1 (en) 2021-09-09
EP4115617A1 (en) 2023-01-11

Similar Documents

Publication Publication Date Title
JP7628550B2 (ja) 再帰ベースの機械学習システムを使用したビデオ圧縮
US12184893B2 (en) Learned B-frame coding using P-frame coding system
US12003734B2 (en) Machine learning based flow determination for video coding
CN116965029A (zh) 使用卷积神经网络对图像进行译码的装置和方法
JP7815247B2 (ja) ニューラルネットワークベースのビデオコーディングのためのフロントエンドアーキテクチャ
KR102831175B1 (ko) 머신 러닝 향상을 갖는 비디오 코딩을 위한 비트-레이트 추정
US20220191523A1 (en) Front-end architecture for neural network based video coding
US11399198B1 (en) Learned B-frame compression
US12177473B2 (en) Video coding using optical flow and residual predictors
US12394100B2 (en) Video coding using camera motion compensation and object motion compensation
CN117716687A (zh) 使用机器学习系统的隐式图像和视频压缩
JP7840973B2 (ja) ビデオコーディングのための機械学習ベースのフロー決定
CN116965032A (zh) 用于视频译码的基于机器学习的流确定

Legal Events

Date Code Title Description
RD02 Notification of acceptance of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7422

Effective date: 20230104

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20231215

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20231215

TRDD Decision of grant or rejection written
A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20241218

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20250107

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20250129

R150 Certificate of patent or registration of utility model

Ref document number: 7628550

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150