CN114097248A - 一种视频流处理方法、装置、设备及介质 - Google Patents

一种视频流处理方法、装置、设备及介质 Download PDF

Info

Publication number
CN114097248A
CN114097248A CN201980079311.2A CN201980079311A CN114097248A CN 114097248 A CN114097248 A CN 114097248A CN 201980079311 A CN201980079311 A CN 201980079311A CN 114097248 A CN114097248 A CN 114097248A
Authority
CN
China
Prior art keywords
image
video stream
target person
target
person
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201980079311.2A
Other languages
English (en)
Other versions
CN114097248B (zh
Inventor
张醒石
常亚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN114097248A publication Critical patent/CN114097248A/zh
Application granted granted Critical
Publication of CN114097248B publication Critical patent/CN114097248B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/20Perspective computation
    • G06T15/205Image-based rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/111Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
    • H04N13/117Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation the virtual viewpoint locations being selected by the viewers or determined by viewer tracking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/251Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42204User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/437Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/4728End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for selecting a Region Of Interest [ROI], e.g. for requesting a higher resolution version of a selected region
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6587Control parameters, e.g. trick play commands, viewpoint selection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8549Creating video summaries, e.g. movie trailer
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/0093Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00 with means for monitoring data relating to the user, e.g. head-tracking, eye-tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20068Projection on vertical or horizontal image axis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N2013/0074Stereoscopic image analysis
    • H04N2013/0081Depth or disparity estimation from stereoscopic image signals

Abstract

本申请提供了一种视频流处理方法,涉及人工智能领域,包括:从多路视频流中获取目标人物的图像集,图像集中的图像包括目标人物的正面人脸,针对图像集中的图像确定目标人物视角模式下的虚拟视点,根据多路视频流中与目标人物的视域存在交集的图像(即目标图像)的深度图以及目标图像对应的真实视点的位姿,将目标图像投影到虚拟视点对应的成像平面,得到目标人物视角模式下的视频流。上述方法无需消耗较高的计算资源以及较长时间进行渲染,能够满足直播等业务需求,并且能够提供目标人物视角模式下的视频流,提升了交互体验。

Description

PCT国内申请,说明书已公开。

Claims (26)

  1. PCT国内申请,权利要求书已公开。
CN201980079311.2A 2019-12-30 2019-12-30 一种视频流处理方法、装置、设备及介质 Active CN114097248B (zh)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/129847 WO2021134178A1 (zh) 2019-12-30 2019-12-30 一种视频流处理方法、装置、设备及介质

Publications (2)

Publication Number Publication Date
CN114097248A true CN114097248A (zh) 2022-02-25
CN114097248B CN114097248B (zh) 2023-03-28

Family

ID=76686122

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980079311.2A Active CN114097248B (zh) 2019-12-30 2019-12-30 一种视频流处理方法、装置、设备及介质

Country Status (4)

Country Link
US (1) US20220329880A1 (zh)
EP (1) EP4072147A4 (zh)
CN (1) CN114097248B (zh)
WO (1) WO2021134178A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115330912A (zh) * 2022-10-12 2022-11-11 中国科学技术大学 基于音频和图像驱动的用于生成人脸说话视频的训练方法

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2022077380A (ja) * 2020-11-11 2022-05-23 キヤノン株式会社 画像処理装置、画像処理方法、およびプログラム
CN113538316B (zh) * 2021-08-24 2023-08-22 北京奇艺世纪科技有限公司 图像处理方法、装置、终端设备以及可读存储介质
CN114125490B (zh) * 2022-01-19 2023-09-26 阿里巴巴(中国)有限公司 直播播放方法及装置
CN114125569B (zh) * 2022-01-27 2022-07-15 阿里巴巴(中国)有限公司 直播处理方法以及装置
CN115022613A (zh) * 2022-05-19 2022-09-06 北京字节跳动网络技术有限公司 一种视频重建方法、装置、电子设备及存储介质
CN115314658A (zh) * 2022-07-29 2022-11-08 京东方科技集团股份有限公司 基于三维显示的视频通信方法及系统

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004152133A (ja) * 2002-10-31 2004-05-27 Nippon Telegr & Teleph Corp <Ntt> 仮想視点画像生成方法及び仮想視点画像生成装置、ならびに仮想視点画像生成プログラム及び記録媒体
US20140368495A1 (en) * 2013-06-18 2014-12-18 Institute For Information Industry Method and system for displaying multi-viewpoint images and non-transitory computer readable storage medium thereof
CN106162137A (zh) * 2016-06-30 2016-11-23 北京大学 虚拟视点合成方法及装置
CN106254916A (zh) * 2016-08-09 2016-12-21 乐视控股(北京)有限公司 直播播放方法及装置
CN107155101A (zh) * 2017-06-20 2017-09-12 万维云视(上海)数码科技有限公司 一种3d播放器使用的3d视频的生成方法及装置
CN107809630A (zh) * 2017-10-24 2018-03-16 天津大学 基于改进虚拟视点合成的多视点视频超分辨率重建算法
JP2018107793A (ja) * 2016-12-27 2018-07-05 キヤノン株式会社 仮想視点画像の生成装置、生成方法及びプログラム
CN108376424A (zh) * 2018-02-09 2018-08-07 腾讯科技(深圳)有限公司 用于对三维虚拟环境进行视角切换的方法、装置、设备及存储介质
CN109561296A (zh) * 2017-09-22 2019-04-02 佳能株式会社 图像处理装置、图像处理方法、图像处理系统和存储介质
CN109644265A (zh) * 2016-05-25 2019-04-16 佳能株式会社 控制装置、控制方法和存储介质
CN109712067A (zh) * 2018-12-03 2019-05-03 北京航空航天大学 一种基于深度图像的虚拟视点绘制方法
JP2019140483A (ja) * 2018-02-08 2019-08-22 キヤノン株式会社 画像処理システム、画像処理システムの制御方法、伝送装置、伝送方法及びプログラム

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10169646B2 (en) * 2007-12-31 2019-01-01 Applied Recognition Inc. Face authentication to mitigate spoofing
US8106924B2 (en) * 2008-07-31 2012-01-31 Stmicroelectronics S.R.L. Method and system for video rendering, computer program product therefor
CN102186038A (zh) * 2011-05-17 2011-09-14 浪潮(山东)电子信息有限公司 一种在数字电视屏幕上同步播放多视角画面的方法
US9851877B2 (en) * 2012-02-29 2017-12-26 JVC Kenwood Corporation Image processing apparatus, image processing method, and computer program product
US20140038708A1 (en) * 2012-07-31 2014-02-06 Cbs Interactive Inc. Virtual viewpoint management system
US10521671B2 (en) * 2014-02-28 2019-12-31 Second Spectrum, Inc. Methods and systems of spatiotemporal pattern recognition for video content development
CN109716268B (zh) * 2016-09-22 2022-05-17 苹果公司 眼部和头部跟踪
CN108900857B (zh) * 2018-08-03 2020-12-11 东方明珠新媒体股份有限公司 一种多视角视频流处理方法和装置
CN109407828A (zh) * 2018-09-11 2019-03-01 上海科技大学 一种凝视点估计方法及系统、存储介质及终端

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004152133A (ja) * 2002-10-31 2004-05-27 Nippon Telegr & Teleph Corp <Ntt> 仮想視点画像生成方法及び仮想視点画像生成装置、ならびに仮想視点画像生成プログラム及び記録媒体
US20140368495A1 (en) * 2013-06-18 2014-12-18 Institute For Information Industry Method and system for displaying multi-viewpoint images and non-transitory computer readable storage medium thereof
CN109644265A (zh) * 2016-05-25 2019-04-16 佳能株式会社 控制装置、控制方法和存储介质
CN106162137A (zh) * 2016-06-30 2016-11-23 北京大学 虚拟视点合成方法及装置
CN106254916A (zh) * 2016-08-09 2016-12-21 乐视控股(北京)有限公司 直播播放方法及装置
JP2018107793A (ja) * 2016-12-27 2018-07-05 キヤノン株式会社 仮想視点画像の生成装置、生成方法及びプログラム
CN107155101A (zh) * 2017-06-20 2017-09-12 万维云视(上海)数码科技有限公司 一种3d播放器使用的3d视频的生成方法及装置
CN109561296A (zh) * 2017-09-22 2019-04-02 佳能株式会社 图像处理装置、图像处理方法、图像处理系统和存储介质
CN107809630A (zh) * 2017-10-24 2018-03-16 天津大学 基于改进虚拟视点合成的多视点视频超分辨率重建算法
JP2019140483A (ja) * 2018-02-08 2019-08-22 キヤノン株式会社 画像処理システム、画像処理システムの制御方法、伝送装置、伝送方法及びプログラム
CN108376424A (zh) * 2018-02-09 2018-08-07 腾讯科技(深圳)有限公司 用于对三维虚拟环境进行视角切换的方法、装置、设备及存储介质
CN109712067A (zh) * 2018-12-03 2019-05-03 北京航空航天大学 一种基于深度图像的虚拟视点绘制方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115330912A (zh) * 2022-10-12 2022-11-11 中国科学技术大学 基于音频和图像驱动的用于生成人脸说话视频的训练方法

Also Published As

Publication number Publication date
EP4072147A1 (en) 2022-10-12
EP4072147A4 (en) 2022-12-14
WO2021134178A1 (zh) 2021-07-08
CN114097248B (zh) 2023-03-28
US20220329880A1 (en) 2022-10-13

Similar Documents

Publication Publication Date Title
CN114097248B (zh) 一种视频流处理方法、装置、设备及介质
CN109345556B (zh) 用于混合现实的神经网络前景分离
Garon et al. Deep 6-DOF tracking
US10855909B2 (en) Method and apparatus for obtaining binocular panoramic image, and storage medium
WO2019238114A1 (zh) 动态模型三维重建方法、装置、设备和存储介质
US10573060B1 (en) Controller binding in virtual domes
TWI752502B (zh) 一種分鏡效果的實現方法、電子設備及電腦可讀儲存介質
WO2021120157A1 (en) Light weight multi-branch and multi-scale person re-identification
WO2018000609A1 (zh) 一种虚拟现实系统中分享3d影像的方法和电子设备
WO2014187223A1 (en) Method and apparatus for identifying facial features
US20150172634A1 (en) Dynamic POV Composite 3D Video System
US10764493B2 (en) Display method and electronic device
WO2019080792A1 (zh) 全景视频图像的播放方法、装置、存储介质及电子装置
CN112543343A (zh) 基于连麦直播的直播画面处理方法、装置及电子设备
WO2017092432A1 (zh) 一种虚拟现实交互方法、装置和系统
US20180227575A1 (en) Depth map generation device
CN107197135B (zh) 一种视频生成方法及视频生成装置
CN110955329A (zh) 传输方法以及电子设备和计算机存储介质
Zhao et al. Laddernet: Knowledge transfer based viewpoint prediction in 360◦ video
CN114495169A (zh) 一种人体姿态识别的训练数据处理方法、装置及设备
CN111292234B (zh) 一种全景图像生成方法及装置
KR102176805B1 (ko) 뷰 방향이 표시되는 vr 컨텐츠 제공 시스템 및 방법
WO2021031210A1 (zh) 视频处理方法和装置、存储介质和电子设备
CN113887354A (zh) 图像识别方法、装置、电子设备及存储介质
CN108965859B (zh) 投影方式识别方法、视频播放方法、装置及电子设备

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant