WO2021004411A1 - 一种媒体处理方法 - Google Patents

一种媒体处理方法 Download PDF

Info

Publication number
WO2021004411A1
WO2021004411A1 PCT/CN2020/100297 CN2020100297W WO2021004411A1 WO 2021004411 A1 WO2021004411 A1 WO 2021004411A1 CN 2020100297 W CN2020100297 W CN 2020100297W WO 2021004411 A1 WO2021004411 A1 WO 2021004411A1
Authority
WO
WIPO (PCT)
Prior art keywords
function
media processing
description information
descriptor
requirements
Prior art date
Application number
PCT/CN2020/100297
Other languages
English (en)
French (fr)
Inventor
徐异凌
杨琦
管云峰
Original Assignee
上海交通大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海交通大学 filed Critical 上海交通大学
Priority to EP20837305.0A priority Critical patent/EP3996374A4/en
Priority to US17/597,427 priority patent/US11973994B2/en
Priority to JP2022500152A priority patent/JP7336161B2/ja
Priority to KR1020217042745A priority patent/KR20220012941A/ko
Publication of WO2021004411A1 publication Critical patent/WO2021004411A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/65Updates
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • H04N21/2353Processing of additional data, e.g. scrambling of additional data or processing content descriptors specifically adapted to content descriptors, e.g. coding, compressing or processing of metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/222Secondary servers, e.g. proxy server, cable television Head-end
    • H04N21/2223Secondary servers, e.g. proxy server, cable television Head-end being a public access point, e.g. for downloading to or uploading from clients
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23412Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs for generating or manipulating the scene composition of objects, e.g. MPEG-4 objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video

Definitions

  • the invention belongs to the field of cloud processing systems, in particular to a media processing method.
  • VR Virtual Reality
  • PC Point Cloud
  • the server needs to collect multiple camera streams, and then stitch the multiple videos into a spherical data through splicing and rendering, which is then consumed by users.
  • the splicing process is more complicated, and a certain time delay is often caused when the local processor has limited capabilities. In a live broadcast scenario, this delay will have a certain impact on the user experience.
  • Point cloud data is directly generated by scanning or computer.
  • a simple point cloud object has tens of thousands or even hundreds of thousands of points. Compressing the geometric and attribute information of these points is very time-consuming and difficult to meet real-time requirements.
  • the MPEG expert group has established a Network Based Media Processing (NBMP) working group for media cloud processing to conduct standardized research on a series of mechanisms for media cloud processing.
  • NBMP Network Based Media Processing
  • the research on immersive media processing has developed extremely rapidly, and the methods represented by deep learning have shown good performance. By training the learning network, the efficiency of media processing can be improved.
  • Nanjing University proposed to use a variational autoencoder (VAE) based on a three-dimensional convolutional neural network (3D CNN) to perform geometric compression of point clouds, achieving good compression performance.
  • VAE variational autoencoder
  • users or other third parties may form their own solutions for media processing, providing media processing methods or media processing functions, and these solutions can guide the server in the cloud system to perform media processing.
  • the current cloud system does not have a mechanism or interface that allows users to upload media processing methods or media processing functions. How to make the cloud system adopt solutions formed by users or third parties is a difficult problem in this field.
  • the present invention provides a media processing method, designing an interaction mechanism between a cloud system and a user end or a third party, so as to realize the collection and upload of the media processing solution provided by the user end or the third party by the cloud system.
  • a media processing method of the present invention is characterized in that it includes the following steps:
  • the media processing method further includes: using a function outside the system to update the function stored in the system.
  • the media processing method further includes: feeding back the judgment result to the sending source of the media, so that the sending source can modify the description information.
  • the descriptor of the function includes at least one of the following: frame descriptor, general descriptor, input descriptor, output descriptor, processing descriptor, request descriptor, configuration descriptor, client assistance descriptor, statement Descriptors, variable descriptors, event descriptors, and security descriptors.
  • the media processing method further includes the following steps:
  • a function that meets the requirements of the description information is further selected, and the function includes configuration parameters;
  • a media processing system implementing the present invention includes a function library, characterized in that the function library is used to perform the following operations:
  • the media processing system further includes: using a function outside the system to update the function stored in the system.
  • the function library feeds back the judgment result to the sending source of the media, so that the sending source can modify the description information.
  • the descriptor of the function includes at least one of the following: frame descriptor, general descriptor, input descriptor, output descriptor, processing descriptor, request descriptor, configuration descriptor, client assistance descriptor , Declaration Descriptor, Variable Descriptor, Event Descriptor and Security Descriptor.
  • the media processing system further includes a manager and a processing entity; among them,
  • the manager is configured to send a request to find a function that meets the description information requirement to the function library according to the received description information of the media processing; then, from the possible functions that meet the description information requirement fed back from the function library, further Select a function that meets the requirements of the description information, and the function includes configuration parameters; and generate a workflow according to the selected function that meets the requirements of the description information, and the workflow includes the selected function; combined with the workflow, Create a configuration for each media processing task and send the configuration to the processing entity; after the configuration of all tasks is successfully created, the sending source of the media is notified, and the media processing can start;
  • the function library is used to find possible functions that meet the requirements of the description information according to the request to find the functions that meet the requirements of the description information, and feed them back to the manager;
  • the processing entity is used for confirming that the configuration of the task is successfully created, and feeding back information to the manager.
  • the cloud system first confirms whether there is a function that meets the media processing requirements in the cloud system according to the media processing requirements sent by the media source.
  • the Hook API or function library upload application program interface API can upload media processing methods or media processing functions provided by the user or a third party to realize personalized media processing based on the user terminal or the third party.
  • Figure 1 is a diagram of the cloud system structure using the hook API solution of the application program interface
  • FIG. 2 is a diagram of the cloud system structure using the function library upload application program interface (API) solution.
  • API application program interface
  • the present invention designs the interaction mechanism between the cloud system and the user terminal, by using a defined application program interface (Application Program Interface, API) to collect user personalized processing methods or parameters to guide media processing; the designed application program interface (API) is not only suitable for users Terminal upload and update functions can also satisfy any third party to update the function library.
  • API Application Program Interface
  • FIG. 1 shows the structure of the cloud media processing system (cloud system) of the present invention.
  • the components of the cloud system mainly include a manager, a processing entity, and a function library.
  • media sources which can be NBMP sources
  • client outside the cloud system Taking NBMP media processing as an example, the media processing method of the cloud system is:
  • the NBMP media that needs media processing sent by the NBMP source contains description information corresponding to the media processing (ie, media processing description).
  • the NBMP source uses the function discovery API to check whether the current function library can provide the media processing description.
  • the function can be the type of media to be processed, the method of calling processing functions, etc., including information such as compression, upsampling, and video transcoding.
  • the function library After the function library receives the media processing description, it determines whether there is a function on the cloud system that meets the requirements of the media processing description; if it exists, select the function that meets the description information requirements from the system; if it does not exist, follow the function according to the priority Select a function from the library, or select a function from outside the system that meets the requirements of the description information. If necessary, the function library can update the function library in the cloud system by using functions or function libraries external to the cloud system. The function library feeds back the check result to the NBMP source.
  • the NBMP source can modify the media processing information based on the inspection results.
  • the NBMP source sends media processing information to the manager as part of the request.
  • the manager sends a request to find a function that meets the media processing description to the function library, and the manager sends one or a group of queries to the function library to find the function.
  • Function library for each query, find out possible functions that meet the requirements of media processing description, and use a short list of possible functions, descriptions of possible functions and their configuration information to feed back to the manager.
  • the function descriptor includes at least one of the following: frame descriptor, general descriptor, input descriptor, output descriptor, processing descriptor, request descriptor, configuration descriptor, client assistance descriptor, declaration descriptor, variable descriptor , Event Descriptor and Security Descriptor.
  • the manager further selects a function that meets the description information requirement from the possible functions that meet the media processing description requirements fed back by the function library, and the input descriptor in the function contains configuration parameters.
  • the media processing function can be a function corresponding to the media processing mode of H.264 video encoding. Its main components include access unit character separation, additional enhanced information, basic image encoding, redundant image encoding, timely decoding and refreshing, Hypothetical reference decoding, imaginary code stream scheduler and other functional modules.
  • the manager generates a workflow according to a selected set of functions that meet the desired media processing, and the workflow contains the selected functions; and combined with the workflow, it creates a configuration and configuration for each media processing task Sent to the processing entity.
  • the processing entity includes an NBMP task module, and the NBMP task module confirms that the task configuration is successfully created, and feeds back information to the manager.
  • the manager after the configuration of all tasks is successfully created, notifies the sending source of the media, and the media processing can start.
  • the processing entity receives the media stream sent from the media source, performs media processing, and sends it to the user terminal or other receiving device.
  • the present invention designs an application program interface hook API (Hook API) for each media processing function.
  • Hook API application program interface hook API
  • the hook program of the program interface takes the media provided by the user or third party
  • the processing function is uploaded to the cloud system.
  • Hook API of an application program interface is a technology used to change the execution result of an application program interface (API).
  • API application programming interface
  • Hook API the original function of a system's application program interface (API) can be changed.
  • the basic method is to "touch" the entry point of the application program interface (API) function that needs to be modified through a hook program, and change its address to point to a new custom media processing function.
  • API application program interface
  • processing entity API processing entity application program interface
  • hook API application program interface
  • the user or a third party can hook the processing entity API; the hook method uses the basic principle of the hook program (Hook), and each hook program (Hook) has a The associated pointer list is called a hook list, which is maintained by the system.
  • Use hook technology to modify the function entry address of the media processing function in the processing entity API, that is, replace the address of the media processing function originally selected through the function library with the address of the user or third party's own media processing function; After running the media processing function, provide the specific parameter value corresponding to each parameter to realize the replacement of the media processing parameter. It is then transmitted to the processing entity through the processing entity API to perform specific media processing tasks.
  • the hook program (Hook API) of the application program interface When used for video transcoding, the information included in the hook program (Hook API) of the application program interface is as follows:
  • the transmission method of the parameters in the above table 1 is not limited to any transmission protocol such as MMT, DSAH, etc.
  • the transmitted information must include but not limited to the following information:
  • Input and output description parameters used to describe the type of media processing guided by this parameter, the specific details of media processing, for example, the transcoding parameters in transcoding, including resolution, frame rate, encoder, encoding method, and media packaging format Wait;
  • the difference between the second embodiment and the first embodiment lies in the manner in which the media processing function provided by the user or a third party is uploaded to the cloud system.
  • the manager judges whether there is a specific media processing function that can meet the requirements in the function library; if so, the manager selects the corresponding media processing function from the function library; if not, the manager can use the cloud system
  • the media processing function with the highest priority related to the media processing description is selected, or the media processing function corresponding to the media processing description can be selected from an external user or a third party, and uploaded to the cloud system through an application program interface (API).
  • API application program interface
  • the present invention adopts a scheme of defining function library upload application program interface (API).
  • the function library upload application program interface (API) can be used to receive specific media processing functions from the client or a third party to the function library.
  • the manager generates specific media processing tasks through these uploaded media processing functions, and the processing entity can use the received media processing tasks to perform media processing on the received media streams.
  • Media processing based on the cloud system can also have the following workflow: the media source can send a function query request to the function library before sending the media processing description to the manager, and the existing cloud system will provide the corresponding function query application program interface (function query API), all corresponding functions can be found through keyword search, for example, all functions that can realize video encoding, including H.264, H.265 and so on. If the media source finds that there is no function that meets the requirements in the function library and the media source itself has corresponding solutions or functions, you can upload the functions of the media source to the function library through the function library upload application program interface (API), and then Used in the generation of media processing tasks.
  • function query API function query application program interface
  • the application program interface should include the identification information of the user function and the specific function description information. Specifically, the application program interface (API) should include the name of the upload function, function input description information, output description information, and media processing additional Information (such as encoder requirements) and so on.
  • the manager sends a function search request to the function library, and the function library receives media processing functions from the client or a third party through the function library upload application program interface (API).
  • the specific design of the library upload application program interface (API) is as follows:
  • the interactive information involved in the API upload of the function library should include but not limited to the following information:
  • Function identification information This information is used by the function library to mark the function provided by the user, and is called when the manager generates a processing task;
  • Function location information This information is used to describe the storage location of user-provided functions
  • Function input and output information This information describes the input parameters and output parameters required by the user to provide the function.
  • the design of the hook program (Hook API) of the application program interface and the function library upload application program interface (API) of the present invention can be extended to any media processing type.
  • hook APIs of the application program interface you can obtain parameters for different media processing types by modifying general information and input and output information.
  • General:brand:function_name general description: brand name: function name
  • replace function_name Replace function_name
  • input and output parameters can be set according to the specific parameters involved in the media processing process, just replace Input:
  • the value byte in value is the same for the rest.
  • the function library upload application program interface API
  • the same as the hook program (Hook API) of the application program interface replace the corresponding byte to realize the upload and use of different users or third-party functions.
  • the cloud system first confirms whether there is a function that meets the media processing requirements in the cloud system according to the media processing requirements sent by the media source, and by adding a system layer application program interface to the existing cloud system Hook API or function library upload application program interface API method can realize personalized media processing based on the user end or a third party, thereby improving the user's consumption experience, the integrity and robustness of the system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Transfer Between Computers (AREA)
  • Stored Programmes (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本发明揭示了一种媒体处理方法,用于实现用户或第三方提供函数的上传与用户提供媒体处理参数的获取,包括如下步骤:接收媒体处理的描述信息,判断系统上是否存在满足所述描述信息的要求的函数;如果存在,从系统中选取满足描述信息的要求的函数;如果不存在,按照优先级从系统中选取函数,或从系统外部选取满足描述信息的要求的函数。采用了本发明的技术方案,可实现云系统基于用户或第三方的个性化媒体处理,从而提高用户的消费体验、系统的完整性与鲁棒性。

Description

一种媒体处理方法 技术领域
本发明属于云处理系统领域,尤其是一种媒体处理方法。
背景技术
随着视觉媒体的迅速发展,虚拟现实(Virtual Reality,VR),点云(Point Cloud,PC)应用到越来越多的消费场景,例如游戏、比赛直播、医疗等。在这些场景中,往往要求具有较低的响应时延,这对数据的处理与传输时延提出了更高的要求。以全景视频为例,服务器需要收集多路相机流,然后通过拼接、渲染,将多路视频拼接成一个球形数据,然后供用户消费。其中拼接这一过程比较复杂,在本地处理器能力有限的情况下,往往会产生一定时延。在直播场景下,这种时延会对用户的体验会造成一定的影响。点云数据则是通过扫描或者电脑直接生成,一个简单的点云对象拥有几万甚至十几万的点,对这些点的几何及属性信息进行压缩十分耗时,难以达到实时的要求。针对以上存在的问题,现在的解决方案是将复杂的媒体处理过程放到云端,利用云端服务器强大的处理能力来加速数据的处理速度。MPEG专家组已针对媒体云处理建立Network Based Media Processing(NBMP)工作组,就媒体云处理的一系列机制进行标准化研究。而目前针对沉浸式媒体处理的研究发展极为迅速,其中以深度学习为代表的方法表现出了较好的性能。通过训练学习网络,可以提高媒体处理效率。以点云压缩为例,南京大学提出使用基于三维卷积神 经网络(3D CNN)的变分自编码器(VAE)进行点云的几何压缩,达到了良好的压缩性能。
现实中,用户或其他第三方可能会针对媒体处理形成自己的解决方案,提供媒体处理方式或媒体处理函数,这些解决方案可以指导云系统中的服务器进行媒体处理。但目前的云系统没有允许用户上传媒体处理方式或者媒体处理函数的机制或者接口,如何使云系统采用用户或第三方形成的解决方案是本领域的一个难题。
发明内容
本发明提供了一种媒体处理方法,设计云系统与用户端或第三方的交互机制,实现云系统对用户端或第三方提供的媒体处理解决方案的收集和上传。
依据上述目的,实施本发明的一种媒体处理的方法,其特征在于,包括如下步骤:
接收媒体处理的描述信息,判断系统上是否存在满足所述描述信息的要求的函数;
如果存在,从系统中选取满足描述信息的要求的函数;
如果不存在,按照优先级从系统中选取函数,或从系统外部选取满足描述信息的要求的函数。
可选的,媒体处理的方法还包括:采用系统外部的函数对系统中存储的函数进行更新。
可选的,媒体处理的方法还包括:将判断结果反馈给媒体的发送源,以便发送源对描述信息进行修改。
可选的,所述函数的描述符包括下列至少一个:框架描述符、通用描述符、输入描述符、输出描述符、处理描述符、要求描述符、配置描述符、客户端协助描述符、声明描述符、变量描述符、事件描述符和安全性描述符。
可选的,媒体处理的方法还包括如下步骤:
根据媒体处理的描述信息,查找出满足描述信息要求的可能的函数;
从满足描述信息要求的可能的函数中,进一步选择出满足描述信息要求的函数,所述函数中包含有配置参数;
根据选择出的满足描述信息要求的函数,生成工作流,所述工作流中包含有选择出的函数;
结合工作流,为每一个媒体处理的任务创建配置,确认所有任务的配置创建成功;
通知媒体的发送源,媒体处理可以开始。
依据上述目的,实施本发明的一种媒体处理的系统,包括函数库,其特征在于,所述函数库用于执行如下操作:
接收媒体处理的描述信息,判断系统上是否存在满足所述描述信息的要求的函数;
如果存在,从系统中选取满足描述信息的要求的函数;
如果不存在,按照优先级从系统中选取函数,或从系统外部选取满足描述信息的要求的函数。
可选的,媒体处理的系统还包括:采用系统外部的函数对系统中存储的函数进行更新。
可选的,所述函数库,将判断结果反馈给媒体的发送源,以便发送源对描述信息进行修改。
可选的,所述函数的描述符包括下列中的至少一个:框架描述符、通用描述符、输入描述符、输出描述符、处理描述符、要求描述符、配置描述符、客户端协助描述符、声明描述符、变量描述符、事件描述符和安全性描述符。
可选的,媒体处理的系统还包括管理器和处理实体;其中,
所述管理器,用于根据接收到的媒体处理的描述信息,发送查找满足描述信息要求的函数的请求给所述函数库;然后从函数库反馈的满足描述信息要求的可能的函数中,进一步选择出满足描述信息要求的函数,所述函数中包含有配置参数;并根据选择出的满足描述信息要求的函数,生成工作流,所述工作流中包含有选择出的函数;结合工作流,为每一个媒体处理的任务创建配置并将配置发送给所述处理实体;所有任务的配置创建成功后,通知媒体的发送源,媒体处理可以开始;
所述函数库,用于根据查找满足描述信息要求的函数的请求,查找出满足描述信息要求的可能的函数,并反馈给所述管理器;
所述处理实体,用于确认任务的配置创建成功,并反馈信息给所述管理器。
采用了本发明的技术方案,针对现有技术的不足,云系统先根据媒体源发送的媒体处理的需求确认云系统中是否存在满足该媒体处理需求的函数,云系统通过添加系统层应用程序接口的钩子程序(Hook API)或者函数库上传应用程序接口API的方式,可上传用户或第三方提供的媒体处理方式或媒体处理函数,实现基于用户端或第三方的个性化媒体处理。
附图说明
图1是:采用应用程序接口的钩子程序(Hook API)方案的云系统结构图;
图2是:采用函数库上传应用程序接口(API)方案的云系统结构图。
具体实施方式
下面结合附图和实施例进一步说明本发明的技术方案。本发明设计了云系统与用 户端的交互机制,通过使用定义的应用程序接口(Application Program Interface,API)收集用户个性化处理方法或者参数指导媒体处理;设计的应用程序接口(API)不仅适用于用户端上传与更新函数,还可满足任何第三方对函数库进行更新。
实施例一
如图1所示为本发明的云媒体处理系统(云系统)的结构。云系统的组成成分主要包括管理器、处理实体和函数库,云系统的外部存在媒体源(可以为NBMP源)、用户端。以NBMP媒体处理为例,该云系统的媒体处理方式为:
NBMP源发送的需要进行媒体处理的NBMP媒体中包含有与该媒体处理相对应的描述信息(即媒体处理描述),NBMP源使用函数发现API来检查当前的函数库是否能够提供媒体处理描述所需的函数。媒体处理描述可以是需要处理的媒体类型、处理函数调用方式等等,包含压缩、上采样、视频转码等信息。
函数库接收到媒体处理描述后,判断云系统上是否存在满足所述媒体处理描述的要求的函数;如果存在,从系统中选取满足描述信息的要求的函数;如果不存在,按照优先级从函数库中选取函数,或从系统外部选取满足描述信息的要求的函数。如果需要,函数库可以采用云系统外部的函数或函数库对云系统中的函数库进行更新。函数库将检查结果反馈给NBMP源。
NBMP源根据检查结果,可以修改媒体处理信息。NBMP源将媒体处理信息作为请求的一部分发送给管理器。
管理器根据接收到的媒体处理描述,发送查找满足媒体处理描述的函数的请求给所述函数库,管理器向函数库发送一个或一组查询来查找函数。
函数库,对于每个查询,查找出满足媒体处理描述要求的可能的函数,用一个简短的可能的函数列表、可能的函数的描述和它们的配置信息,反馈给管理器。函数的 描述符包括下列至少一个:框架描述符、通用描述符、输入描述符、输出描述符、处理描述符、要求描述符、配置描述符、客户端协助描述符、声明描述符、变量描述符、事件描述符和安全性描述符。
管理器从函数库反馈的满足媒体处理描述要求的可能的函数中,进一步选择出满足描述信息要求的函数,所述函数中的输入描述符中包含有配置参数。例如,媒体处理的函数可以为与H.264视频编码的媒体处理方式相对应的函数,其主要组成部分包括访问单元符分隔、附加增强信息、基本图像编码、冗余图像编码、及时解码刷新、假想参考解码、假象码流调度器等函数模块。
管理器根据选择出的满足想要的媒体处理的一组函数,生成工作流,所述工作流中包含有选择出的函数;并结合工作流,为每一个媒体处理的任务创建配置并将配置发送给所述处理实体。
处理实体中包含有NBMP任务模块,NBMP任务模块确认任务的配置创建成功,并反馈信息给所述管理器。
管理器,在所有任务的配置创建成功后,通知媒体的发送源,媒体处理可以开始。
处理实体,接收从媒体源发送的媒体流,进行媒体处理后发送给用户端或其它接收设备。
本发明针对每个媒体处理函数设计应用程序接口的钩子程序(Hook API),当需要采用用户或其他第三方形成的解决方案进行媒体处理时,程序接口的钩子程序将用户或第三方提供的媒体处理函数上传至云系统。
应用程序接口的钩子程序(Hook API)是一种用于改变应用程序接口(API)执行结果的技术。当诸如控件这些现成的手段不能实现一些功能时,我们还需要借助应用程序接口(API)。例如,对某些应用程序接口(API)函数的功能不太满意, 可修改这些应用程序接口(API),使之能够更好的提供服务。通过应用程序接口的钩子程序(Hook API),能够改变一个系统的应用程序接口(API)的原有功能。基本的方法就是通过钩子程序(Hook)“接触”到需要修改的应用程序接口(API)函数入口点,改变它的地址指向新的自定义的媒体处理函数。
在管理器与处理实体之间存在一个应用程序接口(API),称之为处理实体应用程序接口(处理实体API)。通过应用程序接口的钩子程序(Hook API)技术,用户或第三方可以钩取到处理实体API;钩取的方式使用钩子程序(Hook)的基本原理,每一个钩子程序(Hook)都有一个与之相关联的指针列表,称之为钩子链表,由系统来维护。使用钩子程序(Hook)的钩取技术,修改处理实体API中的媒体处理函数的函数入口地址,即将原先通过函数库选取的媒体处理函数的地址替换为用户或第三方自己媒体处理函数的地址;运行的媒体处理函数后,提供每个参数对应的具体参数值,则可实现媒体处理参数的替换。然后通过处理实体API传输给处理实体,执行具体的媒体处理任务。
以基于转码的应用程序接口的钩子程序(Hook API)为例,视频转码时需要指定媒体处理的类别(本处即为视频转码),视频的帧率、分辨率、量化步长、编码器、比特率、采样率等。这些参数原本是通过函数库提供的具体的媒体处理函数生成用于指导任务生成,这些参数定义在处理实体API中,即管理器将该处理实体API中定义的信息传输给处理实体,处理实体根据这些信息具体执行媒体处理。在将处理实体API中函数库的入口函数地址替换为用户或第三方的函数地址,管理器确定用户端或第三方提供的函数后,会使用用户或第三方提供的媒体处理函数的地址创建具体的媒体处理任务生成工作流,该工作流首先运行对应的媒体处理函数得到具体的媒体处理参数信息,然后指导任务的生成。
视频转码采用应用程序接口的钩子程序(Hook API)时,则该应用程序接口的钩子程序(Hook API)所包括的信息如下表一:
表一Hook API_Transcode
Figure PCTCN2020100297-appb-000001
Figure PCTCN2020100297-appb-000002
上述表一中的参数的传输方式不限于MMT,DSAH等任何传输协议,所传输的信息须包括但不限于如下信息:
(1)一般参数:用于该传输的识别与传输端口配置;
(2)输入输出描述参数:用于说明该参数指导的媒体处理类型,媒体处理具体细节,例如,转码中的转码参数,包括分辨率、帧率、编码器、编码方式、媒体封装格式等;
(3)其余参数:用于可选的要求信息,监控信息等。
实施例二
实施例二与实施例一的不同之处在于用户或第三方提供的媒体处理函数上传至云系统的方式。
管理器根据媒体处理描述,判断函数库中是否存在可满足要求的具体媒体处理函数;如果是,管理器从函数库中选取相对应的媒体处理函数;如果否,管理器既可从云系统中选取与媒体处理描述相关的优先级最高的媒体处理函数,也可从外部用户或第三方选取与媒体处理描述相对应的媒体处理函数,并通过应用程序接口(API)上传至云系统。
如图2所示,本发明采用定义函数库上传应用程序接口(API)的方案。该函数库上传应用程序接口(API)可用来接收用户端或者第三方向函数库传输具体的媒体处理函数。管理器通过这些上传的媒体处理函数生成具体的媒体处理任务,处理实体可以使用接收到的媒体处理任务对接收到的媒体流进行媒体处理。
基于云系统的媒体处理还可存在以下工作流程:媒体源在向管理器发送媒体处理描述之前,可以向函数库发出函数查询请求,现有云系统会提供相应的函数查询应用程序接口(函数查询API),通过关键字检索即可查找出所有的对应函数,例如所有可以实现视频编码的函数,包括H.264,H.265等等。如果媒体源发现函数库中没有满足要求的函数且媒体源本身有相应的解决方式或者函数,可以通过函数库上传应用程序接口(API)将媒体源有的函数上传至函数库,并在之后的媒体处理任务的生成中使用。
该应用程序接口(API)应包括用户函数的识别信息以及具体的函数描述信息,具体的,该应用程序接口(API)需要包括上传函数的名称、函数输入描述信息、输出描述信息、媒体处理附加信息(例如编码器要求)等等。
若采用函数库上传应用程序接口(API),则管理器向函数库发出函数查找请求,函数库则通过函数库上传应用程序接口(API)接收来自用户端或者第三方的媒体处理函数,该函数库上传应用程序接口(API)的具体设计如下表二:
表二Function Repository API(函数库上传应用程序接口)
Figure PCTCN2020100297-appb-000003
函数库上传应用程序接口(API)涉及的交互信息应该包括但不限于以下信息:
(1)函数识别信息:该信息用于函数库标记用户提供的函数,并在管理器生成处理任务时调用;
(2)函数位置信息:该信息用于描述用户提供函数的存放位置;
(3)函数输入输出信息:该信息描述该用户提供函数所需的输入参数,及输出参数。
本发明应用程序接口的钩子程序(Hook API)和函数库上传应用程序接口(API)的设计均可扩展到任何媒体处理类型。对于应用程序接口的钩子程序 (Hook API),通过修改一般信息、输入输出信息即可针对不同媒体处理类型实现参数获取,例如,General:brand:function_name(通用描述:名牌:函数名),替换function_name(函数名)即可表明不同媒体处理目的,该信息可以是压缩(compression)、上采样(upsampling)等;输入输出参数则可根据媒体处理过程中涉及的具体参数进行设置,只需替换Input:value中的value字节,其余同理。对于函数库上传应用程序接口(API),与应用程序接口的钩子程序(Hook API)同理,替换对应字节即可实现不同用户或者第三方函数的上传与使用。
本发明所提供的媒体处理的技术方案,云系统先根据媒体源发送的媒体处理的需求确认云系统中是否存在满足该媒体处理需求的函数,通过在现有云系统中添加系统层应用程序接口的钩子程序(Hook API)或者函数库上传应用程序接口API的方式,可以实现基于用户端或第三方的个性化媒体处理,从而提高用户的消费体验、系统的完整性与鲁棒性。
所属领域的技术人员应当认识到,以上的说明书仅是本发明众多实施例中的两种或几种实施方式,而并非用对本发明的限定。任何对于以上所述实施例的均等变化、变型以及等同替代等技术方案,只要符合本发明的实质精神范围,都将落在本发明的权利要求书所保护的范围内。

Claims (10)

  1. 一种媒体处理的方法,其特征在于,包括如下步骤:
    接收媒体处理的描述信息,判断系统上是否存在满足所述描述信息的要求的函数;
    如果存在,从系统中选取满足描述信息的要求的函数;
    如果不存在,按照优先级从系统中选取函数,或从系统外部选取满足描述信息的要求的函数。
  2. 如权利要求1所述的一种媒体处理的方法,其特征在于,还包括:
    采用系统外部的函数对系统中存储的函数进行更新。
  3. 如权利要求1所述的一种媒体处理的方法,其特征在于,还包括:
    将判断结果反馈给媒体的发送源,以便发送源对描述信息进行修改。
  4. 如权利要求1所述的一种媒体处理的方法,其特征在于,
    所述函数的描述符包括下列至少一个:框架描述符、通用描述符、输入描述符、输出描述符、处理描述符、要求描述符、配置描述符、客户端协助描述符、声明描述符、变量描述符、事件描述符和安全性描述符。
  5. 如权利要求1所述的一种媒体处理的方法,其特征在于,还包括如下步骤:
    根据媒体处理的描述信息,查找出满足描述信息要求的可能的函数;
    从满足描述信息要求的可能的函数中,进一步选择出满足描述信息要求的函数,所述函数中包含有配置参数;
    根据选择出的满足描述信息要求的函数,生成工作流,所述工作流中包含有选择 出的函数;
    结合工作流,为每一个媒体处理的任务创建配置,确认所有任务的配置创建成功;
    通知媒体的发送源,媒体处理可以开始。
  6. 一种媒体处理的系统,包括函数库,其特征在于,所述函数库用于执行如下操作:
    接收媒体处理的描述信息,判断系统上是否存在满足所述描述信息的要求的函数;
    如果存在,从系统中选取满足描述信息的要求的函数;
    如果不存在,按照优先级从系统中选取函数,或从系统外部选取满足描述信息的要求的函数。
  7. 如权利要求6所述的一种媒体处理的系统,其特征在于,还包括:
    采用系统外部的函数对系统中存储的函数进行更新。
  8. 如权利要求6所述的一种媒体处理的系统,其特征在于,
    所述函数库,将判断结果反馈给媒体的发送源,以便发送源对描述信息进行修改。
  9. 如权利要求6所述的一种媒体处理的系统,其特征在于,
    所述函数的描述符包括下列中的至少一个:框架描述符、通用描述符、输入描述符、输出描述符、处理描述符、要求描述符、配置描述符、客户端协助描述符、声明描述符、变量描述符、事件描述符和安全性描述符。
  10. 如权利要求6所述的一种媒体处理的系统,其特征在于,还包括管理器和处理实体;其中,
    所述管理器,用于根据接收到的媒体处理的描述信息,发送查找满足描述信息要求的函数的请求给所述函数库;然后从函数库反馈的满足描述信息要求的可能的函数中,进一步选择出满足描述信息要求的函数,所述函数中包含有配置参数;并根据选择出的满足描述信息要求的函数,生成工作流,所述工作流中包含有选择出的函数;结合工作流,为每一个媒体处理的任务创建配置并将配置发送给所述处理实体;所有任务的配置创建成功后,通知媒体的发送源,媒体处理可以开始;
    所述函数库,用于根据查找满足描述信息要求的函数的请求,查找出满足描述信息要求的可能的函数,并反馈给所述管理器;
    所述处理实体,用于确认任务的配置创建成功,并反馈信息给所述管理器。
PCT/CN2020/100297 2019-07-05 2020-07-04 一种媒体处理方法 WO2021004411A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP20837305.0A EP3996374A4 (en) 2019-07-05 2020-07-04 MULTIMEDIA PROCESSING PROCESS
US17/597,427 US11973994B2 (en) 2019-07-05 2020-07-04 Media processing method
JP2022500152A JP7336161B2 (ja) 2019-07-05 2020-07-04 メディア処理方法
KR1020217042745A KR20220012941A (ko) 2019-07-05 2020-07-04 미디어 처리 방법

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201910604344 2019-07-05
CN201910604344.6 2019-07-05
CN201910862257.0A CN112188235B (zh) 2019-07-05 2019-09-12 媒体处理方式的选择方法及媒体处理方法
CN201910862257.0 2019-09-12

Publications (1)

Publication Number Publication Date
WO2021004411A1 true WO2021004411A1 (zh) 2021-01-14

Family

ID=73919902

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/100297 WO2021004411A1 (zh) 2019-07-05 2020-07-04 一种媒体处理方法

Country Status (6)

Country Link
US (1) US11973994B2 (zh)
EP (1) EP3996374A4 (zh)
JP (1) JP7336161B2 (zh)
KR (1) KR20220012941A (zh)
CN (1) CN112188235B (zh)
WO (1) WO2021004411A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230015697A1 (en) * 2021-07-13 2023-01-19 Citrix Systems, Inc. Application programming interface (api) authorization

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102123269A (zh) * 2010-12-16 2011-07-13 成都市华为赛门铁克科技有限公司 视频监控数据获取方法、装置以及视频监控系统
CN103731672A (zh) * 2013-12-16 2014-04-16 乐视致新电子科技(天津)有限公司 一种音视频解码方法及智能电视
CN106169065A (zh) * 2016-06-30 2016-11-30 联想(北京)有限公司 一种信息处理方法及电子设备
CN106295489A (zh) * 2015-06-29 2017-01-04 株式会社日立制作所 信息处理方法、信息处理装置和视频监控系统
US20180130182A1 (en) * 2011-07-12 2018-05-10 Apple Inc. Multifunctional environment for image cropping

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6216152B1 (en) * 1997-10-27 2001-04-10 Sun Microsystems, Inc. Method and apparatus for providing plug in media decoders
JP2004348437A (ja) 2003-05-22 2004-12-09 Matsushita Electric Ind Co Ltd リソース管理装置、リソース管理方法及び記録媒体
US7957413B2 (en) 2005-04-07 2011-06-07 International Business Machines Corporation Method, system and program product for outsourcing resources in a grid computing environment
JP2006295586A (ja) 2005-04-12 2006-10-26 Hitachi Ltd コンテンツ変換装置及びトランスコードシステム
US9076311B2 (en) * 2005-09-07 2015-07-07 Verizon Patent And Licensing Inc. Method and apparatus for providing remote workflow management
JP2007221401A (ja) 2006-02-16 2007-08-30 Kenwood Corp 再生装置、プログラム、及びネットワーク型コンテンツ再生方法
EP2088780B1 (en) 2006-11-07 2015-11-04 Sony Corporation Electronic device, content reproducing method, and content decoding method
JP4685040B2 (ja) * 2007-01-24 2011-05-18 パナソニック株式会社 半導体集積回路及びその電源供給制御方法
CN101067924B (zh) * 2007-05-28 2010-05-19 广东威创视讯科技股份有限公司 一种基于第三方播放软件的视频加速方法
EP2201708A2 (en) * 2007-08-07 2010-06-30 Thomson Licensing Broadcast clip scheduler
CN101339789B (zh) * 2008-08-13 2010-08-18 中兴通讯股份有限公司 一种多媒体引擎的实现方法
JP2010166339A (ja) 2009-01-15 2010-07-29 Nippon Hoso Kyokai <Nhk> プログラムファイル取得装置
US9009294B2 (en) * 2009-12-11 2015-04-14 International Business Machines Corporation Dynamic provisioning of resources within a cloud computing environment
JP2011166748A (ja) * 2010-01-14 2011-08-25 Canon Inc 画像処理装置、その制御方法、及びプログラム
US8745122B2 (en) * 2011-06-14 2014-06-03 At&T Intellectual Property I, L.P. System and method for providing an adjunct device in a content delivery network
WO2014047867A1 (en) * 2012-09-28 2014-04-03 Intel Corporation Processing video data in a cloud
US9407944B1 (en) * 2015-05-08 2016-08-02 Istreamplanet Co. Resource allocation optimization for cloud-based video processing
CN106358042B (zh) * 2015-07-17 2020-10-09 恩智浦美国有限公司 使用视频图像的帧间预测的并行解码器
US10033816B2 (en) * 2015-09-30 2018-07-24 Amazon Technologies, Inc. Workflow service using state transfer
CN109213991A (zh) * 2017-07-05 2019-01-15 中兴通讯股份有限公司 消息处理方法、系统、云平台及存储介质
CN109685015B (zh) * 2018-12-25 2021-01-08 北京旷视科技有限公司 图像的处理方法、装置、电子设备和计算机存储介质
CN109831636B (zh) * 2019-01-28 2021-03-16 努比亚技术有限公司 互动视频控制方法、终端及计算机可读存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102123269A (zh) * 2010-12-16 2011-07-13 成都市华为赛门铁克科技有限公司 视频监控数据获取方法、装置以及视频监控系统
US20180130182A1 (en) * 2011-07-12 2018-05-10 Apple Inc. Multifunctional environment for image cropping
CN103731672A (zh) * 2013-12-16 2014-04-16 乐视致新电子科技(天津)有限公司 一种音视频解码方法及智能电视
CN106295489A (zh) * 2015-06-29 2017-01-04 株式会社日立制作所 信息处理方法、信息处理装置和视频监控系统
CN106169065A (zh) * 2016-06-30 2016-11-30 联想(北京)有限公司 一种信息处理方法及电子设备

Also Published As

Publication number Publication date
JP7336161B2 (ja) 2023-08-31
EP3996374A4 (en) 2023-07-26
JP2022539798A (ja) 2022-09-13
CN112188235A (zh) 2021-01-05
EP3996374A1 (en) 2022-05-11
KR20220012941A (ko) 2022-02-04
US20220272391A1 (en) 2022-08-25
CN112188235B (zh) 2023-03-24
US11973994B2 (en) 2024-04-30

Similar Documents

Publication Publication Date Title
US11747976B2 (en) Method and system for ink data generation, ink data rendering, ink data manipulation and ink data communication
US10171541B2 (en) Methods, devices, and computer programs for improving coding of media presentation description data
US10157031B2 (en) Systems, methods, and apparatuses for accepting late joiners with screen sharing
US9317683B2 (en) Dynamic media content previews
US20110162025A1 (en) Method and system for providing dynamic time slice encoding for complete internet anywhere
US10534852B2 (en) Display system and virtual web device in the cloud
JP2020115350A (ja) プラットフォーム及びイメージデバイスの間の通信プロトコル
KR102516231B1 (ko) 네트워크 기반 미디어 처리(nbmp)에서의 미디어 처리 함수를 대한 구성 파라미터의 그래프 표현 및 설명
KR20210134776A (ko) 컴퓨팅 플랫폼에서의 기능 구현을 위한 컴퓨팅 리소스 추정
US20220078502A1 (en) Techniques for obtaining and distributing user-generated content to internet-based content providers
US20220137978A1 (en) Method and apparatus for stateless parallel processing of tasks and workflows
US20120254759A1 (en) Browser-based recording of content
US11861411B2 (en) Variable and event reporting in a cloud service system
CN115461735A (zh) 具有边缘计算的媒体流式传输
US20080043015A1 (en) Online volume rendering system and method
WO2021004411A1 (zh) 一种媒体处理方法
US7692562B1 (en) System and method for representing digital media
Li et al. Emerging technologies and applications on interactive entertainments
US11593150B2 (en) Method and apparatus for cloud service
JP5613644B2 (ja) 映像情報処理ファイルシステム
JP2022167830A (ja) アクセスのための装置、方法及びコンピュータ可読媒体
CN114328410A (zh) 文件处理方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20837305

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20217042745

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2022500152

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2020837305

Country of ref document: EP

Effective date: 20220207