WO2023078204A1 - 数据处理方法、装置、设备、可读存储介质及程序产品 - Google Patents

数据处理方法、装置、设备、可读存储介质及程序产品 Download PDF

Info

Publication number
WO2023078204A1
WO2023078204A1 PCT/CN2022/128561 CN2022128561W WO2023078204A1 WO 2023078204 A1 WO2023078204 A1 WO 2023078204A1 CN 2022128561 W CN2022128561 W CN 2022128561W WO 2023078204 A1 WO2023078204 A1 WO 2023078204A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
encoding
data
encoded
coding
Prior art date
Application number
PCT/CN2022/128561
Other languages
English (en)
French (fr)
Inventor
李志成
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2023078204A1 publication Critical patent/WO2023078204A1/zh
Priority to US18/450,627 priority Critical patent/US20230396783A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/30Scenes; Scene-specific elements in albums, collections or shared content, e.g. social network photos or video
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/232Content retrieval operation locally within server, e.g. reading video streams from disk arrays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/432Content retrieval operation from a local storage medium, e.g. hard-disk
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4

Definitions

  • the present application relates to the field of computer technology, and in particular to a data processing method, device, equipment, readable storage medium and program product.
  • multimedia data such as video, music, text, etc.
  • corresponding encoding may be performed according to different encoding processes, so as to obtain media data with different media qualities.
  • media data with different code rates or different resolutions, etc.
  • Embodiments of the present application provide a data processing method, device, device, readable storage medium, and program product, which help to improve encoding efficiency and encoding compression performance.
  • an embodiment of the present application provides a data processing method, which is executed in a computer device, and the method includes:
  • An embodiment of the present application provides a data processing device on the one hand, including:
  • a data acquisition module configured to acquire media data to be encoded, and the first business scenario type to which the media data to be encoded belongs;
  • a template acquiring module configured to acquire a first coding configuration template corresponding to the first business scenario type according to the mapping relationship between at least two coding configuration templates in the configuration template set and at least two business scenario types;
  • a parameter determination module configured to determine frame encoding parameters of the media data to be encoded according to the first encoding configuration template
  • a data encoding module configured to encode the media data to be encoded according to the determined frame encoding parameters to obtain first media data.
  • An embodiment of the present application provides a computer device, including: a processor and a memory;
  • the memory stores a computer program, and when the computer program is executed by the processor, the processor executes the method in the embodiment of the present application.
  • Embodiments of the present application provide, on the one hand, a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program, and the computer program includes program instructions.
  • the program instructions are executed by a processor, the method in the embodiment of the present application is executed.
  • One aspect of the present application provides a computer program product, the computer program product includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the method provided in one aspect of the embodiments of the present application.
  • FIG. 1 is a network architecture diagram provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of a data encoding scenario provided by an embodiment of the present application.
  • Fig. 3 is a schematic flow chart of a data processing method provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a scenario for configuring a coding configuration template for a business scenario type provided by an embodiment of the present application
  • FIG. 5 is a schematic diagram of an encoding process provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of determining frame encoding parameters according to a first encoding configuration template provided by an embodiment of the present application
  • Fig. 7 is a system flow chart provided by the embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a data processing device provided in an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • FIG. 1 is a schematic structural diagram of a network architecture provided by an embodiment of the present application.
  • the network architecture may include a service server 1000 and a terminal device cluster (that is, a user terminal cluster).
  • the terminal device cluster may include one or more terminal devices, and the number of terminal devices is not limited here.
  • the multiple terminal devices may specifically include a terminal device 100a, a terminal device 100b, a terminal device 100c, ..., a terminal device 100n.
  • terminal equipment 100a, terminal equipment 100b, terminal equipment 100c, ..., terminal equipment 100n can carry out the network connection with above-mentioned business server 1000 respectively, so that each terminal equipment can carry out network connection with business server 1000 through this network connection.
  • the network connection here is not limited to the connection method, and may be directly or indirectly connected through wired communication, wireless communication, or other methods, which are not limited in this application.
  • Each terminal device can be integrated with a target application installed, and when the target application runs in each terminal device, it can perform data interaction with the service server 1000 shown in FIG. 1 above.
  • the target application may include an application having a function of displaying text, image, audio, video and other data information.
  • the application may include social application, multimedia application (for example, video application), entertainment application (for example, game application), education application, live broadcast application and other applications with media data encoding function (for example, video encoding function). It can also be other applications that have the function of displaying data information and video encoding functions, and will not give examples one by one here.
  • the application may be an independent application, or an embedded sub-application integrated in an application (for example, a social application, an educational application, a multimedia application, etc.), which is not limited here.
  • one user terminal may be selected from the multiple user terminals shown in FIG. 1 as the target user terminal.
  • the user terminal 100a shown in FIG. 1 may be used as a target user terminal, and a target application having a video encoding function may be integrated in the target user terminal.
  • the target user terminal can realize data interaction with the service server 1000 through the service data platform corresponding to the application client.
  • the computer equipment for example, user terminal 100a, service server 1000
  • media data encoding function for example, video encoding function
  • cloud technology refers to a hosting technology that unifies a series of resources such as hardware, software, and network in a wide area network or a local area network to realize data calculation, storage, processing, and sharing.
  • Cloud technology can be a general term for network technology, information technology, integration technology, management platform technology, application technology, etc. It can form a resource pool and be used on demand, which is flexible and convenient. Cloud computing technology will become an important support.
  • the background services of technical network systems require a lot of computing and storage resources, such as video websites, picture websites and more portal websites.
  • each item may have its own identification mark in the future, which needs to be transmitted to the background system for logical processing. Data of different levels will be processed separately, and all kinds of industry data need to be powerful.
  • the system backing support can only be realized through cloud computing.
  • the data processing method provided in the embodiment of the present application can be applied to high-resolution, high-frame-rate scenarios such as video viewing scenarios, video call scenarios, video transmission scenarios, cloud conference scenarios, and live broadcast scenarios.
  • cloud conference is an efficient, convenient and low-cost conference form based on cloud computing technology.
  • domestic cloud conferences mainly focus on the service content based on the SaaS (Software as a Service) model, including telephone, network, video and other service forms.
  • Cloud computing-based video conferences are called cloud conferences.
  • the cloud conference system supports multi-server dynamic cluster deployment and provides multiple high-performance servers, which greatly improves the stability, security and availability of the conference.
  • video conferencing has been widely used in transportation, transportation, finance, operators, education, enterprises and other fields because it can greatly improve communication efficiency, continuously reduce communication costs, and bring about upgrading of internal management.
  • cloud computing video conferencing will be more attractive in terms of convenience, speed, and ease of use, which will definitely stimulate the arrival of a new upsurge in video conferencing applications.
  • a computer device with a media data encoding function can encode the media data through a media data encoder (for example, a video encoder) to obtain the corresponding The data code stream (for example, obtain the video code stream corresponding to the video data), and then the transmission efficiency of the media data can be improved.
  • a media data encoder for example, a video encoder
  • the video encoder may be an AV1 video encoder, an H.266 video encoder, an AVS3 video encoder, etc., and no further examples are given here.
  • the video compression standard of the AV1 video encoder is the first-generation video coding standard developed by the Alliance for Open Media (AOM).
  • the media data to be encoded may be referred to as media data to be encoded
  • the service scenario type to which the media data to be encoded belongs may be referred to as a first service scenario type.
  • this application can configure different encoding configuration templates for different types of business scenarios. If the first type of business scenario is different, then the first The encoding configuration templates are also different, and when the media data to be encoded is encoded, the corresponding frame encoding parameters will also be different.
  • the service server 1000 when the service server 1000 acquires the media data to be encoded, it can encode the media data to be encoded (for example, the media data to be encoded can be encoded by a media data encoder) to obtain the first 1. Media data.
  • the service server 1000 may determine the corresponding first encoding configuration template according to the first business scenario type to which the media data to be encoded belongs, and then determine the corresponding frame according to the first encoding configuration template encoding parameters, and then encode the media data to be encoded based on the frame encoding parameters, thereby obtaining first media data with a first media quality (the first media quality is adaptable to the first service scenario type).
  • the specific implementation manner of determining the first coding configuration template according to the first business scenario type and determining the frame coding parameters of the media data to be coded according to the first coding configuration template can refer to the subsequent description in the embodiment corresponding to FIG. 3 .
  • the method provided in the embodiment of the present application can be executed by a computer device, and the computer device includes but is not limited to a terminal device or a service server.
  • the business server can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, and can also provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud Cloud servers for basic cloud computing services such as communications, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms.
  • the above-mentioned computer equipment may be a node in a distributed system, wherein the distributed system may be a block Chain system, the blockchain system may be a distributed system formed by connecting multiple nodes through network communication.
  • the peer-to-peer (P2P, Peer To Peer) network that can be formed between nodes
  • the P2P protocol is an application layer protocol that runs on the Transmission Control Protocol (TCP, Transmission Control Protocol) protocol.
  • TCP Transmission Control Protocol
  • any form of computer equipment such as business servers, terminal equipment and other electronic equipment, can become a node in the blockchain system by joining the peer-to-peer network.
  • blockchain is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Organize and encrypt into a ledger, so that it cannot be tampered with and forged, and at the same time, data can be verified, stored and updated.
  • the computer device is a block chain node
  • the data in this application (such as the media data to be encoded, the first media data after encoding processing, and the frame encoding parameters etc.) have authenticity and security, so that after relevant data processing based on these data, the obtained results are more reliable.
  • FIG. 2 is a schematic diagram of a data encoding scenario provided by an embodiment of the present application.
  • the terminal device 2a may be a sending terminal for sending video data (for example, the video data 1 shown in FIG. 2 ), and the user corresponding to the terminal device 2a may be user a.
  • the terminal device 2b may be a receiving terminal for receiving video data (eg, video data 1 shown in FIG. 2 ), and the user corresponding to the terminal device 2b may be user b.
  • the service server 200 shown in FIG. 2 may be a server having a network connection relationship with the terminal device 2a, and the service server 200 may be the service server 1000 shown in FIG. 1 above.
  • the terminal device 2a can obtain the video data 1 associated with the user a collected by the image collector (for example, a camera) (the video data 1 associated with the user a needs to be obtained under the authorization of the user a). video data associated with user a). Further, the terminal device 2a may encode the video data 1 through a video encoder (for example, an AV1 video encoder), so as to generate a video code stream 1 associated with the video data 1 . At this point, the terminal device 2a may send the video code stream 1 to the service server 200 .
  • a video encoder for example, an AV1 video encoder
  • the service server 200 When the service server 200 receives the video code stream 1, it can decode the video code stream 1 to obtain video data (which can be called YUV video data, also called YUV format) with pixel image format (also called YUV format). may be referred to as decoded video data, or as video data to be encoded), and then the service server 200 may perform encoding processing on the video data to be encoded (for example, perform encoding processing on the video data to be encoded by a video encoder).
  • video data which can be called YUV video data, also called YUV format
  • pixel image format also called YUV format
  • the process for the service server 200 to encode the video data to be encoded can be: the service server 200 can first obtain the business scene type (which can be called the first business scene type) to which the video data to be coded belongs, that is to say, it can The first business scene type to which the video data 1 belongs is acquired, wherein when the terminal device 2a transmits the video code stream 1 to the business server 1000, it can also transmit the first business scene type to which the video data 1 belongs, so that the business server 1000 200 can quickly and accurately acquire the first service scene type of the video data 1 .
  • the business scene type which can be called the first business scene type
  • the service server 200 may acquire a configuration template set, wherein the configuration template set may include mapping relationships between at least two business scenario types and at least two coding configuration templates (a business scenario type may be associated with a coding configuration templates), the service server 200 can obtain the first coded configuration template corresponding to the first business scenario type according to these mapping relationships in the configuration template set.
  • the configuration template set may include mapping relationships between at least two business scenario types and at least two coding configuration templates (a business scenario type may be associated with a coding configuration templates)
  • the service server 200 can obtain the first coded configuration template corresponding to the first business scenario type according to these mapping relationships in the configuration template set.
  • the configuration template set includes the mapping relationship between business scenario type 1 and coding configuration template 1, the mapping relationship between business scenario type 2 and coding configuration template 2, the mapping relationship between business scenario type 3 and coding configuration template
  • the mapping relationship between the configuration templates 3, and the first business scenario type to which the video data 1 belongs is the business scenario type 2
  • the service server 200 can use the encoding configuration template 2 having a mapping relationship with the business scenario type 2 as the first Encoding configuration templates.
  • the service server 200 can determine the frame encoding parameters of the video data to be encoded according to the first encoding configuration template, and the service server 200 can encode the video data to be encoded according to the frame encoding parameters, thus obtaining First video data having a first media quality.
  • the service server 200 can send the first video data to the terminal device 2b waiting to play the video, and the terminal device 2b can decode and output the first video data, and the user b can view the video data 1 through the terminal device 2b.
  • the encoding process of the video data to be encoded can actually be understood as encoding the video frame of the video data to be encoded (which can be called a video frame to be encoded), and the above-mentioned frame encoding parameters can include frame encoding structure and frame Coding quality parameters (such as code rate, resolution, quantization coefficient, etc.), the present application can perform coding processing on each video frame to be coded according to the frame coding structure and frame coding quality parameters.
  • the video frame (which may be referred to as the first video frame) that needs to be encoded can be obtained from the video data to be encoded, and then the video frame to be encoded can be obtained from the first video frame.
  • Coding Unit Coding Unit, referred to as CU.
  • the service server 200 may perform prediction processing on the unit to be encoded based on the encoding strategy of the video encoder to obtain the optimal prediction mode corresponding to the unit to be encoded, and then may base on the optimal prediction mode and the above frame
  • the encoding parameter performs encoding processing on the unit to be encoded, so as to obtain a compressed code stream corresponding to the unit to be encoded.
  • the service server 200 completes the encoding processing of each unit to be encoded in the first video frame, the compressed code stream corresponding to each unit to be encoded can be respectively obtained, and then these compressed code streams can be encapsulated into a
  • the video code stream associated with the video data to be encoded can be referred to as the first video data shown in FIG. 2 .
  • the encoding strategy here may include an intra-frame prediction mode and an inter-frame prediction mode. It can be understood that, in the process of determining the optimal prediction mode by the service server 200, the service server 200 can select the prediction mode corresponding to the optimal rate-distortion cost among the intra prediction modes, and then can use the selected prediction mode mode as the optimal intra prediction mode. Similarly, the service server 200 may select the prediction mode corresponding to the optimal rate-distortion cost from the inter-frame prediction modes, and then may use the selected prediction mode as the optimal inter-frame prediction mode. Further, the service server 200 may perform mode selection processing based on the rate-distortion cost of the optimal intra prediction mode and the optimal inter prediction mode to obtain the optimal prediction mode.
  • the service server 200 can select the prediction mode with the smallest rate-distortion cost from the optimal intra-frame prediction mode and the optimal inter-frame prediction mode, and then can use the prediction mode with the smallest rate-distortion cost as the optimal prediction mode.
  • this application can pre-configure a coding configuration template for different business scenario types according to different business scenario requirements, and the coding configuration template can include media data (such as video) under different business scenario types. data) frame coding structure and frame coding quality parameters, thus, when a certain media data to be coded is obtained, the amount of computation in the coding process can be reduced, and the first business scenario to which the media data to be coded belongs can be directly obtained
  • the first encoding configuration template corresponding to the type, so that the frame encoding structure and frame encoding quality parameters corresponding to the to-be-encoded media data can be quickly determined, and then the to-be-encoded media can be encoded according to the frame encoding structure and frame encoding quality parameters
  • the data is encoded, which can greatly improve the encoding efficiency.
  • the present application can also make adaptive adjustments to the first encoding configuration template according to the network status (or decoding capability) of the decoding end (such as the terminal device 2b), for example, after acquiring After arriving at the first encoding configuration template corresponding to the media data to be encoded, the frame encoding quality parameter in the first encoding configuration template includes the encoding quality parameter of code rate, if the network status corresponding to the terminal device 2b is obtained at this time is poor, Then, at this time, the code rate can be adaptively reduced, so that the decoder can still receive video data quickly even when the network status is poor.
  • FIG. 3 is a schematic flowchart of a data processing method provided by an embodiment of the present application.
  • the method can be performed by a computer device, for example, it can be performed by a terminal device (such as any terminal device in the terminal device cluster in the embodiment corresponding to Figure 1 above, such as the terminal device 100a);
  • the service server 1000 in the embodiment corresponding to FIG. 1 is executed; the method can also be executed jointly by the terminal device and the service server.
  • the method flow may at least include the following steps S101-step S104:
  • step S101 the media data to be coded and the first business scenario type to which the media data to be coded belong are obtained.
  • a computer device such as a terminal device with a media data encoding function (such as a video coding function) can acquire media data collected by an image collector (such as a camera of a terminal device) in a data transmission scenario (such as video data). Further, the terminal device can encode the media data, thereby obtaining a data code stream (such as a video code stream) corresponding to the media data, and the terminal device can send the data code stream to the service server, and the service server can The data code stream is decoded to obtain media data in YUV format, which can be referred to as media data to be encoded.
  • a data code stream such as a video code stream
  • the terminal device when it transmits the data code stream to the service server, it can also transmit the business scenario type to which the media data belongs.
  • the business scenario type to which the media data belongs is the business scenario to which the media data to be encoded belongs.
  • Type here the business scenario type to which the media data to be encoded belongs can be referred to as the first business scenario type.
  • the business scenario type to which the media data to be encoded belongs can be determined according to the application type of the target application. For example, the target application installed in the terminal device is an offline short video application, and the terminal device is running the offline short video application.
  • the video data is obtained through the camera
  • the business scene type to which the video data belongs may be an offline short video type
  • the corresponding first business scene type to which the media data to be encoded belongs to is an offline short video type.
  • the target application installed in the terminal device is a live broadcast application
  • the terminal device obtains live video data through the camera when running the live broadcast application (the live video data is obtained under the authorization of the user corresponding to the terminal device), then the live broadcast
  • the business scenario type to which the video data belongs may be a live broadcast type
  • the corresponding first business scenario type to which the media data to be encoded belongs to is a live broadcast type. That is to say, if the media data to be encoded is obtained through a target application, the first service scenario type to which the media data to be encoded belongs may be the application type of the target application.
  • the target applications here may include social applications, multimedia applications (such as offline short video applications, live broadcast applications, audio and video applications), entertainment applications (such as game applications), educational applications, and other applications with media data encoding functions, Certainly, other applications having the function of displaying data information and video encoding function may also be used, and no further examples will be given here.
  • the solution provided by the embodiment of the present application may involve machine learning technology of artificial intelligence.
  • Machine learning Machine Learning (Machine Learning, ML) is a multi-field interdisciplinary subject, involving probability theory, statistics, approximation theory, convex analysis, algorithm Complexity theory and other disciplines. Specializes in the study of how computers simulate or implement human learning behaviors to acquire new knowledge or skills, and reorganize existing knowledge structures to continuously improve their performance.
  • Machine learning is the core of artificial intelligence and the fundamental way to make computers intelligent, and its application pervades all fields of artificial intelligence.
  • Machine learning and deep learning usually include techniques such as artificial neural network, belief network, reinforcement learning, transfer learning, inductive learning, and teaching learning.
  • the service server determines the first service scene type to which the media data to be encoded belongs, it can generate the scene feature corresponding to the media data to be encoded according to the scene prediction model, and then output the scene feature corresponding to the scene prediction model in the scene prediction model.
  • the predicted scene type corresponding to the scene feature can be used as the first service scene type corresponding to the media data to be encoded.
  • the scene prediction model may be a machine learning model trained according to historical media data with real scene labels, and may be used to infer the predicted scene type to which certain media data belongs.
  • Step S102 according to the mapping relationship between at least two coding configuration templates and at least two business scenario types in the configuration template set, obtain a first coding configuration template corresponding to the first business scenario type.
  • the at least two encoding configuration templates include a first encoding configuration template.
  • different coding configuration templates can be pre-configured for different business scenario types, and a mapping relationship can be created for each business scenario type and its corresponding coding configuration template, so that a configuration template set containing multiple mapping relationships can be obtained .
  • the first business scenario type can be determined according to the mapping relationship between at least two coding configuration templates and at least two business scenario types in the configuration template set.
  • the first encoding configuration template corresponding to the scene type.
  • FIG. 4 is a schematic diagram of a scenario for configuring a coding configuration template for a service scenario type according to an embodiment of the present application.
  • the type of business scenario shown in Figure 4 is an offline short video scene as an example.
  • the encoding requirements of this offline short video scene can be obtained.
  • the main goal of this offline short video is compression performance, but its requirements for processing delay is not high, the scene encoding requirements of the offline short video are: high compression performance and low processing delay requirements.
  • the template for the offline short video scene which can be understood as: configuring the frame type for the video frame of the offline short video scene, Configure the frame coding structure and frame coding quality parameters.
  • a frame group when configuring a video frame, it can be configured in units of frame groups (thus a frame group can be called a unit video frame group), wherein a frame group can be a group of continuous video frames in video data (Group of Pictures, referred to as GOP), a frame group can include multiple video frames.
  • a frame group can be a group of continuous video frames in video data (Group of Pictures, referred to as GOP)
  • a frame group can include multiple video frames.
  • the frame type of a frame of image may be determined according to the encoding parameter setting and the code rate control policy, and the frame type here may include the first type, the second type and the third type.
  • the frame type of an intra picture may be referred to as the first type, and a bi-directional interpolated prediction frame (bi-directional interpolated prediction frame, B frame for short, B frame It is a two-way difference frame, that is, the B frame records the difference between the current frame and the previous frame, and the B frame can be used as the reference frame of other B frames, or it can not be used as the reference frame of other B frames).
  • This type of frame is called the second Type, the forward predictive frame (predictive-frame, referred to as P frame, P frame represents the difference between this frame and a previous key frame (or P frame), when decoding, you need to superimpose this frame with the previously cached frame
  • P frame forward predictive frame
  • P frame represents the difference between this frame and a previous key frame (or P frame), when decoding, you need to superimpose this frame with the previously cached frame
  • P frame type is called the third type.
  • a GOP can be understood as the interval between two I frames.
  • a video frame includes 20 frames, wherein the first frame is an I frame, the second to eighth frames are B frames, the ninth frame is a P frame, and the 10th frame is a P frame.
  • the frame is an I frame, the 11th-19th frame is a B frame, and the 20th frame is a P frame, then the first frame can be used as the starting frame, and the ninth frame is the ending frame to form a GOP (unit video frame group) ; It is also possible to use the 10th frame as the start frame and the 20th frame as the end frame to form a GOP (unit video frame group).
  • the frame sequence of each GOP is fixed.
  • the fixed GOP is 120 frames, that is, it is generated every 120 frames
  • the fixed GOP is 120 frames, that is, it is generated every 120 frames
  • An I frame, GOP frame sequence can determine whether a certain frame is a P frame or a B frame according to the picture complexity and related P and B frame generation weights.
  • the GOP size and frame sequence are not fixed, it can be automatically generated according to the picture texture, motion complexity, I, P, B frame generation strategy and weight configuration.
  • the above-mentioned division process of GOP for example, taking the first frame as the start frame and the ninth frame as the end frame to form a GOP
  • the embodiment of the present application can use GOP as the granularity to configure each video frame included in a GOP.
  • the frame type distribution, frame coding structure, and frame coding quality parameters (such as quantization) of each frame can be reconfigured. coefficient, code rate, etc.).
  • I, B, and P frame distribution and frame coding structure can be reconfigured for video frames in a GOP (layered coding structure 40 as shown in Figure 4 ), after determining the layered coding structure 40, quantization coefficients (Quantization Parameter, QP) and code rate control parameters (also referred to as code control algorithms) can also be configured for each layer in the layered coding structure 40.
  • quantization coefficients Quantization Parameter, QP
  • code rate control parameters also be configured for each layer in the layered coding structure 40.
  • the above-mentioned I, B, and P frame type distributions, frame coding structures, quantization coefficients of each layer, and bit rate control parameters can be referred to as coding configuration templates corresponding to the offline short video scene.
  • the configuration parameters included in the encoding configuration template are of course not limited to the above-described frame type distribution, frame encoding structure, quantization coefficients of each layer, and rate control parameters (quantization coefficients of each layer and rate control parameters It can be referred to as a frame coding quality parameter), for example, it is also possible to configure the reference group in the GOP (MinGOP, a group divided into a GOP, and each frame in the group is only referenced in the intra-frame and inter-frame in the group), MinGOP can be controlled within a maximum of 16. Different resolutions have different maximum values. For details, refer to standard RFC documents, such as the level 4.1 specification. The maximum value of 720P and 1080P videos is within 9 and 4, so that Code control and QP adjustment are more accurate and reasonable.
  • the layered coding method can maintain the logic of dependencies between a group of frames, so the video frames satisfying the dependencies can be coded at the same time, so that the coding can be processed in parallel, which can greatly improve the coding efficiency. performance.
  • the layered coding structure shown in Figure 4 can be divided into 5 layers, the 0th frame is an I frame, the 16th frame is a P frame, and the rest of the frames are B frames, such as As shown in Figure 6, data frame (i.e.
  • data frame to be encoded such as video frame
  • the service server can perform data frame 8 Encoding processing.
  • the frame type of the data frame 8 may be called a B frame.
  • Data frame 4 needs to refer to data frame 0 and data frame 8 during the encoding process
  • data frame 12 needs to refer to data frame 8 and data frame 16 during the encoding process. Therefore, when the encoding of data frame 8 is completed, the service server can separately Frame 4 and data frame 12 are encoded.
  • the frame types of the data frame 4 and the data frame 12 can both be referred to as B frames.
  • Data frame 2 needs to refer to data frame 0 and data frame 4 during the encoding process
  • data frame 6 needs to refer to data frame 4 and data frame 8 during the encoding process. Therefore, when the encoding of data frame 4 is completed, the server located at the data Frame 2 and data frame 6 are encoded.
  • the data frame 10 needs to refer to the data frame 8 and the data frame 12 during the encoding process
  • the data frame 14 needs to refer to the data frame 12 and the data frame 16 during the encoding process. Therefore, when the encoding of the data frame 12 is completed, the encoding process can be performed on the data frame 10 and the data frame 14 .
  • the frame types of the data frame 2, the data frame 6, the data frame 10 and the data frame 14 can all be referred to as B frames.
  • Data frame 1 needs to refer to data frame 0 and data frame 2 during the encoding process
  • data frame 3 needs to refer to data frame 2 and data frame 4 during the encoding process. Therefore, when the encoding of data frame 2 is completed, the service server can separately Frame 1 and data frame 3 are encoded.
  • Data frame 5 needs to refer to data frame 4 and data frame 6 during the encoding process
  • data frame 7 needs to refer to data frame 6 and data frame 8 during the encoding process. Therefore, when the encoding of data frame 6 is completed, the computer device can respectively Data frame 5 and data frame 7 are encoded.
  • Data frame 9 needs to refer to data frame 8 and data frame 10 during the encoding process
  • data frame 11 needs to refer to data frame 10 and data frame 12 during the encoding process.
  • the service server can separately Frame 9 and data frame 11 are encoded.
  • Data frame 13 needs to refer to data frame 12 and data frame 14 during the encoding process
  • data frame 15 needs to refer to data frame 14 and data frame 16 during the encoding process. Therefore, when data frame 14 is encoded, the service server can separately Frame 13 and data frame 15 are encoded. Among them, data frame 1, data frame 3, data frame 5, data frame 7, data frame 9, data frame 11, data frame 13 and data frame 15 are not referenced, and the frame types of these eight reference frames can be called B frame.
  • this application can configure different encoding configuration templates for different types of business scenarios (for example, with a GOP as the granularity, configure frame type distribution, layered encoding structure, quantization coefficients and codes in each layer rate control parameters, etc.), after configuring templates for each business scenario type, a mapping relationship can be established between each business scenario type and its corresponding coding configuration template, so that a configuration template set containing multiple mapping relationships can be obtained , then after acquiring the media data to be encoded and the first business scenario type to which it belongs, the first encoding configuration template corresponding to the first business scenario type can be determined according to the configuration template set.
  • the specific implementation manner of obtaining the first coding configuration template corresponding to the first business scenario type may be: traversable Configure at least two business scenario types in the template set; if there is the same target business scenario type as the first business scenario type in the at least two business scenario types, then at least two coding configuration templates can be configured with the first business scenario type
  • a coding configuration template with a mapping relationship between scenario types is determined to be the first coding configuration template corresponding to the first business scenario type; and if at least two business scenario types do not have the same first business scenario type as the first business scenario type , then the scenario similarity between at least two business scenario types and the first business scenario type can be determined, and the first coding configuration template corresponding to the first business scenario type can be determined according to the at least two scenario similarities.
  • the specific implementation method for determining the first coding configuration template corresponding to the first business scenario type according to at least two scenario similarities may be as follows: the maximum scenario similarity may be obtained from at least two scenario similarities, and the maximum scenario similarity degree and the scene similarity threshold; if the maximum scene similarity is greater than the scene similarity threshold, then at least two business scene types, the business scene type corresponding to the maximum scene similarity can be determined as the matching business scene type, and at least Among the two coding configuration templates, the coding configuration template that has a mapping relationship with the matching business scenario type is determined to be the first coding configuration template corresponding to the first business scenario type.
  • the configuration template set can be traversed, if there is a business scenario type identical to the first business scenario type in the configuration template set (that is, the first business scenario type) scenario type is configured with a template), then the coding configuration template corresponding to the business scenario type can be directly determined as the first coding configuration template; and if there is no business scenario type identical to the first business scenario type in the configuration template set (that is, no template is configured for the first business scenario type), then the business scenario type most similar to the first business scenario type in the configuration template set can be determined (that is, the matching business scenario type), if the scenarios between the two are similar degree is greater than or equal to the scenario similarity threshold, then the coding configuration template corresponding to the most similar business scenario type can be used as the first coding configuration template.
  • the scenario similarity threshold between the most similar business scenario type and the first business scenario type is also lower than the scenario similarity threshold
  • you can Template configuration is performed in real time according to the scenario coding requirements of the first business scenario type, and a mapping relationship between the configured template and the first business scenario type can be established and stored in a configuration template set for subsequent use.
  • Step S103 determine frame encoding parameters of the media data to be encoded according to the first encoding configuration template.
  • Step S104 Perform encoding processing on the media data to be encoded according to the determined frame encoding parameters to obtain first media data.
  • the first media data has a first media quality.
  • the first media quality matches the first service scenario type.
  • the frame coding parameters of the media data to be coded can be determined according to the first coding configuration template, wherein the frame coding parameters can include frame coding structure, Frame encoding quality parameter, the frame encoding structure may refer to the encoding structure of the data frame to be encoded (such as when the media data to be encoded is video data, the data frame to be encoded may refer to the video frame to be encoded), the frame encoding quality parameter It may include quantization coefficient QP, code rate control parameters, resolution and so on. It should be understood that the present application can use GOP as a unit to configure the frame coding structure and frame coding quality parameters for the data frames to be coded in each GOP. For the specific implementation of determining the frame coding parameters of the media data to be coded, please refer to the following figure The description in the embodiment corresponding to 6.
  • the encoding process can be performed on the media data to be encoded according to the frame encoding parameters, or That is to say, the to-be-encoded data frame can be encoded according to the frame encoding structure and frame encoding quality parameters, thereby obtaining the first media data with the first media quality (that is, having a certain resolution, code rate, and compression performance).
  • FIG. 5 is a schematic diagram of an encoding process provided by an embodiment of the present application.
  • a frame of image (the Fn current frame shown in Figure 5) is sent to the video encoder, which can be divided into multiple coding tree units (Coding Tree Units, CTUs) according to the block size of 64 ⁇ 64.
  • a coding unit (Coding Unit, CU) can be obtained, where each CU can contain a prediction unit (Predict Unit, PU) and a transformation unit (TransformUnit, TU).
  • Predict Unit, PU Predict Unit
  • TransformUnit TransformUnit
  • the degree of content change between the prediction unit and the unit to be encoded can be determined, and then based on the degree of content change, the residual between the prediction unit and the unit to be encoded can be determined (for example, according to the The current frame Fn and the reference frame F'n-1 perform intra-frame prediction and inter-frame prediction, and perform ME (motion estimation, motion estimation) and MC (Motion Compensation, motion compensation)) in the figure; then, the service server can The residual is transformed (such as discrete cosine transform (Discrete Cosine Transform, DCT)) and quantized to obtain quantized coefficients (also called residual coefficients), and then the quantized parameters can be entropy encoded to obtain the unit to be encoded
  • the corresponding compressed code stream that is, the arrow output as shown in FIG. 5 ).
  • the prediction mode for predicting the coding unit to be processed may include an intra-frame prediction mode and an inter-frame prediction mode.
  • the service server can perform inverse quantization processing (inverse quantization as shown in FIG. 5 ) and inverse transformation processing (inverse transformation as shown in FIG. 5 ) on the quantization coefficients to obtain the residual value corresponding to the reconstructed image. Furthermore, based on the residual value and the prediction unit, a reconstructed image can be obtained (which may correspond to the reconstructed frame F'n as shown in the figure. For example, the residual value and the predicted value can be added, and based on DB (DeBlock Filter, deblocking filter) and SAO (Sample adaptive offset, adaptive compensation) to obtain a reconstructed image).
  • DB DeBlock Filter, deblocking filter
  • SAO sample adaptive offset, adaptive compensation
  • the service server may perform filtering processing (such as in-loop filtering processing) on the reconstructed image to obtain a filtered image. That is, after the encoding of the current frame Fn is completed, the reconstructed frame corresponding to the current frame Fn can be obtained based on the filtered image, and then the reconstructed frame can be entered into the reference frame queue as the reference frame of the next frame, so that The encoding process is performed backwards in turn.
  • filtering processing such as in-loop filtering processing
  • PU prediction can be divided into intra-frame prediction and frame-level prediction. You can first compare different PUs within the same prediction type to find the optimal partition mode, and then perform intra-frame and inter-frame modes. By comparison, the optimal prediction mode under the current CU can be found; at the same time, the adaptive transformation (Residual Quad-tree Transform, RQT) based on the quadtree structure can be performed on the CU to find the optimal TU mode. Finally, a frame of image can be divided into CUs, and PUs and TUs corresponding to CUs.
  • RQT Residual Quad-tree Transform
  • a series of encoding parameters corresponding to the media data to be encoded can be quickly determined according to the template configured in the scene, so that the encoded
  • the media data can meet the required rate limit and make the encoding distortion as small as possible.
  • the rate control belongs to the category of rate-distortion optimization. It is mainly to determine the quantization coefficient that matches the bit rate, which requires a large amount of calculation.
  • the application can be configured in advance according to the scene, which can effectively reduce the amount of calculation in the encoding process and improve the encoding efficiency.
  • the terminal device when the terminal device collects media data (such as video data), it needs to perform encoding processing on the media data, and the collected media data may also be referred to as media data to be encoded.
  • the terminal device may also obtain the first coding configuration template from the configuration template set according to the first business scenario type, and determine the frame coding parameters corresponding to the media data to be coded according to the first coding configuration template, and according to the frame coding parameters Encode the media data. The specific process will not be repeated here.
  • different coding configuration templates can be configured for different business scenario types, and a configuration template set can be generated, which can include each business scenario type and its corresponding (with mapping relationship) coding configuration template, then when the media data to be encoded is obtained, the first encoding configuration template corresponding to the first business scenario type can be quickly obtained in the configuration template set according to the first business scenario type to which it belongs;
  • the first encoding configuration template determines the frame encoding parameters of the to-be-encoded media data, and then encodes the to-be-encoded media data according to the frame encoding parameters to obtain target media data of target media quality matching the first service scenario type.
  • the frame encoding parameters of the media data to be encoded are determined by the first encoding configuration template, which can effectively reduce the amount of calculation in the encoding process, reduce the time-consuming encoding, and improve the encoding efficiency of the computer equipment for the media data to be encoded; at the same time, according to the business scenario type
  • the method of adaptively selecting the encoding configuration template can make the selected first encoding configuration template match the first service scenario type, and the first media quality of the encoded first media data is consistent with the first service scenario type.
  • the scene type matches, that is, the frame coding parameters determined based on the first coding configuration template meet the coding requirements of the first business scene type, and have scene adaptability, and the compression performance of the first media data obtained by using the frame coding parameters is also good. higher.
  • the present application can improve coding efficiency and improve coding compression performance.
  • FIG. 6 is a schematic flowchart of determining frame encoding parameters according to a first encoding configuration template provided by an embodiment of the present application.
  • the process may correspond to the above-mentioned embodiment corresponding to FIG. 3 .
  • the frame encoding parameters in the process may include the frame encoding structure and frame encoding parameters. quality parameters.
  • the process may at least include the following steps S601-step S603:
  • Step S601 acquiring frame type distribution and frame level distribution in the first encoding configuration template.
  • the frame types here may include the first type, the second type and the third type.
  • the frame type of an intra picture may be referred to as the first type
  • the frame type of a bidirectional predictive coding frame (bi-directional interpolated prediction frame, B frame for short) may be referred to as the first type.
  • the frame type is called the second type
  • the frame type of the forward predictive coding frame (predictive-frame, P frame for short) is called the third type.
  • the frame level distribution can refer to the hierarchical coding structure (the total number of layers, the number of frames in each layer, etc.), for example, the layered coding structure 40 in the embodiment corresponding to Figure 4 above can be the frame level distribution, the layered The total number of layers in the coding structure is 5 layers, the number of frames in the first layer is 2 (including the first frame and the 16th frame, the first frame is an I frame, and the 16th frame is a P frame), and the number of frames in the second layer is 1 (including the 8th frame, which is a B frame), the number of frames in the third layer is 2 (including the 4th frame and the 12th frame, both the 4th frame and the 12th frame are B frames), the layered coding structure also includes the Layer 4 and Layer 5, the layered coding structure 40 will not be described here again.
  • Step S602 according to the frame type distribution and the frame level distribution, determine the frame coding structure corresponding to the media data to be coded.
  • this application can perform frame type configuration and layered coding structure configuration with GOP as granularity, then this application can use GOP as granularity to determine each GOP (which can be referred to as unit data here) according to the first coding configuration template.
  • Frame group corresponds to the frame coding structure.
  • the specific method can be: the unit data frame group corresponding to the media data to be encoded can be obtained; wherein, the unit data frame group is composed of N consecutive data frames to be encoded, and the media data to be encoded includes the data frame to be encoded; N is positive integer; subsequently, the frame type distribution corresponding to the unit data frame group in the frame type distribution can be obtained (that is, the frame type distribution configured for the frame in each GOP), according to the frame type distribution of the frame group, can be The data frame to be coded in the unit data frame group is divided into types to obtain the type-divided data frame; then, the frame level distribution of the frame group corresponding to the unit data frame group in the frame level distribution can be obtained, and can be classified according to the frame level distribution.
  • the type-divided data frames are hierarchically divided to obtain the hierarchical coding structure corresponding to the unit data frame group; the hierarchical coding structure is determined as the frame coding structure corresponding to the media data to be coded.
  • step S602 can obtain each unit data frame group in the media data to be encoded; each unit data frame group is composed of N consecutive data frames to be encoded, and N is a positive integer .
  • the frame type distribution classify the data frames to be encoded in each unit data frame group to obtain type-classified data frames.
  • the type-divided data frames are hierarchically divided according to the frame level distribution to obtain a hierarchical coding structure corresponding to each unit data frame group.
  • the layered coding structure corresponding to each unit data frame group determine the frame coding structure corresponding to the media data to be coded.
  • the layered coding structure including each frame may be determined as the frame coding structure corresponding to the GOP.
  • Step S603 acquiring the coding quality parameters in the first coding configuration template, and configuring the quality parameters of the media data to be coded according to the coding quality parameters to obtain the frame coding quality parameters corresponding to the media data to be coded.
  • the frame coding structure can be a layered coding structure.
  • the layered coding structure includes the first level and the second level, and the second level is higher than the first level.
  • Each level configures frame coding quality parameters (such as quantization coefficient QP, code rate control parameters, etc.), taking the coding quality parameters including the first coding quality parameter corresponding to the first level and the second coding quality parameter corresponding to the second level as an example,
  • the specific method of obtaining the frame encoding quality parameters corresponding to the media data to be encoded can be: the first level in the layered encoding structure can be obtained in the media data to be encoded The first data frame to be encoded, and the second data frame to be encoded at the second level in the hierarchical encoding structure; the application can directly use the first encoding quality parameter as the frame encoding quality corresponding to the first data frame to be encoded parameter, using the second en
  • this application can also adjust the template configured according to the scene according to the device information of the decoding end (such as a terminal device waiting to play media data).
  • the device information can include network status information, Terminal decoding capability, etc.
  • the application can adjust the template according to the network status information of the terminal device, terminal decoding capability, etc. (eg, adjust the encoding quality parameters, the number of layers in the layered encoding structure, etc.).
  • the present application configures the quality parameters of the media data to be encoded according to the encoding quality parameters to obtain the frame encoding quality parameters corresponding to the media data to be encoded.
  • the specific method for judging whether the device index information meets the parameter adjustment conditions can be: the network quality parameter can be matched with the network parameter threshold, and the decoding algorithm The power information is matched with the computing power threshold; if the network quality parameter is greater than (or equal to) the network parameter threshold, and the decoding computing power information is greater than (or equal to) the computing power threshold, it can be determined that the device index information does not meet the parameter adjustment conditions; and if If the network quality parameter is smaller than the network parameter threshold, or the decoding computing power information is smaller than the computing power threshold, it can be determined that the device index information meets the parameter adjustment condition.
  • the first encoding quality parameter may be directly determined as the frame encoding quality parameter corresponding to the first data frame to be encoded, and the second encoding quality parameter may be determined as A frame encoding quality parameter corresponding to the second data frame to be encoded.
  • the specific implementation process for determining the frame encoding quality parameter corresponding to the first data frame to be encoded according to the device index information and the first encoding quality parameter may be as follows: a parameter mapping table may be obtained; Wherein, the parameter mapping table may include a mapping relationship between a set of network quality intervals and an adapted rate control parameter set, and there is a mapping relationship between a network quality interval and an adapted rate control parameter; subsequently, the first For the network quality parameters corresponding to the terminal, the first network quality interval to which the network quality parameters belong can be obtained from the network quality interval set, and the network adaptive bit rate control parameters that have a mapping relationship with the first network quality interval can be obtained; according to the scene bit rate The control parameter and the network adaptation code rate control parameter can determine the frame coding quality parameter corresponding to the first data frame to be coded.
  • the specific implementation method may be as follows: the first A first operation coefficient corresponding to a business scenario type, and a second operation coefficient corresponding to a network quality parameter; subsequently, the first operation coefficient and the scene code rate control parameter can be processed to obtain the first operation code rate control parameter; The second operation coefficient and the network adaptation code rate control parameter are processed to obtain the second operation code rate control parameter; the average value between the first operation code rate control parameter and the second operation code rate control parameter can be determined, and the average value can be calculated Determine the frame encoding quality parameter corresponding to the first data frame to be encoded.
  • the frame coding quality parameter (such as reducing the code rate) can be appropriately reduced at this time to achieve Match the network status or decoding capability of the terminal device.
  • Different rate control parameters (which can be called network adaptation rate control parameters) can be configured in advance for different network states (such as network quality intervals) or different decoding capabilities.
  • the operation coefficient (which can be understood as a weight) can also be configured for the scene and the network status (or decoding capability) of the terminal device.
  • the scene code rate control parameter and its corresponding first calculation coefficient can be firstly processed (such as multiplication processing) to obtain the first calculation code rate control parameter (product result); at the same time
  • the network adaptation code rate control parameter and the second operation coefficient are subjected to operation processing (such as multiplication processing) to obtain the second operation code rate control parameter, and then the first operation code rate control parameter and the second operation code rate control parameter are determined
  • the mean value among the parameters is determined as the frame coding quality parameter of the data frame to be coded.
  • the specific implementation manner of determining the frame encoding quality parameter corresponding to the first data frame to be encoded according to the scene rate control parameter and the network adaptation rate control parameter may be: Compare the scene code rate control parameter with the network adaptation code rate control parameter to determine the minimum code rate control parameter between the scene code rate control parameter and the network adaptation code rate control parameter; the minimum code rate control parameter can be determined as the first A frame coding quality parameter corresponding to the data frame to be coded.
  • the minimum bit rate control parameter between the scene bit rate control parameter and the network adaptation bit rate control parameter can be used as the data frame to be encoded Frame encoding quality parameter. In this way, it can be ensured that the terminal device can decode smoothly.
  • the specific implementation manner of determining the frame encoding quality parameter corresponding to the first data frame to be encoded according to the scene rate control parameter and the network adaptation rate control parameter may be: After obtaining the network adaptation bit rate control parameters, the network adaptation bit rate control parameters can be directly determined as the frame encoding quality parameters of the data frame to be encoded (that is, the scene bit rate control parameters are replaced by the network adaptation bit rate Control parameters).
  • the computing power information in this application may refer to the hardware or network resources that computer equipment (such as business servers, terminal devices) need to occupy when performing computing tasks, usually including central processing unit (central processing unit, cpu) computing power Information, graphics processing unit (gpu), memory resources, network bandwidth resources, disk resources, etc.
  • central processing unit central processing unit, cpu
  • graphics processing unit gpu
  • memory resources such as network bandwidth resources, disk resources, etc.
  • the present application can also pre-configure different templates for the same service scenario type according to different network states.
  • three configuration network scenarios can be subdivided, and for a configuration network scenario with a good network status (such as a network quality parameter greater than a certain first threshold), the frame coding quality parameter (for example, the code rate) is set to be relatively large; for the configuration network scenario where the network status is medium (such as the network quality parameter is between the second threshold and the first threshold, and the second threshold is less than the first threshold), the frame in the template can be It is slightly smaller if the encoding quality parameters (such as bit rate) are set relatively well; for configuration network scenarios with poor network status, you can set the frame encoding quality parameters in the template to be smaller.
  • the network status of the terminal device can be obtained first, and the corresponding first encoding configuration template can be determined according to the first service scenario type and the network status.
  • different coding configuration templates can be configured for different business scenario types, and a configuration template set can be generated, which can include each business scenario type and its corresponding (with mapping relationship) coding configuration template, then when the media data to be encoded is obtained, the first encoding configuration template corresponding to the first business scenario type can be quickly obtained in the configuration template set according to the first business scenario type to which it belongs;
  • the real-time network status or terminal decoding capability
  • properly adjusts the first encoding configuration template and finally determines the frame encoding parameters of the media data to be encoded, and then encodes the media data to be encoded according to the frame encoding parameters, and obtains the first The first media data of the first media quality matching the business scenario type.
  • the mode of adaptively selecting and adjusting the coding configuration template can make the determined first coding configuration template correspond to the first business scenario type and network status Matched, the first media quality of the encoded first media data matches the first business scenario type and the network state, that is, the frame coding parameters determined based on the first coding configuration template conform to the coding of the first business scenario type requirements, has scene adaptability, and also meets the requirements of the network status, and the compression performance of the
  • FIG. 7 is a system flow chart provided by an embodiment of the present application. As shown in Figure 7, the process can at least follow steps S71-step S76:
  • Step S71 input encoding scene parameters.
  • the encoding scene parameter here may refer to a parameter corresponding to a business scene type to which the media data to be encoded belongs.
  • Step S72 selecting a layered template configuration according to the scenario.
  • the encoding configuration template here can refer to a layered encoding configuration template. After inputting the encoding scenario parameters, the corresponding layered encoding can be obtained according to the scenario parameters. Configure templates.
  • Step S73 receiving the terminal network status.
  • Step S74 adjusting the layered template configuration in real time.
  • the selected layered template configuration can be adjusted in real time according to the terminal network status. For example, the number of layers of the layered coding structure in the template configuration, the configuration of the number of frames in each layer, and the frame coding quality parameters corresponding to each layer (such as quantization coefficients, code control parameters, etc.) can be adjusted.
  • step S75 encoding processing is performed.
  • the data frame to be encoded of the media data to be encoded may be encoded according to the adjusted layered template configuration.
  • Step S76 outputting the encoding result.
  • step S71-step S76 reference may be made to the description of step S101-step S103 in the embodiment corresponding to FIG. 3 above, which will not be repeated here.
  • the beneficial effects brought by it will not be repeated here.
  • FIG. 8 is a schematic structural diagram of a data processing device provided by an embodiment of the present application.
  • the data processing device may be a computer program (including program code) running in a computer device, for example, the data processing device is an application software; the data processing device may be used to execute the method shown in FIG. 3 .
  • the data processing device 1 may include: a data acquisition module 11 , a template acquisition module 12 , a parameter determination module 13 and a data encoding module 14 .
  • a data acquisition module 11 configured to acquire the media data to be encoded and the first business scenario type to which the media data to be encoded belongs;
  • the template obtaining module 12 is configured to obtain a first coding configuration template corresponding to the first business scenario type according to the mapping relationship between at least two coding configuration templates in the configuration template set and at least two business scenario types. There is a mapping relationship between a coding configuration template and a business scenario type.
  • the at least two encoding configuration templates include a first encoding configuration template.
  • a parameter determination module 13 configured to determine frame encoding parameters of the media data to be encoded according to the first encoding configuration template
  • the data encoding module 14 is configured to perform encoding processing on the media data to be encoded according to the determined frame encoding parameters to obtain first media data.
  • the first media quality matches the first service scenario type.
  • the specific implementation manners of the data acquisition module 11, the template acquisition module 12, the parameter determination module 13, and the data encoding module 14 can refer to the description of steps S101-step S104 in the embodiment corresponding to FIG. 3 above, and will not be repeated here. .
  • the template acquisition module 12 may include: a type traversal unit 121 and a template determination unit 122 .
  • a type traversal unit 121 configured to traverse at least two business scenario types in the configuration template set
  • the template determination unit 122 is configured to, if there is a first business scenario type identical to the first business scenario type in at least two business scenario types, then at least two coding configuration templates have a mapping relationship with the first business scenario type
  • the coding configuration template is determined as the first coding configuration template corresponding to the first business scenario type
  • the template determination unit 122 is also used to determine the difference between the at least two business scenario types and the first business scenario type if there is no first business scenario type identical to the first business scenario type in at least two business scenario types. scene similarity;
  • the template determining unit 122 is further configured to determine a first encoding configuration template corresponding to the first business scenario type according to at least two scenario similarities.
  • step S102 for the specific implementation manners of the type traversal unit 121 and the template determination unit 122, reference may be made to the description of step S102 in the above embodiment corresponding to FIG. 3 , which will not be repeated here.
  • the template determining unit 122 may include: a matching subunit 1221 and a template determining subunit 1222 .
  • the matching subunit 1221 is configured to obtain the maximum scene similarity among at least two scene similarities, and match the maximum scene similarity with the scene similarity threshold;
  • the template determination subunit 1222 is configured to determine the business scenario type corresponding to the maximum scenario similarity among at least two business scenario types as the matching business scenario type if the maximum scenario similarity is greater than the scenario similarity threshold, and encode at least two Among the configuration templates, the coding configuration template that has a mapping relationship with the matching business scenario type is determined to be the first coding configuration template corresponding to the first business scenario type.
  • step S102 for the specific implementation manners of the matching subunit 1221 and the template determining subunit 1222, reference may be made to the description of step S102 in the embodiment corresponding to FIG. 3 above, which will not be repeated here.
  • the frame coding parameters include frame coding structure and frame coding quality parameters
  • the parameter determination module 13 may include: a template distribution acquisition unit 131 , a coding structure determination unit 132 and a quality parameter determination unit 133 .
  • a template distribution acquisition unit 131 configured to acquire the frame type distribution and frame level distribution in the first encoding configuration template
  • the encoding structure determining unit 132 is configured to determine the frame encoding structure corresponding to the media data to be encoded according to the frame type distribution and the frame level distribution;
  • the quality parameter determination unit 133 is configured to obtain the encoding quality parameters in the first encoding configuration template, configure the quality parameters of the media data to be encoded according to the encoding quality parameters, and obtain the frame encoding quality parameters corresponding to the media data to be encoded.
  • the encoding structure determination unit 132 and the quality parameter determination unit 133 please refer to the description of steps S103 and S104 in the embodiment corresponding to FIG. 3 above, and will not be repeated here.
  • the coding structure determining unit 132 may include: a unit frame group acquiring subunit 1321 , a type distribution determining subunit 1322 and a coding structure determining subunit 1323 .
  • the unit frame group obtaining subunit 1321 is configured to obtain each unit data frame group in the media data to be encoded.
  • the unit data frame group is composed of N consecutive data frames to be encoded.
  • the media data to be encoded includes data frames to be encoded. N is a positive integer.
  • the type distribution determination subunit 1322 is used to classify the data frames to be encoded in each unit data frame group according to the frame group frame type distribution, and obtain the type-divided data frames;
  • the encoding structure determination subunit 1323 is used to perform hierarchical division on the type-divided data frames according to the frame level distribution, so as to obtain the hierarchical encoding structure corresponding to each unit data frame group;
  • the encoding structure determining subunit 1323 is further configured to determine the frame encoding structure corresponding to the media data to be encoded according to the layered encoding structure corresponding to each unit data frame group.
  • the frame encoding structure is a hierarchical encoding structure, the hierarchical encoding structure includes a first level and a second level, and the second level is higher than the first level;
  • the encoding quality parameter includes a first encoding quality corresponding to the first level parameter, and a second encoding quality parameter corresponding to the second level;
  • the quality parameter determination unit 133 may include: an encoded frame acquisition subunit 1331 , a device information acquisition subunit 1332 , and a first quality parameter determination subunit 1333 .
  • the coded frame obtaining subunit 1331 is used to obtain the first data frame to be coded at the first level in the layered coding structure and the second data frame to be coded at the second level in the layered coding structure from the media data to be coded Data Frame;
  • the device information acquisition subunit 1332 is used to obtain the device index information of the first terminal; the first terminal refers to the terminal waiting to play the media data to be encoded;
  • the first quality parameter determination subunit 1333 is used to determine the frame coding quality parameter corresponding to the first data frame to be coded according to the device index information and the first coding quality parameter if the device index information satisfies the parameter adjustment condition, and according to the device index information Determine the frame coding quality parameter corresponding to the second data frame to be coded according to the second coding quality parameter.
  • the device index information includes network quality parameters and decoding computing power information
  • the quality parameter determining unit 133 may further include: a device information matching subunit 1334 and a condition determining subunit 1335 .
  • the device information matching subunit 1334 is configured to match network quality parameters with network parameter thresholds, and match decoding computing power information with computing power thresholds;
  • the condition determination subunit 1335 is configured to determine that the device index information does not meet the parameter adjustment condition if the network quality parameter is greater than the network parameter threshold and the decoding computing power information is greater than the computing power threshold;
  • the condition determination subunit 1335 is further configured to determine that the device index information satisfies the parameter adjustment condition if the network quality parameter is smaller than the network parameter threshold, or the decoding computing power information is smaller than the computing power threshold.
  • the quality parameter determining unit 133 may further include: a second quality parameter determining subunit 1336 .
  • the second quality parameter determination subunit 1336 is configured to determine the first coding quality parameter as the frame coding quality parameter corresponding to the first data frame to be coded, and determine the second coding quality parameter if the device index information does not satisfy the parameter adjustment condition A frame encoding quality parameter corresponding to the second data frame to be encoded.
  • the first encoding quality parameter includes a scene rate control parameter;
  • the device index information includes a network quality parameter;
  • the first quality parameter determination subunit 1333 is also specifically used to obtain a parameter mapping table;
  • the parameter mapping table includes a mapping relationship between a network quality interval set and an adapted code rate control parameter set, a network quality interval and an adaptive code There is a mapping relationship between the rate control parameters;
  • the first quality parameter determination subunit 1333 is also specifically configured to obtain the network quality parameter corresponding to the first terminal, obtain the first network quality interval to which the network quality parameter belongs in the network quality interval set, and obtain a mapping with the first network quality interval Relational network adaptation rate control parameters;
  • the first quality parameter determining subunit 1333 is also specifically configured to determine the frame encoding quality parameter corresponding to the first data frame to be encoded according to the scene rate control parameter and the network adaptation rate control parameter.
  • the first quality parameter determination subunit 1333 is further specifically configured to obtain the first operation coefficient corresponding to the first service scenario type, and the second operation coefficient corresponding to the network quality parameter;
  • the first quality parameter determination subunit 1333 is also specifically configured to perform calculation processing on the first operation coefficient and the scene code rate control parameter to obtain the first operation code rate control parameter;
  • the first quality parameter determination subunit 1333 is also specifically configured to perform calculation processing on the second operation coefficient and the network adaptation code rate control parameter to obtain the second operation code rate control parameter;
  • the first quality parameter determination subunit 1333 is also specifically configured to determine the mean value between the first operational code rate control parameter and the second operational code rate control parameter, and determine the mean value as the frame encoding quality parameter corresponding to the first data frame to be encoded .
  • the first quality parameter determining subunit 1333 is further specifically configured to compare the scene code rate control parameters with the network adaptation code rate control parameters, and determine the scene code rate control parameters and the network adaptation code rate control parameters The minimum code rate control parameter between;
  • the first quality parameter determining subunit 1333 is also specifically configured to determine the minimum code rate control parameter as the frame encoding quality parameter corresponding to the first data frame to be encoded.
  • different coding configuration templates can be configured for different business scenario types, and a configuration template set can be generated, which can include each business scenario type and its corresponding (with mapping relationship) coding configuration template, then when the media data to be encoded is obtained, the first encoding configuration template corresponding to the first business scenario type can be quickly obtained in the configuration template set according to the first business scenario type to which it belongs;
  • the real-time network status or terminal decoding capability
  • properly adjusts the first encoding configuration template and finally determines the frame encoding parameters of the media data to be encoded, and then encodes the media data to be encoded according to the frame encoding parameters, and obtains the first The first media data of the first media quality matching the business scenario type.
  • the mode of adaptively selecting and adjusting the coding configuration template can make the determined first coding configuration template correspond to the first business scenario type and network status Matched, the first media quality of the encoded first media data matches the first business scenario type and the network state, that is, the frame coding parameters determined based on the first coding configuration template conform to the coding of the first business scenario type requirements, has scene adaptability, and also meets the requirements of the network status, and the compression performance of the
  • FIG. 9 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • the device 1 in the embodiment corresponding to the above-mentioned FIG. 7 can be applied to the above-mentioned computer equipment 8000, and the above-mentioned computer equipment 8000 can include: a processor 8001, a network interface 8004 and a memory 8005.
  • the above-mentioned computer equipment 8000 can also It includes: a user interface 8003, and at least one communication bus 8002. Among them, the communication bus 8002 is used to realize connection and communication between these components.
  • the user interface 8003 may include a display screen (Display) and a keyboard (Keyboard), and the optional user interface 8003 may also include a standard wired interface and a wireless interface.
  • the network interface 8004 may include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the memory 8005 can be a high-speed RAM memory, or a non-volatile memory, such as at least one disk memory.
  • the memory 8005 may also be at least one storage device located away from the aforementioned processor 8001 .
  • the memory 8005 as a computer-readable storage medium may include an operating system, a network communication module, a user interface module, and a device control application program.
  • the network interface 8004 can provide a network communication function;
  • the user interface 8003 is mainly used to provide an input interface for the user; and
  • the processor 8001 can be used to call the device control application stored in the memory 8005 program to achieve:
  • mapping relationship between at least two coding configuration templates and at least two business scenario types in the configuration template set obtain the first coding configuration template corresponding to the first business scenario type; between a coding configuration template and a business scenario type There is a mapping relationship; at least two coding configuration templates include the first coding configuration template;
  • the computer device 8000 described in the embodiment of the present application can execute the description of the data processing method in the previous embodiment corresponding to FIG. 3 to FIG. 7 , and can also execute the data processing method in the previous embodiment corresponding to FIG. 8
  • the description of the device 1 will not be repeated here.
  • the description of the beneficial effect of adopting the same method will not be repeated here.
  • the embodiment of the present application also provides a computer-readable storage medium, and the above-mentioned computer-readable storage medium stores the computer program executed by the aforementioned data processing computer device 1000, and
  • the above-mentioned computer program includes program instructions.
  • the above-mentioned processor executes the above-mentioned program instructions, it can execute the description of the above-mentioned data processing method in the embodiments corresponding to FIG. 3 to FIG. 7 above, so details will not be repeated here.
  • the description of the beneficial effect of adopting the same method will not be repeated here.
  • the above-mentioned computer-readable storage medium may be the data processing apparatus provided in any one of the foregoing embodiments or an internal storage unit of the above-mentioned computer equipment, such as a hard disk or memory of the computer equipment.
  • the computer-readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk equipped on the computer device, a smart memory card (smart media card, SMC), a secure digital (secure digital, SD) card, Flash card (flash card), etc.
  • the computer-readable storage medium may also include both an internal storage unit of the computer device and an external storage device.
  • the computer-readable storage medium is used to store the computer program and other programs and data required by the computer device.
  • the computer-readable storage medium can also be used to temporarily store data that has been output or will be output.
  • One aspect of the present application provides a computer program product or computer program, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the method provided in one aspect of the embodiments of the present application.
  • each flow and/or of the method flow charts and/or structural diagrams can be implemented by computer program instructions or blocks, and combinations of processes and/or blocks in flowcharts and/or block diagrams.
  • These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a A device for realizing the functions specified in one or more steps of the flowchart and/or one or more blocks of the structural diagram.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions
  • the device implements the functions specified in one or more blocks of the flowchart and/or one or more blocks of the structural schematic diagram.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby
  • the instructions provide steps for implementing the functions specified in one or more steps of the flowchart and/or one or more blocks in the structural illustration.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

一种数据处理方法、装置、设备、可读存储介质及程序产品。一种数据处理方法包括:获取待编码媒体数据,以及待编码媒体数据所属的第一业务场景类型;根据配置模板集合中的至少两个编码配置模板与至少两个业务场景类型之间的映射关系,获取第一业务场景类型对应的第一编码配置模板;根据第一编码配置模板确定待编码媒体数据的帧编码参数;按照确定的帧编码参数对待编码媒体数据进行编码处理,得到第一媒体数据。本申请的方案可以提高编码效率,提升编码压缩性能。

Description

数据处理方法、装置、设备、可读存储介质及程序产品
本申请要求于2021年11月2日提交中国专利局、申请号为202111288823.5、申请名称为“一种数据处理方法、装置、设备以及可读存储介质”的中国专利申请的优先权,以及2021年12月29日提交中国专利局、申请号为202111640485.7、申请名称为“一种数据处理方法、装置、设备以及可读存储介质”的中国专利申请的优先权,这两个申请的全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,尤其涉及一种数据处理方法、装置、设备、可读存储介质及程序产品。
背景技术
随着移动互联网以及多媒体技术的快速发展,观看多媒体数据(如视频、音乐、文本等)已逐渐成为日常娱乐方式。而对于同一个多媒体数据,可以根据不同的编码处理过程来进行对应的编码,以得到具有不同媒体质量的媒体数据。例如,通过不同的编码过程可以输出不同码率(或不同清晰度等)的媒体数据。
目前,在对多媒体数据进行编码时,不同的多媒体数据只采用同样的固定的编码参数,由于不同的多媒体数据具备不同的质量需求,则这种方式十分影响媒体数据的编码效果,编码性能并不高。
发明内容
本申请实施例提供一种数据处理方法、装置、设备、可读存储介质及程序产品,有助于提高编码效率,和提升编码压缩性能。
本申请实施例一方面提供了一种数据处理方法,在计算机设备中执行,所述方法包括:
获取待编码媒体数据,以及所述待编码媒体数据所属的第一业务场景类型;
根据配置模板集合中的至少两个编码配置模板与至少两个业务场景类型之间的映射关系,获取所述第一业务场景类型对应的第一编码配置模板;
根据所述第一编码配置模板确定所述待编码媒体数据的帧编码参数;
按照确定的所述帧编码参数对所述待编码媒体数据进行编码处理,得到第一媒体数据。
本申请实施例一方面提供了一种数据处理装置,包括:
数据获取模块,用于获取待编码媒体数据,以及所述待编码媒体数据所属的第一业务场景类型;
模板获取模块,用于根据配置模板集合中的至少两个编码配置模板与至少两个业务场景类型之间的映射关系,获取所述第一业务场景类型对应的第一编码配置模板;
参数确定模块,用于根据第一编码配置模板确定待编码媒体数据的帧编码参数;
数据编码模块,用于按照确定的所述帧编码参数对所述待编码媒体数据进行编码处理,得到第一媒体数据。
本申请实施例一方面提供了一种计算机设备,包括:处理器和存储器;
存储器存储有计算机程序,计算机程序被处理器执行时,使得处理器执行本申请实施例中的方法。
本申请实施例一方面提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序包括程序指令,程序指令当被处理器执行时,执行本申请实施例中的方法。
本申请的一个方面,提供了一种计算机程序产品,该计算机程序产品包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行本申请实施例中一方面提供的方法。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的一种网络架构图;
图2是本申请实施例提供的一种进行数据编码的场景示意图;
图3是本申请实施例提供的一种数据处理方法的流程示意图;
图4是本申请实施例提供的一种为业务场景类型配置编码配置模板的场景示意图;
图5是本申请实施例提供的一种编码处理过程的示意图;
图6是本申请实施例提供的一种根据第一编码配置模板确定帧编码参数的流程示意图;
图7是本申请实施例提供的一种系统流程图;
图8是本申请实施例提供的一种数据处理装置的结构示意图;
图9是本申请实施例提供的一种计算机设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
请参见图1,图1是本申请实施例提供的一种网络架构的结构示意图。如图1所示,该网络架构可以包括业务服务器1000和终端设备集群(即用户终端集群)。该终端设备集群可以包括一个或者多个终端设备,这里将不对终端设备的数量进行限制。如图1所示,多个终端设备具体可以包括终端设备100a、终端设备100b、终端设备100c、…、终端设备100n。如图1所示,终端设备100a、终端设备100b、终端设备100c、…、终端设备100n可以分别与上述业务服务器1000进行网络连接,以便于每个终端设备可以通过该网络连接与业务服务器1000进行数据交互。其中,这里的网络连接不限定连接方式,可以通过有线通信方式进行直接或间接地连接,也可以通过无线通信方式进行直接或间接地连接,还可以通过其他方式,本申请在此不做限制。
每个终端设备均可以集成安装有目标应用,当该目标应用运行于各终端设备中时,可与上述图1所示的业务服务器1000之间进行数据交互。其中,该目标应用可以包括具有显示文字、图像、音频以及视频等数据信息功能的应用。其中,该应用可以包括社交应用、多媒体应用(例如,视频应用)、娱乐应用(例如,游戏应用)、教育应用、直播应用等具有媒体数据编码功能(如视频编码功能)的应用,当然,应用还可以为其他具有显示数据信息功能、视频编码功能的应用,在此不再一一进行举例。其中,该应用可以为独立的应用,也可以为集成在某应用(例如,社交应用、教育应用以及多媒体应用等)中的嵌入式子应用,在此不进行限定。
为便于理解,本申请实施例可以在图1所示的多个用户终端中选择一个用户终端作为目标用户终端。例如,本申请实施例可以将图1所示的用户终端100a作为目标用户终端,该目标用户终端中可以集成有具备视频编码功能的目标应用。此时,该目标用户终端可以通过该应用客户端对应的业务数据平台与业务服务器1000之间实现数据交互。
应当理解,本申请实施例中的具有媒体数据编码功能(如,视频编码功能)的计算机设备(例如,用户终端100a、业务服务器1000)可以通过云技术,实现对多媒体数据(例如,视频数据)的数据编码以及数据传输。其中,云技术(Cloud technology)是指在广域网或局域网内将硬件、软件、网络等系列资源统一起来,实现数据的计算、储存、处理和共享的一种托管技术。
云技术可以是网络技术、信息技术、整合技术、管理平台技术、应用技术等的总称,可以组成资源池,按需所用,灵活便利。云计算技术将变成重要支撑。技术网络系统的后台服务需要大量的计算、存储资源,如视频网站、图片类网站和更多的门户网站。伴随着互联网行业的高度发展和应用,将来每个物品都有可能存在自己的识别标志,都需要传输到后台系统进行逻辑处理,不同程度级别的数据将会分开处理,各类行业数据皆需要强大的系统后盾支撑,只能通过云计算来实现。
例如,本申请实施例提供的数据处理方法可以应用于视频观看场景、视频通话场景、视频传输场景、云会议场景、直播场景等高分辨率、高帧率的场景。其中,云会议是基于云计算技术的一种高效、便捷、低成本的会议形式。目前国内云会议主要集中在以SaaS(Software as a Service,软件即服务)模式为主体的服务内容,包括电话、网络、视频等服务形式,基于云计算的视频会议就叫云会议。云 会议系统支持多服务器动态集群部署,并提供多台高性能服务器,大大提升了会议稳定性、安全性、可用性。近年来,视频会议因能大幅提高沟通效率,持续降低沟通成本,带来内部管理水平升级,已广泛应用在交通、运输、金融、运营商、教育、企业等各个领域。毫无疑问,视频会议运用云计算以后,在方便性、快捷性、易用性上具有更强的吸引力,必将激发视频会议应用新高潮的到来。
应当理解,具有媒体数据编码功能的计算机设备(例如,具备视频编码功能的用户终端100a)可以通过媒体数据编码器(如,视频编码器)对媒体数据进行编码处理,以得到该媒体数据对应的数据码流(如,得到视频数据对应的视频码流),进而可以提升媒体数据的传输效率。其中,媒体数据编码器为视频编码器时,该视频编码器可以为AV1视频编码器,H.266视频编码器,AVS3视频编码器等,在此不再一一进行举例。其中,该AV1视频编码器的视频压缩标准是开放媒体联盟(Alliance for Open Media,简称AOM)开发的第一代视频编码标准。
其中,本申请实施例可以将待进行编码处理的媒体数据称之为待编码媒体数据,可以将待编码媒体数据所属的业务场景类型称之为第一业务场景类型。可以理解的是,本申请为提高媒体数据的编码效率与编码压缩性能,可以为不同的业务场景类型配置不同的编码配置模板,第一业务场景类型不同,那么待编码媒体数据所对应的第一编码配置模板也就不同,在对待编码媒体数据进行编码处理时,其对应的帧编码参数也会不同。而在本申请中,业务服务器1000在获取到待编码媒体数据时,可以对该待编码媒体数据进行编码处理(如,可以通过媒体数据编码器对该待编码媒体数据进行编码处理),得到第一媒体数据。其中,该业务服务器1000在进行编码处理前,可以根据待编码媒体数据所属的第一业务场景类型,确定出其对应的第一编码配置模板,再根据第一编码配置模板确定出其对应的帧编码参数,再基于该帧编码参数对待编码媒体数据进行编码处理,由此可以得到具有第一媒体质量(该第一媒体质量与第一业务场景类型具有适配性)的第一媒体数据。其中,根据第一业务场景类型确定第一编码配置模板、以及根据第一编码配置模板确定待编码媒体数据的帧编码参数的具体实现方式,可以参见后续图3所对应实施例中的描述。
可以理解的是,本申请实施例提供的方法可以由计算机设备执行,计算机设备包括但不限于终端设备或业务服务器。其中,业务服务器可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN、以及大数据和人工智能平台等基础云计算服务的云服务器。
可选的,可以理解的是,上述计算机设备(如上述业务服务器1000、终端设备100a、终端设备100b等等)可以是一个分布式系统中的一个节点,其中,该分布式系统可以为区块链系统,该区块链系统可以是由该多个节点通过网络通信的形式连接形成的分布式系统。其中,节点之间可以组成的点对点(P2P,Peer To Peer)网络,P2P协议是一个运行在传输控制协议(TCP,Transmission Control Protocol)协议之上的应用层协议。在分布式系统中,任意形式的计算机设备,比如业务服务器、终端设备等电子设备都可以通过加入该点对点网络而成为该区块链系统中的一个节点。为便于理解,以下将对区块链的概念进行说明:区块链是一种分布式数据存储、点对点传输、共识机制以及加密算法等计算机技术的新型应用模式,主要用于对数据按时间顺序进行整理,并加密成账本,使其不可被篡改和伪造,同时可进行数据的验证、存储和更新。当计算机设备为区块链节点时,由于区块链的不可被篡改特性与防伪造特性,可以使得本申请中的数据(如待编码媒体数据、编码处理后的第一媒体数据、帧编码参数等等)具备真实性与安全性,从而可以使得基于这些数据进行相关数据处理后,得到的结果更为可靠。
本申请实施例可应用于各种场景,包括但不限于云技术、人工智能、智慧交通、辅助驾驶等。为便于理解,请参见图2,图2是本申请实施例提供的一种进行数据编码的场景示意图。其中,终端设备2a可以为用于发送视频数据(例如,图2所示的视频数据1)的发送终端,该终端设备2a对应的用户可以为用户a。终端设备2b可以为用于接收视频数据(如,图2所示的视频数据1)的接收终端,该终端设备2b对应的用户可以为用户b。其中,图2所示的业务服务器200可以为与终端设备2a具有网络连接关系的服务器,该业务服务器200可以为上述图1所示的业务服务器1000。
应当理解,在视频传输场景中,终端设备2a可以获取由图像采集器(例如,摄像头)所采集到的 与用户a相关联的视频数据1(需要在经过用户a授权的情况下,才能获取与用户a相关联的视频数据)。进一步地,该终端设备2a可以通过视频编码器(例如,AV1视频编码器)对该视频数据1进行编码处理,以生成与该视频数据1相关联的视频码流1。此时,终端设备2a可以将该视频码流1发送至业务服务器200。该业务服务器200在接收到该视频码流1时,可以对该视频码流1进行解码处理,得到具备像素图像格式(也可称为YUV格式)的视频数据(可称为YUV视频数据,也可称为解码视频数据,或称为待编码视频数据),然后业务服务器200可以对该待编码视频数据进行编码处理(例如,通过视频编码器对该待编码视频数据进行编码处理)。
其中,对于业务服务器200对待编码视频数据进行编码处理的过程可为:业务服务器200可首先获取到待编码视频数据所属的业务场景类型(可称之为第一业务场景类型),也就是说可以获取到视频数据1所属的第一业务场景类型,其中,终端设备2a在向业务服务器1000传输视频码流1时,可以一并传输该视频数据1所属的第一业务场景类型,由此业务服务器200可以快速准确的获取到该视频数据1的第一业务场景类型。
进一步地,业务服务器200可以获取到配置模板集合,其中,该配置模板集合中可以包括至少两个业务场景类型与至少两个编码配置模板之间的映射关系(一个业务场景类型可以与一个编码配置模板之间存在映射关系),业务服务器200可以根据配置模板集合中的这些映射关系,获取到该第一业务场景类型对应的第一编码配置模板。例如,如图2所示,该配置模板集合中包括业务场景类型1与编码配置模板1之间的映射关系、业务场景类型2与编码配置模板2之间的映射关系、业务场景类型3与编码配置模板3之间的映射关系,而视频数据1所属的第一业务场景类型为业务场景类型2,则业务服务器200可以将与该业务场景类型2具有映射关系的编码配置模板2作为该第一编码配置模板。
进一步地,业务服务器200可以根据该第一编码配置模板确定出该待编码视频数据的帧编码参数,而业务服务器200可以按照该帧编码参数对该待编码视频数据进行编码处理,由此可以得到具有第一媒体质量的第一视频数据。业务服务器200可以将该第一视频数据发送至等待播放视频的终端设备2b,而终端设备2b可以将该第一视频数据进行解码输出,用户b即可通过终端设备2b查看到该视频数据1。
其中,可以理解的是,对待编码视频数据进行编码处理,实际可以理解为对待编码视频数据的视频帧(可称为待编码视频帧)进行编码处理,上述帧编码参数可以包括帧编码结构与帧编码质量参数(如,码率、分辨率、量化系数等等),本申请可以根据帧编码结构与帧编码质量参数对每个待编码视频帧进行编码处理。可以理解的是,在对待编码视频数据进行编码处理时,可以从待编码视频数据中获取需要进行编码处理的视频帧(可称为第一视频帧),进而可以从第一视频帧中获取待编码单元(Coding Uint,简称CU)。进一步地,业务服务器200可以基于该视频编码器的编码策略,对该待编码单元进行预测处理,以得到该待编码单元对应的最优预测模式,进而可以基于该最优预测模式、以及上述帧编码参数对该待编码单元进行编码处理,以得到该待编码单元对应的压缩码流。应当理解,在该业务服务器200完成对第一视频帧中的每个待编码单元的编码处理时,可以分别得到每个待编码单元对应的压缩码流,进而可以将这些压缩码流封装成与该待编码视频数据相关联的视频码流,该视频码流即可称之为图2所示的第一视频数据。
其中,这里的编码策略可以包括帧内预测模式和帧间预测模式。可以理解的是,在业务服务器200确定最优预测模式的过程中,该业务服务器200可以在帧内预测模式中选择中具有最优率失真代价对应的预测模式,进而可以将所选择出的预测模式作为最优帧内预测模式。同理,该业务服务器200可以从帧间预测模式中选择中具有最优率失真代价对应的预测模式,进而可以将所选择出的预测模式作为最优帧间预测模式。进一步地,该业务服务器200可以基于最优帧内预测模式和最优帧间预测模式的率失真代价,进行模式择优处理,得到最优预测模式。换言之,业务服务器200可以在最优帧内预测模式和最优帧间预测模式中,选择出具有最小率失真代价的预测模式,进而可以将具有最小率失真代价的预测模式作为最优预测模式。
应当理解,为提高编码效率,本申请可以根据不同的业务场景需求,预先为不同的业务场景类型配置一个编码配置模板,该编码配置模板中可以包括在不同业务场景类型下的媒体数据(如视频数据)的帧编码结构、以及帧编码质量参数,由此,在获取到某个待编码媒体数据时,可以减少编码过程中 的计算量,直接获取到该待编码媒体数据所属的第一业务场景类型所对应的第一编码配置模板,由此可以快速确定出该待编码媒体数据对应的帧编码结构与帧编码质量参数,然后即可按照该帧编码结构以及帧编码质量参数对该待编码媒体数据进行编码处理,可以大大提高编码效率。
可选的,在一种可行的实施例中,本申请还可以根据解码端(如终端设备2b)的网络状态(或解码能力)来对第一编码配置模板进行适应性调整,例如,在获取到待编码媒体数据对应的第一编码配置模板后,该第一编码配置模板中的帧编码质量参数包括码率这一编码质量参数,若此时获取到终端设备2b对应的网络状态较差,那么此时可适应性地将码率进行降低,由此可以使得解码端即使在网络状态较差的情况下,仍然可以快速接收到视频数据。
进一步地,请参见图3,图3是本申请实施例提供的一种数据处理方法的流程示意图。其中,该方法可由计算机设备执行,例如可由终端设备(如上述图1所对应实施例中终端设备集群中的任一终端设备,如终端设备100a)所执行;该方法也可由业务服务器(如上述图1所对应实施例中的业务服务器1000)所执行;该方法还可由终端设备与业务服务器共同执行。以该方法由业务服务器所执行为例,如图3所示,该方法流程可以至少包括以下步骤S101-步骤S104:
步骤S101,获取待编码媒体数据,以及待编码媒体数据所属的第一业务场景类型。
本申请中,具有媒体数据编码功能(如视频编码功能)的计算机设备(例如,终端设备)可以在数据传输场景中,获取由图像采集器(例如,终端设备的摄像头)所采集到的媒体数据(如视频数据)。进一步地,该终端设备可以对该媒体数据进行编码处理,由此可以得到媒体数据对应的数据码流(如视频码流),而终端设备可以将该数据码流发送至业务服务器,业务服务器可以将该数据码流进行解码处理,得到具有YUV格式的媒体数据,该媒体数据即可称之为待编码媒体数据。
可以理解的是,终端设备在向业务服务器传输数据码流时,还可以一并传输该媒体数据所属的业务场景类型,该媒体数据所属的业务场景类型也就是该待编码媒体数据所属的业务场景类型,这里可以将待编码媒体数据所属的业务场景类型称为第一业务场景类型。可以理解的是,待编码媒体数据所属的业务场景类型可以根据目标应用的应用类型所确定,例如,终端设备中安装的目标应用为离线短视频应用,终端设备是在运行该离线短视频应用时,通过摄像头获取到视频数据,那么该视频数据所属的业务场景类型可以为离线短视频类型,其对应的待编码媒体数据所属的第一业务场景类型为离线短视频类型。例如,终端设备中安装的目标应用为直播应用,终端设备是在运行直播应用时,通过摄像头获取到直播视频数据(在终端设备所对应的用户授权的情况下获取直播视频数据),那么该直播视频数据所属的业务场景类型可以为直播类型,其对应的待编码媒体数据所属的第一业务场景类型为直播类型。也就是说,若待编码媒体数据是通过目标应用所获取得到的,则该待编码媒体数据所属的第一业务场景类型可以为该目标应用的应用类型。
其中,这里的目标应用可以包括社交应用、多媒体应用(如,离线短视频应用、直播应用、音视频应用)、娱乐应用(如,游戏应用)、教育应用等等具有媒体数据编码功能的应用,当然还可以为其他具有显示数据信息功能、视频编码功能的应用,在此不再一一进行举例。
可选的,本申请实施例提供的方案可以涉及人工智能的机器学习技术,机器学习(Machine Learning,ML)是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是使计算机具有智能的根本途径,其应用遍及人工智能的各个领域。机器学习和深度学习通常包括人工神经网络、置信网络、强化学习、迁移学习、归纳学习、式教学习等技术。具体通过如下实施例进行说明:业务服务器在确定待编码媒体数据所属的第一业务场景类型时,可以根据场景预测模型生成该待编码媒体数据对应的场景特征,然后在场景预测模型中输出与该场景特征对应的预测场景类型,该预测场景类型即可作为该待编码媒体数据对应的第一业务场景类型。其中,场景预测模型可以是根据拥有真实场景标签的历史媒体数据训练得到的机器学习模型,可用于推测某个媒体数据所属的预测场景类型。
步骤S102,根据配置模板集合中的至少两个编码配置模板与至少两个业务场景类型之间的映射关系,获取第一业务场景类型对应的第一编码配置模板。一个编码配置模板与一个业务场景类型之间存在映射关系。至少两个编码配置模板包括第一编码配置模板。
本申请中,可以预先为不同的业务场景类型配置不同的编码配置模板,为每个业务场景类型及其对应的编码配置模板创建一个映射关系,由此可以得到包含多个映射关系的配置模板集合。由此,当确定出待编码媒体数据所属的第一业务场景类型后,可以根据配置模板集合中的至少两个编码配置模板与至少两个业务场景类型之间的映射关系,确定出第一业务场景类型对应的第一编码配置模板。
为便于理解,请参见图4,图4是本申请实施例提供的一种为业务场景类型配置编码配置模板的场景示意图。如图4所示的业务场景类型是以离线短视频场景为例,首先可以获取到该离线短视频场景的编码需求,该离线短视频的主要目标为压缩性能,但是其对处理延时的要求并不高,则该离线短视频的场景编码需求为:压缩性能高,处理时延要求低。那么可以根据该场景编码需求,对该离线短视频的编码配置模板进行配置,其中,我们这里为该离线短视频场景进行模板配置,可以理解为:为离线短视频场景的视频帧配置帧类型、配置帧编码结构、配置帧编码质量参数。
其中,在为视频帧进行配置时,可以以帧组为单位进行配置(由此可以将一个帧组称为单位视频帧组),其中,帧组可以为视频数据中的一组连续的视频帧(Group of Pictures,简称GOP),一个帧组中可以包括多个视频帧。应当理解,一帧图像可以根据编码参数设置和码率控制策略决定该帧的帧类型,这里的帧类型可以包括第一类型,第二类型以及第三类型。其中,本申请实施例可以将帧内编码帧(intra picture,简称I帧)这种帧类型称之为第一类型,将双向预测编码帧(bi-directional interpolated prediction frame,简称B帧,B帧是双向差别帧,也就是B帧记录的是本帧与前后帧的差别,B帧可以作其它B帧的参考帧,也可以不作为其它B帧参考帧)这种帧类型称之为第二类型,将前向预测编码帧(predictive-frame,简称P帧,P帧表示的是这一帧跟之前的一个关键帧(或P帧)的差别,解码时需要用之前缓存的画面叠加上本帧定义的差别,生成最终画面)这种帧类型称之为第三类型。而一个GOP可以理解为两个I帧之间的间隔,例如,视频帧包括20帧,其中,第1帧为I帧,第2-8帧为B帧,第9帧为P帧,第10帧为I帧,第11帧-第19帧为B帧,第20帧为P帧,则可以以第1帧为起始帧,第9帧为结束帧,组成一个GOP(单位视频帧组);也可以以第10帧为起始帧,第20帧为结束帧,组成一个GOP(单位视频帧组)。
需要说明的是,在固定GOP及GOP中的帧序列时,每个GOP的帧序列是固定的,比如固定GOP是120帧,则每隔120帧就可以生成一个I帧,GOP帧序列固定如:I B B P B B P…B B I;在固定GOP的大小但不固定帧序列时,每个GOP的帧序列是不固定的,比如固定GOP是120帧,即每隔120帧就生成一个I帧,GOP帧序列可以根据画面复杂度和相关P、B帧生成权重决定某个帧是P帧还是B帧。在GOP大小及帧序列均不固定时,可以根据画面纹理、运动复杂度以及I、P、B帧生成策略和权重配置自动生成。而对于上述对于GOP的划分过程(例如,以第1帧为起始帧,第9帧为结束帧,组成一个GOP),只是举例说明如何划分GOP,其并不具备实际参考意义。
可以理解的是,本申请实施例可以以GOP为粒度,为一个GOP里包括的各个视频帧进行配置,例如,可以重新配置帧类型分布、帧编码结构、各个帧的帧编码质量参数(如量化系数、码率等等)。例如,针对于离线短视频场景,以GOP大小为16为例,可以为一个GOP内的视频帧重新配置I、B、P帧分布、帧编码结构(如图4所示的分层编码结构40),在确定分层编码结构40后,还可以为分层编码结构40中的每一层配置量化系数(Quantization Parameter,QP)以及码率控制参数(也可以称之为码控算法)。上述的I、B、P帧类型分布、帧编码结构、每层的量化系数以及码率控制参数,可以称之为针对该离线短视频场景对应的编码配置模板。具体的,对于编码配置模板中所包含的配置参数当然并不仅限于上述所描述的帧类型分布、帧编码结构、每层的量化系数以及码率控制参数(每层的量化系数以及码率控制参数可以称之为帧编码质量参数),例如,还可以对GOP内的参考组(MinGOP,在一个GOP内分成的一个组,组内的各帧只在组内帧内帧间参考)进行配置,MinGOP可以控制在最大16以内,不同的分辨率其最大值不一样,具体可以以标准RFC文档为准,比如level 4.1规范,分辨率720P和分辨率1080P视频最大的值为9和4以内,以便码控和QP调整更精准合理。
可以理解的是,采用分层编码的方式,可以维护一组帧之间依赖关系的逻辑,所以满足依赖关系的视频帧都可以同时进行编码,由此可以使得编码能够并行处理,可以大大提高编码性能。为便于理解,以图4所示的分层编码结构为例,该分层编码结构可以分为5层,第0帧是I帧、第16帧为P帧,其余帧均为B帧,如图6所示,数据帧(即待编码数据帧,如视频帧)8在编码过程中需要参考数据 帧0和数据帧16,因此在数据帧16编码完成时,业务服务器可以对数据帧8进行编码处理。其中,数据帧8的帧类型可以称之为B帧。
数据帧4在编码过程中需要参考数据帧0和数据帧8,数据帧12在编码过程中需要参考数据帧8和数据帧16,因此,在数据帧8编码完成时,业务服务器可以分别对数据帧4和数据帧12进行编码处理。其中,数据帧4和数据帧12的帧类型均可以称之为B帧。
数据帧2在编码过程中需要参考数据帧0和数据帧4,数据帧6在编码过程中需要参考数据帧4和数据帧8,因此,在数据帧4编码完成时,位于服务器可以分别对数据帧2和数据帧6进行编码处理。数据帧10在编码过程中需要参考数据帧8和数据帧12,数据帧14在编码过程中需要参考数据帧12和数据帧16。因此,在数据帧12编码完成时,可以对数据帧10和数据帧14进行编码处理。其中,数据帧2、数据帧6、数据帧10以及数据帧14的帧类型均可以称之为B帧。
数据帧1在编码过程中需要参考数据帧0和数据帧2,数据帧3在编码过程中需要参考数据帧2和数据帧4,因此,在数据帧2编码完成时,业务服务器可以分别对数据帧1和数据帧3进行编码处理。数据帧5在编码过程中需要参考数据帧4和数据帧6,数据帧7在编码过程中需要参考数据帧6和数据帧8,因此,在数据帧6编码完成时,该计算机设备可以分别对数据帧5和数据帧7进行编码处理。数据帧9在编码过程中需要参考数据帧8和数据帧10,数据帧11在编码过程中需要参考数据帧10和数据帧12,因此,在数据帧10编码完成时,业务服务器可以分别对数据帧9和数据帧11进行编码处理。数据帧13在编码过程中需要参考数据帧12和数据帧14,数据帧15在编码过程中需要参考数据帧14和数据帧16,因此,在数据帧14编码完成时,业务服务器可以分别对数据帧13和数据帧15进行编码处理。其中,数据帧1、数据帧3、数据帧5、数据帧7、数据帧9、数据帧11、数据帧13和数据帧15不被参考,这8个参考帧的帧类型均可以称之为B帧。
综上所述,本申请可以为不同的业务场景类型,配置不同的编码配置模板(例如,以一个GOP为粒度,配置帧类型分布、分层编码结构、对于每一层中的量化系数及码率控制参数等等),在为各个业务场景类型配置模板后,可以在每个业务场景类型及其对应的编码配置模板之间建立映射关系,由此可以得到包含多个映射关系的配置模板集合,那么在获取到待编码媒体数据及其所属的第一业务场景类型后,可以根据配置模板集合,确定出该第一业务场景类型对应的第一编码配置模板。
其中,对于根据配置模板集合中的至少两个编码配置模板与至少两个业务场景类型之间的映射关系,获取第一业务场景类型对应的第一编码配置模板的具体实现方式可以为:可遍历配置模板集合中的至少两个业务场景类型;若至少两个业务场景类型中,存在与第一业务场景类型相同的目标业务场景类型,则可将至少两个编码配置模板中,与第一业务场景类型具有映射关系的编码配置模板,确定为第一业务场景类型对应的第一编码配置模板;而若至少两个业务场景类型中,不存在与第一业务场景类型相同的第一业务场景类型,则可确定至少两个业务场景类型分别与第一业务场景类型之间的场景相似度,可根据至少两个场景相似度确定第一业务场景类型对应的第一编码配置模板。
其中,对于根据至少两个场景相似度确定第一业务场景类型对应的第一编码配置模板的具体实现方式可以为:可以在至少两个场景相似度中获取最大场景相似度,可以将最大场景相似度与场景相似度阈值进行匹配;若最大场景相似度大于场景相似度阈值,则可将至少两个业务场景类型中,最大场景相似度对应的业务场景类型确定为匹配业务场景类型,可将至少两个编码配置模板中,与匹配业务场景类型具有映射关系的编码配置模板,确定为第一业务场景类型对应的第一编码配置模板。
应当理解,在获取到待编码媒体数据所属的第一业务场景类型后,可以遍历配置模板集合,若该配置模板集合中存在与该第一业务场景类型相同的业务场景类型(即为第一业务场景类型配置了模板),则可直接将该业务场景类型所对应的编码配置模板确定为第一编码配置模板;而若该配置模板集合中不存在与该第一业务场景类型相同的业务场景类型(即未为第一业务场景类型配置模板),则可以确定出配置模板集合中与该第一业务场景类型最为相似的业务场景类型(即匹配业务场景类型),若两者之间的场景相似度大于或等于了场景相似度阈值,则可将该最为相似的业务场景类型所对应的编码配置模板,作为第一编码配置模板。可选的,在配置模板集合中未包含第一业务场景类型,且最相似的业务场景类型与第一业务场景类型之间的场景相似度阈值也低于场景相似度阈值时,那么此时可以根据该第一业务场景类型的场景编码需求实时进行模板配置,且可以建立该配置的模板与该第一业务场景 类型之间的映射关系,并可以存储至配置模板集合中,以供后续使用。
步骤S103,根据第一编码配置模板确定待编码媒体数据的帧编码参数。
步骤S104,按照确定的帧编码参数对待编码媒体数据进行编码处理,得到第一媒体数据。这里,第一媒体数据具有第一媒体质量。第一媒体质量与第一业务场景类型相匹配。
本申请中,在确定第一业务场景类型对应的第一编码配置模板后,可以根据该第一编码配置模板确定待编码媒体数据的帧编码参数,其中,该帧编码参数可以包括帧编码结构、帧编码质量参数,该帧编码结构可以是指待编码数据帧(如待编码媒体数据为视频数据时,该待编码数据帧可以是指等待编码的视频帧)的编码结构,该帧编码质量参数可以包括量化系数QP、码率控制参数、分辨率等等。应当理解,本申请可以以GOP为单位,为每个GOP中的待编码数据帧配置帧编码结构、帧编码质量参数,对于确定待编码媒体数据的帧编码参数的具体实现方式,可以参见后续图6所对应实施例中的描述。
进一步地,在确定出待编码媒体数据的帧编码参数(也就是确定出待编码数据帧的帧编码结构与帧编码质量参数)后,可以按照该帧编码参数对待编码媒体数据进行编码处理,也就是说可以按照帧编码结构与帧编码质量参数对待编码数据帧进行编码处理,由此可以得到具有第一媒体质量(即具有一定分辨率、码率、压缩性能)的第一媒体数据。
为便于理解对待编码数据帧进行编码处理的过程,请一并参见图5,图5是本申请实施例提供的一种编码处理过程的示意图。
如图5所示,一帧图像(如图5所示的Fn当前帧)送入到视频编码器,可以先按照64×64块大小分割成多个编码树单元(Coding Tree Uint,CTU),经过深度划分可以得到编码单元(Coding Uint,CU),其中,每个CU可以包含预测单元(Predict Unit,PU)和变换单元(TransformUnit,TU)。可以理解的是,通过视频编码器可以从当前帧Fn中获取待编码单元(也就是某个CU),进而可以对该待编码单元进行预测处理,以得到预测单元。此时,可以确定该预测单元与待编码单元之间的内容变化度,进而可以基于该内容变化度,确定该预测单元与待编码单元之间的残差(如可以根据如图5所示的当前帧Fn与参考帧F’n-1进行帧内预测和帧间预测,进行图中的ME(motion estimation,运动估计)和MC(Motion Compensation,运动补偿));然后,业务服务器可以对该残差进行变换处理(如离散余弦变换(Discrete Cosine Transform,DCT))和量化处理,得到量化系数(也可称为残差系数),进而可以对量化参数进行熵编码,以得到该待编码单元对应的压缩码流(即如图5所示的箭头输出)。其中,对待编码单元进行预测处理的预测方式可以包括帧内预测方式与帧间预测方式。
与此同时,业务服务器可以对量化系数进行反量化处理(如图5所示的逆量化)以及反变换处理(如图5所示的逆变换)之后,得到重构图像对应的残差值,进而可以基于残差值以及预测单元,从而可以得到重构图像(可以对应于如图所示的重建帧F’n。例如,可以将残差值与预测值进行相加,并基于DB(DeBlock Filter,去块滤波)和SAO(Sample adaptive offset,自适应补偿)得到重构图像)。进一步地,业务服务器可以对重构图像进行滤波处理(如环内滤波处理),得到滤波处理后的图像。即在对当前帧Fn编码完成后,可以基于滤波处理后的图像,得到当前帧Fn对应的重构帧,进而可以将该重构帧进入参考帧队列,作为下一帧的参考帧,从而可以依次向后进行编码处理。
其中,预测时,可以从最大编码单元从最大编码单元(Largest Code Unit,LCU)开始,每层按照四叉树,逐层向下划分,做递归计算。
首先,可以由上向下划分。从深度depth=0,64×64块先分割为4个32×32的子CU。然后其中一个32×32的子CU,再继续分割为4个16×16的子CU,以此类推,直到深度depth=3,CU大小为8×8。
然后,可以由下往上进行修剪。对4个8×8的CU的RDcost(Rate Distortion Optimation,率失真优化,也可称为率失真代价)求和(可记为cost1),与对应上一级16×16的CU的RDcost(可记为cost2)进行比较。若cost1小于cost2,则可以保留8×8的CU分割,否则继续往上修剪,逐层比较。最后,可以找出最优的CU深度划分情况。
可以理解的是,PU预测可以分为帧内预测和帧级预测,可以先在相同预测类型内,在不同PU间进行比较,找到最优的分割模式,再在帧内帧间模式之间进行比较,由此可以找到当前CU下的最优 预测模式;同时可以对CU进行基于四叉树结构的自适应变换(Residual Quad-tree Transform,RQT),找出最优的TU模式。最后可以将一帧图像分成一个个CU,及CU对应下的PU与TU。
可以理解的是,在进行编码处理前,可以根据场景所配置的模板,快速确定出待编码媒体数据对应的一系列编码参数(如,帧编码结构、码率控制参数等),使得编码后的媒体数据可以满足所需要的码率限制,并且使得编码失真尽量小,码率控制属于率失真优化的范畴,其主要是确定与码率相匹配的量化系数,需要较大的计算量,而本申请通过根据场景提前配置,可以有效减少编码处理中的计算量,提高编码效率。
可选的,终端设备在采集到媒体数据时(如视频数据),需要对该媒体数据进行编码处理,该采集到的媒体数据也可称之为待编码媒体数据。终端设备也可以根据第一业务场景类型,在配置模板集合中获取到第一编码配置模板,并根据该第一编码配置模板确定出待编码媒体数据对应的帧编码参数,并按照该帧编码参数对媒体数据进行编码处理。其具体过程这里将不再进行赘述。
在本申请实施例中,可以为不同的业务场景类型配置不同的编码配置模板,生成一个配置模板集合,该配置模板集合中可以包括每个业务场景类型及其对应的(具有映射关系的)编码配置模板,那么在获取到待编码媒体数据时,可以根据其所属的第一业务场景类型,快速地在配置模板集合中获取到第一业务场景类型对应的第一编码配置模板;随后,可以根据该第一编码配置模板确定该待编码媒体数据的帧编码参数,再按照该帧编码参数对待编码媒体数据进行编码处理,得到与第一业务场景类型相匹配的目标媒体质量的目标媒体数据。也就是说,通过提前为不同的业务场景类型配置好不同的编码配置模板的方式,可以使得在编码过程中,通过查询即可找到与第一业务场景类型的第一编码配置模板,可以快速地通过第一编码配置模板确定出待编码媒体数据的帧编码参数,可以有效减少编码处理中的计算量,减少编码耗时,提高计算机设备对于待编码媒体数据的编码效率;同时,根据业务场景类型来自适应选择编码配置模板的方式,可以使得所选择到的第一编码配置模板是与该第一业务场景类型是相匹配的,所编码得到的第一媒体数据的第一媒体质量与第一业务场景类型相匹配,即基于第一编码配置模板所确定的帧编码参数符合该第一业务场景类型的编码需求,具有场景适配性,采用该帧编码参数所得到第一媒体数据的压缩性能也较高。综上,本申请可以提高编码效率,提升编码压缩性能。
进一步地,请参见图6,图6是本申请实施例提供的一种根据第一编码配置模板确定帧编码参数的流程示意图。其中,该流程可以对应于上述图3所对应实施例中,对于根据第一编码配置模板确定待编码媒体数据的帧编码参数的流程,该流程中的帧编码参数可以包括帧编码结构与帧编码质量参数。如图6所示,该流程可以至少包括以下步骤S601-步骤S603:
步骤S601,获取第一编码配置模板中的帧类型分布与帧层级分布。
具体的,这里的帧类型可以包括第一类型,第二类型以及第三类型。其中,本申请实施例可以将帧内编码帧(intra picture,简称I帧)这种帧类型称之为第一类型,将双向预测编码帧(bi-directional interpolated prediction frame,简称B帧)这种帧类型称之为第二类型,将前向预测编码帧(predictive-frame,简称P帧)这种帧类型称之为第三类型。而帧层级分布可以是指分层编码结构(分层总数、各层帧数量等等),例如,上述图4所对应实施例中的分层编码结构40可以为该帧层级分布,该分层编码结构中的分层总数为5层,第1层帧数量为2(包括第1帧与第16帧,第1帧为I帧,第16帧为P帧)、第2层帧数量为1(包括第8帧,为B帧)、第3层帧数量为2(包括第4帧与第12帧,第4帧与第12帧均为B帧),该分层编码结构中还包括第4层与第5层,这里将不再对分层编码结构40进行阐述。
步骤S602,根据帧类型分布以及帧层级分布,确定待编码媒体数据对应的帧编码结构。
具体的,本申请可以以GOP为粒度进行帧类型配置以及分层编码结构的配置,那么本申请可以以GOP为粒度,根据第一编码配置模板确定出每个GOP(这里可称之为单位数据帧组)对应的帧编码结构。其具体方法可为:可获取待编码媒体数据对应的单位数据帧组;其中,单位数据帧组是由N个连续的待编码数据帧组成的,待编码媒体数据包括待编码数据帧;N为正整数;随后,可获取帧类型分布中单位数据帧组所对应的帧组帧类型分布(也就是对每个GOP中的帧所配置的帧类型分布),根据帧组帧类型分布,可对单位数据帧组中的待编码数据帧进行类型划分,得到经类型划分的数据帧;随 后,可获取帧层级分布中单位数据帧组所对应的帧组帧层级分布,可根据帧层级分布对经类型划分的数据帧进行层级划分,得到单位数据帧组对应的分层编码结构;将分层编码结构确定为待编码媒体数据对应的帧编码结构。
在一个实施例中,步骤S602可以获取所述待编码媒体数据中的每个单位数据帧组;所述每个单位数据帧组是由N个连续的待编码数据帧组成的,N为正整数。根据帧类型分布,对每个单位数据帧组中的所述待编码数据帧进行类型划分,得到经类型划分的数据帧。根据帧层级分布对所述经类型划分的数据帧进行层级划分,得到每个单位数据帧组对应的分层编码结构。根据每个单位数据帧组对应的所述分层编码结构,确定所述待编码媒体数据对应的帧编码结构。
例如,以GOP对应的帧层级分布为分层编码结构40为例,在将某个GOP中的帧进行类型划分,得到I、B、P分布后,可以将这些帧填入该分层编码结构40中,例如,第1帧为I帧,则可将第1帧填入至该分层编码结构40的第1层中的第1帧的位置。在确定各个帧的帧类型以及所处层级后,可将该包括各个帧的分层编码结构确定为该GOP对应的帧编码结构。
步骤S603,获取第一编码配置模板中的编码质量参数,按照编码质量参数对所述待编码媒体数据进行质量参数配置,得到待编码媒体数据对应的帧编码质量参数。
具体的,通过上述可知,帧编码结构可以为分层编码结构,这里可以以分层编码结构包括第一层级与第二层级,第二层级高于第一层级为例;本申请还可以为每个层级配置帧编码质量参数(如量化系数QP、码率控制参数等),以编码质量参数包括第一层级对应的第一编码质量参数,以及第二层级对应的第二编码质量参数为例,对于按照编码质量参数对待编码媒体数据进行质量参数配置,得到待编码媒体数据对应的帧编码质量参数的具体方法可为:可以在待编码媒体数据中获取处于分层编码结构中的第一层级的第一待编码数据帧,以及处于分层编码结构中的第二层级的第二待编码数据帧;本申请可以直接将该第一编码质量参数作为该第一待编码数据帧对应的帧编码质量参数,将第二编码质量参数作为该第二待编码数据帧对应的帧编码质量参数。
可选的,可以理解的是,本申请还可以根据解码端(如等待播放媒体数据的终端设备)的设备信息来对根据场景所配置的模板进行调整,例如,设备信息可以包括网络状态信息、终端解码能力等等,那么本申请可以根据终端设备的网络状态信息、终端解码能力等,来调整模板(如,调整编码质量参数、分层编码结构中的分层数大小等等)。
以调整编码质量参数为例,本申请对于按照编码质量参数对待编码媒体数据进行质量参数配置,得到待编码媒体数据对应的帧编码质量参数的具体方法还可为:可以在待编码媒体数据中获取处于分层编码结构中的第一层级的第一待编码数据帧,以及处于分层编码结构中的第二层级的第二待编码数据帧;随后,可获取第一终端的设备指标信息;其中,第一终端是指等待播放待编码媒体数据的终端;若设备指标信息满足参数调整条件,则可根据设备指标信息与第一编码质量参数,确定第一待编码数据帧对应的帧编码质量参数,根据设备指标信息与第二编码质量参数,确定第二待编码数据帧对应的帧编码质量参数。
其中,以设备指标信息包括网络质量参数与解码算力信息为例,对于判断设备指标信息是否满足参数调整条件的具体方法可为:可以将网络质量参数与网络参数阈值进行匹配,同时将解码算力信息与算力阈值进行匹配;若网络质量参数大于(或等于)网络参数阈值,且解码算力信息大于(或等于)算力阈值,则可确定设备指标信息未满足参数调整条件;而若网络质量参数小于网络参数阈值,或解码算力信息小于算力阈值,则可以确定设备指标信息满足参数调整条件。
可选的,可以理解的是,若设备指标信息未满足参数调整条件,则可以直接将第一编码质量参数确定为第一待编码数据帧对应的帧编码质量参数,将第二编码质量参数确定为第二待编码数据帧对应的帧编码质量参数。
其中,在设备指标信息满足参数调整条件时,对于根据设备指标信息与第一编码质量参数,确定第一待编码数据帧对应的帧编码质量参数的具体实现过程可以为:可以获取参数映射表;其中,参数映射表中可以包含网络质量区间集合与适配码率控制参数集合之间的映射关系,一个网络质量区间与一个适配码率控制参数之间存在映射关系;随后,可以获取第一终端对应的网络质量参数,可在网络质量区间集合中获取网络质量参数所属的第一网络质量区间,可以获取与第一网络质量区间具有映射 关系的网络适配码率控制参数;根据场景码率控制参数与网络适配码率控制参数,即可确定第一待编码数据帧对应的帧编码质量参数。
其中,在一种可行的实施例中,对于根据场景码率控制参数与网络适配码率控制参数,确定第一待编码数据帧对应的帧编码质量参数的具体实现方式可为:可以获取第一业务场景类型对应的第一运算系数,以及网络质量参数对应的第二运算系数;随后,可以将第一运算系数与场景码率控制参数进行运算处理,得到第一运算码率控制参数;将第二运算系数与网络适配码率控制参数进行运算处理,得到第二运算码率控制参数;可确定第一运算码率控制参数与第二运算码率控制参数之间的均值,可将均值确定为第一待编码数据帧对应的帧编码质量参数。
可以理解的是,在设备指标信息满足参数调整条件时,也就是终端设备的网络状态较差,或终端解码能力较差时,此时可以适当降低帧编码质量参数(如降低码率),来与终端设备的网络状态或解码能力匹配上。可以预先为不同的网络状态(如网络质量区间)或不同的解码能力配置不同的码率控制参数(可称为网络适配码率控制参数),在确定设备指标信息满足参数调整条件时,可以获取到终端设备处的网络状态(或解码能力),根据终端设备的网络状态获取到其对应的网络适配码率控制参数,随后,可以确定场景码率控制参数(即为不同的业务场景类型配置的码率控制参数)与网络适配码率控制参数之间的均值,可将该均值确定为待编码数据帧的帧编码质量参数。
而在一种可行的实施例中,还可以为场景与终端设备的网络状态(或解码能力)配置运算系数(可以理解为权重),在确定场景码率控制参数与网络适配码率控制参数后,在将两者进行运算时,可以先将场景码率控制参数及其对应的第一运算系数进行运算处理(如相乘处理),得到第一运算码率控制参数(乘积结果);同时将网络适配码率控制参数与第二运算系数进行运算处理(如相乘处理),得到第二运算码率控制参数,随后,再确定第一运算码率控制参数与第二运算码率控制参数之间的均值,确定为该待编码数据帧的帧编码质量参数。
可选的,在一种可行的实施例中,对于根据场景码率控制参数与网络适配码率控制参数,确定第一待编码数据帧对应的帧编码质量参数的具体实现方式可为:可将场景码率控制参数与网络适配码率控制参数进行比较,确定场景码率控制参数与网络适配码率控制参数之间的最小码率控制参数;可以将最小码率控制参数确定为第一待编码数据帧对应的帧编码质量参数。
应当理解,当终端设备的网络状态较差或解码能力较差时,此时可以将场景码率控制参数与网络适配码率控制参数之间的最小码率控制参数,作为待编码数据帧的帧编码质量参数。由此可以保证终端设备可以顺利解码。
可选的,在一种可行的实施例中,对于根据场景码率控制参数与网络适配码率控制参数,确定第一待编码数据帧对应的帧编码质量参数的具体实现方式可为:在获取到网络适配码率控制参数后,可直接将该网络适配码率控制参数,确定为该待编码数据帧的帧编码质量参数(即将场景码率控制参数替换为该网络适配码率控制参数)。
其中,本申请中的算力信息可以是指计算机设备(如业务服务器、终端设备)执行计算任务时所需要占用的硬件或者网络资源,通常可以包括中央处理器(central processing unit,cpu)算力信息、图形处理器(graphics processing unit,gpu)、内存资源、网络带宽资源、磁盘资源等等。
可选的,可以理解的是,本申请还可以预先根据不同的网络状态,为同一个业务场景类型配置不同的模板。例如,针对同一个业务场景类型,可以细分三个配置网络场景,针对网络状态较好(如网络质量参数大于某个第一阈值)的配置网络场景,可以将模板中的帧编码质量参数(如码率)设置的较大;针对网络状态为中等情况(如网络质量参数处于第二阈值与第一阈值之间,第二阈值小于第一阈值)的配置网络场景,可以将模板中的帧编码质量参数(如码率)设置的比较好的情况稍微小;针对网络状态较差的配置网络场景,可以将模板中的帧编码质量参数设置的较小。那么在确定出待编码媒体数据的第一业务场景类型后,可以先获取到终端设备的网络状态,可以根据该第一业务场景类型与网络状态共同确定出其对应的第一编码配置模板。
在本申请实施例中,可以为不同的业务场景类型配置不同的编码配置模板,生成一个配置模板集合,该配置模板集合中可以包括每个业务场景类型及其对应的(具有映射关系的)编码配置模板,那么在获取到待编码媒体数据时,可以根据其所属的第一业务场景类型,快速地在配置模板集合中获取 到第一业务场景类型对应的第一编码配置模板;随后,可以根据实时网络状态(或终端解码能力)对该第一编码配置模板进行适当调整,最终确定该待编码媒体数据的帧编码参数,再按照该帧编码参数对待编码媒体数据进行编码处理,得到与第一业务场景类型相匹配的第一媒体质量的第一媒体数据。也就是说,通过提前为不同的业务场景类型配置好不同的编码配置模板的方式,可以使得在编码过程中,通过查询即可找到与第一业务场景类型的第一编码配置模板,可以快速地通过第一编码配置模板与网络状态(或终端解码能力)确定出待编码媒体数据的帧编码参数,可以有效减少编码处理中的计算量,减少编码耗时,提高对于待编码媒体数据的编码效率;同时,根据业务场景类型与网络状态(或终端解码能力)来自适应选择及调整编码配置模板的方式,可以使得所确定的第一编码配置模板是与该第一业务场景类型与网络状态是相匹配的,所编码得到的第一媒体数据的第一媒体质量与第一业务场景类型与网络状态相匹配,即基于第一编码配置模板所确定的帧编码参数符合该第一业务场景类型的编码需求,具有场景适配性,同时也符合网络状态的需求,采用该帧编码参数所得到第一媒体数据的压缩性能也较高。综上,本申请可以提高编码效率,提升编码压缩性能。
进一步地,请参见图7,图7是本申请实施例提供的一种系统流程图。如图7所示,该流程可以至少以下步骤S71-步骤S76:
步骤S71,输入编码场景参数。
具体的,这里的编码场景参数可以是指待编码媒体数据所属的业务场景类型,所对应的参数。在媒体数据编码器(如视频编码器)初始化后,可以输入编码场景参数。
步骤S72,根据场景选择分层模板配置。
具体的,可以为不同的业务场景类型配置不同的编码配置模板,这里的编码配置模板可以是指分层编码配置模板,则在输入编码场景参数后,可以根据场景参数获取到对应的分层编码配置模板。
步骤S73,接收终端网络状态。
步骤S74,实时调整分层模板配置。
具体的,可以根据终端网络状态实时调整所选择的分层模板配置。如,可以调整模板配置中的分层编码结构的分层数大小、各层帧的数量配置、每层对应的帧编码质量参数(如量化系数、码控控制参数等等)。
步骤S75,进行编码处理。
具体的,可以根据调整后的分层模板配置,对待编码媒体数据的待编码数据帧进行编码处理。
步骤S76,输出编码结果。
其中,对于步骤S71-步骤S76的具体实现方式,可以参见上述图3所对应实施例中步骤S101-步骤S103的描述,这里将不再进行赘述。其带来的有益效果这里也将不再进行赘述。
进一步地,请参见图8,图8是本申请实施例提供的一种数据处理装置的结构示意图。该数据处理装置可以是运行于计算机设备中的一个计算机程序(包括程序代码),例如该数据处理装置为一个应用软件;该数据处理装置可以用于执行图3所示的方法。如图8所示,该数据处理装置1可以包括:数据获取模块11、模板获取模块12、参数确定模块13以及数据编码模块14。
数据获取模块11,用于获取待编码媒体数据,以及待编码媒体数据所属的第一业务场景类型;
模板获取模块12,用于根据配置模板集合中的至少两个编码配置模板与至少两个业务场景类型之间的映射关系,获取第一业务场景类型对应的第一编码配置模板。一个编码配置模板与一个业务场景类型之间存在映射关系。至少两个编码配置模板包括第一编码配置模板。
参数确定模块13,用于根据第一编码配置模板确定待编码媒体数据的帧编码参数;
数据编码模块14,用于按照确定的帧编码参数对待编码媒体数据进行编码处理,得到第一媒体数据。第一媒体质量与第一业务场景类型相匹配。
其中,数据获取模块11、模板获取模块12、参数确定模块13以及数据编码模块14的具体实现方式,可以参见上述图3所对应实施例中步骤S101-步骤S104的描述,这里将不再进行赘述。
在一个实施例中,模板获取模块12可以包括:类型遍历单元121以及模板确定单元122。
类型遍历单元121,用于遍历配置模板集合中的至少两个业务场景类型;
模板确定单元122,用于若至少两个业务场景类型中,存在与第一业务场景类型相同的第一业务 场景类型,则将至少两个编码配置模板中,与第一业务场景类型具有映射关系的编码配置模板,确定为第一业务场景类型对应的第一编码配置模板;
模板确定单元122,还用于若至少两个业务场景类型中,不存在与第一业务场景类型相同的第一业务场景类型,则确定至少两个业务场景类型分别与第一业务场景类型之间的场景相似度;
模板确定单元122,还用于根据至少两个场景相似度确定第一业务场景类型对应的第一编码配置模板。
其中,类型遍历单元121以及模板确定单元122的具体实现方式,可以参见上述图3所对应实施例中步骤S102的描述,这里将不再进行赘述。
在一个实施例中,模板确定单元122可以包括:匹配子单元1221以及模板确定子单元1222。
匹配子单元1221,用于在至少两个场景相似度中获取最大场景相似度,将最大场景相似度与场景相似度阈值进行匹配;
模板确定子单元1222,用于若最大场景相似度大于场景相似度阈值,则将至少两个业务场景类型中,最大场景相似度对应的业务场景类型确定为匹配业务场景类型,将至少两个编码配置模板中,与匹配业务场景类型具有映射关系的编码配置模板,确定为第一业务场景类型对应的第一编码配置模板。
其中,匹配子单元1221以及模板确定子单元1222的具体实现方式,可以参见上述图3所对应实施例中步骤S102的描述,这里将不再进行赘述。
在一个实施例中,帧编码参数包括帧编码结构与帧编码质量参数;
参数确定模块13可以包括:模板分布获取单元131、编码结构确定单元132以及质量参数确定单元133。
模板分布获取单元131,用于获取第一编码配置模板中的帧类型分布与帧层级分布;
编码结构确定单元132,用于根据帧类型分布以及帧层级分布,确定待编码媒体数据对应的帧编码结构;
质量参数确定单元133,用于获取第一编码配置模板中的编码质量参数,按照编码质量参数对待编码媒体数据进行质量参数配置,得到待编码媒体数据对应的帧编码质量参数。
其中,模板分布获取单元131、编码结构确定单元132以及质量参数确定单元133的具体实现方式,可以参见上述图3所对应实施例中步骤S103和S104的描述,这里将不再进行赘述。
在一个实施例中,编码结构确定单元132可以包括:单位帧组获取子单元1321、类型分布确定子单元1322以及编码结构确定子单元1323。
单位帧组获取子单元1321,用于获取待编码媒体数据中的每个单位数据帧组。单位数据帧组是由N个连续的待编码数据帧组成的。待编码媒体数据包括待编码数据帧。N为正整数。
类型分布确定子单元1322,用于根据帧组帧类型分布,对每个单位数据帧组中的待编码数据帧进行类型划分,得到经类型划分的数据帧;
编码结构确定子单元1323,用于根据帧层级分布对经类型划分的数据帧进行层级划分,得到每个单位数据帧组对应的分层编码结构;
编码结构确定子单元1323,还用于根据每个单位数据帧组对应的所述分层编码结构,确定所述待编码媒体数据对应的帧编码结构。
其中,单位帧组获取子单元1321、类型分布确定子单元1322以及编码结构确定子单元1323的具体实现方式,可以参见上述图3所对应实施例中步骤S103和S104的描述,这里将不再进行赘述。
在一个实施例中,帧编码结构为分层编码结构,分层编码结构包括第一层级与第二层级,第二层级高于第一层级;编码质量参数包括第一层级对应的第一编码质量参数,以及第二层级对应的第二编码质量参数;
质量参数确定单元133可以包括:编码帧获取子单元1331、设备信息获取子单元1332以及第一质量参数确定子单元1333。
编码帧获取子单元1331,用于在待编码媒体数据中获取处于分层编码结构中的第一层级的第一待编码数据帧,以及处于分层编码结构中的第二层级的第二待编码数据帧;
设备信息获取子单元1332,用于获取第一终端的设备指标信息;第一终端是指等待播放待编码媒 体数据的终端;
第一质量参数确定子单元1333,用于若设备指标信息满足参数调整条件,则根据设备指标信息与第一编码质量参数,确定第一待编码数据帧对应的帧编码质量参数,根据设备指标信息与第二编码质量参数,确定第二待编码数据帧对应的帧编码质量参数。
其中,编码帧获取子单元1331、设备信息获取子单元1332以及第一质量参数确定子单元1333的具体实现方式,可以参见上述图3所对应实施例中步骤S103和S104的描述,这里将不再进行赘述。
在一个实施例中,设备指标信息包括网络质量参数与解码算力信息;
质量参数确定单元133还可以包括:设备信息匹配子单元1334以及条件确定子单元1335。
设备信息匹配子单元1334,用于将网络质量参数与网络参数阈值进行匹配,将解码算力信息与算力阈值进行匹配;
条件确定子单元1335,用于若网络质量参数大于网络参数阈值,且解码算力信息大于算力阈值,则确定设备指标信息未满足参数调整条件;
条件确定子单元1335,还用于若网络质量参数小于网络参数阈值,或解码算力信息小于算力阈值,则确定设备指标信息满足参数调整条件。
其中,设备信息匹配子单元1334以及条件确定子单元1335的具体实现方式,可以参见上述图3所对应实施例中步骤S103和S104的描述,这里将不再进行赘述。
在一个实施例中,质量参数确定单元133还可以包括:第二质量参数确定子单元1336。
第二质量参数确定子单元1336,用于若设备指标信息未满足参数调整条件,则将第一编码质量参数确定为第一待编码数据帧对应的帧编码质量参数,将第二编码质量参数确定为第二待编码数据帧对应的帧编码质量参数。
其中,第二质量参数确定子单元1336的具体实现方式,可以参见上述图3所对应实施例中步骤S103和S104的描述,这里将不再进行赘述。
在一个实施例中,第一编码质量参数包括场景码率控制参数;设备指标信息包括网络质量参数;
第一质量参数确定子单元1333,还具体用于获取参数映射表;参数映射表中包含网络质量区间集合与适配码率控制参数集合之间的映射关系,一个网络质量区间与一个适配码率控制参数之间存在映射关系;
第一质量参数确定子单元1333,还具体用于获取第一终端对应的网络质量参数,在网络质量区间集合中获取网络质量参数所属的第一网络质量区间,获取与第一网络质量区间具有映射关系的网络适配码率控制参数;
第一质量参数确定子单元1333,还具体用于根据场景码率控制参数与网络适配码率控制参数,确定第一待编码数据帧对应的帧编码质量参数。
在一个实施例中,第一质量参数确定子单元1333,还具体用于获取第一业务场景类型对应的第一运算系数,以及网络质量参数对应的第二运算系数;
第一质量参数确定子单元1333,还具体用于将第一运算系数与场景码率控制参数进行运算处理,得到第一运算码率控制参数;
第一质量参数确定子单元1333,还具体用于将第二运算系数与网络适配码率控制参数进行运算处理,得到第二运算码率控制参数;
第一质量参数确定子单元1333,还具体用于确定第一运算码率控制参数与第二运算码率控制参数之间的均值,将均值确定为第一待编码数据帧对应的帧编码质量参数。
在一个实施例中,第一质量参数确定子单元1333,还具体用于将场景码率控制参数与网络适配码率控制参数进行比较,确定场景码率控制参数与网络适配码率控制参数之间的最小码率控制参数;
第一质量参数确定子单元1333,还具体用于将最小码率控制参数确定为第一待编码数据帧对应的帧编码质量参数。
在本申请实施例中,可以为不同的业务场景类型配置不同的编码配置模板,生成一个配置模板集合,该配置模板集合中可以包括每个业务场景类型及其对应的(具有映射关系的)编码配置模板,那么在获取到待编码媒体数据时,可以根据其所属的第一业务场景类型,快速地在配置模板集合中获取 到第一业务场景类型对应的第一编码配置模板;随后,可以根据实时网络状态(或终端解码能力)对该第一编码配置模板进行适当调整,最终确定该待编码媒体数据的帧编码参数,再按照该帧编码参数对待编码媒体数据进行编码处理,得到与第一业务场景类型相匹配的第一媒体质量的第一媒体数据。也就是说,通过提前为不同的业务场景类型配置好不同的编码配置模板的方式,可以使得在编码过程中,通过查询即可找到与第一业务场景类型的第一编码配置模板,可以快速地通过第一编码配置模板与网络状态(或终端解码能力)确定出待编码媒体数据的帧编码参数,可以有效减少编码处理中的计算量,减少编码耗时,提高对于待编码媒体数据的编码效率;同时,根据业务场景类型与网络状态(或终端解码能力)来自适应选择及调整编码配置模板的方式,可以使得所确定的第一编码配置模板是与该第一业务场景类型与网络状态是相匹配的,所编码得到的第一媒体数据的第一媒体质量与第一业务场景类型与网络状态相匹配,即基于第一编码配置模板所确定的帧编码参数符合该第一业务场景类型的编码需求,具有场景适配性,同时也符合网络状态的需求,采用该帧编码参数所得到第一媒体数据的压缩性能也较高。综上,本申请可以提高计算机设备的编码效率,提升编码压缩性能。
进一步地,请参见图9,图9是本申请实施例提供的一种计算机设备的结构示意图。如图9所示,上述图7所对应实施例中的装置1可以应用于上述计算机设备8000,上述计算机设备8000可以包括:处理器8001,网络接口8004和存储器8005,此外,上述计算机设备8000还包括:用户接口8003,和至少一个通信总线8002。其中,通信总线8002用于实现这些组件之间的连接通信。其中,用户接口8003可以包括显示屏(Display)、键盘(Keyboard),可选用户接口8003还可以包括标准的有线接口、无线接口。网络接口8004可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器8005可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。存储器8005可选的还可以是至少一个位于远离前述处理器8001的存储装置。如图9所示,作为一种计算机可读存储介质的存储器8005中可以包括操作系统、网络通信模块、用户接口模块以及设备控制应用程序。
在图9所示的计算机设备8000中,网络接口8004可提供网络通讯功能;而用户接口8003主要用于为用户提供输入的接口;而处理器8001可以用于调用存储器8005中存储的设备控制应用程序,以实现:
获取待编码媒体数据,以及待编码媒体数据所属的第一业务场景类型;
根据配置模板集合中的至少两个编码配置模板与至少两个业务场景类型之间的映射关系,获取第一业务场景类型对应的第一编码配置模板;一个编码配置模板与一个业务场景类型之间存在映射关系;至少两个编码配置模板包括第一编码配置模板;
根据第一编码配置模板确定待编码媒体数据的帧编码参数,按照帧编码参数对待编码媒体数据进行编码处理,得到具有第一媒体质量的第一媒体数据;第一媒体质量与第一业务场景类型相匹配。
应当理解,本申请实施例中所描述的计算机设备8000可执行前文图3到图7所对应实施例中对该数据处理方法的描述,也可执行前文图8所对应实施例中对该数据处理装置1的描述,在此不再赘述。另外,对采用相同方法的有益效果描述,也不再进行赘述。
此外,这里需要指出的是:本申请实施例还提供了一种计算机可读存储介质,且上述计算机可读存储介质中存储有前文提及的数据处理的计算机设备1000所执行的计算机程序,且上述计算机程序包括程序指令,当上述处理器执行上述程序指令时,能够执行前文图3到图7所对应实施例中对上述数据处理方法的描述,因此,这里将不再进行赘述。另外,对采用相同方法的有益效果描述,也不再进行赘述。对于本申请所涉及的计算机可读存储介质实施例中未披露的技术细节,请参照本申请方法实施例的描述。
上述计算机可读存储介质可以是前述任一实施例提供的数据处理装置或者上述计算机设备的内部存储单元,例如计算机设备的硬盘或内存。该计算机可读存储介质也可以是该计算机设备的外部存储设备,例如该计算机设备上配备的插接式硬盘,智能存储卡(smart media card,SMC),安全数字(secure digital,SD)卡,闪存卡(flash card)等。进一步地,该计算机可读存储介质还可以既包括该计算机设备的内部存储单元也包括外部存储设备。该计算机可读存储介质用于存储该计算机程序以及该计算机设备所需的其他程序和数据。该计算机可读存储介质还可以用于暂时地存储已经输出或者 将要输出的数据。
本申请的一个方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行本申请实施例中一方面提供的方法。
本申请实施例的说明书和权利要求书及附图中的术语“第一”、“第二”等是用于区别不同对象,而非用于描述特定顺序。此外,术语“包括”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、装置、产品或设备没有限定于已列出的步骤或模块,而是可选地还包括没有列出的步骤或模块,或可选地还包括对于这些过程、方法、装置、产品或设备固有的其他步骤单元。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例提供的方法及相关装置是参照本申请实施例提供的方法流程图和/或结构示意图来描述的,具体可由计算机程序指令实现方法流程图和/或结构示意图的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。这些计算机程序指令可提供到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或结构示意图一个方框或多个方框中指定的功能的装置。这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或结构示意图一个方框或多个方框中指定的功能。这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或结构示意一个方框或多个方框中指定的功能的步骤。
以上所揭露的仅为本申请较佳实施例而已,当然不能以此来限定本申请之权利范围,因此依本申请权利要求所作的等同变化,仍属本申请所涵盖的范围。

Claims (16)

  1. 一种数据处理方法,在计算机设备中执行,所述方法包括:
    获取待编码媒体数据,以及所述待编码媒体数据所属的第一业务场景类型;
    根据配置模板集合中的至少两个编码配置模板与至少两个业务场景类型之间的映射关系,获取所述第一业务场景类型对应的第一编码配置模板;
    根据所述第一编码配置模板确定所述待编码媒体数据的帧编码参数;
    按照确定的所述帧编码参数对所述待编码媒体数据进行编码处理,得到第一媒体数据。
  2. 根据权利要求1所述的方法,其中,所述根据配置模板集合中的至少两个编码配置模板与至少两个业务场景类型之间的映射关系,获取所述第一业务场景类型对应的第一编码配置模板,包括:
    遍历所述配置模板集合中的所述至少两个业务场景类型;
    若所述至少两个业务场景类型中,存在与所述第一业务场景类型相同的业务场景类型,则将所述至少两个编码配置模板中,与所述第一业务场景类型具有映射关系的编码配置模板,确定为所述第一业务场景类型对应的第一编码配置模板;
    若所述至少两个业务场景类型中,不存在与所述第一业务场景类型相同的业务场景类型,则确定所述至少两个业务场景类型分别与所述第一业务场景类型之间的场景相似度,根据至少两个场景相似度确定所述第一业务场景类型对应的第一编码配置模板。
  3. 根据权利要求2所述的方法,其中,所述根据至少两个场景相似度确定所述第一业务场景类型对应的第一编码配置模板,包括:
    在所述至少两个场景相似度中获取最大场景相似度,将所述最大场景相似度与场景相似度阈值进行匹配;
    若所述最大场景相似度大于所述场景相似度阈值,则将所述至少两个业务场景类型中,所述最大场景相似度对应的业务场景类型确定为匹配业务场景类型,将与所述匹配业务场景类型具有映射关系的编码配置模板,确定为所述第一业务场景类型对应的第一编码配置模板。
  4. 根据权利要求1所述的方法,其中,所述帧编码参数包括帧编码结构与帧编码质量参数;
    所述根据所述第一编码配置模板确定所述待编码媒体数据的帧编码参数,包括:
    获取所述第一编码配置模板中的帧类型分布与帧层级分布;
    根据所述帧类型分布以及所述帧层级分布,确定所述待编码媒体数据对应的帧编码结构;
    获取所述第一编码配置模板中的编码质量参数,按照所述编码质量参数对所述待编码媒体数据进行质量参数配置,得到所述待编码媒体数据对应的帧编码质量参数。
  5. 根据权利要求4所述的方法,其中,所述根据所述帧类型分布以及所述帧层级分布,确定所述待编码媒体数据对应的帧编码结构,包括:
    获取所述待编码媒体数据中的每个单位数据帧组;所述每个单位数据帧组是由N个连续的待编码数据帧组成的,N为正整数;
    根据所述帧类型分布,对所述每个单位数据帧组中的所述待编码数据帧进行类型划分,得到经类型划分的数据帧;
    根据所述帧层级分布对所述经类型划分的数据帧进行层级划分,得到所述每个单位数据帧组对应的分层编码结构;
    根据每个单位数据帧组对应的所述分层编码结构,确定所述待编码媒体数据对应的帧编码结构。
  6. 根据权利要求4所述的方法,其中,所述帧编码结构为分层编码结构,所述分层编码结构包括第一层级与第二层级,所述第二层级高于所述第一层级;所述编码质量参数包括所述第一层级对应的第一编码质量参数,以及所述第二层级对应的第二编码质量参数;
    所述按照所述编码质量参数对所述待编码媒体数据进行质量参数配置,得到所述待编码媒体数据对应的帧编码质量参数,包括:
    在所述待编码媒体数据中获取处于所述分层编码结构中的所述第一层级的第一待编码数据帧,以及处于所述分层编码结构中的所述第二层级的第二待编码数据帧;
    获取第一终端的设备指标信息;所述第一终端是指等待播放所述待编码媒体数据的终端;
    若所述设备指标信息满足参数调整条件,则根据所述设备指标信息与所述第一编码质量参数,确定所述第一待编码数据帧对应的帧编码质量参数,根据所述设备指标信息与所述第二编码质量参数,确定所述第二待编码数据帧对应的帧编码质量参数。
  7. 根据权利要求6所述的方法,其中,所述设备指标信息包括网络质量参数与解码算力信息;
    所述方法还包括:
    若所述网络质量参数大于网络参数阈值,且所述解码算力信息大于算力阈值,则确定所述设备指标信息未满足所述参数调整条件;
    若所述网络质量参数小于所述网络参数阈值,或所述解码算力信息小于所述算力阈值,则确定所述设备指标信息满足所述参数调整条件。
  8. 根据权利要求6所述的方法,其中,所述方法还包括:
    若所述设备指标信息未满足所述参数调整条件,则将所述第一编码质量参数确定为所述第一待编码数据帧对应的帧编码质量参数,将所述第二编码质量参数确定为所述第二待编码数据帧对应的帧编码质量参数。
  9. 根据权利要求6所述的方法,其中,所述第一编码质量参数包括场景码率控制参数;所述设备指标信息包括网络质量参数;
    所述根据所述设备指标信息与所述第一编码质量参数,确定所述第一待编码数据帧对应的帧编码质量参数,包括:
    获取参数映射表;所述参数映射表中包含网络质量区间集合与适配码率控制参数集合之间的映射关系;
    获取所述第一终端对应的所述网络质量参数,在所述网络质量区间集合中获取所述网络质量参数所属的第一网络质量区间,获取与所述第一网络质量区间具有映射关系的网络适配码率控制参数;
    根据所述场景码率控制参数与所述网络适配码率控制参数,确定所述第一待编码数据帧对应的帧编码质量参数。
  10. 根据权利要求9所述的方法,其中,所述根据所述场景码率控制参数与所述网络适配码率控制参数,确定所述第一待编码数据帧对应的帧编码质量参数,包括:
    获取所述第一业务场景类型对应的第一运算系数,以及所述网络质量参数对应的第二运算系数;
    将所述第一运算系数与所述场景码率控制参数进行运算处理,得到第一运算码率控制参数;
    将所述第二运算系数与所述网络适配码率控制参数进行运算处理,得到第二运算码率控制参数;
    确定所述第一运算码率控制参数与所述第二运算码率控制参数的均值,将所述均值确定为所述第一待编码数据帧对应的帧编码质量参数。
  11. 根据权利要求9所述的方法,其中,所述根据所述场景码率控制参数与所述网络适配码率控制参数,确定所述第一待编码数据帧对应的帧编码质量参数,包括:
    确定所述场景码率控制参数与所述网络适配码率控制参数之间的最小码率控制参数;
    将所述最小码率控制参数确定为所述第一待编码数据帧对应的帧编码质量参数。
  12. 根据权利要求1所述的方法,其中,一个编码配置模板与一个业务场景类型之间存在映射关系,所述至少两个编码配置模板包括所述第一编码配置模板;
    所述第一媒体数据具有第一媒体质量,所述第一媒体质量与所述第一业务场景类型相匹配。
  13. 一种数据处理装置,包括:
    数据获取模块,用于获取待编码媒体数据,以及所述待编码媒体数据所属的第一业务场景类型;
    模板获取模块,用于根据配置模板集合中的至少两个编码配置模板与至少两个业务场景类型之间的映射关系,获取所述第一业务场景类型对应的第一编码配置模板
    参数确定模块,用于根据所述第一编码配置模板确定所述待编码媒体数据的帧编码参数;
    数据编码模块,用于按照确定的所述帧编码参数对所述待编码媒体数据进行编码处理,得到第一媒体数据。
  14. 一种计算机设备,包括:处理器、存储器以及网络接口;
    所述处理器与所述存储器、所述网络接口相连,其中,所述网络接口用于提供网络通信功能,所述存储器用于存储程序代码,所述处理器用于调用所述程序代码,以使所述计算机设备执行权利要求1-12任一项所述的方法。
  15. 一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序适于由处理器加载并执行权利要求1-12任一项所述的方法。
  16. 一种计算机程序产品,所述计算机程序产品包括计算机指令,所述计算机指令存储在计算机可读存储介质中,所述计算机指令适于由处理器读取并执行,以使得具有所述处理器的计算机设备执行权利要求1-12任一项所述的方法。
PCT/CN2022/128561 2021-11-02 2022-10-31 数据处理方法、装置、设备、可读存储介质及程序产品 WO2023078204A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/450,627 US20230396783A1 (en) 2021-11-02 2023-08-16 Data processing method and apparatus, device, and readable storage medium

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202111288823 2021-11-02
CN202111288823.5 2021-11-02
CN202111640485.7A CN116095359A (zh) 2021-11-02 2021-12-29 一种数据处理方法、装置、设备以及可读存储介质
CN202111640485.7 2021-12-29

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/450,627 Continuation US20230396783A1 (en) 2021-11-02 2023-08-16 Data processing method and apparatus, device, and readable storage medium

Publications (1)

Publication Number Publication Date
WO2023078204A1 true WO2023078204A1 (zh) 2023-05-11

Family

ID=86187478

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/128561 WO2023078204A1 (zh) 2021-11-02 2022-10-31 数据处理方法、装置、设备、可读存储介质及程序产品

Country Status (3)

Country Link
US (1) US20230396783A1 (zh)
CN (1) CN116095359A (zh)
WO (1) WO2023078204A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107846605A (zh) * 2017-01-19 2018-03-27 湖南快乐阳光互动娱乐传媒有限公司 主播端流媒体数据生成系统及方法、网络直播系统及方法
CN110087142A (zh) * 2019-04-16 2019-08-02 咪咕文化科技有限公司 一种视频切片方法、终端及存储介质
US10686859B2 (en) * 2017-12-28 2020-06-16 Intel Corporation Content scenario and network condition based multimedia communication
CN112543328A (zh) * 2019-09-20 2021-03-23 广州虎牙科技有限公司 辅助编码方法、装置、计算机设备及存储介质
CN112738511A (zh) * 2021-04-01 2021-04-30 杭州微帧信息科技有限公司 一种结合视频分析的快速模式决策方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107846605A (zh) * 2017-01-19 2018-03-27 湖南快乐阳光互动娱乐传媒有限公司 主播端流媒体数据生成系统及方法、网络直播系统及方法
US10686859B2 (en) * 2017-12-28 2020-06-16 Intel Corporation Content scenario and network condition based multimedia communication
CN110087142A (zh) * 2019-04-16 2019-08-02 咪咕文化科技有限公司 一种视频切片方法、终端及存储介质
CN112543328A (zh) * 2019-09-20 2021-03-23 广州虎牙科技有限公司 辅助编码方法、装置、计算机设备及存储介质
CN112738511A (zh) * 2021-04-01 2021-04-30 杭州微帧信息科技有限公司 一种结合视频分析的快速模式决策方法及装置

Also Published As

Publication number Publication date
CN116095359A (zh) 2023-05-09
US20230396783A1 (en) 2023-12-07

Similar Documents

Publication Publication Date Title
WO2018090774A1 (zh) 动态自适应视频流媒体的码率控制与版本选择方法及系统
KR101644208B1 (ko) 이전에 계산된 모션 정보를 이용하는 비디오 인코딩
US20230291909A1 (en) Coding video frame key points to enable reconstruction of video frame
WO2020176144A1 (en) Improved entropy coding in image and video compression using machine learning
US11102477B2 (en) DC coefficient sign coding scheme
CN110072119A (zh) 一种基于深度学习网络的内容感知视频自适应传输方法
WO2019228207A1 (zh) 一种图像编码、解码方法、相关装置及存储介质
US11849113B2 (en) Quantization constrained neural image coding
US20230023369A1 (en) Video processing method, video processing apparatus, smart device, and storage medium
WO2022155974A1 (zh) 视频编解码以及模型训练方法与装置
WO2021249290A1 (zh) 环路滤波方法和装置
WO2023279961A1 (zh) 视频图像的编解码方法及装置
CN112235582B (zh) 一种视频数据处理方法、装置、计算机设备及存储介质
Zhu et al. A novel rate control algorithm for low latency video coding base on mobile edge cloud computing
WO2023116173A1 (zh) 一种数据处理方法、装置、计算机设备以及存储介质
WO2023193629A1 (zh) 区域增强层的编解码方法和装置
WO2023078204A1 (zh) 数据处理方法、装置、设备、可读存储介质及程序产品
CN103428529A (zh) 一种媒体云中视频数据编码传输方法
TW202337211A (zh) 條件圖像壓縮
WO2022100173A1 (zh) 一种视频帧的压缩和视频帧的解压缩方法及装置
WO2022269469A1 (en) Method, apparatus and computer program product for federated learning for non independent and non identically distributed data
Safaaldin et al. A new teleconference system with a fast technique in HEVC coding
Zheng et al. A rate control scheme for distributed high performance video encoding in cloud
Cheng et al. NDMP—An emerging MPEG standard for network distributed media processing
WO2023130893A1 (zh) 流媒体传输方法、装置、电子设备及计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22889229

Country of ref document: EP

Kind code of ref document: A1