CN115412731A - Video processing method, device, equipment and storage medium - Google Patents

Video processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN115412731A
CN115412731A CN202110511860.1A CN202110511860A CN115412731A CN 115412731 A CN115412731 A CN 115412731A CN 202110511860 A CN202110511860 A CN 202110511860A CN 115412731 A CN115412731 A CN 115412731A
Authority
CN
China
Prior art keywords
coding
video
estimation result
soft
hard
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110511860.1A
Other languages
Chinese (zh)
Inventor
王彬
严冰
熊征宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zitiao Network Technology Co Ltd
Original Assignee
Beijing Zitiao Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zitiao Network Technology Co Ltd filed Critical Beijing Zitiao Network Technology Co Ltd
Priority to CN202110511860.1A priority Critical patent/CN115412731A/en
Priority to PCT/CN2022/086195 priority patent/WO2022237427A1/en
Publication of CN115412731A publication Critical patent/CN115412731A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Abstract

The embodiment of the disclosure discloses a video processing method, a video processing device, video processing equipment and a storage medium. The method comprises the following steps: acquiring decision information corresponding to a video to be released; determining a soft coding estimation result, a hard coding estimation result and a transparent transmission coding estimation result according to the decision information; inputting the decision information, the soft-coding estimation result, the hard-coding estimation result and the transparent transmission estimation result into a set classification neural network to obtain a target coding mode; and coding the video to be issued based on the target coding mode, and uploading the coded video. According to the video processing method provided by the embodiment of the disclosure, the target coding mode is determined based on the decision information, the soft-coding estimation result, the hard-coding estimation result and the transparent transmission estimation result, and the video to be published is coded in the target coding mode, so that the publishing time of the video can be reduced, and the user experience can be improved.

Description

Video processing method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a video processing method, apparatus, device, and storage medium.
Background
With the rapid development of intelligent terminals, video production has become a very popular entertainment mode. After a user uses an application program to make a video, uploading the video to a server (namely, publishing the video) is a very critical link. For the user, the faster the upload speed, the better. However, the terminal equipment, the network speed state and the video length of the user are objective, and in order to shorten the uploading time, the video is processed at the client to reduce the redundancy size. However, processing the video also takes a certain time, so it is very important to make an optimization compromise between video processing and uploading duration.
Disclosure of Invention
The disclosure provides a video processing method, a video processing device, video processing equipment and a storage medium, so as to reduce the publishing duration of a video and improve the user experience.
In a first aspect, an embodiment of the present disclosure provides a video processing method, including:
acquiring decision information corresponding to a video to be released;
determining a soft coding estimation result, a hard coding estimation result and a transparent transmission coding estimation result according to the decision information;
inputting the decision information, the soft-coding estimation result, the hard-coding estimation result and the transparent transmission estimation result into a set classification neural network to obtain a target coding mode;
and coding the video to be issued based on the target coding mode, and uploading the coded video.
In a second aspect, an embodiment of the present disclosure further provides a video processing apparatus, including:
the decision information acquisition module is used for acquiring decision information corresponding to the video to be released;
the prediction result determining module is used for determining a soft coding prediction result, a hard coding prediction result and a transparent transmission coding prediction result according to the decision information;
a target coding mode obtaining module, configured to input the decision information, the soft-coding estimation result, the hard-coding estimation result, and the transparent transmission estimation result into a set classification neural network, so as to obtain a target coding mode;
and the video coding module is used for coding the video to be issued based on the target coding mode and uploading the coded video.
In a third aspect, an embodiment of the present disclosure further provides an electronic device, where the electronic device includes:
one or more processing devices;
storage means for storing one or more programs;
when the one or more programs are executed by the one or more processing devices, the one or more processing devices are caused to implement the video processing method according to the embodiment of the present disclosure.
In a fourth aspect, the disclosed embodiments also provide a computer readable medium, on which a computer program is stored, where the program, when executed by a processing device, implements a video processing method according to the disclosed embodiments.
The embodiment of the disclosure discloses a video processing method, a video processing device, video processing equipment and a storage medium. Acquiring decision information corresponding to a video to be published; determining a soft coding estimation result, a hard coding estimation result and a transparent transmission coding estimation result according to the decision information; inputting the decision information, the soft-coding estimation result, the hard-coding estimation result and the transparent transmission estimation result into a set classification neural network to obtain a target coding mode; and coding the video to be issued based on the target coding mode, and uploading the coded video. The video processing method provided by the embodiment of the disclosure determines a target coding mode based on the decision information, the soft coding estimation result, the hard coding estimation result and the transparent transmission estimation result, and codes the video to be published by adopting the target coding mode, so that the publishing time of the video can be reduced, and the user experience can be improved.
Drawings
Fig. 1 is a flow chart of a video processing method in an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of determining a target encoding scheme in an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a video processing apparatus in an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device in an embodiment of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be understood that the various steps recited in method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based at least in part on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.
It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a" or "an" in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will appreciate that references to "one or more" are intended to be exemplary and not limiting unless the context clearly indicates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
In the embodiment of the present disclosure, the following processes are required from production to distribution of a video: preparing for shooting, previewing, recording, editing and releasing.
Among them, preparing to photograph may be understood as turning on a start button (e.g., "+") in the setting application software. Previewing may be understood as adjusting the individual shooting parameters at the shooting interface. Recording may be understood as clicking the shoot button to start recording. Editing may be understood as post-processing the video, such as adding features, adding background music. Publishing may be understood as uploading a video to a server.
In this embodiment, the issued duration is the maximum of the upload duration and the encoding duration. The uploading time is effective by the uploading network speed and the file size. Wherein file size = video duration × bitrate. And the video time length and the uploading network speed are objective values. The coding duration depends on the coding mode, including soft coding, hard coding, and transparent transmission. Here, transparent transmission can be understood as Pulse Code Modulation (PCM) audio encoding only for video, but the output Code rate is high. The hard coding uses hardware coding such as a CPU, a GPU and the like, the speed is higher than that of soft coding, and the code rate is higher than that of soft coding. The code rate of the soft coding output is the lowest, but the time is longer. Therefore, in the embodiment, the optimal selection is made in the time consuming video encoding and uploading.
Fig. 1 is a flowchart of a video processing method provided in an embodiment of the present disclosure, where the present embodiment is applicable to a case of processing a video to be distributed to a server, and the method may be executed by a video processing apparatus, where the apparatus may be composed of hardware and/or software, and may be generally integrated in a device with a video processing function, where the device may be an electronic device such as a server, a mobile terminal, or a server cluster. As shown in fig. 1, the method specifically includes the following steps:
and step 110, obtaining decision information corresponding to the video to be released.
The decision information may include: uploading network speed, hard coding performance characteristics, soft coding performance characteristics, pre-coding duration and video information; the video information comprises video duration, whether the video uses a special effect, whether music is added to the video, the number of video segments and a video acquisition mode.
The uploading network speed can be understood as the speed of the client uploading data to the server. Hard coding performance characteristics may be understood as the ability to encode video in hardware, and may be understood as the ability to process a million pixels per second. Soft coding performance characteristics can be immediately understood as the ability to code video in software, and as the ability to process a million pixels per second. The pre-coding time duration can be understood as the time duration required for coding the video to be issued with a certain time duration in a software mode. The video acquisition mode can comprise shooting or downloading from a network and the like.
Specifically, the method for acquiring the uploading network speed corresponding to the video to be published may be: sending blank files with set sizes to a plurality of servers; determining the fastest network speed as an uploading network speed, and determining a server corresponding to the fastest network speed as a target server; or acquiring the downlink network speed of the client; and determining the uploading network speed according to the downlink network speed.
For one application, multiple servers are typically deployed. In this embodiment, the blank file with the set size is simultaneously sent to a plurality of servers, the server with the highest network speed is selected as the target server, and the network speed from the client to the target server is determined as the final uploading network speed.
The downlink network speed can be understood as the speed of downloading data by the client. In this embodiment, the downlink network speed may be directly determined as the upload network speed.
Specifically, the manner of obtaining the hard coding performance characteristic of the video to be published may be: acquiring a first average performance characteristic of a terminal device where a video to be released is located, processing a set pixel level image in a set historical period by adopting a hard coding mode; the first average performance characteristic is determined as a hard coded performance characteristic.
Wherein the set history period may be the last week, the last month, or the last half year, etc. The set pixel level may be on the megapixel level, i.e., the ability to process megapixels per second. The hard-coding performance characteristic may be determined by the following formula: hard-coding performance feature = avg (video width length video duration/coding duration/100 ten thousand), where avg () is the average value, and represents the hardware's ability to process million pixels per second.
Specifically, the manner of obtaining the soft coding performance characteristic of the video to be distributed may be: acquiring a second average performance characteristic of a terminal device where a video to be released is located, processing a set pixel level image in a set historical time period by adopting a soft coding mode; the second average performance characteristic is determined as a soft coding performance characteristic.
Wherein the set history period may be the last week, the last month, the last half year, or the like. The soft coding performance characteristics may be determined by the following equation: soft coding performance characteristic = avg (video width length video duration/coding duration/100 ten thousand), where avg () is an average value, and soft coding performance characteristic represents the capability of software to process million pixels per second.
Specifically, the method for obtaining the pre-coding duration of the video to be distributed may be: coding data of set time of a to-be-issued time length in a soft coding mode; and determining the time length required by the coding completion as the pre-coding time length.
Wherein the set time may be the video between the first 1-5 seconds of the video to be distributed. In this embodiment, a software coding mode is adopted to pre-code the video of the first 1 to 5 seconds of the video to be distributed, and the duration required by pre-coding is determined as the pre-coding duration.
And step 120, determining a soft coding estimation result, a hard coding estimation result and a transparent transmission coding estimation result according to the decision information.
The soft coding estimation result may include soft coding duration and soft coding rate, and may be determined by soft coding performance characteristics, pre-coding duration and video information. The hard-coded prediction may include a hard-coded duration, which may be determined from the hard-coded performance characteristic and the video information. The transparent transmission coding estimation result can comprise a transparent transmission coding time length which can be determined by a video time length.
Specifically, the process of determining the soft coding estimation result according to the decision information may be: and inputting the performance characteristics of the soft coding, the pre-coding duration and the video information into the first neural network to obtain a soft coding estimation result.
Wherein the first Neural network may be composed of 2 Deep Neural Networks (DNNs), and each DNN network is composed of 2 full-connectivity layers.
Specifically, the manner of determining the hard-coded estimation result according to the decision information may be: and inputting the hard coding performance characteristic and the video information into a second neural network to obtain a hard coding estimation result.
Wherein the second neural network may be composed of 2 DNN networks, each DNN network being composed of 2 fully-connected layers. The code rate of the hard coding output is a set value.
Specifically, the mode of determining the transparent transmission coding estimation result according to the decision information may be: and determining a transparent transmission coding estimation result according to the video time length.
The transparent transmission coding estimation result determined according to the video duration can be calculated according to the following formula: the passthrough encoding estimate = video duration/3.562.
Step 130, inputting the decision information, the soft-coding estimation result, the hard-coding estimation result and the transparent transmission estimation result into a set classification neural network to obtain a target coding mode.
Wherein, the set classification neural network may be a three-classification neural network. The target coding mode can comprise a soft coding mode, a hard coding mode and a transparent transmission coding mode. The set classification neural network may be composed of a DNN network, including 3 fully-connected layers, the last fully-connected layer being followed by a logistic regression (softmax) layer.
Illustratively, fig. 2 is a schematic diagram of determining a target encoding mode in the embodiment of the present disclosure. As shown in fig. 2, decision information such as uploading network speed, hard coding performance characteristics, soft coding performance characteristics, pre-coding duration, video information and the like is firstly subjected to characteristic grouping, and then soft coding estimation, hard coding estimation and transparent coding estimation are respectively performed according to the grouped decision information to obtain a soft coding estimation result, a hard coding estimation result and a transparent coding estimation result. And finally inputting the soft coding estimation result, the hard coding estimation result, the transparent coding estimation result and the decision information into a classification neural network to obtain a target coding mode.
And 140, coding the video to be issued based on the target coding mode, and uploading the coded video.
Specifically, if the target coding mode is a soft coding mode, the video to be released is coded into a code rate video corresponding to the soft coding by using a software coding mode, then the size of the file is determined according to the code rate and the video duration, and finally the file with the size is uploaded to a server according to the uploading network speed.
If the target coding mode is a hard coding mode, the video to be released is coded into the fixed code rate video by using the hardware coding mode, then the size of the file is determined according to the fixed code rate and the video duration, and finally the file with the size is uploaded to a server according to the uploading network speed.
If the target coding mode is a transparent transmission coding mode, coding the video to be released by utilizing a PCM coding mode to obtain the video with a certain code rate, then determining the size of the file according to the code rate and the video duration, and finally uploading the file with the size to a server according to the uploading network speed.
According to the technical scheme, the method comprises the steps of obtaining decision information corresponding to a video to be published; determining a soft coding estimation result, a hard coding estimation result and a transparent transmission coding estimation result according to the decision information; inputting the decision information, the soft-coding estimation result, the hard-coding estimation result and the transparent transmission estimation result into a set classification neural network to obtain a target coding mode; and coding the video to be issued based on the target coding mode, and uploading the coded video to a server. According to the video processing method provided by the embodiment of the disclosure, the target coding mode is determined based on the decision information, the soft-coding estimation result, the hard-coding estimation result and the transparent transmission estimation result, and the video to be published is coded in the target coding mode, so that the publishing time of the video can be reduced, and the user experience can be improved.
Fig. 3 is a schematic structural diagram of a video processing apparatus according to an embodiment of the disclosure. As shown in fig. 3, the apparatus includes:
the decision information obtaining module 210 is configured to obtain decision information corresponding to a video to be published;
the prediction result determining module 220 is configured to determine a soft coding prediction result, a hard coding prediction result and a transparent transmission coding prediction result according to the decision information;
a target coding mode obtaining module 230, configured to input the decision information, the soft-coding estimation result, the hard-coding estimation result, and the transparent transmission estimation result into a set classification neural network, so as to obtain a target coding mode;
and the video encoding module 240 is configured to encode the video to be published based on the target encoding mode, and upload the encoded video.
Optionally, the decision information includes: uploading network speed, hard coding performance characteristics, soft coding performance characteristics, pre-coding duration and video information; the video information comprises video duration, whether the video uses a special effect, whether music is added to the video, the number of video segments and a video acquisition mode.
Optionally, the decision information obtaining module 210 is further configured to:
sending blank files with set sizes to a plurality of servers;
determining the fastest network speed as an uploading network speed, and determining a server corresponding to the fastest network speed as a target server; alternatively, the first and second electrodes may be,
acquiring the downlink network speed of a client;
and determining the uploading network speed according to the downlink network speed.
Optionally, the decision information obtaining module 210 is further configured to:
acquiring a first average performance characteristic of a terminal device where a video to be released is located, processing a set pixel level image in a set historical period by adopting a hard coding mode;
the first average performance characteristic is determined to be a hard coded performance characteristic.
Optionally, the decision information obtaining module 210 is further configured to:
acquiring a second average performance characteristic of a terminal device where a video to be released is located, processing a set pixel level image in a set historical time period by adopting a soft coding mode;
the second average performance characteristic is determined as a soft coding performance characteristic.
Optionally, the decision information obtaining module 210 is further configured to:
adopting a soft coding mode to code the data of the set time of the video to be issued;
and determining the time length required by the coding completion as the pre-coding time length.
Optionally, the estimation result determining module 220 is further configured to:
inputting the performance characteristics of the soft coding, the pre-coding duration and the video information into a first neural network to obtain a soft coding estimation result; wherein, the soft coding estimation result comprises a soft coding duration and a soft coding code rate;
inputting the hard coding performance characteristic and the video information into a second neural network to obtain a hard coding estimation result; wherein the hard-coded prediction result comprises hard-coded duration;
determining a transparent transmission coding estimation result according to the video time length; wherein, the estimation result of the transparent transmission code comprises the time length of the transparent transmission code.
The device can execute the methods provided by all the embodiments of the disclosure, and has corresponding functional modules and beneficial effects for executing the methods. For details of the technology not described in detail in this embodiment, reference may be made to the methods provided in all the foregoing embodiments of the disclosure.
Referring now to FIG. 4, a block diagram of an electronic device 300 suitable for use in implementing embodiments of the present disclosure is shown. The electronic device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle mounted terminal (e.g., a car navigation terminal), and the like, and a fixed terminal such as a digital TV, a desktop computer, and the like, or various forms of servers such as an independent server or a server cluster. The electronic device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 4, electronic device 300 may include a processing means (e.g., central processing unit, graphics processor, etc.) 301 that may perform various appropriate actions and processes in accordance with a program stored in a read-only memory device (ROM) 302 or a program loaded from a storage device 305 into a random access memory device (RAM) 303. In the RAM 303, various programs and data necessary for the operation of the electronic apparatus 300 are also stored. The processing device 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
Generally, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage devices 308 including, for example, magnetic tape, hard disk, etc.; and a communication device 309. The communication means 309 may allow the electronic device 300 to communicate with other devices, wireless or wired, to exchange data. While fig. 4 illustrates an electronic device 300 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, the processes described above with reference to the flow diagrams may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program containing program code for performing a recommendation method for a word. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 309, or installed from the storage means 305, or installed from the ROM 302. The computer program, when executed by the processing device 301, performs the above-described functions defined in the methods of embodiments of the present disclosure.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring decision information corresponding to a video to be published; determining a soft coding estimation result, a hard coding estimation result and a transparent transmission coding estimation result according to the decision information; inputting the decision information, the soft-coding estimation result, the hard-coding estimation result and the transparent transmission estimation result into a set classification neural network to obtain a target coding mode; and coding the video to be issued based on the target coding mode, and uploading the coded video to a server.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, smalltalk, C + +, including conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), complex Programmable Logic Devices (CPLDs), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
According to one or more embodiments of the present disclosure, a video processing method is disclosed in an embodiment of the present disclosure, including:
acquiring decision information corresponding to a video to be released;
determining a soft coding estimation result, a hard coding estimation result and a transparent transmission coding estimation result according to the decision information;
inputting the decision information, the soft-coding estimation result, the hard-coding estimation result and the transparent transmission estimation result into a set classification neural network to obtain a target coding mode;
and coding the video to be issued based on the target coding mode, and uploading the coded video to a server.
Further, the decision information includes: uploading network speed, hard coding performance characteristics, soft coding performance characteristics, pre-coding duration and video information; the video information comprises video duration, whether the video uses a special effect or not, the number of video segments and a video acquisition mode.
Further, acquiring an uploading network speed corresponding to the video to be published, including:
sending blank files with set sizes to a plurality of servers;
determining the fastest network speed as an uploading network speed, and determining a server corresponding to the fastest network speed as a target server; alternatively, the first and second electrodes may be,
acquiring the downlink network speed of the client;
and determining the uploading network speed according to the downlink network speed.
Further, acquiring the hard coding performance characteristics of the video to be published includes:
acquiring a first average performance characteristic of a terminal device where the video to be released is located, processing a set pixel level image in a set historical period by adopting a hard coding mode;
determining the first average performance characteristic as a hard-coded performance characteristic.
Further, acquiring the soft coding performance characteristics of the video to be published includes:
acquiring a second average performance characteristic of a terminal device where the video to be released is located, processing a set pixel level image in a set history period in a soft coding mode;
determining the second average performance characteristic as a soft coding performance characteristic.
Further, acquiring the pre-coding duration of the video to be issued includes:
adopting a soft coding mode to code the data of the video to be issued with set time;
and determining the time length required by the coding completion as the pre-coding time length.
Further, determining a soft code estimation result, a hard code estimation result and a transparent transmission code estimation result according to the decision information comprises the following steps:
inputting the soft coding performance characteristic, the pre-coding duration and the video information into a first neural network to obtain a soft coding estimation result; the soft coding estimation result comprises soft coding duration and a soft coding code rate;
inputting the hard coding performance characteristic and the video information into a second neural network to obtain a hard coding estimation result; wherein the hard-coded prediction result comprises a hard-coded duration;
determining a transparent transmission coding estimation result according to the video time length; and the transparent transmission code estimation result comprises transparent transmission code duration.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present disclosure and the technical principles employed. Those skilled in the art will appreciate that the present disclosure is not limited to the particular embodiments described herein, and that various obvious changes, rearrangements and substitutions will now be apparent to those skilled in the art without departing from the scope of the disclosure. Therefore, although the present disclosure has been described in greater detail with reference to the above embodiments, the present disclosure is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present disclosure, the scope of which is determined by the scope of the appended claims.

Claims (10)

1. A video processing method, comprising:
acquiring decision information corresponding to a video to be released;
determining a soft coding estimation result, a hard coding estimation result and a transparent transmission coding estimation result according to the decision information;
inputting the decision information, the soft-coding estimation result, the hard-coding estimation result and the transparent transmission estimation result into a set classification neural network to obtain a target coding mode;
and coding the video to be issued based on the target coding mode, and uploading the coded video.
2. The method of claim 1, wherein the decision information comprises: uploading network speed, hard coding performance characteristics, soft coding performance characteristics, pre-coding duration and video information; the video information comprises video duration, whether the video uses a special effect, whether music is added to the video, the number of video segments and a video acquisition mode.
3. The method according to claim 2, wherein the obtaining of the uploading network speed corresponding to the video to be published comprises:
sending blank files with set sizes to a plurality of servers;
determining the fastest network speed as an uploading network speed, and determining a server corresponding to the fastest network speed as a target server; alternatively, the first and second electrodes may be,
acquiring the downlink network speed of the client;
and determining the uploading network speed according to the descending network speed.
4. The method of claim 2, wherein obtaining the hard-coding performance characteristics of the video to be distributed comprises:
acquiring a first average performance characteristic of a terminal device where the video to be released is located, processing a set pixel level image in a set historical period by adopting a hard coding mode;
determining the first average performance characteristic as a hard-coded performance characteristic.
5. The method of claim 2, wherein obtaining soft coding performance characteristics of the video to be distributed comprises:
acquiring a second average performance characteristic of a terminal device where the video to be released is located, processing a set pixel level image in a set history period in a soft coding mode;
determining the second average performance characteristic as a soft coding performance characteristic.
6. The method of claim 2, wherein obtaining the pre-encoding duration of the video to be distributed comprises:
adopting a soft coding mode to code the data of the video to be issued with set time;
and determining the time length required by the coding completion as the pre-coding time length.
7. The method of claim 2, wherein determining a soft-coded estimate, a hard-coded estimate, and a pass-through coded estimate based on the decision information comprises:
inputting the soft coding performance characteristics, the pre-coding duration and the video information into a first neural network to obtain a soft coding estimation result; the soft coding estimation result comprises a soft coding duration and a soft coding code rate;
inputting the hard coding performance characteristics and the video information into a second neural network to obtain a hard coding estimation result; wherein the hard-coded prediction result comprises a hard-coded duration;
determining a transparent transmission coding estimation result according to the video time length; and the unvarnished transmission code estimation result comprises unvarnished transmission code duration.
8. A video processing apparatus, comprising:
the decision information acquisition module is used for acquiring decision information corresponding to the video to be published;
the estimation result determining module is used for determining a soft coding estimation result, a hard coding estimation result and a transparent transmission coding estimation result according to the decision information;
a target coding mode obtaining module, configured to input the decision information, the soft-coding estimation result, the hard-coding estimation result, and the transparent transmission estimation result into a set classification neural network, so as to obtain a target coding mode;
and the video coding module is used for coding the video to be issued based on the target coding mode and uploading the coded video.
9. An electronic device, characterized in that the electronic device comprises:
one or more processing devices;
storage means for storing one or more programs;
when executed by the one or more processing devices, cause the one or more processing devices to implement the video processing method of any of claims 1-7.
10. A computer-readable medium, on which a computer program is stored, characterized in that the program, when being executed by processing means, carries out the video processing method according to any one of claims 1 to 7.
CN202110511860.1A 2021-05-11 2021-05-11 Video processing method, device, equipment and storage medium Pending CN115412731A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110511860.1A CN115412731A (en) 2021-05-11 2021-05-11 Video processing method, device, equipment and storage medium
PCT/CN2022/086195 WO2022237427A1 (en) 2021-05-11 2022-04-12 Video processing method and apparatus, device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110511860.1A CN115412731A (en) 2021-05-11 2021-05-11 Video processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115412731A true CN115412731A (en) 2022-11-29

Family

ID=84027954

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110511860.1A Pending CN115412731A (en) 2021-05-11 2021-05-11 Video processing method, device, equipment and storage medium

Country Status (2)

Country Link
CN (1) CN115412731A (en)
WO (1) WO2022237427A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023174254A1 (en) * 2022-03-14 2023-09-21 百果园技术(新加坡)有限公司 Video posting method and apparatus, and device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110110417A1 (en) * 2008-07-16 2011-05-12 Atsushi Tabuchi Encoding apparatus of video and audio data, encoding method thereof, and video editing system
CN104159113A (en) * 2014-06-30 2014-11-19 北京奇艺世纪科技有限公司 Method and device for selecting video coding mode for Android system
CN110870310A (en) * 2018-09-04 2020-03-06 深圳市大疆创新科技有限公司 Image encoding method and apparatus
CN111447447A (en) * 2020-04-03 2020-07-24 北京三体云联科技有限公司 Live broadcast encoding method and device and electronic equipment
CN111630570A (en) * 2019-05-31 2020-09-04 深圳市大疆创新科技有限公司 Image processing method, apparatus and computer-readable storage medium
CN111837140A (en) * 2018-09-18 2020-10-27 谷歌有限责任公司 Video coded field consistent convolution model
CN112165623A (en) * 2020-09-30 2021-01-01 广州光锥元信息科技有限公司 Soft and hard combined audio and video coding and decoding device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110110417A1 (en) * 2008-07-16 2011-05-12 Atsushi Tabuchi Encoding apparatus of video and audio data, encoding method thereof, and video editing system
CN104159113A (en) * 2014-06-30 2014-11-19 北京奇艺世纪科技有限公司 Method and device for selecting video coding mode for Android system
CN110870310A (en) * 2018-09-04 2020-03-06 深圳市大疆创新科技有限公司 Image encoding method and apparatus
CN111837140A (en) * 2018-09-18 2020-10-27 谷歌有限责任公司 Video coded field consistent convolution model
CN111630570A (en) * 2019-05-31 2020-09-04 深圳市大疆创新科技有限公司 Image processing method, apparatus and computer-readable storage medium
WO2020237646A1 (en) * 2019-05-31 2020-12-03 深圳市大疆创新科技有限公司 Image processing method and device, and computer-readable storage medium
CN111447447A (en) * 2020-04-03 2020-07-24 北京三体云联科技有限公司 Live broadcast encoding method and device and electronic equipment
CN112165623A (en) * 2020-09-30 2021-01-01 广州光锥元信息科技有限公司 Soft and hard combined audio and video coding and decoding device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023174254A1 (en) * 2022-03-14 2023-09-21 百果园技术(新加坡)有限公司 Video posting method and apparatus, and device and storage medium

Also Published As

Publication number Publication date
WO2022237427A1 (en) 2022-11-17

Similar Documents

Publication Publication Date Title
WO2019141902A1 (en) An apparatus, a method and a computer program for running a neural network
CN112153415B (en) Video transcoding method, device, equipment and storage medium
CN114401447A (en) Video stuck prediction method, device, equipment and medium
CN112954354B (en) Video transcoding method, device, equipment and medium
CN113642673A (en) Image generation method, device, equipment and storage medium
CN110781150A (en) Data transmission method and device and electronic equipment
CN114257815A (en) Video transcoding method, device, server and medium
CN114861790B (en) Method, system and device for optimizing federal learning compression communication
CN111263220B (en) Video processing method and device, electronic equipment and computer readable storage medium
CN115412731A (en) Video processing method, device, equipment and storage medium
CN114040257B (en) Self-adaptive video stream transmission playing method, device, equipment and storage medium
CN111385574A (en) Code rate control method and device in video coding, mobile terminal and storage medium
CN111478916B (en) Data transmission method, device and storage medium based on video stream
CN115103191A (en) Image processing method, device, equipment and storage medium
CN115378878B (en) CDN scheduling method, device, equipment and storage medium
CN114187177A (en) Method, device and equipment for generating special effect video and storage medium
CN111815508A (en) Image generation method, device, equipment and computer readable medium
CN116760992B (en) Video encoding, authentication, encryption and transmission methods, devices, equipment and media
KR101637022B1 (en) Apparatus and method for transmitting and receiving content
CN113076195B (en) Object shunting method and device, readable medium and electronic equipment
CN115833847B (en) Polar code decoding method, polar code decoding device, communication equipment and storage medium
CN114979644A (en) Video encoding method, device, equipment and storage medium
CN111083196B (en) Information forwarding method and device and electronic equipment
CN114257870A (en) Short video playing method, device, equipment and storage medium
CN114679586A (en) Intra-frame coding frame transmission method, device, electronic device and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination