CN112565887B - Video processing method, device, terminal and storage medium - Google Patents

Video processing method, device, terminal and storage medium Download PDF

Info

Publication number
CN112565887B
CN112565887B CN202011368232.4A CN202011368232A CN112565887B CN 112565887 B CN112565887 B CN 112565887B CN 202011368232 A CN202011368232 A CN 202011368232A CN 112565887 B CN112565887 B CN 112565887B
Authority
CN
China
Prior art keywords
super
resolution
video stream
processing
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011368232.4A
Other languages
Chinese (zh)
Other versions
CN112565887A (en
Inventor
胡均浩
葛维
李振中
戴婵媛
李倩茹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisoc Chongqing Technology Co Ltd
Original Assignee
Unisoc Chongqing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisoc Chongqing Technology Co Ltd filed Critical Unisoc Chongqing Technology Co Ltd
Priority to CN202011368232.4A priority Critical patent/CN112565887B/en
Publication of CN112565887A publication Critical patent/CN112565887A/en
Application granted granted Critical
Publication of CN112565887B publication Critical patent/CN112565887B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440263Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the spatial resolution, e.g. for displaying on a connected PDA
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234363Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the spatial resolution, e.g. for clients with a lower screen resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • H04N7/0135Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving interpolation processes

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Television Systems (AREA)

Abstract

The embodiment of the application provides a video processing method, a device, a terminal and a storage medium, wherein the method comprises the following steps: acquiring a video stream to be processed, and extracting features of the video stream to be processed; selecting a super-resolution strategy according to the extracted video characteristic information; and performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution strategy instruction to obtain a video stream after super-resolution processing. By the embodiment of the application, the super-resolution strategy can be adaptively adjusted to reconstruct the super-resolution of the video stream, so that the video quality is effectively improved.

Description

Video processing method, device, terminal and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a video processing method, a video processing device, a terminal, and a storage medium.
Background
Super-Resolution (SR) refers to reconstructing a corresponding high Resolution image using an observed low Resolution image. With the development of SR technology and television chip technology, the SR technology is also applied to intelligent televisions. The smart television with the SR technology can reconstruct a low-resolution image into a high-resolution image in the process of video playing. However, at present, only one SR scheme is set in the smart tv, and because the effect that can be achieved by the fixed SR scheme is limited, the real-time requirement of a complex scene cannot be met, for example, when facing a complex and changeable scene, the application of one SR scheme cannot better balance the hardware overhead and the visual effect of the image, so that the reconstructed high-resolution image cannot achieve the optimal effect generally.
Disclosure of Invention
The embodiment of the invention provides a video processing method, a device, a terminal and a storage medium, which can adaptively adjust a super-resolution strategy to reconstruct super-resolution of a video stream, thereby effectively improving video quality.
In one aspect, an embodiment of the present invention provides a video processing method, including:
acquiring a video stream to be processed, and extracting features of the video stream to be processed;
selecting a super-resolution strategy according to the extracted video feature information, wherein the selected super-resolution strategy is a single super-resolution strategy or a fusion super-resolution strategy;
and performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution strategy instruction to obtain a video stream after super-resolution processing.
Optionally, the selecting a super-resolution policy according to the extracted video feature information includes:
detecting whether the extracted video characteristic information meets a specified condition;
if the video characteristic information does not meet the specified condition, selecting the single super-resolution strategy;
and if the video characteristic information meets the specified condition, selecting the fusion super-resolution strategy.
Optionally, the performing super-resolution processing on the image frames in the video stream to be processed according to the selected indication of the super-resolution policy to obtain a video stream after super-resolution processing, including:
when the selected super-resolution strategy is a single super-resolution strategy, determining the video type of the video stream to be processed according to the video characteristic information;
selecting a super-resolution mode from a plurality of preset super-resolution modes according to the video type;
and performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution mode to obtain a video stream subjected to the super-resolution processing.
Optionally, the performing super-resolution processing on the image frames in the video stream to be processed according to the selected indication of the super-resolution policy to obtain a video stream after super-resolution processing, including:
when the selected super-resolution strategy is a fusion super-resolution strategy, determining at least two super-resolution modes corresponding to the fusion super-resolution strategy;
and performing super-resolution processing on the image frames in the video stream to be processed by using the at least two super-resolution modes to obtain a video stream subjected to super-resolution processing.
Optionally, the performing super-resolution processing on the image frames in the video stream to be processed by using the at least two super-resolution modes to obtain a video stream after super-resolution processing includes:
for any image frame in the video stream to be processed, performing super-resolution processing on brightness values included in pixel points in any image frame by using one or more super-resolution modes of the at least two super-resolution modes;
performing super-resolution processing on chromaticity values included in pixel points in any image frame by using one or more super-resolution modes of the at least two super-resolution modes;
and determining the super-resolution processed image frame according to the luminance value and the chrominance value after the super-resolution processing, and generating a super-resolution processed video stream according to the super-resolution processed image frame.
Optionally, the at least two super-resolution modes include a first super-resolution mode, a second super-resolution mode and a third super-resolution mode;
the performing super-resolution processing on the luminance value included in the pixel point in any image frame by using one or more super-resolution modes of the at least two super-resolution modes respectively includes:
Performing super-resolution processing on brightness values included in pixel points in any image frame by using the first super-resolution mode and the second super-resolution mode to obtain brightness values after the super-resolution processing;
the performing super-resolution processing on the chrominance values included in the pixel points in any one of the image frames by using one or more of the at least two super-resolution modes respectively includes:
and performing super-resolution processing on the brightness values included in the pixel points in any image frame by using the third super-resolution mode to obtain the chromaticity values after the super-resolution processing.
Optionally, the video feature information includes resolution, and the selecting a super-resolution policy according to the extracted video feature information includes:
if the resolution is smaller than or equal to a preset resolution threshold, selecting the fusion super-resolution strategy;
the super-resolution processing is carried out on the image frames in the video stream to be processed according to the indication of the selected super-resolution strategy, so as to obtain a video stream after the super-resolution processing, which comprises the following steps:
determining a fourth super-resolution mode and a fifth super-resolution mode corresponding to the fused super-resolution strategy;
For any image frame in the video stream to be processed, performing super-resolution processing on the any image frame in the fourth super-resolution mode to obtain a reference image frame after super-resolution processing;
and performing super-resolution processing on the reference image frame in the fifth super-resolution mode to obtain a target image frame subjected to super-resolution processing, and generating a video stream subjected to super-resolution processing according to the target image frame.
In another aspect, an embodiment of the present invention provides a video processing apparatus, including:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a video stream to be processed and extracting characteristics of the video stream to be processed;
the processing unit is used for selecting a super-resolution strategy according to the extracted video feature information, wherein the selected super-resolution strategy is a single super-resolution strategy or a fusion super-resolution strategy;
the processing unit is further used for performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution strategy instruction to obtain a video stream after super-resolution processing.
Optionally, the processing unit is specifically configured to:
detecting whether the extracted video characteristic information meets a specified condition;
If the video characteristic information does not meet the specified condition, selecting the single super-resolution strategy;
and if the video characteristic information meets the specified condition, selecting the fusion super-resolution strategy.
Optionally, the processing unit is specifically configured to:
when the selected super-resolution strategy is a single super-resolution strategy, determining the video type of the video stream to be processed according to the video characteristic information;
selecting a super-resolution mode from a plurality of preset super-resolution modes according to the video type;
and performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution mode to obtain a video stream subjected to the super-resolution processing.
Optionally, the processing unit is specifically configured to:
when the selected super-resolution strategy is a fusion super-resolution strategy, determining at least two super-resolution modes corresponding to the fusion super-resolution strategy;
and performing super-resolution processing on the image frames in the video stream to be processed by using the at least two super-resolution modes to obtain a video stream subjected to super-resolution processing.
Optionally, the processing unit is specifically configured to:
For any image frame in the video stream to be processed, performing super-resolution processing on brightness values included in pixel points in any image frame by using one or more super-resolution modes of the at least two super-resolution modes;
performing super-resolution processing on chromaticity values included in pixel points in any image frame by using one or more super-resolution modes of the at least two super-resolution modes;
and determining the super-resolution processed image frame according to the luminance value and the chrominance value after the super-resolution processing, and generating a super-resolution processed video stream according to the super-resolution processed image frame.
Optionally, the processing unit is specifically configured to:
if the resolution is smaller than or equal to a preset resolution threshold, selecting the fusion super-resolution strategy;
and performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution strategy instruction to obtain a video stream after super-resolution processing.
Optionally, the processing unit is specifically configured to:
determining a fourth super-resolution mode and a fifth super-resolution mode corresponding to the fused super-resolution strategy;
For any image frame in the video stream to be processed, performing super-resolution processing on the any image frame in the fourth super-resolution mode to obtain a reference image frame after super-resolution processing;
and performing super-resolution processing on the reference image frame in the fifth super-resolution mode to obtain a target image frame subjected to super-resolution processing, and generating a video stream subjected to super-resolution processing according to the target image frame.
In yet another aspect, an embodiment of the present invention provides an intelligent terminal, where the intelligent terminal includes a processor, a communication interface, and a memory, where the processor, the communication interface, and the memory are connected to each other, where the memory is configured to store a computer program, and the computer program includes program instructions, and the processor is configured to invoke the program instructions to perform operations related to the video processing method.
Accordingly, an embodiment of the present invention also provides a computer readable storage medium storing a computer program, where the processor executes a program related to the video processing method.
Accordingly, embodiments of the present invention also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions are read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, cause the computer device to perform a video processing method as described above.
The embodiment of the invention selects the super-resolution strategy by acquiring the video stream to be processed and utilizing the video feature information obtained by carrying out feature extraction on the video stream to be processed, wherein when the video feature information does not meet the specified condition, a single super-resolution strategy is selected; when the video characteristic information meets the specified condition, selecting a fusion super-resolution strategy; the resolution ratio included in the video feature information is smaller than or equal to a preset resolution ratio threshold value, and a fusion super-resolution strategy is selected; and finally, performing super-resolution processing on the image frames in the video stream to be processed by using the indication of the selected super-resolution strategy to obtain the video stream after the super-resolution processing. According to the embodiment of the application, the super-resolution strategy can be adaptively adjusted according to the video characteristic information to reconstruct the super-resolution of the video stream, so that the video quality is effectively improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a video processing system according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a video processing method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a super-resolution model structure based on a neural network model according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a super-resolution method based on interpolation according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of another super-resolution method based on interpolation according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a fused super-resolution strategy according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of another fused super-resolution strategy according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of an interface of a resolution configuration according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a video processing apparatus according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of an intelligent terminal according to an embodiment of the present invention.
Detailed Description
The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that the descriptions of "first," "second," and the like in the embodiments of the present application are for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a technical feature defining "first", "second" may include at least one such feature, either explicitly or implicitly.
As shown in fig. 1, the embodiment of the present application provides a video processing system, which may be specifically integrated in an electronic device, where the electronic device may be a device such as a terminal or a server. For example, the video processing system may be integrated in a terminal. The terminal may be a cell phone, tablet, notebook, desktop, personal computer (PC, personal Computer), television or other smart player, as the application is not limited in this regard. For another example, the video processing system may be integrated in a server. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (CDN, content Delivery Network), basic cloud computing services such as big data and artificial intelligent platforms, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited herein.
It will be appreciated that the video processing method of this embodiment may be executed on the terminal, may be executed on the server, or may be executed by both the terminal and the server.
In one embodiment, a terminal is taken as an example to perform a video processing method. The video processing system includes a terminal 101 and a server 102, and the terminal 101 and the server 102 are connected through a network, for example, a wireless network connection or the like. The terminal 101 obtains a video stream to be processed sent by the server 102 through a network, performs feature extraction on image frames of the video stream to be processed, selects a single super-resolution strategy or a fused super-resolution strategy according to the extracted video feature information, and performs super-resolution processing on the image frames in the video stream to be processed by using the selected super-resolution strategy to obtain a video stream after super-resolution processing, so that the super-resolution reconstruction can be performed on the video stream by adaptively adjusting the super-resolution strategy according to the video feature information, and the video quality is effectively improved.
In one embodiment, the super resolution modes include a first super resolution mode, a second super resolution mode, a third super resolution mode, a fourth super resolution mode, and a fifth super resolution mode, wherein the first super resolution mode, the second super resolution mode, the third super resolution mode, the fourth super resolution mode, and the fifth super resolution mode are any one of a super resolution mode based on nearest neighbor interpolation (Nearest neighbor interpolation), a super resolution mode based on bilinear interpolation (bilinear interpolation), a super resolution mode based on bicubic interpolation (bicubic interpolation), and a super resolution mode based on a neural network model, respectively.
It may be understood that the architecture schematic diagram of the system described in the embodiments of the present application is for more clearly describing the technical solution of the embodiments of the present application, and does not constitute a limitation on the technical solution provided in the embodiments of the present application, and those skilled in the art can know that, with the evolution of the architecture of the system and the appearance of a new service scenario, the technical solution provided in the embodiments of the present application is equally applicable to similar technical problems.
In one embodiment, as shown in fig. 2, a video processing method according to an embodiment of the present invention is provided based on the video processing system of fig. 1. Take the example that the terminal is the terminal 101 mentioned in fig. 1. The method of the embodiment of the present invention is described below with reference to fig. 2.
S201, obtaining a video stream to be processed, and extracting features of the video stream to be processed.
In one embodiment, the video stream to be processed is video data that is sent by the server to the terminal through the network and played at the terminal, and the terminal may be a mobile phone, a tablet computer, a notebook computer, a desktop computer, a personal computer (PC, personal Computer), a television or other smart player, which is not limited in this application. In the embodiment of the invention, the terminal is taken as an intelligent television as an example, when a user watches a video program through an intelligent television, one or more image frames in the video program at any time can be obtained, the image frames are subjected to subsequent feature extraction, the extracted video feature information can be used for determining one or more of brightness change, pixel difference and types of objects included in the video frames, and the feature extraction method can be determined according to actual requirements, so that the application is not limited.
In one embodiment, the video feature information may be further determined according to the set specified condition and according to actual requirements, which is not limited in this application.
S202, selecting a super-resolution strategy according to the extracted video feature information, wherein the selected super-resolution strategy is a single super-resolution strategy or a fusion super-resolution strategy.
In one embodiment, whether a specified condition is met or not can be determined according to the extracted video feature information, for example, obvious brightness change occurs in an image frame, multiple movements and changes occur in an object included in the image frame including an animal, a building or a plurality of continuous image frames, and the like, the image frame can be considered as a complex scene, the specified condition is met, and a fusion super-resolution strategy is selected; for another example, only sky, ground and the like appear in the image frame, the image frame can be considered as a simple scene, the specified condition is not satisfied, and a single super-resolution strategy is selected; or selecting a corresponding super-resolution strategy according to the resolution in the video characteristic information.
The method comprises the steps of carrying out super-resolution reconstruction on image frames in a video stream by a single super-resolution strategy in a super-resolution mode, and carrying out super-resolution reconstruction on the image frames in the video stream by at least two super-resolution modes by a fusion super-resolution strategy.
S203, performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution strategy instruction to obtain a video stream after super-resolution processing.
In one embodiment, image frames in a video stream to be processed can be acquired at intervals of a period of time, and a super-resolution strategy is determined through the selected image frames in the period of time, so that super-resolution processing is performed on each image frame in the video stream in the period of time according to the selected super-resolution strategy; feature extraction can also be performed on each image frame, and a corresponding super-resolution strategy is selected to perform super-resolution reconstruction according to video feature information of each image frame, which is not limited in the application.
In the embodiment of the application, the super-resolution strategy is selected by acquiring the video stream to be processed and utilizing the video feature information obtained by extracting the features of the video stream to be processed, the image frames in the video stream to be processed are subjected to super-resolution processing by utilizing the indication of the selected super-resolution strategy, and the video stream after the super-resolution processing is obtained, so that the super-resolution reconstruction can be performed on the video stream by adaptively adjusting the super-resolution strategy according to the video feature information, and the video quality is effectively improved.
In one embodiment, the selecting the super-resolution strategy according to the extracted video feature information includes: detecting whether the extracted video characteristic information meets a specified condition; if the video characteristic information does not meet the specified condition, selecting the single super-resolution strategy; and if the video characteristic information meets the specified condition, selecting the fusion super-resolution strategy.
In one embodiment, whether the specified condition is satisfied is determined according to the extracted video feature information, so that the super-resolution strategy of the video stream is determined according to whether the specified condition is satisfied, and the specified condition can be determined according to actual application and requirements, which is not limited in the application.
In one embodiment, one or more of a brightness change of an image frame and a class of objects included in the video frame are determined according to the extracted video feature information, then whether the image frame belongs to a complex scene is determined according to one or more of the brightness change of the image frame and the class of objects included in the video frame, for example, the image frame includes a plurality of objects, the brightness change among the objects included in the image frame is obvious, and the like, the image frame is determined to be a complex scene, when the image frame is determined to be a complex scene, a specified condition is considered to be satisfied, a fused super-resolution strategy is selected, and on the contrary, the specified condition is considered not to be satisfied, and a single super-resolution strategy is selected. Whether the image frame is a complex scene can be judged according to actual requirements and applications, and the application is not limited to the method.
In one possible embodiment, the image frames obtained from the video stream to be processed are detected, such as edge detection, absolute error sum, etc., a target area and a background area in the image frames are obtained, and a pixel difference between the image frames and the background area in the image frames is determined, and if the pixel difference is greater than or equal to a pixel difference threshold value, one or more super-resolution strategies in the fused super-resolution strategy are selected to respectively reconstruct the target area and the background area.
In one embodiment, the performing super-resolution processing on the image frames in the video stream to be processed according to the selected indication of the super-resolution policy to obtain a video stream after super-resolution processing includes: when the selected super-resolution strategy is a single super-resolution strategy, determining the video type of the video stream to be processed according to the video characteristic information; selecting a super-resolution mode from a plurality of preset super-resolution modes according to the video type; and performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution mode to obtain a video stream subjected to the super-resolution processing.
In one embodiment, when the selected super-resolution strategy is a single super-resolution strategy, video characteristic information of image frames in the video stream is obtained, and the video type of the video stream is determined according to the video characteristic information, so that different super-resolution modes are selected according to the video type to perform super-resolution reconstruction.
In one embodiment, the video type can be determined according to the image frames in the video stream to be processed, and complexity division is performed according to the video type, for example, when the video source is a chess game, the complexity of the image frames to be processed can be considered as simple, and super-resolution reconstruction is performed on the image frames in the video stream to be processed by adopting a super-resolution mode based on nearest neighbor interpolation or a super-resolution mode based on bilinear interpolation; when the video source is a music channel, the complexity of the image frames to be processed can be considered as general, and the super-resolution reconstruction is carried out on the image frames in the video stream to be processed by adopting a super-resolution mode based on bicubic interpolation; when the video source is a movie channel or a sports channel, the complexity of the image frames to be processed can be considered as complex, and super-resolution reconstruction is performed on the image frames in the video stream to be processed in a super-resolution mode based on a neural network model. Specifically, the complexity of the video type determined by the image frame of the video stream to be processed may be divided according to the actual requirement, which is not limited in this application.
In one embodiment, the image recognition technology may be used to identify object classes included in the image frames in the video stream to be processed, and then the object classes are classified according to complexity, for example, when the object classes in the image frames in the video stream to be processed are sky, road, grassland, beach, or other scenes with low resolution requirements, the complexity of the image frames may be considered as simple, and super-resolution reconstruction is performed on the image frames by adopting a super-resolution mode based on nearest neighbor interpolation or a super-resolution mode based on bilinear interpolation; when the object category in the image frame in the video stream to be processed comprises scenes such as simple drawing objects, characters and the like, the complexity of the image frame can be considered as general, and super-resolution reconstruction is carried out on the image frame by adopting a super-resolution mode based on bicubic interpolation; when the object types in the image frames in the video stream to be processed are a plurality of characters, animals, human faces or buildings, the complexity of the image frames can be considered to be complex, and super-resolution reconstruction is carried out on the image frames in the video stream to be processed in a super-resolution mode based on a neural network model. Specifically, the complexity of the object categories included in the image frames in the video stream to be processed may be divided according to actual requirements, which is not limited in this application.
In the embodiment of the application, the super-resolution mode based on nearest neighbor interpolation and the super-resolution mode based on bilinear interpolation can obtain a smooth effect when being applied to static and simple scenes, and the super-resolution mode based on bicubic interpolation can keep more high-frequency components and more image details in image frames, and meanwhile, the super-resolution mode based on a neural network model can better process complex and changeable scenes, so that different super-resolution modes can be adopted for different video types to better improve the visual effect of video streams and balance hardware cost.
In one embodiment, any image frame in the video stream to be processed is an image in YUV color space, and of course, in addition to YUV format, super-resolution policy may be used to reconstruct super-resolution of the image frame to be processed in any color format, such as RGB and HSL, etc., which is not limited in this embodiment.
In one embodiment, as shown in fig. 3, in the neural network model for performing super-resolution reconstruction based on the super-resolution mode of the neural network model according to the embodiment of the present application, before inputting the image frame with low resolution into the neural network model, the image in RGB color space needs to be converted into the image in YUV color space, and the image in YUV color space is input into the neural network model. The application introduces a depth separable convolution in the neural network model, wherein the depth separable convolution carries out operation on each channel of an input image by adopting different convolution kernels, and the operation steps of the depth separable convolution can be divided into a depth convolution (Depthwise) and a point convolution (Pointwise). Because the standard convolution operates all channels of the input signal at the same time, compared with the standard convolution structure, the depth separable convolution has less calculation amount, so that the training speed of the network is increased, the width of the network is further improved, more characteristic information can be propagated in the network, and the reconstruction quality of the network is improved. For example, assume that the size of its input feature map is d×d and the channel is c d The convolution kernel has a size k×k and the channel is c k The size of the output feature map is consistent with the size of the input feature map, and the channel is consistent with the convolution kernel channel. The operand is:
n=k×k×d×d×c k ×c d
further, the depth separable convolution is to split one-step convolution operation into two steps of depth convolution and point convolution, wherein the convolution kernel k×k of the depth convolution is c d The convolution kernel of the point convolution is 1×1, and the channel is c k . The operand is:
n dsc =k×k×d×d×c d +d×d×c d ×c k
the operand is reduced compared to the standard convolution
Figure BDA0002805176230000111
The neural network model also comprises a residual block, wherein depth separable convolution is also introduced into the residual block, and the convolution kernel of the depth convolution is 1×5. And up-sampling the feature map by adopting Pixel Shuffle (Pixel Shuffle) to obtain a high-resolution image frame after super-resolution reconstruction of the image frame.
In one embodiment, the performing super-resolution processing on the image frames in the video stream to be processed according to the selected indication of the super-resolution policy to obtain a video stream after super-resolution processing includes: when the selected super-resolution strategy is a fusion super-resolution strategy, determining at least two super-resolution modes corresponding to the fusion super-resolution strategy; and performing super-resolution processing on the image frames in the video stream to be processed by using the at least two super-resolution modes to obtain a video stream subjected to super-resolution processing.
In one embodiment, after determining that an image frame in a video stream to be processed uses a fused super-resolution strategy, at least two super-resolution modes corresponding to the fused super-resolution strategy need to be determined, and super-resolution processing is performed on the image frame in the video stream to be processed by using the at least two super-resolution strategies, so as to obtain a video stream after super-resolution processing. The super-resolution mode includes a super-resolution mode based on nearest neighbor interpolation, a super-resolution mode based on bilinear interpolation, a super-resolution mode based on bicubic interpolation, a super-resolution mode based on a neural network model, and the like, which is not limited in this application.
Referring to fig. 4, fig. 4 is a schematic diagram of a super-resolution method based on interpolation, and the distance P is calculated based on the super-resolution method of nearest neighbor interpolation xp,yp The nearest known point P x,y Directly assigned to P xp,yp
In one embodiment, the pixel values of two adjacent points are passed through in the X and Y directions in a super-resolution manner based on bilinear interpolation, and the weighted average is obtained by taking the distance between the pixel values and the target point as a weight, taking fig. 3 as an example, and the specific algorithm is as shown in the following formula (1):
P xp,yp =W x,y P x,y +W x+1,y P x+1,y +W x,y+1 P x,y+1 +W x+1,y+1 P x+1,y+1 (1)
wherein W is x,y =(x offset -1)(y offset -1),
W x+1,y =-x offset (y offset -1),
W x,y+1 =-(x offset -1)y offset
W x+1,y+1 =x offset y offset
x offset =x p -x,
y offset =y p -y。
In one embodiment, the bicubic interpolation based super-resolution approach is to fit the data by selecting 4 x 4 pixel points adjacent to the pixel to be solved and by selecting the interpolation basis functions, as shown in fig. 5. Since the distances of the adjacent pixels from the unknown pixels are different, in the calculation, the pixels closer in distance have higher weights. The interpolation basis function is shown in the following formula (2):
Figure BDA0002805176230000121
To-be-interpolated pixel (x, y), a 4×4 neighborhood (x i ,y j ) I, j=0, 1,2,3. Interpolation is performed according to the following formula (3).
Figure BDA0002805176230000122
Where x, y are the positions of the rows and columns of pixel points to be interpolated.
In one embodiment, the Super-resolution mode based on the neural network model needs to be convenient for hardware implementation, such as image Super-resolution (image Super-Resolution using Deep Convolutional Networks, srnn) using a deep convolutional network, efficient Super-resolution network (Wide activation for efficient and accurate image SuperResolution, WDSR) with wide residual, super-resolution network (photo-reMistic single image Super-Resolution using a Generative Adversarial Network, SRGAN) based on generating countermeasure, and real-time transmission Super-resolution network (perceptual losses for Real-time style Transfer and SuperResolution, RTSR) with perceived loss, which is not limited in the present application.
In one embodiment, the performing super-resolution processing on the image frames in the video stream to be processed by using the at least two super-resolution modes to obtain a video stream after super-resolution processing includes: for any image frame in the video stream to be processed, performing super-resolution processing on brightness values included in pixel points in any image frame by using one or more super-resolution modes of the at least two super-resolution modes; performing super-resolution processing on chromaticity values included in pixel points in any image frame by using one or more super-resolution modes of the at least two super-resolution modes; and determining the super-resolution processed image frame according to the luminance value and the chrominance value after the super-resolution processing, and generating a super-resolution processed video stream according to the super-resolution processed image frame.
In one embodiment, the at least two super-resolution modes include a first super-resolution mode, a second super-resolution mode, and a third super-resolution mode, wherein the first super-resolution mode, the second super-resolution mode, and the third super-resolution mode are any one of a super-resolution mode based on nearest neighbor interpolation, a super-resolution mode based on bilinear interpolation, a super-resolution mode based on bicubic interpolation, and a super-resolution mode based on a neural network model, respectively.
In one embodiment, performing super-resolution processing on brightness values included in pixel points in any image frame by using a first super-resolution mode and a second super-resolution mode to obtain brightness values after the super-resolution processing; performing super-resolution processing on brightness values included in pixel points in any image frame in a third super-resolution mode to obtain chromaticity values after the super-resolution processing; and determining the super-resolution processed image frame according to the luminance value and the chrominance value after the super-resolution processing, and generating a super-resolution processed video stream according to the super-resolution processed image frame.
In one embodiment, any image frame in the video stream to be processed is an image of YUV color space, and pixels in any image frame have a Luminance image (Y) representing Luminance values and also have a Chrominance image (UV) representing Chrominance values, for example, the video data is in YUV format, Y represents Luminance (luminence or Luma), i.e., gray values, UV represents Chrominance (Chroma or Chroma), i.e., pixels under the Y channel constitute the Luminance image, and pixels under the U channel, V channel constitute the Chrominance image. Of course, the image frame to be processed may be input in any color format other than YUV format, such as RGB and HSL, etc., which is not limited in this embodiment. In this embodiment, an image in YUV color space is preferably used, and when the image frame to be processed is an image in RGB color space, the following formulas (4) to (6) are referred to for the conversion formula:
Y=0.2291R+0.5876G+0.104B (4)
U=0.492(B-Y) (5)
V=0.877(R-Y) (6)
Where Y represents a luminance image and UV represents a chrominance image.
In one embodiment, as shown in fig. 6, the super-resolution reconstruction is performed on the luminance image by using a super-resolution mode based on a neural network model and a super-resolution mode based on an interpolation method, where the super-resolution mode based on the neural network model may be any neural network model that is convenient for hardware implementation, and the application is not limited thereto, since the sensitivity of the human eye to the chrominance image is low, the super-resolution reconstruction is performed on the chrominance image by using the super-resolution mode based on the interpolation method, and the super-resolution mode based on the interpolation method may be one or more of the super-resolution mode based on nearest neighbor interpolation, the super-resolution mode based on bilinear interpolation, and the super-resolution mode based on bicubic interpolation, the luminance value in the luminance image obtained by performing the super-resolution reconstruction by using the super-resolution mode based on the neural network model and the super-resolution mode based on the interpolation method is calculated and weighted and obtained, and the luminance image containing the chrominance value obtained by performing the super-resolution reconstruction by using the super-resolution mode based on the interpolation method is combined after performing time sequence matching, so as to obtain the super-resolution image, wherein the super-resolution image is in a size of, for example, and the luminance image is convenient for reconstructing in a format of 2042 x 2 x 3102 (K2 x 2 or a size of the luminance image).
In one embodiment, the video feature information includes resolution, and the selecting the super-resolution strategy according to the extracted video feature information includes: if the resolution is smaller than or equal to a preset resolution threshold, selecting the fusion super-resolution strategy; in an embodiment, the performing super-resolution processing on the image frames in the video stream to be processed according to the selected indication of the super-resolution policy to obtain a video stream after the super-resolution processing includes: determining a fourth super-resolution mode and a fifth super-resolution mode corresponding to the fused super-resolution strategy; for any image frame in the video stream to be processed, performing super-resolution processing on the any image frame in the fourth super-resolution mode to obtain a reference image frame after super-resolution processing; and performing super-resolution processing on the reference image frame in the fifth super-resolution mode to obtain a target image frame subjected to super-resolution processing, and generating a video stream subjected to super-resolution processing according to the target image frame.
Specifically, the fourth super-resolution mode and the fifth super-resolution mode are any one of a super-resolution mode based on nearest neighbor interpolation, a super-resolution mode based on bilinear interpolation, a super-resolution mode based on bicubic interpolation, and a super-resolution mode based on a neural network model, respectively. Performing super-resolution processing on any image frame in the video stream in a fourth super-resolution mode to obtain a reference image frame after super-resolution processing; and then performing super-resolution processing on the reference image frame in a fifth super-resolution mode to obtain a target image frame after super-resolution processing, and generating a video stream after super-resolution processing according to the target image frame.
In the embodiment of the application, any image frame in the video stream is subjected to super-resolution processing mainly by using any one of a super-resolution mode based on nearest neighbor interpolation, a super-resolution mode based on bilinear interpolation and a super-resolution mode based on bicubic interpolation, so as to obtain a reference image frame after super-resolution processing; and performing super-resolution processing on the reference image frame by using a super-resolution mode based on the neural network model to obtain a target image frame after super-resolution processing, and generating a video stream after super-resolution processing according to the target image frame.
In another embodiment, the performing super-resolution processing on the image frames in the video stream to be processed according to the selected indication of the super-resolution policy to obtain a video stream after super-resolution processing includes: determining a fourth super-resolution mode and a fifth super-resolution mode corresponding to the fused super-resolution strategy, and acquiring magnification; for any image frame in the video stream to be processed, performing super-resolution processing on the any image frame in the fourth super-resolution mode to obtain a reference image frame after super-resolution processing; and performing super-resolution processing on the reference image frame in the fifth super-resolution mode to obtain a target image frame subjected to super-resolution processing, and generating a video stream subjected to super-resolution processing according to the target image frame. The resolution of the video stream after the super-resolution processing is determined based on a fusion super-resolution strategy selected by the magnification.
In one embodiment, when the resolution of the image frame of the video stream to be processed is lower than the preset resolution threshold, for example, 720p (1280×720) or lower resolution image is larger in magnification, that is, in the case of outputting ultra-high resolution, a combination of different modes can be selected based on a fused super-resolution strategy according to the difference of magnification, for example, the resolution of the input video stream is 480p (800×400), the output resolution needs 2K, as shown in fig. 7, the method can adopt a super-resolution mode based on an interpolation method to amplify to an intermediate resolution to obtain a reference image frame after super-resolution processing, the super-resolution mode based on the interpolation method can be any one of a super-resolution mode based on nearest neighbor interpolation, a super-resolution mode based on a super-interpolation method based on double-line and a super-resolution mode based on double-line interpolation, then the intermediate pixel corresponding to the reference image frame is processed by adopting a super-resolution mode based on a neural network model, the resolution is increased to the ultra-high resolution output, and the target image frame after super-resolution processing is obtained, thereby obtaining a more fine effect.
In one embodiment, when the resolution of the input video stream is 480P and the output resolution is greater than 2K, the image frames in the video stream to be processed may be detected, and the target area and the background area in the image frames to be processed are obtained.
Specifically, the image frame may be processed by using image segmentation or edge detection to obtain a target area and a background area of the image frame, for example, SPG-Net (Segmentation Prediction and Guidance Network, segmentation prediction guide network), PAN (Path Aggregation Network ), and the like, so as to obtain at least one target area.
Further, super-resolution reconstruction is carried out on a target area in the image frame in a super-resolution mode based on a neural network model to obtain a reference image frame after super-resolution processing, and then super-resolution reconstruction is carried out on a background area corresponding to the reference image frame and the image frame in a super-resolution mode based on an interpolation method to obtain the target image frame after super-resolution processing.
In one embodiment, when the resolution of the input video stream is higher, for example, the resolution of the input video stream is 2K, and the video stream is determined to be a simple scene according to the video feature information of the image frame, super-resolution reconstruction can be performed by using a super-resolution mode based on bicubic interpolation in a single super-resolution strategy.
In one embodiment, as shown in fig. 8, a resolution configuration interface is provided, and a user can flexibly switch the configuration modes according to the viewing preference.
In one embodiment, a neural network model for performing super-resolution reconstruction in a super-resolution manner based on the neural network model may be determined according to a preset super-resolution amplification factor, so that the super-resolution reconstruction is performed on the video stream by using the neural network model, for example, the super-resolution amplification factors include 1, 2, 4, and the like, and the neural network model is trained by using a large number of low-resolution images and high-resolution images of corresponding super-resolution amplification factors, so as to obtain the neural network models corresponding to different amplification factors. When the super-resolution amplification factor is 1, the neural network model can further repair the image frame subjected to super-resolution reconstruction, and the viewing effect is improved. When the user performs resolution configuration, the super-resolution reconstruction can be performed on the video stream by determining a neural network model of corresponding magnification according to the setting of the user.
As shown in fig. 9, fig. 9 is a schematic structural diagram of a video processing apparatus according to an embodiment of the present application, where the apparatus includes:
An obtaining unit 901, configured to obtain a video stream to be processed, and perform feature extraction on the video stream to be processed;
the processing unit 902 is configured to select a super-resolution policy according to the extracted video feature information, where the selected super-resolution policy is a single super-resolution policy or a fused super-resolution policy;
the processing unit 902 is further configured to perform super-resolution processing on the image frames in the video stream to be processed according to the selected indication of the super-resolution policy, so as to obtain a video stream after super-resolution processing.
In one embodiment, the processing unit 902 is specifically configured to:
detecting whether the extracted video characteristic information meets a specified condition;
if the video characteristic information does not meet the specified condition, selecting the single super-resolution strategy;
and if the video characteristic information meets the specified condition, selecting the fusion super-resolution strategy.
In one embodiment, the processing unit 902 is specifically configured to:
when the selected super-resolution strategy is a single super-resolution strategy, determining the video type of the video stream to be processed according to the video characteristic information;
selecting a super-resolution mode from a plurality of preset super-resolution modes according to the video type;
And performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution mode to obtain a video stream subjected to the super-resolution processing.
In one embodiment, the processing unit 902 is specifically configured to:
when the selected super-resolution strategy is a fusion super-resolution strategy, determining at least two super-resolution modes corresponding to the fusion super-resolution strategy;
and performing super-resolution processing on the image frames in the video stream to be processed by using the at least two super-resolution modes to obtain a video stream subjected to super-resolution processing.
In one embodiment, the processing unit 902 is specifically configured to:
for any image frame in the video stream to be processed, performing super-resolution processing on brightness values included in pixel points in any image frame by using one or more super-resolution modes of the at least two super-resolution modes;
performing super-resolution processing on chromaticity values included in pixel points in any image frame by using one or more super-resolution modes of the at least two super-resolution modes;
and determining the super-resolution processed image frame according to the luminance value and the chrominance value after the super-resolution processing, and generating a super-resolution processed video stream according to the super-resolution processed image frame.
In one embodiment, the processing unit 902 is specifically configured to:
performing super-resolution processing on brightness values included in pixel points in any image frame by using the first super-resolution mode and the second super-resolution mode to obtain brightness values after the super-resolution processing;
the performing super-resolution processing on the chrominance values included in the pixel points in any one of the image frames by using one or more of the at least two super-resolution modes respectively includes:
and performing super-resolution processing on the brightness values included in the pixel points in any image frame by using the third super-resolution mode to obtain the chromaticity values after the super-resolution processing.
In one embodiment, the processing unit 902 is specifically configured to:
if the resolution is smaller than or equal to a preset resolution threshold, selecting the fusion super-resolution strategy;
and performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution strategy instruction to obtain a video stream after super-resolution processing.
In one embodiment, the processing unit 902 is specifically configured to:
determining a fourth super-resolution mode and a fifth super-resolution mode corresponding to the fused super-resolution strategy;
For any image frame in the video stream to be processed, performing super-resolution processing on the any image frame in the fourth super-resolution mode to obtain a reference image frame after super-resolution processing;
and performing super-resolution processing on the reference image frame in the fifth super-resolution mode to obtain a target image frame subjected to super-resolution processing, and generating a video stream subjected to super-resolution processing according to the target image frame.
In the embodiment of the application, the super-resolution strategy is selected by acquiring the video stream to be processed and utilizing the video feature information obtained by extracting the features of the video stream to be processed, the image frames in the video stream to be processed are subjected to super-resolution processing by utilizing the indication of the selected super-resolution strategy, and the video stream after the super-resolution processing is obtained, so that the super-resolution reconstruction can be performed on the video stream by adaptively adjusting the super-resolution strategy according to the video feature information, and the video quality is effectively improved.
As shown in fig. 10, fig. 10 is a schematic structural diagram of an intelligent terminal provided in an embodiment of the present application, where an internal structure of the intelligent terminal is shown in fig. 10, and includes: one or more processors 1001, memory 1002, communication interfaces 1003, user interfaces 1004. The processor 1001, the memory 1002, the communication interface 1003, and the user interface 1004 described above may be connected by a bus 1005 or otherwise, and the embodiments of the present application are exemplified as being connected by the bus 1005.
The processor 1001 (or CPU (Central Processing Unit, central processing unit)) is a computing core and a control core of the intelligent terminal, and can parse various instructions in the intelligent terminal and process various data of the intelligent terminal, for example: the CPU can be used for analyzing a startup and shutdown instruction sent by a user to the intelligent terminal and controlling the intelligent terminal to perform startup and shutdown operation; and the following steps: the CPU can transmit various interactive data among the internal structures of the intelligent terminal, and the like. Communication interface 1003 may optionally include a standard wired interface, a wireless interface (e.g., wi-Fi, mobile communication interface, etc.), controlled by processor 1001 for transceiving data. The user interface 1004 may include a Display (Display) and the optional user interface 1004 may also include standard wired, wireless interfaces. The Memory 1002 (Memory) is a Memory device in the smart terminal for storing programs and data. It will be appreciated that the memory 1002 herein may include a built-in memory of the smart terminal, or may include an extended memory supported by the smart terminal.
In one embodiment, the processor 1001 performs the following by executing executable program code in the memory 1002:
Acquiring a video stream to be processed, and extracting features of the video stream to be processed;
selecting a super-resolution strategy according to the extracted video feature information, wherein the selected super-resolution strategy is a single super-resolution strategy or a fusion super-resolution strategy;
and performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution strategy instruction to obtain a video stream after super-resolution processing.
In one embodiment, the processor 1001 is specifically configured to:
detecting whether the extracted video characteristic information meets a specified condition;
if the video characteristic information does not meet the specified condition, selecting the single super-resolution strategy;
and if the video characteristic information meets the specified condition, selecting the fusion super-resolution strategy.
In one embodiment, the processor 1001 is specifically configured to:
when the selected super-resolution strategy is a single super-resolution strategy, determining the video type of the video stream to be processed according to the video characteristic information;
selecting a super-resolution mode from a plurality of preset super-resolution modes according to the video type;
and performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution mode to obtain a video stream subjected to the super-resolution processing.
In one embodiment, the processor 1001 is specifically configured to:
when the selected super-resolution strategy is a fusion super-resolution strategy, determining at least two super-resolution modes corresponding to the fusion super-resolution strategy;
and performing super-resolution processing on the image frames in the video stream to be processed by using the at least two super-resolution modes to obtain a video stream subjected to super-resolution processing.
In one embodiment, the processor 1001 is specifically configured to:
for any image frame in the video stream to be processed, performing super-resolution processing on brightness values included in pixel points in any image frame by using one or more super-resolution modes of the at least two super-resolution modes;
performing super-resolution processing on chromaticity values included in pixel points in any image frame by using one or more super-resolution modes of the at least two super-resolution modes;
and determining the super-resolution processed image frame according to the luminance value and the chrominance value after the super-resolution processing, and generating a super-resolution processed video stream according to the super-resolution processed image frame.
In one embodiment, the processor 1001 is specifically configured to:
Performing super-resolution processing on brightness values included in pixel points in any image frame by using the first super-resolution mode and the second super-resolution mode to obtain brightness values after the super-resolution processing;
the performing super-resolution processing on the chrominance values included in the pixel points in any one of the image frames by using one or more of the at least two super-resolution modes respectively includes:
and performing super-resolution processing on the brightness values included in the pixel points in any image frame by using the third super-resolution mode to obtain the chromaticity values after the super-resolution processing.
In one embodiment, the processor 1001 is specifically configured to:
if the resolution is smaller than or equal to a preset resolution threshold, selecting the fusion super-resolution strategy;
and performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution strategy instruction to obtain a video stream after super-resolution processing.
In one embodiment, the processor 1001 is specifically configured to:
determining a fourth super-resolution mode and a fifth super-resolution mode corresponding to the fused super-resolution strategy;
For any image frame in the video stream to be processed, performing super-resolution processing on the any image frame in the fourth super-resolution mode to obtain a reference image frame after super-resolution processing;
and performing super-resolution processing on the reference image frame in the fifth super-resolution mode to obtain a target image frame subjected to super-resolution processing, and generating a video stream subjected to super-resolution processing according to the target image frame.
In the embodiment of the application, the super-resolution strategy is selected by acquiring the video stream to be processed and utilizing the video feature information obtained by extracting the features of the video stream to be processed, the image frames in the video stream to be processed are subjected to super-resolution processing by utilizing the indication of the selected super-resolution strategy, and the video stream after the super-resolution processing is obtained, so that the super-resolution reconstruction can be performed on the video stream by adaptively adjusting the super-resolution strategy according to the video feature information, and the video quality is effectively improved.
One or more embodiments of the present application further provide a computer readable storage medium storing a computer program, where the computer program includes program instructions, when the computer program runs on a computer, cause the computer to perform the image processing method based on artificial intelligence according to the embodiments of the present application, and specific implementation manner may refer to the foregoing description and will not be repeated herein.
Those skilled in the art will appreciate that implementing all or part of the above-described embodiment methods may be accomplished by computer programs stored on a computer readable storage medium, which when executed, may include embodiments of the file management methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like.
One or more embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the steps performed in the embodiments of the methods described above.
The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims (9)

1. A method of video processing, the method comprising:
acquiring a video stream to be processed, and extracting features of image frames in the video stream to be processed to obtain video feature information;
determining target information of the image frame according to the video characteristic information, wherein the target information comprises one or more of the following: brightness change information, pixel difference information, category information of the contained object, resolution; the pixel difference information is determined according to pixel differences between the image frame and a background area in the image frame, and the brightness change information is determined according to brightness differences between objects in the image frame;
selecting a super-resolution strategy according to the target information, wherein the super-resolution strategy comprises the following steps: if the image frame is determined to belong to a simple scene according to the target information, determining that the selected super-resolution strategy is a single super-resolution strategy; if the image frame is determined to belong to a complex scene according to the target information, determining that the selected super-resolution strategy is a fusion super-resolution strategy; the single super-resolution strategy indicates that a super-resolution mode is selected according to the video type of the video stream to be processed to perform super-resolution processing, and the video type is determined according to the video source of the video stream to be processed; the fused super-resolution strategy indicates that the image frames are sequentially subjected to multiple super-resolution processing according to at least two super-resolution modes, or the brightness values and the chromaticity values of the pixel points in the image frames are respectively subjected to super-resolution processing according to at least two super-resolution modes;
And performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution strategy instruction to obtain a video stream after super-resolution processing.
2. The method according to claim 1, wherein the performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution policy instruction to obtain a super-resolution processed video stream includes:
when the selected super-resolution strategy is a single super-resolution strategy, determining the video type of the video stream to be processed according to the video characteristic information;
selecting a super-resolution mode from a plurality of preset super-resolution modes according to the video type;
and performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution mode to obtain a video stream subjected to the super-resolution processing.
3. The method according to claim 1, wherein the performing super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution policy instruction to obtain a super-resolution processed video stream includes:
when the selected super-resolution strategy is a fusion super-resolution strategy, determining at least two super-resolution modes corresponding to the fusion super-resolution strategy;
And performing super-resolution processing on the image frames in the video stream to be processed by using the at least two super-resolution modes to obtain a video stream subjected to super-resolution processing.
4. The method according to claim 3, wherein performing super-resolution processing on the image frames in the video stream to be processed by using the at least two super-resolution modes to obtain a super-resolution processed video stream, includes:
for any image frame in the video stream to be processed, performing super-resolution processing on brightness values included in pixel points in any image frame by using one or more super-resolution modes of the at least two super-resolution modes;
performing super-resolution processing on chromaticity values included in pixel points in any image frame by using one or more super-resolution modes of the at least two super-resolution modes;
and determining the super-resolution processed image frame according to the luminance value and the chrominance value after the super-resolution processing, and generating a super-resolution processed video stream according to the super-resolution processed image frame.
5. The method of claim 4, wherein the at least two super-resolution modes include a first super-resolution mode, a second super-resolution mode, and a third super-resolution mode;
The performing super-resolution processing on the luminance value included in the pixel point in any image frame by using one or more super-resolution modes of the at least two super-resolution modes respectively includes:
performing super-resolution processing on brightness values included in pixel points in any image frame by using the first super-resolution mode and the second super-resolution mode to obtain brightness values after the super-resolution processing;
the performing super-resolution processing on the chrominance values included in the pixel points in any one of the image frames by using one or more of the at least two super-resolution modes respectively includes:
and performing super-resolution processing on the brightness values included in the pixel points in any image frame by using the third super-resolution mode to obtain the chromaticity values after the super-resolution processing.
6. The method of claim 1, wherein selecting a super resolution strategy based on the target information comprises:
if the resolution is smaller than or equal to a preset resolution threshold, selecting the fusion super-resolution strategy;
the super-resolution processing is carried out on the image frames in the video stream to be processed according to the indication of the selected super-resolution strategy, so as to obtain a video stream after the super-resolution processing, which comprises the following steps:
Determining a fourth super-resolution mode and a fifth super-resolution mode corresponding to the fused super-resolution strategy;
for any image frame in the video stream to be processed, performing super-resolution processing on the any image frame in the fourth super-resolution mode to obtain a reference image frame after super-resolution processing;
and performing super-resolution processing on the reference image frame in the fifth super-resolution mode to obtain a target image frame subjected to super-resolution processing, and generating a video stream subjected to super-resolution processing according to the target image frame.
7. A video processing apparatus, the apparatus comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a video stream to be processed and extracting the characteristics of image frames in the video stream to be processed to obtain video characteristic information;
a processing unit, configured to determine target information of the image frame according to the video feature information, where the target information includes one or more of the following: brightness change information, pixel difference information, category information of the contained object, resolution; the pixel difference information is determined according to pixel differences between the image frame and a background area in the image frame, and the brightness change information is determined according to brightness differences between objects in the image frame;
The processing unit is further configured to select a super-resolution policy according to the target information, and includes: if the image frame is determined to belong to a simple scene according to the target information, determining that the selected super-resolution strategy is a single super-resolution strategy; if the image frame is determined to belong to a complex scene according to the target information, determining that the selected super-resolution strategy is a fusion super-resolution strategy; the single super-resolution strategy indicates that a super-resolution mode is selected according to the video type of the video stream to be processed to perform super-resolution processing, and the video type is determined according to the video source of the video stream to be processed; the fused super-resolution strategy indicates that the image frames are sequentially subjected to multiple super-resolution processing according to at least two super-resolution modes, or the brightness values and the chromaticity values of the pixel points in the image frames are respectively subjected to super-resolution processing according to at least two super-resolution modes;
the processing unit is further configured to perform super-resolution processing on the image frames in the video stream to be processed according to the selected super-resolution policy, so as to obtain a video stream after super-resolution processing.
8. An intelligent terminal comprising a memory, a communication interface, and a processor, wherein the memory, the communication interface, and the processor are interconnected, the memory stores computer program code, and the processor invokes the computer program code stored in the memory for performing the video processing method of any one of claims 1-6.
9. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the video processing method of any one of claims 1 to 6.
CN202011368232.4A 2020-11-27 2020-11-27 Video processing method, device, terminal and storage medium Active CN112565887B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011368232.4A CN112565887B (en) 2020-11-27 2020-11-27 Video processing method, device, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011368232.4A CN112565887B (en) 2020-11-27 2020-11-27 Video processing method, device, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN112565887A CN112565887A (en) 2021-03-26
CN112565887B true CN112565887B (en) 2023-06-20

Family

ID=75046607

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011368232.4A Active CN112565887B (en) 2020-11-27 2020-11-27 Video processing method, device, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN112565887B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114359051A (en) * 2022-01-05 2022-04-15 京东方科技集团股份有限公司 Image processing method, image processing device, image processing system, and storage medium
CN118632010A (en) * 2023-03-08 2024-09-10 杭州海康威视数字技术股份有限公司 Decoding method, device and equipment thereof
CN116761019A (en) * 2023-08-24 2023-09-15 瀚博半导体(上海)有限公司 Video processing method, system, computer device and computer readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108848376A (en) * 2018-06-20 2018-11-20 腾讯科技(深圳)有限公司 Video coding, coding/decoding method, device and computer equipment
CN111899167A (en) * 2020-06-22 2020-11-06 武汉联影医疗科技有限公司 Interpolation algorithm determination method, interpolation algorithm determination device, computer equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009071383A (en) * 2007-09-10 2009-04-02 Sharp Corp Image processing apparatus, inspection system, image processing method, image processing program, and computer-readable recording medium recording this program
CN106611405B (en) * 2015-10-23 2020-04-07 展讯通信(天津)有限公司 Image interpolation method and device
CN106780342A (en) * 2016-12-28 2017-05-31 深圳市华星光电技术有限公司 Single-frame image super-resolution reconstruction method and device based on the reconstruct of sparse domain
US10489887B2 (en) * 2017-04-10 2019-11-26 Samsung Electronics Co., Ltd. System and method for deep learning image super resolution
CN111800629A (en) * 2019-04-09 2020-10-20 华为技术有限公司 Video decoding method, video encoding method, video decoder and video encoder
CN111402139B (en) * 2020-03-25 2023-12-05 Oppo广东移动通信有限公司 Image processing method, apparatus, electronic device, and computer-readable storage medium
CN111598779B (en) * 2020-05-14 2023-07-14 Oppo广东移动通信有限公司 Image super-resolution processing method and device, electronic equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108848376A (en) * 2018-06-20 2018-11-20 腾讯科技(深圳)有限公司 Video coding, coding/decoding method, device and computer equipment
CN111899167A (en) * 2020-06-22 2020-11-06 武汉联影医疗科技有限公司 Interpolation algorithm determination method, interpolation algorithm determination device, computer equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Ji Bin ; Liu Hongchen ; Gao Yanying.Rapid Hybrid Interpolation Method of Single Image.《IEEE》.2010,全文. *
贺瑜飞 ; 高宏伟.基于多层连接卷积神经网络的单帧图像超分辨重建.《计算机应用与软件》.2019,全文. *

Also Published As

Publication number Publication date
CN112565887A (en) 2021-03-26

Similar Documents

Publication Publication Date Title
CN112565887B (en) Video processing method, device, terminal and storage medium
US11151690B2 (en) Image super-resolution reconstruction method, mobile terminal, and computer-readable storage medium
CN111681167B (en) Image quality adjusting method and device, storage medium and electronic equipment
WO2018161775A1 (en) Neural network model training method, device and storage medium for image processing
WO2022141819A1 (en) Video frame insertion method and apparatus, and computer device and storage medium
CN108124109A (en) A kind of method for processing video frequency, equipment and computer readable storage medium
CN112788235B (en) Image processing method, image processing device, terminal equipment and computer readable storage medium
WO2023005140A1 (en) Video data processing method, apparatus, device, and storage medium
JP2023539620A (en) Facial image processing method, display method, device and computer program
CN114286172B (en) Data processing method and device
CN109544441B (en) Image processing method and device, and skin color processing method and device in live broadcast
US20240205376A1 (en) Image processing method and apparatus, computer device, and storage medium
CN112102422B (en) Image processing method and device
CN106603885B (en) Method of video image processing and device
US20240037701A1 (en) Image processing and rendering
CN113822803A (en) Image super-resolution processing method, device, equipment and computer readable storage medium
CN117768774A (en) Image processor, image processing method, photographing device and electronic device
JP2019149785A (en) Video conversion device and program
CN113538304A (en) Training method and device of image enhancement model, and image enhancement method and device
US10764578B2 (en) Bit rate optimization system and method
KR20210057925A (en) Streaming server and method for object processing in multi-view video using the same
CN113435515B (en) Picture identification method and device, storage medium and electronic equipment
CN112788234B (en) Image processing method and related device
CN114092359A (en) Screen-splash processing method and device and electronic equipment
WO2020233536A1 (en) Vr video quality evaluation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant