CN116416134A

CN116416134A - Image super processing method, system, device, storage medium, and program product

Info

Publication number: CN116416134A
Application number: CN202310363275.0A
Authority: CN
Inventors: 郑美松; 襄成; 陈颖
Original assignee: Alibaba China Co Ltd
Current assignee: Alibaba China Co Ltd
Priority date: 2023-04-04
Filing date: 2023-04-04
Publication date: 2023-07-11

Abstract

The application provides an image super processing method, system, device, storage medium and program product, wherein the method comprises the following steps: acquiring an original image to be processed and a target resolution, wherein the resolution of the original image is smaller than the target resolution; extracting the characteristics of the original image through a neural network processing unit of the terminal to obtain a characteristic diagram of the original image; and carrying out up-sampling processing on the feature map according to the target resolution through a graphic processor of the terminal to obtain a target image. The method and the device realize the separation of the feature extraction process and the up-sampling operation, namely, different terminal hardware resources are respectively adopted for processing, the image superdivision is realized on the terminal, the calculation power of the server is saved, the transmission bandwidth consumption is reduced, the hardware resources of the terminal are utilized pointedly, the problem of limited calculation power on the terminal is solved, and the image superdivision effect of the terminal is improved.

Description

Image super processing method, system, device, storage medium, and program product

Technical Field

The present invention relates to the field of image processing technology, and in particular, to an image super processing method, system, device, storage medium, and program product.

Background

Image super-resolution refers to the recovery of a high resolution image from a low resolution image or sequence of images. In the e-commerce scene, merchant pictures are stored in Oss (Object Storage Service object storage service) in an original image form, when a client user requests access, the original image is super-divided into resolution requested by the user through real-time transcoding, and then the transcoded pictures are sent to the user.

The request resolution corresponding to each original image in the actual scene is often various, the current image super-resolution mode has large data calculation amount, and the image super-resolution transcoding can cause great computational cost overhead.

Disclosure of Invention

The main purpose of the embodiment of the application is to provide an image superprocessing method, an image superprocessing system, an image superprocessing device, a storage medium and a program product, which realize targeted utilization of hardware resources of a terminal in an image superdivision process, solve the problem of limited computing power on the terminal and improve the image superdivision effect.

In a first aspect, an embodiment of the present application provides an image super processing method, including: acquiring an original image to be processed and a target resolution, wherein the resolution of the original image is smaller than the target resolution; extracting the characteristics of the original image through a neural network processing unit of the terminal to obtain a characteristic diagram of the original image; and carrying out up-sampling processing on the feature map according to the target resolution through a graphic processor of the terminal to obtain a target image.

In an embodiment, the neural network processing unit of the terminal performs feature extraction on the original image to obtain a feature map of the original image, including: and calling a neural network processing unit of the terminal, loading a preset feature extraction model through the neural network processing unit, inputting the original image into the preset feature extraction model, and outputting a feature map of the original image, wherein the feature map comprises texture features of the original image.

In an embodiment, the neural network processing unit of the calling terminal includes: and calling a neural network processing unit of the terminal through the deep learning reasoning engine.

In an embodiment, the graphics processor of the terminal performs upsampling processing on the feature map according to the target resolution to obtain a target image, including: and invoking a graphic processor of a terminal, and carrying out interpolation processing on the feature map through the graphic processor according to the target resolution to obtain a target image corresponding to the image request, wherein the resolution of the target image is the target resolution.

In one embodiment, the invoking the graphics processor of the terminal comprises: and invoking a graphic processor of the terminal through a deep learning reasoning engine.

In an embodiment, after the graphics processor of the pass-through terminal performs upsampling processing on the feature map according to the target resolution to obtain a target image, the method further includes: and rendering the target image, and displaying the target image on a user interface.

In an embodiment, the acquiring the original image to be processed and the target resolution includes: responding to an image request, and acquiring an image identifier and a target resolution carried by the image request; and according to the image identification, requesting a server to acquire the original image corresponding to the image identification.

In a second aspect, an embodiment of the present application provides a commodity image super processing method, including: responding to a query request of a user for a commodity, acquiring an original commodity image corresponding to the commodity from a server, wherein the query request carries a target resolution, and the resolution of the original commodity image is smaller than the target resolution; extracting the characteristics of the original commodity image through a neural network processing unit of the terminal to obtain a characteristic diagram of the original commodity image; the feature map is subjected to up-sampling processing according to the target resolution by a graphic processor of the terminal, so that a target commodity image corresponding to the query request is obtained; and rendering the target commodity image, and displaying the target commodity image on a user interface.

In a third aspect, an embodiment of the present application provides a method for generating an image super-resolution model, including: obtaining a sample image set, the sample image set comprising: a first sample image and a second sample image, the second sample image being an image of the first sample image reduced in resolution; establishing a superdivision model structure, wherein the superdivision model structure comprises a preset feature extraction module and an up-sampling module, the preset feature extraction module is deployed in a neural network processing unit of a terminal, and the up-sampling module is deployed in a graphic processor of the terminal; and training the superdivision model structure by adopting the sample image set to obtain a trained image superdivision model.

In a fourth aspect, an embodiment of the present application provides an image super processing apparatus, including:

the acquisition module is used for acquiring an original image to be processed and a target resolution, wherein the resolution of the original image is smaller than the target resolution;

the extraction module is used for extracting the characteristics of the original image through a neural network processing unit of the terminal to obtain a characteristic diagram of the original image;

and the up-sampling module is used for up-sampling the feature map according to the target resolution by a graphic processor of the terminal to obtain a target image.

In an embodiment, the extracting module is configured to invoke a neural network processing unit of a terminal, load a preset feature extraction model through the neural network processing unit, input the original image into the preset feature extraction model, and output a feature map of the original image, where the feature map includes texture features of the original image.

In an embodiment, the up-sampling module is configured to invoke a graphics processor of a terminal, and perform interpolation processing on the feature map according to the target resolution by using the graphics processor to obtain a target image corresponding to the image request, where the resolution of the target image is the target resolution.

In one embodiment, the method further comprises: and the rendering module is used for performing up-sampling processing on the feature map according to the target resolution ratio by the graphics processor of the passing terminal to obtain a target image, then rendering the target image, and displaying the target image on a user interface.

In an embodiment, the acquiring module is configured to respond to an image request, and acquire an image identifier and a target resolution carried by the image request; and according to the image identification, requesting a server to acquire the original image corresponding to the image identification.

In a fifth aspect, embodiments of the present application provide an image super processing system, including:

an acquisition unit configured to acquire an original image to be processed and a target resolution, wherein the resolution of the original image is smaller than the target resolution;

the neural network processing unit is used for extracting the characteristics of the original image to obtain a characteristic diagram of the original image;

and the image processor is used for carrying out up-sampling processing on the feature map according to the target resolution to obtain a target image.

In a sixth aspect, an embodiment of the present application provides an electronic device, including:

at least one processor; and

a memory communicatively coupled to the at least one processor;

wherein the memory stores instructions executable by the at least one processor to cause the electronic device to perform the method of any of the above aspects.

In a seventh aspect, an embodiment of the present application provides a cloud device, including:

at least one processor; and

a memory communicatively coupled to the at least one processor;

wherein the memory stores instructions executable by the at least one processor to cause the cloud device to perform the method of any of the above aspects.

In an eighth aspect, embodiments of the present application provide a computer-readable storage medium having stored therein computer-executable instructions that, when executed by a processor, implement the method of any one of the above aspects.

In a ninth aspect, embodiments of the present application provide a computer program product comprising a computer program which, when executed by a processor, implements the method of any of the above aspects.

According to the image super-processing method, system, device, storage medium and program product, the original image is subjected to feature extraction through the neural network processing unit of the terminal, the feature image is subjected to up-sampling processing through the graphic processor of the terminal, and then the target image conforming to the target resolution is obtained.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application. It will be apparent to those of ordinary skill in the art that the drawings in the following description are of some embodiments of the invention and that other drawings may be derived from them without inventive faculty.

Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 2 is an application scenario schematic diagram of an image super-resolution processing scheme provided in an embodiment of the present application;

FIG. 3A is a schematic diagram of an image super processing system according to an embodiment of the present disclosure;

fig. 3B is a schematic diagram of an up-sampling super-resolution network structure according to an embodiment of the present application;

fig. 3C is a schematic diagram of a post-upsampling super resolution network structure according to an embodiment of the present application;

fig. 3D is a schematic diagram of an image super-resolution processing scheme applied to a mobile phone according to an embodiment of the present application;

FIG. 4 is a flowchart illustrating an image super processing method according to an embodiment of the present disclosure;

FIG. 5 is a flowchart of an image super processing method according to an embodiment of the present disclosure;

FIG. 6 is a schematic flow chart of a commodity image super processing method according to an embodiment of the present application;

fig. 7 is a schematic flow chart of an image super-resolution model generating method according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an image super processing device according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a cloud device according to an embodiment of the present application.

Specific embodiments thereof have been shown by way of example in the drawings and will herein be described in more detail. These drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but to illustrate the concepts of the present application to those skilled in the art by reference to specific embodiments.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application.

The term "and/or" is used herein to describe association of associated objects, and specifically indicates that three relationships may exist, for example, a and/or B may indicate: a exists alone, A and B exist together, and B exists alone.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region, and provide corresponding operation entries for the user to select authorization or rejection.

In order to clearly describe the technical solutions of the embodiments of the present application, firstly, the terms referred to in the present application are explained:

SR: super Resolution, super Resolution.

LR: low Resolution, low Resolution.

Bicubic interpolation: bicubic interpolation.

APP: application, application program.

NPU: neural Processing Unit, a neural network processing unit, may be used to accelerate the performance of common machine learning tasks.

GPU: graphics Processing Unit, a graphics processor.

CPU: central Processing Unit, a central processing unit.

CNN: convolutional Neural Networks, convolutional neural network.

MNN: mobile Neural Network is a lightweight deep neural network engine, supports reasoning and training of deep learning, and is suitable for various devices such as servers/personal computers/mobile phones/embedded devices.

Oss: object Storage Service, object store services.

AI: artificial Intelligence, artificial intelligence.

iOS: is a mobile operating system developed by apple corporation.

Core ML: core Machine Learning is a framework for machine learning without internet connection on-line with apple corporation.

Android: and an android operating system.

API: application Programming Interface, application programming interface.

NNAPI: android Neural Networks API an Android neural network interface, designed specifically for running computationally intensive operations on Android devices to implement machine learning, aims to provide a basic functional layer for a higher level machine learning framework for building and training a neural network.

As shown in fig. 1, the present embodiment provides an electronic apparatus 1 including: at least one processor 11 and a memory 12, one processor being exemplified in fig. 1. The processor 11 and the memory 12 are connected by a bus 10. The memory 12 stores instructions executable by the processor 11, and the instructions are executed by the processor 11, so that the electronic device 1 can execute all or part of the flow of the method in the following embodiments, so as to realize targeted utilization of hardware resources of the terminal in the image superdivision process, solve the problem of limited computing power on the terminal, and improve the image superdivision effect.

In an embodiment, the electronic device 1 may be a mobile phone, a tablet computer, a notebook computer, a desktop computer, or a large computing system composed of a plurality of computers.

Fig. 2 is a schematic diagram of an application scenario 200 of an image super processing system according to an embodiment of the present application. As shown in fig. 2, the system includes: server 210 and terminal 220, wherein:

the server 210 may be a data platform that provides image services, such as an e-commerce shopping platform. In a practical scenario, one e-commerce shopping platform may have multiple servers 210, for example 1 server 210 in fig. 2.

The terminal 220 may be a computer, a mobile phone, a tablet, or other devices used when the user logs in to the shopping platform of the electronic commerce, or a plurality of terminals 220 may be provided, and 2 terminals 220 are illustrated in fig. 2 as an example.

Information transmission between the terminal 220 and the server 210 may be performed through the internet, so that the terminal 220 may access data on the server 210. The terminal 220 and/or the server 210 may be implemented by the electronic device 1.

The image super-resolution processing mode can be applied to any field needing image resolution conversion processing.

Image super-resolution refers to the recovery of a high resolution image from a low resolution image or sequence of images. In the E-commerce scene, merchant pictures are stored in the Oss of a server in an original image form, when a client user requests access, the original image is super-divided into resolution requested by the user through real-time transcoding, and then the transcoded pictures are sent to the client where the user is located.

The request resolution corresponding to each original picture in the actual scene is often various, the server is required to transcode each request resolution in the related technology, huge data calculation amount can cause huge server calculation cost overhead, and in addition, higher transmission bandwidth resource consumption is brought along with the increase of picture access amount.

In order to solve the above problems, the embodiments of the present application provide an image super-processing scheme, which performs feature extraction on an original image through a neural network processing unit of a terminal, and performs up-sampling processing on a feature image through a graphics processor of the terminal, so as to obtain a target image conforming to a target resolution.

The above image super processing scheme may be deployed on the server 210, on the terminal 220, or partially on the server 210 and partially on the terminal 220. The actual scene may be selected based on actual requirements, which is not limited in this embodiment.

When the image super processing scheme is deployed in whole or in part on the server 210, an interface may be invoked open to the terminal 220 to provide algorithmic support to the terminal 220.

As shown in fig. 3A, an architecture diagram of an image super processing system 300 according to an embodiment of the present application includes: an acquisition unit 301, a neural network processing unit 302, and an image processor 303, wherein:

an acquisition unit 301, configured to acquire an original image to be processed and a target resolution, where the resolution of the original image is smaller than the target resolution.

The neural network processing unit 302 is configured to perform feature extraction on the original image, so as to obtain a feature map of the original image.

And the image processor 303 is configured to perform upsampling processing on the feature map according to the target resolution, so as to obtain a target image corresponding to the image request.

In this embodiment, the neural network processing unit 302 may be implemented by an NPU of the terminal 220, and the image processor 303 may be implemented by a GPU of the terminal 220.

In a related embodiment, the super-divided target image can be obtained by performing an up-sampling operation on the original image and then performing feature extraction on the up-sampled image, so that the method has very large calculated amount and is not suitable for being deployed at a mobile terminal.

As shown in fig. 3B, an up-sampling super-resolution network structure schematic diagram according to an embodiment of the present application includes: and (3) performing Bicubic interpolation processing on the low-resolution original image (LR) to realize up-sampling to obtain an up-sampled image (SR '), inputting the up-sampled image (SR') into a pre-trained feature extraction model (CNN), and performing feature extraction and image enhancement on the original image to obtain the super-divided high-resolution target image (SR).

In this embodiment, if the resolution of the original image is 64×64 and the double super resolution target resolution is 128×128, if the up-sampling method is used, the feature map resolution of the entire neural network is consistent with the up-sampling, that is, the feature map resolution is 128×128, and the network calculation amount increases with the increase of the target resolution.

As shown in fig. 3C, a post-upsampling super resolution network structure schematic diagram according to an embodiment of the present application includes: inputting the low-resolution original image (LR) into a pre-trained feature extraction model (CNN), carrying out feature extraction and image enhancement on the original image, and then carrying out Bicubic interpolation processing on a feature image (LR') to realize up-sampling and obtain a super-divided high-resolution target image (SR).

In this embodiment, a mode of feature extraction and up-sampling is adopted, and the resolution of the feature map of the entire neural network is consistent with that of the original image during feature extraction, that is, the resolution of the feature map is 64x64. Thus, the calculated amount of the neural network is related to the square of the resolution of the feature map, for the double oversubstance, the calculated amount is 4 times of that of the post-upsampling method by adopting the pre-upsampling method, and compared with the pre-upsampling method in the related art, the calculated amount can be saved by adopting the post-upsampling method in the embodiment of the application, and the method is more suitable for mobile terminal application.

In an embodiment, the system may be applied to a mobile terminal with an NPU and a GPU, and the mobile terminal is a mobile phone, for example, as shown in fig. 3D, which is a schematic diagram of an application scenario of the system in fig. 3A and/or fig. 3C in the mobile phone. Taking a mobile phone e-commerce APP as an example, the system can be deployed through an MNN implementation model, the e-commerce APP can be suitable for different operating systems, super-resolution on a client is realized through an MNN call corresponding model engine, and an Android (Android) system and an iOS system are taken as examples in fig. 3D.

A user can initiate a picture access request to a server by using an e-commerce APP (client) through a mobile phone, and the server returns a low-resolution original image (LR). Assuming that a user mobile phone is an android system, an electronic commerce APP calls an NNAPI through an android through an MNN engine, so that feature extraction is carried out on an original image at an NPU of the mobile phone, and a feature map of the original image is obtained. And taking out the feature map from the memory of the NPU, transmitting the feature map to the GPU of the mobile phone, and carrying out up-sampling processing on the feature map by the GPU of the mobile phone according to the target resolution to obtain a target image corresponding to the image request. And the image superdivision of the required scale is completed on the client side, and rendering is performed.

Assuming that a user mobile phone is an iOS system, calling Core ML through the iOS system by an e-commerce APP through an MNN engine, and performing feature extraction on an original image at an NPU of the mobile phone to obtain a feature map of the original image. And taking out the feature map from the memory of the NPU, transmitting the feature map to the GPU of the mobile phone, and carrying out up-sampling processing on the feature map by the GPU of the mobile phone according to the target resolution to obtain a target image corresponding to the image request. And the image superdivision of the required scale is completed on the client side, and rendering is performed.

Some embodiments of the present application are described in detail below with reference to the accompanying drawings. In the case where there is no conflict between the embodiments, the following embodiments and features in the embodiments may be combined with each other. In addition, the sequence of steps in the method embodiments described below is only an example and is not strictly limited.

Please refer to fig. 4, which is an image super-processing method according to an embodiment of the present application, the method may be executed by the electronic device 1 shown in fig. 1, and may be applied to an application scenario of the image super-processing shown in fig. 2-3D, so as to implement targeted utilization of hardware resources of a terminal in the image super-processing process, solve the problem of limited computing power on the terminal, and improve the image super-processing effect. In this embodiment, taking the terminal 220 as an executing terminal as an example, the method includes the following steps:

Step 401: and acquiring an original image to be processed and a target resolution, wherein the resolution of the original image is smaller than the target resolution.

In this step, the original image refers to an image of lower resolution than the target resolution, where the image may be a single image, or a sequence of pictures. The original image may be a version of the image requested by the user that is stored in the database. Taking an e-commerce scene as an example, the original image may be a commodity image stored in an Oss of a server in an original image form, in an actual scene, the commodity image volume of the e-commerce is larger, in order to save storage resources, the resolution of the commodity image stored in the Oss is generally lower, and the user viewing experience is lower, so that when the user requests to access the commodity image, a high-resolution image can be recovered from a low-resolution commodity image or an image sequence. The target resolution refers to a resolution specified by the high-resolution image to be restored, and thus the original image needs to be super-processed only when the resolution of the original image is smaller than the target resolution. The target resolution can be selected by a user based on actual requirements, or can be automatically determined according to requirements of a terminal display.

In one embodiment, the step 401 may specifically include: and responding to the image request, and acquiring an image identifier and a target resolution carried by the image request. And according to the image identification, requesting the server to acquire an original image corresponding to the image identification.

In this embodiment, the user may initiate an image request through the client, where the image request may carry an image identifier to be requested and a specified target resolution, and then request to obtain an original image corresponding to the image identifier from the server, where the server may be a data server of the e-commerce platform, and the image identifier may be an identification code of the image or a thumbnail of the original image. Taking an e-commerce scene as an example, a user at an e-commerce client browses a commodity, displays a low-resolution small image of the commodity, and when the user clicks commodity details, can trigger an image request for the commodity detail image to request a server for a high-resolution large commodity image. The server may store different resolutions for each item detail map, e.g., the original image resolution of the item may be 350x350 or 650x 650. When receiving the request of the client, assuming that the target resolution designated by the user is 1200x1200, the server issues an original image with the resolution of 650x650, for example, and then the original image with the resolution of 650x650 is super-divided into 1200x1200 target images through the terminal, so that the access bandwidth can be saved compared with the case that the server directly issues the original image with the resolution of 1200x 1200.

Step 402: and extracting the characteristics of the original image through a neural network processing unit of the terminal to obtain a characteristic diagram of the original image.

In this step, the neural network processing unit NPU of the terminal is a hardware resource of the terminal, and its working principle is to simulate human neurons and synapses at a circuit layer, and directly process large-scale neurons and synapses by using a deep learning instruction set, and one instruction completes the processing of a group of neurons, so that the NPU realizes storage and calculation integration by highlighting weights, thereby improving the operation efficiency. The method can be used for accelerating the operation of the neural network and solving the problem of low efficiency of the traditional chip in the operation of the neural network. NPU is constructed by imitating biological neural network, CPU and GPU processor need to use several thousands of instructions to complete neuron processing, NPU can be completed by only one or several NPUs, so that it has obvious advantage in the aspect of deep learning processing efficiency. When the super-processing is carried out on the original image with low resolution, the characteristics and/or image enhancement of the original image are required to be extracted, and the characteristic extraction can be generally realized by adopting a neural network model, so that the characteristic extraction process can be calculated through the NPU of the terminal, the hardware resources of the terminal are reasonably and fully used, and the characteristic extraction efficiency is improved.

In one embodiment, step 402 may specifically include: and calling a neural network processing unit of the terminal, loading a preset feature extraction model through the neural network processing unit, inputting the original image into the preset feature extraction model, and outputting a feature map of the original image, wherein the feature map comprises texture features of the original image.

In this embodiment, the preset feature extraction model may be a pre-trained neural network model, for example, may be a CNN model, where CNN model parameters may be obtained through machine learning. The feature extraction model may be a model of other structures, and is not limited herein. The NPU of the terminal may load a preset feature extraction model and related learning parameters to extract a feature map of the original image, where the feature map may include at least texture features of the original image, so as to recover a high-resolution image according to the texture features. The NPU loading feature extraction model is used for calculation, so that the hardware resources of the terminal can be fully utilized, and the calculation efficiency is improved.

In one embodiment, the process of invoking the neural network processing unit of the terminal in step 402 includes: and calling a neural network processing unit of the terminal through the deep learning reasoning engine.

In this embodiment, the deep learning reasoning engine is a basic component of the artificial intelligence software, and by parsing the deep learning model into a computational graph, the reasoning engine implements optimization of the model and supports accelerated reasoning when running on different devices. When the hardware resource of the terminal is called, the deep learning reasoning engine can be adopted, for example, the NPU of the MNN calling terminal can be adopted to load the CNN model for feature extraction, and the texture features of the original image can be extracted. The MNN is a lightweight deep neural network reasoning engine, a deep neural network model is loaded on the end side to conduct reasoning prediction, the MNN is more focused on acceleration and optimization during reasoning, and the efficiency problem in the model deployment stage is solved, so that the service behind the model is more efficiently realized on the mobile end.

Step 403: and carrying out up-sampling processing on the feature map according to the target resolution by a graphic processor of the terminal to obtain a target image.

In this step, after obtaining the feature map of the original image, the high-resolution image may be obtained by upsampling the feature map. Specifically, up-sampling processing can be performed on the feature map of the original image according to the target resolution specified by the user, so as to obtain a target image with the resolution being the target resolution. Assuming that the mobile phone display of the user is 1080P (a video display format, the foreign language letter P means progressive scanning), the resolution of an original image issued by the server is 650x650, and the original image is subjected to feature extraction and image enhancement and then is subjected to up-sampling processing and is super-divided into 1080x1080 target images. The method has the advantages that the method adopts the mode of firstly extracting the characteristics and then upsampling to realize the super-division of any scale, compared with the method of firstly upsampling in the related technology, the calculated amount of the deep learning neural network can be greatly saved, and the method is more suitable for mobile terminal application.

The GPU, namely the graphic processor, is a hardware resource of the terminal, is a massive parallel computing architecture consisting of a large number of operation units, is separated from the CPU earlier and is specially used for processing image parallel computing data and is specially designed for simultaneously processing multiple parallel computing tasks, so that the GPU is more suitable for a large amount of training data, a large amount of matrixes and convolution operations in deep learning than the CPU. By separating and configuring the up-sampling operation and the feature extraction model and adopting the GPU to realize the up-sampling process, the existing hardware resources of the terminal can be utilized in a targeted manner, and the superdivision processing efficiency is improved.

In one embodiment, step 403 may specifically include: and calling a graphic processor of the terminal, and carrying out interpolation processing on the feature image according to the target resolution by the graphic processor to obtain a target image corresponding to the image request, wherein the resolution of the target image is the target resolution.

In this embodiment, the upsampling process may be implemented by using any scale interpolation operator, for example, the feature map is upsampled by using Bicubic interpolation, which is relatively less in calculation amount, low in consumption of video memory resources, and easy to deploy, compared with a scheme in which element learning is implemented by additionally using a network in the related art.

NPU with AI computing power at the mobile end in the actual scene is not friendly to support any scale interpolation operator, the scheme separates the up-sampling operation from the feature extraction model structure, the NPU is adopted to carry out the forward setting of the feature extraction model during deployment, and the GPU of the terminal is called to carry out any scale interpolation calculation, so that the final high-resolution output image is obtained. The super-resolution process adopts a post-up-sampling method and uses a single model to realize super-resolution of any scale, and compared with the prior up-sampling or meta-learning method in the related art, the super-resolution process can save the calculated amount and the memory space and reduce the forward time consumption of the model.

In one embodiment, the process of invoking the graphics processor of the terminal in step 403 includes: the graphics processor of the terminal is invoked by the deep learning reasoning engine.

In this step, when the GPU hardware resource of the terminal is called, the deep learning inference engine may be adopted, for example, MNN may be adopted to call the GPU of the terminal, the feature map processed by NPU is signed, the feature map is extracted from the memory and sent to the GPU of the terminal, and the GPU performs interpolation calculation, so that the real-time super-resolution of any scale on the terminal may be realized.

Step 404: rendering the target image and displaying the target image on a user interface.

In the step, the target image is rendered after the super-division is finished, so that the target image can be conveniently displayed in a user interface through a display of the terminal, and a user can conveniently and timely view the target image. Assuming that a user mobile phone display is 1080p (target resolution), the resolution of an original image issued by a server is 650x650, the original image is subjected to feature extraction and image enhancement, then up-sampling processing is carried out, the original image is super-divided into target images with the resolution of 1080x1080, and then a rendering module of the terminal renders and displays the target images.

According to the image super-processing method, the neural network processing unit of the terminal is used for extracting the characteristics of the original image, the graphic processor of the terminal is used for implementing the up-sampling processing on the characteristic image, and then the target image conforming to the target resolution is obtained.

When a client user requests to access an image, the server issues an original image with low resolution and super-resolution of the original image is up to the target resolution requested by the user.

Specifically, the method adopts the mode of firstly extracting the characteristics and then upsampling to realize the super division of any scale, so that compared with the mode of firstly upsampling in the related art, the method can save the calculated amount and is more suitable for mobile terminal application. The up-sampling and the network feature extraction are separated so that the two parts are respectively executed by different special hardware, the real-time super-resolution of any scale on the terminal is realized, the hardware resources of the terminal are reasonably utilized, and the calculation efficiency is improved.

In an actual scene, non-integer multiple interpolation operation is not supported by special hardware (NPU) of mobile terminal equipment AI generally, the scheme separates up-sampling and network feature extraction, wherein a deep learning neural network part is executed by adopting the special AI hardware NPU, and the up-sampling part can be executed by using a GPU, so that real-time super-resolution of any scale on the terminal is realized, the calculation efficiency is improved, and the hardware resources of the terminal are reasonably utilized.

Please refer to fig. 5, which is an image super-processing method according to an embodiment of the present application, the method may be executed by the electronic device 1 shown in fig. 1, and may be applied to the application scenario of the image super-processing shown in fig. 2-3D, so as to implement targeted utilization of hardware resources of a terminal in the image super-processing process, solve the problem of limited computing power on the terminal, and improve the image super-processing effect. In this embodiment, taking the terminal 220 as an executing terminal as an example, the method includes the following steps:

Step 501: and responding to the image request, and acquiring an image identifier and a target resolution carried by the image request.

Step 502: and according to the image identification, requesting the server to acquire an original image corresponding to the image identification.

Step 503: and calling an NPU of the terminal through the MNN, loading a preset feature extraction model through the NPU, inputting the original image into the preset feature extraction model, and outputting a feature map of the original image, wherein the feature map comprises texture features of the original image.

Step 504: and calling a GPU of the terminal through the MNN, and carrying out interpolation processing on the feature map through the GPU according to the target resolution to obtain a target image corresponding to the image request, wherein the target resolution is the target resolution.

Step 505: rendering the target image and displaying the target image on a user interface.

The details of each step of the above image super processing method can be referred to the related description of the above embodiment, and will not be repeated here.

Please refer to fig. 6, which is an embodiment of a commodity image super-processing method of the present application, the method may be executed by the electronic device 1 shown in fig. 1, and may be applied to the application scenario of the image super-processing shown in fig. 2-3D, so as to implement targeted utilization of hardware resources of a terminal in the image super-processing process, solve the problem of limited computing power on the terminal, and improve the image super-processing effect. In this embodiment, the terminal 220 is taken as an executing terminal, and compared with the previous embodiment, in this embodiment, the method takes the electronic commerce scene as an example to browse the commodity image, and the method includes the following steps:

Step 601: and responding to a query request of a user for the commodity, acquiring an original commodity image corresponding to the commodity from a server, wherein the query request carries a target resolution, and the resolution of the original commodity image is smaller than the target resolution.

In the step, the commodity can be sold by the e-commerce platform, and the user can log in the e-commerce shopping platform through the terminal to check information of the commodity, for example, the identity of the interested commodity can be clicked, and a query request for commodity details is triggered. Here, the product identifier may be a product code, text guide information, or a thumbnail of the product. For example, the E-commerce client side user browses the commodity, the low-resolution small diagram of the commodity is displayed, and when the user clicks the text guide information 'commodity details', the image request of the commodity details diagram can be triggered, and the high-resolution large diagram of the commodity is requested to the server. The server may store different resolutions for each item detail map, e.g., the original item image resolution may be 350x350 or 650x 650. When receiving the request of the client, assuming that the target resolution designated by the user is 1200x1200, the server issues an original commodity image with the resolution of 650x650, for example, and then the original commodity image with the resolution of 650x650 is super-divided into a target commodity image with the resolution of 1200x1200 by the terminal, so that the access bandwidth can be saved compared with the case that the server directly issues the original commodity image with the resolution of 1200x 1200.

Step 602: and extracting the characteristics of the original commodity image through a neural network processing unit of the terminal to obtain a characteristic diagram of the original commodity image.

Step 603: and carrying out up-sampling processing on the feature map according to the target resolution through a graphic processor of the terminal to obtain a target commodity image corresponding to the query request.

Step 604: rendering the target commodity image and displaying the target commodity image on a user interface.

According to the commodity image super-processing method, any scale super-division of the mobile terminal is adopted, when a client user requests to access the commodity image, the server issues the original commodity image with low resolution, and super-resolution of the original commodity image is carried out to target resolution requested by the user on the terminal, so that transcoding calculation force and bandwidth cost can be saved, and the display effect on the terminal can be better. Specifically, the method adopts the mode of firstly extracting the characteristics and then upsampling to realize the super division of any scale, so that compared with the mode of firstly upsampling in the related art, the method can save the calculated amount and is more suitable for mobile terminal application. The up-sampling and the network feature extraction are separated so that the two parts are respectively executed by different special hardware, the real-time super-resolution of any scale on the terminal is realized, the hardware resources of the terminal are reasonably utilized, and the calculation efficiency is improved. In the E-commerce scene, the computing power and the transmission bandwidth cost of the server can be saved, and the on-terminal commodity image display effect can be better.

The details of each step of the above commodity image super processing method can be referred to the related description of the above embodiment, and will not be repeated here.

Please refer to fig. 7, which is a method for generating an image super-resolution model according to an embodiment of the present application, wherein the method can be executed by the electronic device 1 shown in fig. 1 and can be applied to the application scenario of the image super-resolution process shown in fig. 2-3D, so as to train and deploy the image super-resolution model. In this embodiment, taking the terminal 220 as an executing terminal as an example, the method includes the following steps:

step 701: acquiring a sample image set, the sample image set comprising: a first sample image and a second sample image, the second sample image being an image of the first sample image reduced in resolution.

In this step, wherein the first sample image is a high resolution sample set and the second sample image is a low resolution sample set, that is to say the sample image set comprises a high resolution sample set and a low resolution sample set, and the high resolution sample image corresponds to the low resolution sample image and the sample image is resolution marked, forming the sample image set.

In an actual scene, the sample image set can be obtained by means of a public data set, self-collection and the like. To enrich the content of the sample image set, the sample data set may contain images of the corresponding application scenario, such as for e-commerce scenarios, the sample image set may contain high quality images of various goods, and then a low resolution goods sample image is generated from the high resolution goods sample image. For example, a high-definition image (first sample image) is cropped, and then downsampled at different scales to reduce the resolution, so as to obtain a second sample image with low resolution.

Step 702: the method comprises the steps of establishing a superdivision model structure, wherein the superdivision model structure comprises a preset feature extraction module and an up-sampling module, the preset feature extraction module is deployed in a neural network processing unit of a terminal, and the up-sampling module is deployed in a graphic processor of the terminal.

In this step, the super-division model structure adopts a mode of feature extraction first and then up-sampling, wherein the preset feature extraction module can be realized based on the feature extraction model of the deep learning neural network, the up-sampling module can be realized in an interpolation mode, and the network structure is shown in fig. 3C, so that compared with the mode of up-sampling first in the related art, the method can save the calculated amount and is more suitable for mobile terminal application. The up-sampling module and the network feature extraction model are separately deployed, wherein the feature extraction model part based on the deep learning neural network is deployed on a special AI hardware NPU, and the up-sampling module is deployed on a terminal GPU, so that the two parts are respectively executed by different special hardware, the real-time super-resolution of any scale on the terminal is realized, the hardware resources of the terminal are reasonably utilized, and the calculation efficiency is improved.

Step 703: and training the super-division model structure by adopting a sample image set to obtain a trained image super-division model.

In the step, the sample image set can be input into a built superdivision model structure for training, and training parameters are adjusted based on actual scene requirements. For example, a second sample image with low resolution is input into the super-division model structure, a super-divided sample image is output, loss between the super-divided sample image and a first sample image with high resolution is calculated, and the calculation function is optimized, so that difference between the super-divided sample image and the first sample image with high resolution is reduced until the loss function converges, and a trained image super-division model is obtained.

According to the image super-resolution model generation method, the super-resolution model with up-sampling and network feature extraction separation is trained in an end-to-end mode, the trained image super-resolution model can be deployed on a mobile terminal, and the super-resolution of any scale on the terminal in real time is realized through the method of any embodiment. The problem of excessive division of any scale of the picture under the condition of limited calculation force on the terminal is effectively solved.

The details of each step of the image superdivision model generating method can be referred to the related description of the above embodiment, which is not repeated here.

Please refer to fig. 8, which illustrates an image super-processing apparatus 800 according to an embodiment of the present application, which may be applied to the electronic device 1 illustrated in fig. 1 and may be applied to the application scenario of the image super-processing illustrated in fig. 2-3D, so as to achieve targeted utilization of hardware resources of a terminal in the image super-processing process, solve the problem of limited computing power on the terminal, and improve the image super-processing effect. The device comprises: the functional principles of the acquisition module 801, the extraction module 802 and the upsampling module 803 are as follows:

An obtaining module 801, configured to obtain an original image to be processed and a target resolution, where the resolution of the original image is less than the target resolution.

The extracting module 802 is configured to perform feature extraction on the original image through a neural network processing unit of the terminal, so as to obtain a feature map of the original image.

And the up-sampling module 803 is configured to perform up-sampling processing on the feature map according to the target resolution by using a graphics processor of the terminal, so as to obtain a target image.

In an embodiment, the extracting module 802 is configured to invoke a neural network processing unit of the terminal, load a preset feature extraction model through the neural network processing unit, input an original image into the preset feature extraction model, and output a feature map of the original image, where the feature map includes texture features of the original image.

In an embodiment, the upsampling module 803 is configured to invoke a graphics processor of the terminal, and interpolate the feature map according to a target resolution by using the graphics processor of the terminal to obtain a target image corresponding to the image request, where the target image resolution is the target resolution.

In one embodiment, invoking a graphics processor of a terminal includes: the graphics processor of the terminal is invoked by the deep learning reasoning engine.

In one embodiment, the method further comprises: and the rendering module is used for performing up-sampling processing on the feature image according to the target resolution through a graphic processor of the terminal to obtain a target image, then rendering the target image and displaying the target image on a user interface.

In one embodiment, the obtaining module 801 is configured to obtain, in response to an image request, an image identifier and a target resolution carried by the image request. And according to the image identification, requesting the server to acquire an original image corresponding to the image identification.

For a detailed description of the image super processing apparatus 800, please refer to the description of the related method steps in the above embodiment, the implementation principle and technical effects are similar, and the detailed description of this embodiment is omitted here.

Fig. 9 is a schematic structural diagram of a cloud device 90 according to an exemplary embodiment of the present application. The cloud device 90 may be used to run the methods provided in any of the embodiments described above. As shown in fig. 9, the cloud device 90 may include: memory 904 and at least one processor 905, one for example in fig. 9.

Memory 904 for storing computer programs and may be configured to store other various data to support operations on cloud device 90. The memory 904 may be an object store (Object Storage Service, OSS).

The memory 904 may be implemented by any type of volatile or nonvolatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The processor 905 is coupled to the memory 904, and is configured to execute the computer program in the memory 904, so as to implement the solutions provided by any of the method embodiments described above, and specific functions and technical effects that can be implemented are not described herein.

Further, as shown in fig. 9, the cloud device further includes: firewall 901, load balancer 902, communication component 906, power component 903, and other components. Only some components are schematically shown in fig. 9, which does not mean that the cloud device only includes the components shown in fig. 9.

In one embodiment, the communication component 906 of FIG. 9 is configured to facilitate wired or wireless communication between the device in which the communication component 906 is located and other devices. The device in which the communication component 906 is located can access a wireless network based on a communication standard, such as a WiFi,2G, 3G, 4G/LTE, 5G, or the like mobile communication network, or a combination thereof. In one exemplary embodiment, the communication component 906 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 906 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

In one embodiment, the power supply 903 of fig. 9 provides power to the various components of the device in which the power supply 903 is located. The power components 903 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the devices in which the power components reside.

The embodiment of the application further provides a computer readable storage medium, wherein computer executable instructions are stored in the computer readable storage medium, and when the processor executes the computer executable instructions, the method of any of the foregoing embodiments is implemented.

Embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements the method of any of the preceding embodiments.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, e.g., the division of modules is merely a logical function division, and there may be additional divisions of actual implementation, e.g., multiple modules may be combined or integrated into another system, or some features may be omitted or not performed.

The integrated modules, which are implemented in the form of software functional modules, may be stored in a computer readable storage medium. The software functional modules described above are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or processor to perform some steps of the methods of the various embodiments of the present application.

It should be appreciated that the processor may be a central processing unit (Central Processing Unit, CPU for short), other general purpose processors, digital signal processor (Digital Signal Processor, DSP for short), application specific integrated circuit (Application Specific Integrated Circuit, ASIC for short), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution. The memory may comprise a high-speed RAM memory, and may further comprise a non-volatile memory NVM, such as at least one magnetic disk memory, and may also be a U-disk, a removable hard disk, a read-only memory, a magnetic disk or optical disk, etc.

The storage medium may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.

An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an application specific integrated circuit (Application Specific Integrated Circuits, ASIC for short). It is also possible that the processor and the storage medium reside as discrete components in an electronic device or a master device.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method of the embodiments of the present application.

In the technical scheme of the application, the related information such as user data and the like is collected, stored, used, processed, transmitted, provided, disclosed and the like, and all meet the requirements of related laws and regulations without violating the common-practice custom.

The foregoing description is only of the preferred embodiments of the present application, and is not intended to limit the scope of the claims, and all equivalent structures or equivalent processes using the descriptions and drawings of the present application, or direct or indirect application in other related technical fields are included in the scope of the claims of the present application.

Claims

1. A method of image superdivision processing, the method comprising:

acquiring an original image to be processed and a target resolution, wherein the resolution of the original image is smaller than the target resolution;

extracting the characteristics of the original image through a neural network processing unit of the terminal to obtain a characteristic diagram of the original image;

and carrying out up-sampling processing on the feature map according to the target resolution through a graphic processor of the terminal to obtain a target image.

2. The method according to claim 1, wherein the feature extraction is performed on the original image by the neural network processing unit of the terminal to obtain a feature map of the original image, including:

and calling a neural network processing unit of the terminal, loading a preset feature extraction model through the neural network processing unit, inputting the original image into the preset feature extraction model, and outputting a feature map of the original image, wherein the feature map comprises texture features of the original image.

3. The method according to claim 2, wherein the neural network processing unit of the calling terminal comprises:

and calling a neural network processing unit of the terminal through the deep learning reasoning engine.

4. The method according to claim 1, wherein the step of performing, by the graphics processor of the terminal, upsampling the feature map according to the target resolution to obtain a target image includes:

and invoking a graphic processor of a terminal, and carrying out interpolation processing on the feature map through the graphic processor according to the target resolution to obtain a target image corresponding to the image request, wherein the resolution of the target image is the target resolution.

5. The method of claim 4, wherein the invoking the graphics processor of the terminal comprises:

and invoking a graphic processor of the terminal through a deep learning reasoning engine.

6. The method according to claim 1, wherein after the graphics processor of the pass-through terminal performs upsampling processing on the feature map according to the target resolution to obtain a target image, the method further comprises:

and rendering the target image, and displaying the target image on a user interface.

7. The method of claim 1, wherein the acquiring the original image to be processed and the target resolution comprises:

responding to an image request, and acquiring an image identifier and a target resolution carried by the image request;

And according to the image identification, requesting a server to acquire the original image corresponding to the image identification.

8. The commodity image super-division processing method is characterized by comprising the following steps of:

responding to a query request of a user for a commodity, acquiring an original commodity image corresponding to the commodity from a server, wherein the query request carries a target resolution, and the resolution of the original commodity image is smaller than the target resolution;

extracting the characteristics of the original commodity image through a neural network processing unit of the terminal to obtain a characteristic diagram of the original commodity image;

the feature map is subjected to up-sampling processing according to the target resolution by a graphic processor of the terminal, so that a target commodity image corresponding to the query request is obtained;

and rendering the target commodity image, and displaying the target commodity image on a user interface.

9. The image super-division model generation method is characterized by comprising the following steps of:

obtaining a sample image set, the sample image set comprising: a first sample image and a second sample image, the second sample image being an image of the first sample image reduced in resolution;

establishing a superdivision model structure, wherein the superdivision model structure comprises a preset feature extraction module and an up-sampling module, the preset feature extraction module is deployed in a neural network processing unit of a terminal, and the up-sampling module is deployed in a graphic processor of the terminal;

And training the superdivision model structure by adopting the sample image set to obtain a trained image superdivision model.

10. An image superdivision processing system, comprising:

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor;

wherein the memory stores instructions executable by the at least one processor to cause the electronic device to perform the method of any one of claims 1-9.

12. A cloud device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor;

wherein the memory stores instructions executable by the at least one processor to cause the cloud device to perform the method of any of claims 1-9.

13. A computer readable storage medium having stored therein computer executable instructions which, when executed by a processor, implement the method of any of claims 1-9.

14. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-9.