CN110033406B - Method and apparatus for processing image - Google Patents

Method and apparatus for processing image Download PDF

Info

Publication number
CN110033406B
CN110033406B CN201910289762.0A CN201910289762A CN110033406B CN 110033406 B CN110033406 B CN 110033406B CN 201910289762 A CN201910289762 A CN 201910289762A CN 110033406 B CN110033406 B CN 110033406B
Authority
CN
China
Prior art keywords
image processing
processing function
computing architecture
target
target computing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910289762.0A
Other languages
Chinese (zh)
Other versions
CN110033406A (en
Inventor
窦倩
张争艳
苏昊天
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910289762.0A priority Critical patent/CN110033406B/en
Publication of CN110033406A publication Critical patent/CN110033406A/en
Application granted granted Critical
Publication of CN110033406B publication Critical patent/CN110033406B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/28Indexing scheme for image data processing or generation, in general involving image processing hardware

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Stored Programmes (AREA)

Abstract

The embodiment of the application discloses a method and a device for processing images. One embodiment of the method comprises the following steps: determining whether a graphics processor GPU supporting the target computing architecture exists; updating the flag variable based on the determination result; and in response to receiving the image sequence, invoking an image processing function in a target image processing function library based on the mark variable to process the image sequence. The embodiment accelerates the image processing function in the target image processing function library by utilizing the target computing architecture, thereby improving the efficiency of image processing.

Description

Method and apparatus for processing image
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a method and a device for processing images.
Background
Image processing is a technique in which images are analyzed by a computer to achieve a desired result. Also known as image processing. Image processing generally refers to digital image processing. Digital images refer to a large two-dimensional array of pixels obtained by photographing with industrial cameras, video cameras, scanners, etc., the elements of which are called pixels. The common methods include: image transformation, image encoding compression, image enhancement and restoration, image segmentation, image description, image classification.
Currently, image processing mainly uses functions in OpenCV to implement some basic algorithms in image processing, including color conversion to gray scale, morphological operations, thresholding, edge extraction, and the like. To take account of the platform versatility of OpenCV, current image processing algorithms still execute on the CPU.
Disclosure of Invention
The embodiment of the application provides a method and a device for processing an image.
In a first aspect, an embodiment of the present application provides a method for processing an image, including: determining whether a graphics processor GPU supporting the target computing architecture exists; updating the flag variable based on the determination result; and in response to receiving the image sequence, invoking an image processing function in the target image processing function library based on the flag variable to process the image sequence.
In some embodiments, updating the flag variable based on the determination result includes: if the GPU supporting the target computing architecture exists, updating the mark variable into information indicating the image processing function for calling acceleration; and invoking an image processing function in the target image processing function library to process the image sequence based on the marker variable, comprising: and calling an image processing function accelerated by a target computing architecture in a target image processing function library to process the image sequence.
In some embodiments, before updating the flag variable based on the determination result, further comprising: and if the GPU supporting the target computing architecture exists, invoking the image processing function accelerated by the target computing architecture in the target image processing function library once.
In some embodiments, updating the flag variable based on the determination result includes: if the GPU supporting the target computing architecture does not exist, updating the mark variable to information indicating that the un-accelerated image processing function is called; and invoking an image processing function in the target image processing function library to process the image sequence based on the flag variable, comprising: and calling an un-accelerated image processing function in the target image processing function library to process the image sequence.
In some embodiments, determining whether a graphics processor GPU exists that supports a unified computing device architecture target computing architecture includes: and calling the counting function of the target computing architecture starting equipment in the target image processing function library to obtain the number of GPUs supporting the target computing architecture.
In some embodiments, if the target computing architecture support is not enabled or there are no GPUs supporting the target computing architecture at the target image processing function library compilation time, the number of GPUs supporting the target computing architecture is zero.
In some embodiments, the method further comprises: when the source codes of the target image processing function library are recompiled by utilizing a cross-platform compiling tool Cmake, compiling options supporting the target computing architecture are added in command line parameters.
In some embodiments, the target computing architecture comprises a unified computing device architecture CUDA, and the target image processing function library comprises an open source computer vision library OpenCV or an open computing language OpenCL.
In a second aspect, an embodiment of the present application provides an apparatus for processing an image, including: a determining unit configured to determine whether there is a graphics processor GPU supporting the target computing architecture; an updating unit configured to update the flag variable based on the determination result; and a processing unit configured to call the image processing function in the target image processing function library to process the image sequence based on the flag variable in response to receiving the image sequence.
In some embodiments, the update unit is further configured to: if the GPU supporting the target computing architecture exists, updating the mark variable into information indicating the image processing function for calling acceleration; and the processing unit is further configured to: and calling an image processing function accelerated by a target computing architecture in a target image processing function library to process the image sequence.
In some embodiments, the apparatus further comprises: and the calling unit is configured to call the image processing function accelerated by the target computing architecture in the target image processing function library once if the GPU supporting the target computing architecture exists.
In some embodiments, the update unit is further configured to: if the GPU supporting the target computing architecture does not exist, updating the mark variable to information indicating that the un-accelerated image processing function is called; and the processing unit is further configured to: and calling an un-accelerated image processing function in the target image processing function library to process the image sequence.
In some embodiments, the determining unit is further configured to: and calling the counting function of the target computing architecture starting equipment in the target image processing function library to obtain the number of GPUs supporting the target computing architecture.
In some embodiments, if the target computing architecture support is not enabled or there are no GPUs supporting the target computing architecture at the target image processing function library compilation time, the number of GPUs supporting the target computing architecture is zero.
In some embodiments, the apparatus further comprises: and an adding unit configured to add a compiling option supporting the target computing architecture in the command line parameters when the source codes of the target image processing function library are recompiled using a cross-platform compiling tool Cmake.
In some embodiments, the target computing architecture comprises a unified computing device architecture CUDA, and the target image processing function library comprises an open source computer vision library OpenCV or an open computing language OpenCL.
In a third aspect, an embodiment of the present application provides a server, including: one or more processors; a storage device having one or more programs stored thereon; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as described in any of the implementations of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer readable medium having stored thereon a computer program which, when executed by a processor, implements a method as described in any of the implementations of the first aspect.
The method and the device for processing the image provided by the embodiment of the application firstly determine whether the GPU supporting the target computing architecture exists or not to obtain a determination result; then updating the flag variable based on the determination result; and finally, under the condition that the image sequence is received, calling an image processing function in a target image processing function library based on the mark variable to process the image sequence. The image processing functions in the target image processing function library are accelerated by using the target computing architecture, so that the efficiency of image processing is improved. And the image processing function is executed on the GPU, so that the occupancy rate of the CPU is greatly reduced.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture in which the present application may be applied;
FIG. 2 is a flow chart of one embodiment of a method for processing an image according to the present application;
FIG. 3 is a flow chart of yet another embodiment of a method for processing an image according to the present application;
FIG. 4 is a schematic diagram of an embodiment of an apparatus for processing images in accordance with the present application;
FIG. 5 is a schematic diagram of a computer system suitable for use with a server implementing an embodiment of the application.
Detailed Description
The application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be noted that, for convenience of description, only the portions related to the present application are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
Fig. 1 shows an exemplary system architecture 100 to which an embodiment of the method for processing an image or the apparatus for processing an image of the present application may be applied.
As shown in fig. 1, a terminal device 101, a network 102, and a server 103 may be included in a system architecture 100. Network 102 is the medium used to provide communication links between terminal device 101 and server 103. Network 102 may include various connection types such as wired, wireless communication links, or fiber optic cables, among others.
A user may interact with the server 103 via the network 102 using the terminal device 101 to receive or send messages or the like. Various client software, such as an image processing class application, etc., may be installed on the terminal device 101.
The terminal device 101 may be hardware or software. When the terminal apparatus 101 is hardware, it may be various electronic apparatuses supporting an image capturing function. Including but not limited to cameras, video cameras, smart phones, tablet computers, and the like. When the terminal apparatus 101 is software, it may be installed in the above-described electronic apparatus. Which may be implemented as a plurality of software or software modules, or as a single software or software module. The present application is not particularly limited herein.
The server 103 may be a server providing various services. Such as an image processing server. The image processing server may analyze the acquired data such as the image sequence and generate a processing result (for example, a processed image sequence).
The server 103 may be hardware or software. When the server 103 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 103 is software, it may be implemented as a plurality of software or software modules (for example, to provide distributed services), or may be implemented as a single software or software module. The present application is not particularly limited herein.
It should be noted that, the method for processing an image provided by the embodiment of the present application is generally performed by the server 103, and accordingly, the device for processing an image is generally disposed in the server 103.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for processing an image in accordance with the present application is shown. The method for processing an image comprises the steps of:
in step 201, it is determined whether there is a GPU supporting the target computing architecture.
In this embodiment, an execution subject of the method for processing an image (e.g., the server 103 shown in fig. 1) may determine whether or not there is a GPU (GraphicsProcessing Unit, graphics processor) supporting a target computing architecture to obtain a determination result. Wherein the determination may be used to indicate whether a GPU supporting the target computing architecture is present.
The target computing architecture may include a pre-specified computing architecture, among other things. For example, the target computing architecture may include CUDA (Compute Unified Device Architecture, unified computing device architecture). CUDA is an operation platform introduced by graphic card manufacturers, and is a general parallel computing architecture, which enables a GPU to solve complex computing problems. It contains the CUDA Instruction Set Architecture (ISA) and the parallel computing engine inside the GPU. Developers can now write programs for CUDA using the C language, which is one of the most widely used high-level programming languages. The written program may run at ultra-high performance on a CUDA enabled processor. The GPU, also called a display core, a visual processor, and a display chip, is a microprocessor that is specially used for image operation on a personal computer, a workstation, a game machine, and some mobile devices (such as a tablet computer, a smart phone, etc.).
In some optional implementations of this embodiment, the executing entity may call an acquire target computing architecture enabling device count function in the target image processing function library to obtain the number of GPUs supporting the target computing architecture. The target image processing function library may include a pre-specified image processing function library. For example, the target image processing function library may include OpenCV (Open Source Computer Vision Library ) or OpenCL (Open Computing Language, open computing language).
The OpenCV is a cross-platform computer vision library issued based on BSD permissions (open source), and can run on Linux, windows, android and Mac OS operating systems. The system is lightweight and efficient, is composed of a series of C functions and a small number of C++ classes, provides interfaces of Python, ruby, MATLAB and other languages, and realizes a plurality of general algorithms in the aspects of image processing and computer vision. At this point, the get CUDA enabled device count function in OpenCV may be getcudaenabledddevicecount. Specifically, if CUDA support is not enabled during OpenCV compiling, the execution body calls a getcudaenabledddevicecount function, and the number of GPUs supporting CUDA is zero. If CUDA support is enabled during OpenCV compiling, the execution body calls a getCudaEnabledDeviceCount function, and the obtained quantity of GPU supporting the CUDA is obtained. It should be noted that, if CUDA support is enabled during OpenCV compiling, but no graphics card supporting CUDA is installed in the system, no GPU supporting CUDA still exists, and at this time, the getcudaenabledddevicecount function is called, and the number of GPUs supporting CUDA obtained is still zero.
The OpenCL is a first open and free standard for general purpose parallel programming of heterogeneous systems, and is also a unified programming environment, so that software developers can write efficient and portable codes for high-performance computing servers, desktop computing systems and handheld devices, and the OpenCL is widely applicable to other parallel processors such as multi-Core Processors (CPUs), graphic Processors (GPUs), cell type architectures and Digital Signal Processors (DSPs).
In general, the execution body may enable target computing architecture support at the compilation time of the target image processing function library prior to executing step 201. Specifically, when recompiling the source code of the target image processing function library using Cmake (crossplatform make, cross-platform compilation tool), the execution body may add a compilation option supporting the target computing architecture in the command line parameters.
In some optional implementations of the present embodiment, when the target image processing function library is OpenCV, the compiling options thereof may include:
“-D WITH_CUDA=ON
-D CUDA_ARCH_PTX=""
-D CUDA_FAST_MATH=ON
-D WITH_CUBLAS=ON
-D WITH_NVCUVID=ON”。
wherein, the WITH_CUDA option may be used to specify whether CUDA support is enabled, WITH a value of ON indicating enablement and a value of OFF indicating non-enablement. And, when it takes the value OFF, the code of the OpenCV that uses CUDA to process the image data in parallel will not be compiled.
The cuda_arch_ptx option may be used to specify a corresponding library file for PTX (Parallel ThreadeXecution ) code generation, which corresponds to graphics card computing capabilities. PTX code is an intermediate form of compiled GPU code that may be recompiled into native GPU microcode. The CUDA compiler generates different codes according to different display card computing capacities, and the values of the codes are not specified here, so that the CUDA compiler automatically detects the computing capacity of the current display card.
The cuda_fast_match option may be used to specify whether to use the FAST MATH library in the CUDA Toolkit, which indicates use when the value is ON. The fast mathematical operation library can provide a function interface specially optimized for speed, and is generally realized by using assembly language.
The WITH_CUBLAS option may be used to specify whether the code in the OpenCV code that involves linear algebraic correlation computation uses an interface provided by the cuBLAS library, which is used when its value is ON. The cuBLAS may be a set of BLAS (BasicLinear Algebra Subprograms, basic linear algebraic subroutine) libraries implemented using the CUDA runtime to accelerate computation using the GPU. Currently mainstream BLAS libraries can include, but are not limited to, ATLAS BLAS, openBLAS, cuBLAS, clBLAS, BLIS, and the like.
The WITH_NVCUVID option may be used to specify whether the code in the OpenCV code that involves video decoding uses an interface provided by the NVCUVID library, which indicates use when the value is ON. NVCUVID is a set of function libraries that use GPU hardware to accelerate video decoding.
Step 202, updating the flag variable based on the determination result.
In this embodiment, the execution body may update the flag variable based on the determination result. Wherein the flag variable may be information indicating that an accelerated image processing function or an un-accelerated image processing function is called. For example, the value of the flag variable may be updated to 0 or 1. If there is a GPU supporting the target computing architecture, the execution body may update the value of the flag variable to 1, which indicates that the accelerated image processing function is invoked. If there is no GPU supporting the target computing architecture, the execution body may update the value of the flag variable to 0, which indicates that the un-accelerated image processing function is invoked.
In response to receiving the image sequence, step 203, the image sequence is processed by invoking an image processing function in the target image processing function library based on the flag variable.
In this embodiment, the terminal device supporting the image capturing function may capture an image sequence and send the image sequence to the execution subject. In the case of receiving an image sequence, the execution body may first determine the value of the flag variable, and then call the image processing function in the target image processing function library to process the image sequence based on the value of the flag variable. The image processing functions in the target image processing function library may include image processing functions accelerated by the target computing architecture and image processing functions not accelerated. For example, if the flag variable has a value of 1, indicating that an accelerated image processing function is to be invoked, then an image processing function with accelerated target computing architecture in the target image processing function library is invoked. And if the value of the flag variable is 0, indicating that the non-accelerated image processing function is called, and calling the non-accelerated image processing function in the target image processing function library.
In general, an image sequence may include a set of images taken in any scene. For example, in an autopilot scenario, since autopilot vehicles need to rely on sensors (radar, camera, laser ranging, GPS, etc.) in combination with artificial intelligence, vision computing, monitoring devices, etc., a computer system can operate a motor vehicle automatically and safely without human intervention. Thus, the image sequence may be a set of images taken by a camera of an autonomous car.
In practice, the processing of the image mainly uses functions in OpenCV to implement some basic algorithms in image processing, including color conversion to gray scale, morphological operation, thresholding, edge extraction, and the like. If CUDA support is enabled at OpenCV compilation time, then these image processing functions may be accelerated by CUDA.
Taking image scaling in image transformation as an example, the main function used is restore (), whose functional prototype is:
void cv::resize(InputArray src,OutputArray dst,Size dsize,double fx=0,double fy=0,int interpolation=INTER_LINEAR)。
where cv may represent an OpenCV defined namespace. The InputArray belongs to an input parameter and src represents an input image, i.e. an image to be scaled. The OutputArray belongs to the output parameter, and dst represents the output image, i.e. the scaled image. Size represents the Size of the output parameter, dsize represents the Size of the output image. fx represents the scaling in the x-axis direction. fy represents the scaling in the y-axis direction. INTER-Interpolation is used to specify Interpolation algorithms, default to INTER-line (LINEAR Interpolation), commonly used are INTER-mean (NEAREST Interpolation algorithm), INTER-cube (bicubic Interpolation algorithm), INTER-AREA (regional Interpolation algorithm), INTER-LANCZOS 4 (LANCZOS Interpolation). Typically, if the image to be reduced is implemented with the cv:: inter_area algorithm, and the image to be enlarged is implemented with either cv:: inter_cure or cv:: inter_line. Among them, inter_cure is good but slow, inter_line is relatively fast and the effect is acceptable.
In general, if dsize is not 0, this means scaling the original image to the size specified by this parameter, and if dsize is 0, this means calculating the size of the scaled image by the following formula:
dsize=Size(round(fx*src.cols),round(fy*src.rows))
where round () represents a function that rounds a decimal fraction, returning an integer. cols and rows represent the size and number of rows and columns. If fx in the restore () is 0, the value thereof needs to be calculated in terms of (double) dsize. width represents the width of the scaled image. If fy in the restore () is 0, the value thereof needs to be calculated in terms of (double) dsize. The height represents the height of the scaled image.
The method for processing the image provided by the embodiment of the application comprises the steps of firstly determining whether a GPU supporting a target computing architecture exists or not to obtain a determination result; then updating the flag variable based on the determination result; and finally, under the condition that the image sequence is received, calling an image processing function in a target image processing function library based on the mark variable to process the image sequence. The image processing functions in the target image processing function library are accelerated by using the target computing architecture, so that the efficiency of image processing is improved. And the image processing function is executed on the GPU, so that the occupancy rate of the CPU is greatly reduced.
With further reference to fig. 3, a flow 300 is shown illustrating yet another embodiment of a method for processing an image in accordance with the present application. The method for processing an image comprises the steps of:
step 301 determines whether there is a GPU supporting the target computing architecture.
In this embodiment, an execution subject of a method for processing an image (e.g., the server 103 shown in fig. 1) may determine whether there is a GPU supporting a target computing architecture. If there is a GPU supporting the target computing architecture, step 302 is performed, and if there is no GPU supporting the target computing architecture, step 305 is performed.
Step 302, call a target computing architecture accelerated image processing function in a target image processing function library.
In this embodiment, if there is a GPU supporting the target computing architecture, the execution body may call the image processing function accelerated by the target computing architecture in the target image processing function library once, and continue to execute step 303.
In general, the execution time of processing an image by the image processing function that invokes the target computing architecture acceleration for the first time may be relatively long. In order to avoid that the image processing function accelerated by the target computing architecture is called for the first time after the image sequence is received and the first frame of image in the image sequence is directly processed, the situation that the processing time is relatively long occurs is avoided, and the image processing function accelerated by the target computing architecture is called once before the image sequence is not received, so that the context of the target computing architecture can be initialized. Thus, when the image sequence is received, the target computing architecture context does not need to be initialized again, so that the processing time of the image sequence is saved. It should be noted that, the image processing function accelerated by the target computing architecture called here processes the image built in the memory by the program. The pixels of the image are typically all black.
Step 303, updating the flag variable to information indicating that the accelerated image processing function is invoked.
In this embodiment, if there is a GPU supporting the target computing architecture, the execution body may update the flag variable to information indicating that the accelerated image processing function is invoked, and continue to execute step 304. For example, the value of the flag variable is updated to 1.
Step 304, in response to receiving the image sequence, invoking an image processing function accelerated by the target computing architecture in the target image processing function library to process the image sequence.
In this embodiment, when the image sequence is received, the electronic device may first determine a value of a flag variable, and if the value of the flag variable is 1, and indicates that an accelerated image processing function is to be invoked, invoke an image processing function with an accelerated target computing architecture in a target image processing function library to process the image sequence.
Step 305 updates the flag variable to information indicating that the un-accelerated image processing function is invoked.
In this embodiment, if there is no GPU supporting the target computing architecture, the execution body may update the flag variable to information indicating that the un-accelerated image processing function is invoked, and continue to execute step 306.
In response to receiving the image sequence, step 306, the image sequence is processed by invoking an un-accelerated image processing function in the target image processing function library.
In this embodiment, when the image sequence is received, the electronic device may first determine a value of a flag variable, and if the value of the flag variable is 0, and indicates that an un-accelerated image processing function is called, call the un-accelerated image processing function in the target image processing function library to process the image sequence.
As can be seen from fig. 3, the flow 300 of the method for processing an image in this embodiment highlights the step of calling the image processing function, compared to the corresponding embodiment of fig. 2. Thus, the solution described in this embodiment calls the image processing function accelerated by the target computing architecture in the target image processing function library once before receiving the image sequence in the presence of the GPU supporting the target computing architecture. Therefore, the application program is initialized, and the target image processing function library is enabled to complete the initialization of the context of the target computing architecture, so that the processing time of the first frame image is shortened when the image sequence is received, and the image processing efficiency is further improved.
With further reference to fig. 4, as an implementation of the method shown in the above figures, the present application provides an embodiment of an apparatus for processing an image, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable to various electronic devices.
As shown in fig. 4, the apparatus 400 for processing an image of the present embodiment may include: a determining unit 401, an updating unit 402, and a processing unit 403. Wherein the determining unit 401 is configured to determine whether there is a graphics processor GPU supporting the target computing architecture; an updating unit 402 configured to update the flag variable based on the determination result; the processing unit 403 is configured to invoke the image processing functions in the target image processing function library for processing the image sequence based on the flag variable in response to receiving the image sequence.
In the present embodiment, in the apparatus 400 for processing an image: the specific processes of the determining unit 401, the updating unit 402 and the processing unit 403 and the technical effects thereof may refer to the descriptions related to step 201, step 202 and step 203 in the corresponding embodiment of fig. 2, and are not repeated herein.
In some optional implementations of the present embodiment, the updating unit 402 is further configured to: if the GPU supporting the target computing architecture exists, updating the mark variable into information indicating the image processing function for calling acceleration; and the processing unit 403 is further configured to: and calling an image processing function accelerated by a target computing architecture in a target image processing function library to process the image sequence.
In some optional implementations of this embodiment, the apparatus 400 for processing an image further includes: a calling unit (not shown in the figure) configured to call the image processing function accelerated by the target computing architecture in the target image processing function library once if there is a GPU supporting the target computing architecture.
In some optional implementations of the present embodiment, the updating unit 402 is further configured to: if the GPU supporting the target computing architecture does not exist, updating the mark variable to information indicating that the un-accelerated image processing function is called; and the processing unit 403 is further configured to: and calling an un-accelerated image processing function in the target image processing function library to process the image sequence.
In some optional implementations of the present embodiment, the determining unit 401 is further configured to: and calling the counting function of the target computing architecture starting equipment in the target image processing function library to obtain the number of GPUs supporting the target computing architecture.
In some optional implementations of this embodiment, if the target computing architecture support is not enabled or there are no GPUs supporting the target computing architecture at the time of compiling the target image processing function library, the number of GPUs supporting the target computing architecture is zero.
In some optional implementations of the present embodiment, wherein the apparatus 400 for processing an image further includes: an adding unit (not shown in the figure) is configured to add a compiling option supporting the target computing architecture in the command line parameters when recompiling the source code of the target image processing function library with a cross-platform compiling tool Cmake.
In some optional implementations of this embodiment, the target computing architecture comprises a unified computing device architecture CUDA, and the target image processing function library comprises an open source computer vision library OpenCV or an open computing language OpenCL.
Referring now to FIG. 5, there is illustrated a schematic diagram of a computer system 500 suitable for use with a server (e.g., server 103 of FIG. 1) for implementing an embodiment of the present application. The server illustrated in fig. 5 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present application.
As shown in fig. 5, the computer system 500 includes a Central Processing Unit (CPU) 501, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the system 500 are also stored. The CPU 501, ROM 502, and RAM 503 are connected to each other through a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
The following components are connected to the I/O interface 505: an input section 506 including a keyboard, a mouse, and the like; an output portion 507 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker, and the like; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The drive 510 is also connected to the I/O interface 505 as needed. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as needed so that a computer program read therefrom is mounted into the storage section 508 as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 509, and/or installed from the removable media 511. The above-described functions defined in the method of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 501. It should be noted that the computer readable medium according to the present application may be a computer readable signal medium or a computer readable medium, or any combination of the two. The computer readable medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present application may be implemented in software or in hardware. The described units may also be provided in a processor, for example, described as: a processor includes a determination unit, an update unit, and a processing unit. Where the names of these units do not constitute a limitation on the unit itself in some cases, for example, the determination unit may also be described as "determining whether there is a unit of graphics processor GPU supporting the target computing architecture".
As another aspect, the present application also provides a computer-readable medium that may be contained in the server described in the above embodiment; or may exist alone without being assembled into the server. The computer readable medium carries one or more programs which, when executed by the server, cause the server to: determining whether a graphics processor GPU supporting the target computing architecture exists; updating the flag variable based on the determination result; and in response to receiving the image sequence, invoking an image processing function in the target image processing function library based on the flag variable to process the image sequence.
The above description is only illustrative of the preferred embodiments of the present application and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the application referred to in the present application is not limited to the specific combinations of the technical features described above, but also covers other technical features formed by any combination of the technical features described above or their equivalents without departing from the inventive concept described above. Such as the above-mentioned features and the technical features disclosed in the present application (but not limited to) having similar functions are replaced with each other.

Claims (16)

1. A method for processing an image, comprising:
determining whether a graphics processor GPU supporting the target computing architecture exists;
if the GPU supporting the target computing architecture exists, calling an image processing function accelerated by the target computing architecture in a target image processing function library once, wherein the called image processing function accelerated by the target computing architecture processes an image with all black pixels constructed in a memory by a program;
updating the flag variable based on the determination result;
and in response to receiving the image sequence, invoking an image processing function in a target image processing function library based on the mark variable to process the image sequence.
2. The method of claim 1, wherein the updating the flag variable based on the determination result comprises:
if the GPU supporting the target computing architecture exists, updating the mark variable to information indicating to call an accelerated image processing function; and
the step of calling the image processing function in the target image processing function library based on the mark variable to process the image sequence comprises the following steps:
and calling the image processing function accelerated by the target computing architecture in the target image processing function library to process the image sequence.
3. The method of claim 2, wherein the updating the flag variable based on the determination result comprises:
if the GPU supporting the target computing architecture does not exist, updating the mark variable to information indicating that the un-accelerated image processing function is called; and
the step of calling the image processing function in the target image processing function library based on the mark variable to process the image sequence comprises the following steps:
and calling an un-accelerated image processing function in the target image processing function library to process the image sequence.
4. The method of claim 1, wherein the determining whether a graphics processor GPU supporting a target computing architecture is present comprises:
and calling an acquisition target computing architecture starting equipment counting function in the target image processing function library to obtain the number of GPUs supporting the target computing architecture.
5. The method of claim 4, wherein the number of GPUs supporting the target computing architecture is zero if target computing architecture support is not enabled or there are no GPUs supporting the target computing architecture at compile time of the target image processing function library.
6. The method of claim 5, wherein the method further comprises:
when the source codes of the target image processing function library are recompiled by utilizing a cross-platform compiling tool Cmake, compiling options supporting the target computing architecture are added in command line parameters.
7. The method of one of claims 1-6, wherein the target computing architecture comprises a unified computing device architecture, CUDA, and the target image processing function library comprises an open source computer vision library, openCV, or an open computing language, openCL.
8. An apparatus for processing an image, comprising:
a determining unit configured to determine whether there is a graphics processor GPU supporting the target computing architecture;
the calling unit is configured to call the image processing function accelerated by the target computing architecture in the target image processing function library once if the GPU supporting the target computing architecture exists, wherein the called image processing function accelerated by the target computing architecture processes an image with all black pixels constructed in a memory by a program;
an updating unit configured to update the flag variable based on the determination result;
and the processing unit is configured to call the image processing function in the target image processing function library to process the image sequence based on the mark variable in response to receiving the image sequence.
9. The apparatus of claim 8, wherein the updating unit is further configured to:
if the GPU supporting the target computing architecture exists, updating the mark variable to information indicating to call an accelerated image processing function; and
the processing unit is further configured to:
and calling the image processing function accelerated by the target computing architecture in the target image processing function library to process the image sequence.
10. The apparatus of claim 9, wherein the updating unit is further configured to:
if the GPU supporting the target computing architecture does not exist, updating the mark variable to information indicating that the un-accelerated image processing function is called; and
the processing unit is further configured to:
and calling an un-accelerated image processing function in the target image processing function library to process the image sequence.
11. The apparatus of claim 8, wherein the determination unit is further configured to:
and calling an acquisition target computing architecture starting equipment counting function in the target image processing function library to obtain the number of GPUs supporting the target computing architecture.
12. The apparatus of claim 11, wherein the number of GPUs supporting the target computing architecture is zero if target computing architecture support is not enabled or there are no GPUs supporting the target computing architecture at compile time of the target image processing function library.
13. The apparatus of claim 12, wherein the apparatus further comprises:
an adding unit configured to add a compiling option supporting the target computing architecture in a command line parameter when recompiling the source code of the target image processing function library using a cross-platform compiling tool Cmake.
14. The apparatus of one of claims 8-13, wherein the target computing architecture comprises a unified computing device architecture, CUDA, and the target image processing function library comprises an open source computer vision library, openCV, or an open computing language, openCL.
15. A server, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-7.
16. A computer readable medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the method of any of claims 1-7.
CN201910289762.0A 2019-04-11 2019-04-11 Method and apparatus for processing image Active CN110033406B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910289762.0A CN110033406B (en) 2019-04-11 2019-04-11 Method and apparatus for processing image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910289762.0A CN110033406B (en) 2019-04-11 2019-04-11 Method and apparatus for processing image

Publications (2)

Publication Number Publication Date
CN110033406A CN110033406A (en) 2019-07-19
CN110033406B true CN110033406B (en) 2023-08-29

Family

ID=67237920

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910289762.0A Active CN110033406B (en) 2019-04-11 2019-04-11 Method and apparatus for processing image

Country Status (1)

Country Link
CN (1) CN110033406B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111694532B (en) * 2020-06-11 2021-06-04 翱捷科技股份有限公司 Display control method of single-chip heterogeneous system and wearable device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006345026A (en) * 2005-06-07 2006-12-21 Rikogaku Shinkokai Image processor and image processing method employing function expression
CN103927721A (en) * 2014-04-10 2014-07-16 哈尔滨工程大学 Moving object edge enhancement method based on GPU
CN106204669A (en) * 2016-07-05 2016-12-07 电子科技大学 A kind of parallel image compression sensing method based on GPU platform

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006345026A (en) * 2005-06-07 2006-12-21 Rikogaku Shinkokai Image processor and image processing method employing function expression
CN103927721A (en) * 2014-04-10 2014-07-16 哈尔滨工程大学 Moving object edge enhancement method based on GPU
CN106204669A (en) * 2016-07-05 2016-12-07 电子科技大学 A kind of parallel image compression sensing method based on GPU platform

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
OpenCV+CUDA入门教程之二---GPU模块简介;原我归来是少年;《https://blog.csdn.net/DumpDoctorWang/article/details/81052597》;20180718;1-2 *

Also Published As

Publication number Publication date
CN110033406A (en) 2019-07-19

Similar Documents

Publication Publication Date Title
CN111581555B (en) Document loading method, device, equipment and storage medium
US10180825B2 (en) System and method for using ubershader variants without preprocessing macros
CN111338623B (en) Method, device, medium and electronic equipment for developing user interface
US11016769B1 (en) Method and apparatus for processing information
US20140198110A1 (en) Reducing the number of sequential operations in an application to be performed on a shared memory cell
CN111324376B (en) Function configuration method, device, electronic equipment and computer readable medium
US11443173B2 (en) Hardware-software co-design for accelerating deep learning inference
CN110033406B (en) Method and apparatus for processing image
US20160371061A1 (en) Read-only communication operator
CN112416303B (en) Software development kit hot repair method and device and electronic equipment
CN112416533A (en) Method and device for running application program on browser and electronic equipment
CN114756334B (en) Server and server-based graphics rendering method
US20130159680A1 (en) Systems, methods, and computer program products for parallelizing large number arithmetic
KR20170108412A (en) Apparatus for operating canvas image
CN113407259B (en) Scene loading method, device, equipment and storage medium
CN111552478B (en) Apparatus, method and storage medium for generating CUDA program
CN112988194B (en) Program optimization method and device based on equipment information, electronic equipment and storage medium
CN114881235A (en) Inference service calling method and device, electronic equipment and storage medium
CN111309323B (en) Parameter initialization method and device and electronic equipment
CN109308194B (en) Method and apparatus for storing data
CN114418824A (en) Image processing method, device and storage medium
CN111596972B (en) Neural network model storage method, loading method, device, equipment and storage medium
CN114820277A (en) OpenCL-based image processing method and device, computing equipment and medium
CN116416355A (en) Shader script generation method and device, electronic equipment and storage medium
CN116170634A (en) Multimedia processing method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant