CN116129249B - Image processing method, device, electronic equipment and storage medium - Google Patents

Image processing method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116129249B
CN116129249B CN202310348700.9A CN202310348700A CN116129249B CN 116129249 B CN116129249 B CN 116129249B CN 202310348700 A CN202310348700 A CN 202310348700A CN 116129249 B CN116129249 B CN 116129249B
Authority
CN
China
Prior art keywords
data
image data
image
bit
bits
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310348700.9A
Other languages
Chinese (zh)
Other versions
CN116129249A (en
Inventor
孙明亭
吴德辉
贾明桥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Suiyuan Technology Co ltd
Original Assignee
Shanghai Enflame Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Enflame Technology Co ltd filed Critical Shanghai Enflame Technology Co ltd
Priority to CN202310348700.9A priority Critical patent/CN116129249B/en
Publication of CN116129249A publication Critical patent/CN116129249A/en
Application granted granted Critical
Publication of CN116129249B publication Critical patent/CN116129249B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/96Management of image or video recognition tasks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/72Data preparation, e.g. statistical preprocessing of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses an image processing method, an image processing device, electronic equipment and a storage medium, and relates to the technical field of image processing, wherein the method is executed by a central processing unit and comprises the following steps: normalizing the initial image data to obtain first image data with more data bits; extracting first effective data of appointed digits in the first image data to obtain second image data with fewer digits according to the first effective data; the second image data is sent to the graphics processor to cause the graphics processor to perform image processing operations based on the second image data. According to the technical scheme, the second image data is ensured to have the high-precision data type, meanwhile, the transmission data quantity between the CPU and the GPU is reduced, the data transmission efficiency is improved, in addition, compared with the mode of compressing the transmission data through structural sparsity and the like, no precision loss exists, no additional sparse algorithm is required to be introduced, and the data transmission flow is simplified.

Description

Image processing method, device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to an image processing method, an image processing device, an electronic device, and a storage medium.
Background
With the complexity of graphic display rendering application scenes, more and more graphic processors (graphics processing unit, GPU) begin to analyze and process images through deep learning modes such as Neural Networks (NN) models.
In the prior art, in order to improve the image recognition precision, a central processing unit (Central Processing Unit, CPU) converts an acquired low-precision original image into a data type with higher precision, and transmits the high-precision image data to a GPU, so that the GPU processes the high-precision image data based on a trained neural network model.
However, in such a transmission manner, if the accuracy of the type of data transmitted to the GPU by the CPU is high, the amount of data to be transmitted is too large, which affects the data transmission efficiency, and if the accuracy is low, the image recognition result of the GPU is greatly deviated.
Disclosure of Invention
The invention provides an image processing method, an image processing device, electronic equipment and a storage medium, which are used for solving the problem of excessive data transmission quantity between a CPU and a GPU.
According to an aspect of the present invention, there is provided an image processing method applied to a central processing unit, including:
Normalizing the initial image data to obtain processed first image data; wherein the data bit number of the first image data is greater than the data bit number of the initial image data;
extracting first valid data of a designated digit in the first image data according to the data type of the initial image data so as to acquire second image data according to the first valid data; wherein the number of data bits of the second image data is less than the number of data bits of the first image data;
and sending the second image data to a graphics processor, so that the graphics processor executes image processing operation according to the second image data.
According to another aspect of the present invention, there is provided an image processing method applied to a graphic processor, including:
acquiring second image data sent by a central processing unit; wherein the second image data is generated based on first valid data of a specified digit in first image data generated based on normalization processing of initial image data, the first image data having a data bit number greater than that of the initial image data and greater than that of the second image data;
Inputting the second image data into a first neural network model to perform an image processing operation; wherein the first neural network model matches a data type of the first image data.
According to an aspect of the present invention, there is provided an image processing apparatus applied to a central processing unit, including:
the normalization processing execution module is used for carrying out normalization processing on the initial image data to obtain processed first image data; wherein the data bit number of the first image data is greater than the data bit number of the initial image data;
the effective data extraction module is used for extracting first effective data of appointed digits in the first image data according to the data type of the initial image data so as to acquire second image data according to the first effective data; wherein the number of data bits of the second image data is less than the number of data bits of the first image data;
and the image data transmitting module is used for transmitting the second image data to the graphic processor so that the graphic processor can execute image processing operation according to the second image data.
According to another aspect of the present invention, there is provided an image processing apparatus applied to a graphic processor, including:
The image data acquisition module is used for acquiring second image data sent by the central processing unit; wherein the second image data is generated based on first valid data of a specified digit in first image data generated based on normalization processing of initial image data, the first image data having a data bit number greater than that of the initial image data and greater than that of the second image data;
an image processing execution module for inputting the second image data into a first neural network model to execute an image processing operation; wherein the first neural network model matches a data type of the first image data.
According to another aspect of the present invention, there is provided an electronic apparatus including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the image processing method according to any one of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute the image processing method according to any one of the embodiments of the present invention.
According to the technical scheme, after the initial image data is subjected to normalization processing to obtain the first image data with more data bits, according to the data type of the initial image data, the first effective data of the appointed digits in the first image data is extracted to obtain the second image data with the data bits less than the first image data and the data precision identical to that of the first image data, and the second image data is further sent to the image processor, so that the image processor executes image processing operation according to the second image data, the transmission data quantity between the CPU and the GPU is reduced while the second image data has high-precision data type, the data carrying cost is reduced, the data transmission efficiency is improved, and in addition, compared with the method of compressing the transmission data through structural sparsity and the like, no precision loss exists, no extra sparse algorithm is required to be introduced, and the data transmission flow is simplified.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of an image processing method according to a first embodiment of the present invention;
fig. 2 is a flowchart of an image processing method according to a second embodiment of the present invention;
fig. 3 is a schematic structural view of an image processing apparatus according to a third embodiment of the present invention;
fig. 4 is a schematic structural view of an image processing apparatus according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device implementing an image processing method according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of an image processing method according to a first embodiment of the present invention, where the method may be applied to image data acquisition and transmission, and the method may be performed by an image processing apparatus according to a third embodiment, where the image processing apparatus may be implemented in hardware and/or software, and the image processing apparatus may be configured in a central processor of an electronic device. As shown in fig. 1, the method includes:
s101, carrying out normalization processing on initial image data to obtain processed first image data; wherein the first image data has a data bit number greater than the data bit number of the initial image data.
After the CPU acquires the original image, RGB (red green blue) data or BGR (blue green red) data of the original image, namely initial image data, are acquired through decoding operation of the original image; the data precision of each element in the initial image data is low, for example, each element in the initial image data may be signed 8-bit integer (i.e., int 8) data, unsigned 8-bit integer (i.e., uint 8) data, signed 10-bit integer (i.e., int 10) data, unsigned 10-bit integer (i.e., uint 10) data, or the like.
Taking the example that the data type of the initial image data is int8, normalizing the initial image data of the int8 type to a numerical value between (0, 1), wherein the precision expression range is as follows: int8 (0, 1) =1/256=0.00390625; after converting the initial image data of the int8 type into the data of the uint8 type, normalizing the initial image data to be a numerical value between (-1, 1), wherein the precision expression range is as follows: int8 (-1, 1) =1/128=0.0078125; after each floating point type data (float) after normalization processing is converted into binary representation of 64 bits (namely FP 64), 32bits (namely FP 32) or 16 bits (namely FP 16), first image data can be obtained; for example, initial image data of 32 (width) ×32 (height) ×3 (RGB three colors) ×8bits is converted into first image data of 32 (width) ×32 (height) 3 (RGB three colors) ×32 bits; obviously, the data type of the first image data is a high-precision data type compared with the initial image data of a low-precision data type, and the number of data bits of each element in the first image data is greater than that of the element data in the initial image data.
S102, extracting first effective data of appointed digits in the first image data according to the data type of the initial image data so as to acquire second image data according to the first effective data; wherein the number of data bits of the second image data is smaller than the number of data bits of the first image data.
The low-bit initial image data is normalized and converted into the high-bit first image data, but the first image data is converted from the initial image data, so that when the initial data conversion is completed, the distribution of the high-bit data still accords with the low-bit data distribution characteristic, and taking the technical scheme as an example, the data precision of each element in the first image data is single precision (namely FP 32), but the effective data is still represented by the data precision of int 8. Specifically, as shown in table 1, for FP32 type data, the 31 st bit is a sign bit, the 23 rd to 30 th bits are exponent bits, and the 0 th to 22 th bits are mantissa bits.
When converting the uint8 data into the FP32 data, the 31 st bit is fixed to 0 because no symbol exists in the binary result; bits 27-30 are four fixed digits, the value of which is fixed as 0111; bits 16-26 are variable digits, which are determined according to the actual numerical value of the current uint8 data, wherein bits 23-26 belong to digits, namely variable digits, and bits 16-22 belong to mantissa digits, namely variable mantissa digits; bits 0-15 are fixed mantissa bits, the values of which are all fixed at 0.
When the int8 data is converted into FP32 data, the 31 st bit is a sign bit in the binary result, and the sign bit is determined according to the actual value of the current int8 data; bits 27-30 are four fixed digits, which are also 0111; bits 17-26 are variable digits, which are determined according to the actual numerical value of the current int8 data, wherein bits 23-26 belong to digits, namely variable digits, and bits 17-22 belong to mantissas, namely variable mantissas; bits 0-15 are fixed mantissa bits, and the values are all fixed to be 0.
As can be seen from the above technical solution, when converting the int8 data or the ui 8 data into the FP32 data, only at most 12 digits are actually in a variable state, namely, the 31 st sign bit, the 23 rd to 26 th sign bits, and the 16 th to 22 th mantissa bits, so that valid data of the 12 specified digits of each element are extracted from the first image data, and then the second image data is formed according to the first valid data of each element; as described in the above technical solution, from first image data of 32 (width) ×32 (height) ×32bits, second image data of 32 (width) ×32 (height) ×12bits is obtained after extracting first effective data; in table 1, EF12 represents the first valid data extracted from the 12 specified digits when the initial image data is int8 data or uint8 data.
Table 1 data bit schematic table of FP32 with uint8 and int8
Figure SMS_1
Similarly, taking uint10 data or int10 data as an example, FP32 data is converted; as shown in table 2, when converting the uint10 data into FP32 data, the 31 st bit is fixed to 0 because of no sign in the binary result, the 27 th to 30 th bits are four fixed digits, and the value is fixed 0111; bits 14 to 26 are variable digits, which are determined according to the actual numerical value of the current uint10 data, wherein bits 23 to 26 belong to digits, namely variable digits, and bits 14 to 22 belong to mantissas, namely variable mantissas; bits 0-13 are fixed mantissa bits, the values of which are all fixed at 0.
When the int10 data is converted into FP32 data, the 31 st bit is a sign bit in a binary result, and the sign bit is determined according to the actual numerical value of the current int10 data; bits 27-30 are four fixed digits, which are also 0111; bits 15-26 are variable digits, which are determined according to the actual numerical value of the current int8 data, wherein bits 23-26 belong to digits, namely variable digits, and bits 15-22 belong to mantissas, namely variable mantissas; bits 0-14 are fixed mantissa bits, which are all fixed 0.
As can be seen from the above technical solution, when converting the int10 data or the uint10 data into the FP32 data, only 14 digits are actually in a variable state, namely, 31 st sign bit, 23 rd to 26 th exponent bits, and 14 th to 22 nd mantissa bits, so that the valid data of the 14 specified digits of each element are extracted from the first image data, and then the second image data is formed according to the first valid data of each element; as described in the above-described embodiments, from the first image data of 32 (width) ×32 (height) ×32bits, the effective data is extracted, and then the second image data of 32 (width) ×32 (height) ×14bits is obtained; in table 2, EF14 represents the first valid data extracted from the 14 specified digits when the initial image data is int10 data or uint10 data.
Table 2 data bit schematic table of FP32 with uint10 and int10
Figure SMS_2
S103, the second image data are sent to a graphics processor, so that the graphics processor executes image processing operation according to the second image data.
Because the second image data comprises all the variable digits of the first image data, the data precision is consistent with that of the first image data, but compared with the first image data, the data transmission quantity between the CPU and the GPU is reduced, the data carrying cost is reduced, and the data transmission efficiency is improved; meanwhile, compared with the mode of compressing and transmitting data through structural sparsity and the like, no precision loss exists, no extra sparse algorithm is needed to be introduced, and the data transmission flow is simplified.
Optionally, in an embodiment of the present invention, the extracting, according to the data type of the initial image data, first valid data of a specified digit in the first image data to obtain second image data according to the first valid data includes: judging whether the data type of the first image data is Fu Haoxing; if the first image data has the symbol type, extracting second effective data of appointed digits in the first image data so as to acquire third image data according to the second effective data; wherein the number of data bits of the third image data is smaller than the number of data bits of the second image data.
Specifically, in order to meet the requirement of the universality of signed integer data and unsigned integer data, the variable data bits in the first image data are actually expanded in the first effective data obtained by extraction, taking int8 as an example, the changed bits are actually 11, namely 31 st sign bit, 23 rd-26 th exponent bits and 17 th-22 th mantissa bits, EF11A represents the second effective data extracted from the 11 specified digits when the initial image data is int8 data, and the third image data can be obtained according to EF11A in practice; taking int10 as an example, the changed digits are 13, namely 31 st sign, 23 rd to 26 th digits and 15 th to 22 th digits, EF13A is used to represent the second valid data extracted from the 13 specified digits when the initial image data is int10 data, and the third image data can be obtained according to EF 13A; therefore, if the data type of the first image data is determined to be signed, the third image data is acquired from the first image data, so that the data precision is kept consistent with the first image data, the transmission data quantity between the CPU and the GPU is further reduced, the data carrying cost is reduced, and the data transmission efficiency is improved.
Optionally, in an embodiment of the present invention, after determining whether the data type of the first image data is signed, the method further includes: if no sign type exists, extracting third effective data of appointed digits in the first image data to obtain fourth image data according to the third effective data; wherein the fourth image data has a number of data bits less than the number of data bits of the second image data, and the fourth valid data is different from the third valid data.
Specifically, taking uint8 as an example, the number of bits changed is actually 11, that is, the 23 th to 26 th digits and the 16 th to 22 th digits, and EF11B is used to represent the third valid data extracted from the 11 th digits when the initial image data is uint8 data, and the fourth image data can be obtained according to the EF 11B; taking uint10 as an example, the number of bits of the change is actually 14, namely, the 23 rd to 26 th exponent bits and the 14 th to 22 th mantissa bits, and EF13B is used to represent the third valid data extracted from the 13 specified bits when the initial image data is uint10 data, and the fourth image data can be obtained according to EF 13B; therefore, if the data type of the first image data is determined to be of a no-symbol type, the fourth image data is obtained from the first image data, so that the data precision is kept consistent with the first image data, the transmission data quantity between the CPU and the GPU is further reduced, the data carrying cost is reduced, and the data transmission efficiency is improved.
Optionally, in an embodiment of the present invention, if the data type of the initial image data is eight-bit integer data and the data type of the first image data is floating point data, extracting the first valid data of the specified digit in the first image data according to the data type of the initial image data includes: extracting sign bit data, four consecutive exponent bit data and seven consecutive mantissa bit data adjacent to the four consecutive exponent bit data in the first image data; if the data type of the initial image data is ten-bit integer data and the data type of the first image data is floating point data, extracting the first valid data of the specified digit in the first image data according to the data type of the initial image data includes: symbol bit data, four consecutive exponent bit data, and nine consecutive mantissa bit data adjacent to the four consecutive exponent bit data in the first image data are extracted.
Specifically, when the data type of the first image data is floating point type data, the extraction of the specified digit can be performed in the same manner regardless of which one of the data types such as FP16, FP32, FP64, etc.; taking table 3 as an example, for FP16 data, the 15 th bit is a sign bit, the 14 th to 10 th bits are exponent bits, and the 9 th to 0 th bits are mantissa bits.
When converting the uint8 data into the FP16 data, bit 15 is fixed to 0 because no symbol exists in the binary result; bit 14 is a fixed exponent, its value is fixed to 0; bits 3-13 are variable digits, which are determined according to the actual numerical value of the current uint8 data, wherein bits 10-13 belong to digits, namely variable digits, and bits 3-9 belong to mantissas, namely variable mantissas; bits 0-2 are fixed mantissa bits, the values of which are all fixed at 0.
When the int8 data is converted into FP16 data, the 15 th bit is a sign bit in a binary result, and the sign bit is determined according to the actual value of the current int8 data; bit 14 is a fixed exponent, its value is fixed to 0; bits 4-13 are variable digits, which are determined according to the actual numerical value of the current uint8 data, wherein bits 10-13 belong to digits, namely variable digits, and bits 4-9 belong to mantissas, namely variable mantissas; bits 0-3 are fixed mantissa bits, the values of which are all fixed at 0.
According to the technical scheme, when 8-bit integer (int 8 or uint 8) data are converted into FP16 data, at most 12 digits are in a variable state, namely 31 st sign bit, 10 th to 13 th exponent digits and 3 rd to 9 th mantissa digits; obviously, it is also the sign bit data, four consecutive exponent bit data, and seven consecutive mantissa bit data adjacent to the four consecutive exponent bit data in the first image data.
Table 3 data bit schematic table of FP16 with uint8 and int8
Figure SMS_3
Similarly, as shown in table 4, when converting the uint10 data into FP16 data, bit 15 is fixed to 0 because it is unsigned in the binary result; bit 14 is a fixed exponent, its value is fixed to 0; bits 1-13 are variable digits, which are determined according to the actual numerical value of the current uint10 data, wherein bits 10-13 belong to digits, namely variable digits, and bits 1-9 belong to mantissas, namely variable mantissas; bit 0 is a fixed mantissa bit whose value is fixed to 0.
When the int10 data is converted into FP16 data, the 15 th bit is a sign bit in a binary result, and the sign bit is determined according to the actual numerical value of the current int10 data; bit 14 is a fixed exponent, its value is fixed to 0; bits 2-13 are variable digits, which are determined according to the actual numerical value of the current uint10 data, wherein bits 10-13 belong to digits, namely variable digits, and bits 2-9 belong to mantissas, namely variable mantissas; bits 0-1 are fixed mantissa bits, the values of which are all fixed at 0.
According to the technical scheme, when 10-bit integer (int 10 or uint 10) data are converted into FP16 data, at most 14 digits are in a variable state, namely 15 th sign bit, 10 th-13 th exponent digits and 1 st-9 th mantissa digits; obviously, it is also the sign bit data, four consecutive exponent bit data, and nine consecutive mantissa bit data adjacent to the four consecutive exponent bit data in the first image data.
Table 4 data bit schematic table of FP16 with uint10 and int10
Figure SMS_4
Therefore, when the initial image data is eight-bit integer data or ten-bit integer data, whether the first image data is floating point data such as FP64, FP32, FP16 and the like or other types of floating point data, the second image data with the data bit number smaller than that of the first image data can be directly obtained through extracting the designated digits, so that the transmission data amount between the CPU and the GPU is reduced, and the data transmission efficiency is improved.
According to the technical scheme, after the initial image data is subjected to normalization processing to obtain the first image data with more data bits, according to the data type of the initial image data, the first effective data of the appointed digits in the first image data is extracted to obtain the second image data with the data bits less than the first image data and the data precision identical to that of the first image data, and the second image data is further sent to the image processor, so that the image processor executes image processing operation according to the second image data, the transmission data quantity between the CPU and the GPU is reduced while the second image data has high-precision data type, the data carrying cost is reduced, the data transmission efficiency is improved, and in addition, compared with the method of compressing the transmission data through structural sparsity and the like, no precision loss exists, no extra sparse algorithm is required to be introduced, and the data transmission flow is simplified.
Example two
Fig. 2 is a flowchart of an image processing method according to a first embodiment of the present invention, where the method may be applied to image processing based on a neural network model, and the method may be performed by the image processing apparatus according to the fourth embodiment, and the image processing apparatus may be implemented in hardware and/or software, and the image processing apparatus may be configured in a graphics processor of an electronic device. As shown in fig. 2, the method includes:
s201, acquiring second image data sent by a central processing unit; wherein the second image data is generated based on first valid data of a specified digit in first image data generated based on normalization processing of initial image data, the first image data having a data bit number greater than that of the initial image data and greater than that of the second image data.
S202, inputting the second image data into a first neural network model to execute image processing operation; wherein the first neural network model matches a data type of the first image data.
The first neural network model is matched with the first image data of the high-precision data type, and can be pre-constructed and trained based on convolutional neural networks (Convolutional Neural Networks, CNN) and/or cyclic neural networks (Recurrent Neural Network, RNN) and other technologies; because the conventional neural network model is usually trained based on FP16 and FP32, in the embodiment of the present invention, since EF12 data or EF14 data and FP32 data have substantially the same data structure and reflect the same data characteristics, the second image data can be directly input into the conventional FP32 model or FP16 model as input data, so that the processing speed of the first neural network model on the input data is improved by reducing the complexity of the input data while the second image data maintains the same image characteristics as the first image data.
Optionally, in an embodiment of the present invention, the inputting the second image data into the first neural network model to perform an image processing operation includes: constructing the first image data according to the second image data, and inputting the first image data into the first neural network model to execute image processing operation; or acquiring a second neural network model according to the first neural network model, and inputting the first image data to the first neural network model to execute image processing operation; wherein the second neural network model matches a data type of the second image data.
Specifically, since the second image data and the first image data reflect the same image characteristics, the existing first neural network model can be converted into a second neural network model matched with the second image data, and then the second image data is input into the second neural network model, for example, EF12 data is input into the EF12 model or EF14 data is input into the EF14 model, so that the computational efficiency of the GPU is improved, and meanwhile, the complexity of the model is reduced; in addition, the second image data can be restored to the first image data by inserting known fixed digits, and then the constructed first image data is input into the first neural network model, so that the input data is completely matched with the first neural network model, calculation errors caused by the difference between the second image data and the first image data are avoided, and further the image recognition accuracy of the GPU is affected.
According to the technical scheme, after the second image data which has the same data precision as the first image data and has a smaller data bit number is acquired by the image processor, the second image data is input into the first neural network model which is matched with the data type of the first image data, so that the processing speed of the first neural network model on the input data is improved by reducing the complexity of the input data while the second image data maintains the same image characteristics as the first image data.
Example III
Fig. 3 is a block diagram of an image processing apparatus according to a third embodiment of the present invention, the image processing apparatus specifically including:
a normalization processing execution module 301, configured to perform normalization processing on the initial image data to obtain first image data after processing is completed; wherein the data bit number of the first image data is greater than the data bit number of the initial image data;
a valid data extraction module 302, configured to extract first valid data of a specified digit in the first image data according to a data type of the initial image data, so as to obtain second image data according to the first valid data; wherein the number of data bits of the second image data is less than the number of data bits of the first image data;
And the image data sending module 303 is configured to send the second image data to a graphics processor, so that the graphics processor performs an image processing operation according to the second image data.
According to the technical scheme, after the initial image data is subjected to normalization processing to obtain the first image data with more data bits, according to the data type of the initial image data, the first effective data of the appointed digits in the first image data is extracted to obtain the second image data with the data bits less than the first image data and the data precision identical to that of the first image data, and the second image data is further sent to the image processor, so that the image processor executes image processing operation according to the second image data, the transmission data quantity between the CPU and the GPU is reduced while the second image data has high-precision data type, the data carrying cost is reduced, the data transmission efficiency is improved, and in addition, compared with the method of compressing the transmission data through structural sparsity and the like, no precision loss exists, no extra sparse algorithm is required to be introduced, and the data transmission flow is simplified.
Optionally, the effective data extracting module 302 is specifically configured to determine whether the data type of the first image data is Fu Haoxing; if the first image data has the symbol type, extracting second effective data of appointed digits in the first image data so as to acquire third image data according to the second effective data; wherein the number of data bits of the third image data is smaller than the number of data bits of the second image data.
Optionally, the effective data extracting module 302 is specifically further configured to extract third effective data of a specified digit in the first image data if no sign type exists, so as to obtain fourth image data according to the third effective data; wherein the fourth image data has a number of data bits less than the number of data bits of the second image data, and the fourth valid data is different from the third valid data.
Optionally, if the data type of the initial image data is eight-bit integer data and the data type of the first image data is floating point data, the valid data extracting module 302 is specifically further configured to extract sign bit data, four consecutive exponent bit data, and seven consecutive mantissa bit data adjacent to the four consecutive exponent bit data in the first image data.
Optionally, if the data type of the initial image data is ten-bit integer data and the data type of the first image data is floating point data, the valid data extracting module 302 is specifically further configured to extract sign bit data, four consecutive exponent bit data, and nine consecutive mantissa bit data adjacent to the four consecutive exponent bit data in the first image data.
The image processor provided by the invention can execute the image processing method provided by the first embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. Technical details not described in detail in the present embodiment may be referred to the image processing method provided in the first embodiment of the present invention.
Example IV
Fig. 4 is a block diagram of an image processing apparatus according to a fourth embodiment of the present invention, the image processing apparatus specifically including:
an image data acquisition module 401, configured to acquire second image data sent by the central processing unit; wherein the second image data is generated based on first valid data of a specified digit in first image data generated based on normalization processing of initial image data, the first image data having a data bit number greater than that of the initial image data and greater than that of the second image data;
an image processing execution module 402, configured to input the second image data into a first neural network model to perform an image processing operation; wherein the first neural network model matches a data type of the first image data.
According to the technical scheme, after the second image data which has the same data precision as the first image data and has a smaller data bit number is acquired by the image processor, the second image data is input into the first neural network model which is matched with the data type of the first image data, so that the processing speed of the first neural network model on the input data is improved by reducing the complexity of the input data while the second image data maintains the same image characteristics as the first image data.
The image processor provided by the invention can execute the image processing method provided by the second embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. Technical details not described in detail in this embodiment may be referred to the image processing method provided in the second embodiment of the present invention.
Example five
Fig. 5 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the respective methods and processes described above, for example, an image processing method.
In some embodiments, the image processing method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as a storage unit. In some embodiments, part or all of the computer program may be loaded and/or installed onto the heterogeneous hardware accelerator via the ROM and/or the communication unit. One or more of the steps of the image processing method described above may be performed when the computer program is loaded into RAM and executed by a processor. Alternatively, in other embodiments, the processor may be configured to perform the image processing method in any other suitable way (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a heterogeneous hardware accelerator having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or a trackball) through which a user can provide input to the heterogeneous hardware accelerator. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. An image processing method, applied to a central processing unit, comprising:
normalizing the initial image data to obtain processed first image data; wherein the data bit number of the first image data is greater than the data bit number of the initial image data;
extracting first valid data of a designated digit in the first image data according to the data type of the initial image data so as to acquire second image data according to the first valid data; wherein the number of data bits of the second image data is less than the number of data bits of the first image data;
And sending the second image data to a graphics processor, so that the graphics processor executes image processing operation according to the second image data.
2. The method of claim 1, wherein extracting first valid data of a specified digit in the first image data based on the data type of the initial image data to obtain second image data based on the first valid data, comprises:
judging whether the data type of the first image data is Fu Haoxing;
if the first image data has the symbol type, extracting second effective data of appointed digits in the first image data so as to acquire third image data according to the second effective data; wherein the number of data bits of the third image data is smaller than the number of data bits of the second image data.
3. The method according to claim 2, further comprising, after determining whether the data type of the first image data is signed,:
if no sign type exists, extracting third effective data of appointed digits in the first image data to obtain fourth image data according to the third effective data; wherein the fourth image data has a number of data bits less than the number of data bits of the second image data, and the fourth image data is different from the third image data.
4. The method according to claim 1, wherein if the data type of the initial image data is eight-bit integer data and the data type of the first image data is floating point data, the extracting the first valid data of the specified digit in the first image data according to the data type of the initial image data includes:
extracting sign bit data, four consecutive exponent bit data and seven consecutive mantissa bit data adjacent to the four consecutive exponent bit data in the first image data;
if the data type of the initial image data is ten-bit integer data and the data type of the first image data is floating point data, extracting the first valid data of the specified digit in the first image data according to the data type of the initial image data includes:
symbol bit data, four consecutive exponent bit data, and nine consecutive mantissa bit data adjacent to the four consecutive exponent bit data in the first image data are extracted.
5. An image processing method, applied to a graphics processor, comprising:
acquiring second image data sent by a central processing unit; wherein the second image data is generated based on first valid data of a specified digit in first image data generated based on normalization processing of initial image data, the first image data having a data bit number greater than that of the initial image data and greater than that of the second image data;
Inputting the second image data into a first neural network model to perform an image processing operation; wherein the first neural network model matches a data type of the first image data.
6. The method of claim 5, wherein the inputting the second image data into the first neural network model to perform an image processing operation comprises:
constructing the first image data according to the second image data, and inputting the first image data into the first neural network model to execute image processing operation;
or acquiring a second neural network model according to the first neural network model, and inputting the second image data to the second neural network model to execute image processing operation; wherein the second neural network model matches a data type of the second image data.
7. An image processing apparatus, applied to a central processing unit, comprising:
the normalization processing execution module is used for carrying out normalization processing on the initial image data to obtain processed first image data; wherein the data bit number of the first image data is greater than the data bit number of the initial image data;
The effective data extraction module is used for extracting first effective data of appointed digits in the first image data according to the data type of the initial image data so as to acquire second image data according to the first effective data; wherein the number of data bits of the second image data is less than the number of data bits of the first image data;
and the image data transmitting module is used for transmitting the second image data to the graphic processor so that the graphic processor can execute image processing operation according to the second image data.
8. An image processing apparatus, applied to a graphics processor, comprising:
the image data acquisition module is used for acquiring second image data sent by the central processing unit; wherein the second image data is generated based on first valid data of a specified digit in first image data generated based on normalization processing of initial image data, the first image data having a data bit number greater than that of the initial image data and greater than that of the second image data;
an image processing execution module for inputting the second image data into a first neural network model to execute an image processing operation; wherein the first neural network model matches a data type of the first image data.
9. An electronic device, the electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the image processing method of any one of claims 1-4 or to perform the image processing method of claim 5 or 6.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores computer instructions for causing a processor to implement the image processing method of any one of claims 1 to 4 or the image processing method of claim 5 or 6 when executed.
CN202310348700.9A 2023-04-04 2023-04-04 Image processing method, device, electronic equipment and storage medium Active CN116129249B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310348700.9A CN116129249B (en) 2023-04-04 2023-04-04 Image processing method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310348700.9A CN116129249B (en) 2023-04-04 2023-04-04 Image processing method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116129249A CN116129249A (en) 2023-05-16
CN116129249B true CN116129249B (en) 2023-07-07

Family

ID=86299349

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310348700.9A Active CN116129249B (en) 2023-04-04 2023-04-04 Image processing method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116129249B (en)

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR3042882B1 (en) * 2015-10-22 2018-09-21 Thales SYSTEM PROVIDED TO PROVIDE OPERATOR WITH INCREASED VISIBILITY AND ASSOCIATED METHOD
CN106599840A (en) * 2016-12-13 2017-04-26 郑州云海信息技术有限公司 Image identification coprocessor, and image identification system and method
US10726514B2 (en) * 2017-04-28 2020-07-28 Intel Corporation Compute optimizations for low precision machine learning operations
JP2019164793A (en) * 2018-03-19 2019-09-26 エスアールアイ インターナショナル Dynamic adaptation of deep neural networks
CN109635922B (en) * 2018-11-20 2022-12-02 华中科技大学 Distributed deep learning parameter quantification communication optimization method and system
CN110175641B (en) * 2019-05-22 2024-02-02 中国科学院苏州纳米技术与纳米仿生研究所 Image recognition method, device, equipment and storage medium
CN113902928A (en) * 2020-07-06 2022-01-07 北京迈格威科技有限公司 Image feature extraction method and device and electronic equipment
CN114692077A (en) * 2020-12-30 2022-07-01 华为技术有限公司 Matrix calculation device, method, system, circuit, chip and equipment
WO2023009557A1 (en) * 2021-07-30 2023-02-02 Nvidia Corporation Improved inferencing using neural networks
CN115760539A (en) * 2022-10-28 2023-03-07 北京奇艺世纪科技有限公司 Image processing method and device and network equipment

Also Published As

Publication number Publication date
CN116129249A (en) 2023-05-16

Similar Documents

Publication Publication Date Title
CN113393371B (en) Image processing method and device and electronic equipment
CN112506935A (en) Data processing method, data processing apparatus, electronic device, storage medium, and program product
CN112562069A (en) Three-dimensional model construction method, device, equipment and storage medium
CN115567589B (en) Compression transmission method, device and equipment of JSON data and storage medium
CN114792355B (en) Virtual image generation method and device, electronic equipment and storage medium
CN114218931A (en) Information extraction method and device, electronic equipment and readable storage medium
CN116129249B (en) Image processing method, device, electronic equipment and storage medium
CN113641829A (en) Method and device for training neural network of graph and complementing knowledge graph
CN116309963B (en) Batch labeling method and device for images, electronic equipment and storage medium
CN112818387A (en) Method, apparatus, storage medium, and program product for model parameter adjustment
CN115995022A (en) Seal identification method, device, equipment and storage medium
CN114187318B (en) Image segmentation method, device, electronic equipment and storage medium
CN112558918B (en) Multiply-add operation method and device for neural network
CN115620321A (en) Table identification method and device, electronic equipment and storage medium
CN116468824B (en) Animation redirection method, device, electronic equipment and storage medium
CN115309888B (en) Method and device for generating chart abstract and training method and device for generating model
CN117615137B (en) Video processing method, device, equipment and storage medium
CN115482422B (en) Training method of deep learning model, image processing method and device
CN117272970B (en) Document generation method, device, equipment and storage medium
CN113362428B (en) Method, apparatus, device, medium, and product for configuring color
CN115169530B (en) Data processing method, device, electronic equipment and readable storage medium
CN115511047B (en) Quantification method, device, equipment and medium of Softmax model
CN115391450A (en) Inference information generation method, device, equipment, readable storage medium and product
CN117216145A (en) Data storage method, device, equipment and medium
CN115237992A (en) Data format conversion method and device and matrix processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: Room a-522, 188 Yesheng Road, Lingang New District, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai, 201306

Patentee after: Shanghai Suiyuan Technology Co.,Ltd.

Country or region after: China

Address before: Room a-522, 188 Yesheng Road, Lingang New District, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai, 201306

Patentee before: SHANGHAI ENFLAME TECHNOLOGY Co.,Ltd.

Country or region before: China

CP03 Change of name, title or address