WO2023131252A1 - 基于数据流架构的图像尺寸调整结构、调整方法及图像缩放方法和装置 - Google Patents

基于数据流架构的图像尺寸调整结构、调整方法及图像缩放方法和装置 Download PDF

Info

Publication number
WO2023131252A1
WO2023131252A1 PCT/CN2023/070762 CN2023070762W WO2023131252A1 WO 2023131252 A1 WO2023131252 A1 WO 2023131252A1 CN 2023070762 W CN2023070762 W CN 2023070762W WO 2023131252 A1 WO2023131252 A1 WO 2023131252A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target
row
column
data
Prior art date
Application number
PCT/CN2023/070762
Other languages
English (en)
French (fr)
Inventor
梁监天
蔡权雄
牛昕宇
Original Assignee
深圳鲲云信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202210007701.2A external-priority patent/CN114022366B/zh
Priority claimed from CN202210057376.0A external-priority patent/CN114092336B/zh
Application filed by 深圳鲲云信息科技有限公司 filed Critical 深圳鲲云信息科技有限公司
Priority to US18/301,985 priority Critical patent/US20230252600A1/en
Publication of WO2023131252A1 publication Critical patent/WO2023131252A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation

Definitions

  • Embodiments of the present invention relate to the technical field of data processing, and in particular to an image size adjustment structure, adjustment method, and image scaling method and device based on a data stream architecture.
  • Embodiments of the present invention provide an image size adjustment structure, adjustment method, and image scaling method and device based on a data stream architecture, so as to relieve the calculation pressure of the CPU and realize extremely high-efficiency calculations.
  • an embodiment of the present invention provides an image size adjustment structure based on a data flow architecture, the structure includes: a first multiplication unit, a second multiplication unit, a first data storage unit, a second data storage unit, The first addition operation unit and the second addition operation unit; wherein,
  • the first input terminal of the first multiplication unit is used to receive the image data to be calculated
  • the second input terminal is used to receive the first image position coefficient
  • the output terminals of the first multiplication unit are respectively connected with the first
  • the input end of the data registering unit is connected to the third input end of the first adding unit
  • the output end of the first data registering unit is connected to the fourth input end of the first adding unit
  • the first The output end of the addition operation unit is connected with the fifth input end of the second multiplication operation unit
  • the sixth input end of the second multiplication operation unit is used to receive the second image position coefficient
  • the second multiplication operation unit The output end is respectively connected with the input end of the second data registering unit and the seventh input end of the second adding operation unit
  • the output end of the second data registering unit is connected with the eighth input end of the second adding operation unit.
  • the input end is connected, and the output end of the second addition operation unit is used to output the calculation result of the image data;
  • the first data register unit is used to store the last time of the first multiplication operation unit in the serial operation As for the operation result, the second data registering unit is used to store the last operation result of the second multiplication operation unit in the serial operation.
  • a first data selection unit and a first data distribution unit are further included, and between the first data selection unit and the first between the data distribution units, a third data storage unit and a third addition operation unit are also included; wherein,
  • the output end of the first addition operation unit is connected to the input end of the first data selection unit, the output end of the first data distribution unit is connected to the input end of the second multiplication operation unit, and the first The first output terminal of the data selection unit is directly connected to the ninth input terminal of the first data distribution unit, and the tenth input terminal of the first data distribution unit is connected to the output terminal of the third addition operation unit, so The second output end of the first data selection unit is respectively connected to the input end of the third data storage unit and the eleventh input end of the third addition operation unit, and the output end of the third data storage unit is connected to the The twelfth input terminal of the third addition operation unit is connected;
  • the second addition operation unit After the second addition operation unit, it also includes a second data selection unit and a second data distribution unit, and between the second data selection unit and the second data distribution unit, it also includes a fourth data registration unit and the fourth addition operation unit; where,
  • the output end of the second addition operation unit is connected to the input end of the second data selection unit, the output end of the second data distribution unit is used to output the calculation result of the image data, and the second data selection unit
  • the third output terminal of the unit is directly connected to the thirteenth input terminal of the second data distribution unit, and the fourteenth input terminal of the second data distribution unit is connected to the output terminal of the fourth addition operation unit, so
  • the fourth output terminal of the second data selection unit is respectively connected to the input terminal of the fourth data storage unit and the fifteenth input terminal of the fourth addition unit, and the output terminal of the fourth data storage unit is connected to the input terminal of the fourth addition operation unit.
  • the sixteenth input terminal of the fourth addition operation unit is connected.
  • the image size adjustment structure adopts a bilinear interpolation algorithm.
  • the first data selection unit and the first data distribution unit select to use the data flow direction from the second output end to the tenth input end
  • the second data selection unit and the first data selection unit chooses to use the data flow direction from the fourth output terminal to the fourteenth input terminal, and the image size adjustment structure adopts a cubic interpolation method.
  • the embodiment of the present invention also provides an image resizing method, which is applied to the image resizing structure based on the data flow architecture provided by any embodiment of the present invention, including:
  • the image position coefficients include a first image position coefficient in the row direction and a second image position coefficient in the column direction;
  • the data flow in the image size adjustment structure is controlled according to the selected image size adjustment algorithm, and the scaled calculation result is obtained through the image size adjustment structure.
  • controlling the data flow in the image resizing structure according to the selected image resizing algorithm, and obtaining a zoomed calculation result through the image resizing structure include:
  • the image size adjustment algorithm is a bilinear interpolation algorithm, the following formula is used for calculation:
  • D 0 represents the pixel data of the target image
  • v 0 and v 1 represent the first image position coefficients
  • u 0 and u 1 represent the second image position coefficients
  • Q 0 , Q 1 , Q 2 and Q 3 Represents 4 pixel data of the source image.
  • controlling the data flow in the image resizing structure according to the selected image resizing algorithm, and obtaining a zoomed calculation result through the image resizing structure include:
  • D 0 represents the pixel data of the target image
  • v 0 , v 1 , v 2 and v 3 represent the first image position coefficients
  • u 0 , u 1 , u 2 and u 3 represent the second image position coefficients
  • Q 0 , Q 1 , Q 2 , Q 3 , Q 4 , Q 5 , Q 6 , Q 7 , Q 8 , Q 9 , Q 10 , Q 11 , Q 12 , Q 13 , Q 14 and Q 15 represent the source The 16 pixel data of the image.
  • the acquisition of the required image position coefficient includes:
  • an embodiment of the present invention provides an image scaling method based on a bilinear interpolation algorithm
  • the target pixel in the target image For each target pixel in the target image, according to the pixel coordinates of the target pixel, from all the row access addresses, the column access addresses, the row interpolation coefficients, and the column interpolation coefficients Determine the corresponding target row access address, target column access address, target row interpolation coefficient, and target column interpolation coefficient, and determine the corresponding The source pixel required by the algorithm, and then based on the bilinear interpolation algorithm, determine the pixel value of the target pixel according to the pixel value of the source pixel, the target row interpolation coefficient and the target column interpolation coefficient.
  • the corresponding target row access is determined from all the row access addresses, the column access addresses, the row interpolation coefficients, and the column interpolation coefficients.
  • the target column access address, the target row interpolation coefficient and the target column interpolation coefficient it also includes:
  • the index values of the first lookup table and the second lookup table are respectively corresponding to the row coordinates and column coordinates of each of the target pixel points;
  • the pixel coordinates of the target pixel point determine the corresponding target row access address from all the row access addresses, the column access addresses, the row interpolation coefficients, and the column interpolation coefficients , target column access address, target row interpolation coefficient and target column interpolation coefficient, including:
  • the second size includes a width size and a height size; the integer part and the fractional part of each accumulation result of the deformation coefficient in the row direction are sequentially stored in the first lookup table, and the column direction Before the integer part and fractional part of each accumulated result of the deformation coefficient are stored in the second lookup table in sequence, it also includes:
  • the lookup table space width of the first lookup table is 2, and the depth is the wide size
  • the lookup table of the second lookup table The space has a width of 2 and a depth of the tall dimension.
  • the rounding operation is rounding down, and the source pixel points required by the corresponding algorithm in the source image are determined according to the target row fetch address and the target column fetch address, include:
  • the bilinear interpolation-based algorithm determining the pixel value of the target pixel point according to the pixel value of the source pixel point, the target row interpolation coefficient, and the target column interpolation coefficient includes:
  • (X dst , Y dst ) represents the pixel coordinates of the target pixel point
  • u x1 represents the target row interpolation coefficient
  • v x1 represents the target column interpolation coefficient
  • Q x11 represents the pixel value of the first source pixel
  • Q x12 represents the first source pixel
  • Qx21 represents the pixel value of the third source pixel
  • Qx2 represents the pixel value of the fourth source pixel.
  • the acquiring the first size of the source image and the second size of the target image, and determining the deformation coefficient in the row direction and the deformation coefficient in the column direction according to the first size and the second size include:
  • scale x represents the deformation coefficient in the row direction
  • scale y represents the deformation coefficient in the column direction
  • src rows represents the width dimension in the first dimension
  • dst rows represents the width dimension in the second dimension
  • src cols Indicates the high dimension in the first dimension
  • dst cols indicates the high dimension in the second dimension.
  • the fixed-point accumulator includes a first accumulator and a second accumulator, and inputting the row-direction deformation coefficient and the column-direction deformation coefficient into the fixed-point accumulator for cumulative calculation includes: initializing the a first input and a second input of a first accumulator, and a third input and a fourth input of said second accumulator;
  • the output terminal of the first accumulator is connected to the first input terminal, the deformation coefficient in the row direction is input through the second input terminal, and the output terminal of the second accumulator is connected to the third input terminal , the deformation coefficient in the column direction is input through the fourth input terminal;
  • an embodiment of the present invention provides an image scaling device based on a bilinear interpolation algorithm, which is applied to an image size adjustment method, including: a deformation coefficient determination module, which is used to obtain the first size of the source image and the first size of the target image a second size, and determining a row-direction deformation coefficient and a column-direction deformation coefficient according to the first size and the second size;
  • an accumulative calculation module configured to input the row-direction deformation coefficient and the column-direction deformation coefficient into a fixed-point accumulator for cumulative calculation
  • the interpolation parameter determination module is used to perform rounding operation on each accumulation result to obtain the integer part and fractional part of each accumulation result, and use the integer part of each accumulation result of the row direction deformation coefficient as the row
  • the access address the decimal part is used as the row interpolation coefficient
  • the integer part of each accumulation result of the deformation coefficient in the column direction is used as the column access address
  • the decimal part is used as the column interpolation coefficient
  • a target pixel determination module for each target pixel in the target image, from all the row access addresses, the column access addresses, and the row interpolation coefficients according to the pixel coordinates of the target pixel points Determine the corresponding target row access address, target column access address, target row interpolation coefficient, and target column interpolation coefficient in the column interpolation coefficient, and determine according to the target row access address and the target column access address
  • the source pixel points required by the corresponding algorithm in the source image are determined based on the bilinear interpolation algorithm according to the pixel value of the source pixel point, the target row interpolation coefficient and the target column interpolation coefficient. The pixel value of the target pixel.
  • An embodiment of the present invention provides an image size adjustment structure based on a data stream architecture, including a first multiplication unit, a second multiplication unit, a first data storage unit, a second data storage unit, a first addition unit, and a second multiplication unit.
  • Two addition operation units, the input and output ports of each unit are connected according to the set data flow direction, by constructing a special unit for adjusting the image size in the data flow architecture, and controlling the image data to be calculated to flow through different calculation units in turn , which avoids calling the CPU for calculation based on the instruction set, realizes the fast calculation of image data, and releases the calculation pressure of the CPU, thereby realizing the rapid scaling of video and image streams, effectively slowing down the efficiency of the AI chip before and after processing Low bottleneck problem.
  • FIG. 1 is a schematic structural diagram of an image resizing structure based on a data stream architecture provided by Embodiment 1 of the present invention
  • FIG. 2 is a schematic structural diagram of another image resizing structure based on a data stream architecture provided by Embodiment 1 of the present invention
  • FIG. 3 is a flow chart of an image size adjustment method provided in Embodiment 2 of the present invention.
  • FIG. 4 is a flowchart of an image scaling method based on a bilinear interpolation algorithm provided by Embodiment 3 of the present invention.
  • FIG. 5 is a schematic structural diagram of an image scaling device based on a bilinear interpolation algorithm provided in Embodiment 4 of the present invention.
  • FIG. 6 is a schematic structural diagram of a computer device provided by Embodiment 5 of the present invention.
  • first”, “second”, etc. may be used herein to describe various directions, actions, steps or elements, etc., but these directions, actions, steps or elements are not limited by these terms. These terms are only used to distinguish a first direction, action, step or element from another direction, action, step or element.
  • a first input could be termed a second input, and, similarly, a second input could be termed a first input, without departing from the scope of the present application. Both the first input and the second input are inputs, but they are not the same input.
  • the terms “first”, “second”, etc. should not be understood as indicating or implying relative importance or implying the number of technical features indicated. Thus, a feature defined as “first” or “second” may explicitly or implicitly include one or more of these features.
  • “plurality” means at least two, such as two, three, etc., unless otherwise specifically defined.
  • FIG. 1 is a schematic structural diagram of an image size adjustment structure based on a data stream architecture provided by Embodiment 1 of the present invention.
  • the image size adjustment structure includes: a first multiplication unit 11, a second multiplication unit 12, a first data register unit 13, a second data register unit 14, a first addition unit 15 and a second Addition unit 16; wherein, the first input terminal 111 of the first multiplication unit 11 is used to receive the image data to be calculated, and the second input terminal 112 is used to receive the first image position coefficient, and the first multiplication operation
  • the output terminal 113 of the unit 11 is respectively connected with the input terminal 131 of the first data register unit 13 and the third input terminal 151 of the first addition operation unit 15, and the output terminal 132 of the first data register unit 13 is connected with
  • the fourth input end 152 of the first addition operation unit 15 is connected, the output end 153 of the first addition operation unit 15 is connected with the fifth
  • the image to be calculated can be input from the first input terminal 111, the second input terminal 112 of the first multiplication unit 11, and the sixth input terminal 122 of the second multiplication unit 12 in this structure according to the agreed timing requirements.
  • the structure can use a bilinear interpolation algorithm to calculate the input image data to obtain a scaled calculation result.
  • the first image position coefficient and the second image position coefficient corresponding to the row and column directions may be calculated in advance by the front-end module according to the preset image scaling ratio.
  • the bilinear interpolation algorithm can be calculated using the following formula:
  • D 0 represents the calculation result, that is, the pixel data of the target image
  • v 0 and v 1 represent the first image position coefficient
  • u 0 and u 1 represent the second image position coefficient
  • Q 0 , Q 1 , Q 2 and Q 3 Indicates the corresponding 4 pixel data required in the source image.
  • v 1 ⁇ Q 0 , v 0 ⁇ Q 1 , v 1 ⁇ Q 2 and v 0 ⁇ Q 3 can be calculated by the first multiplication unit 11, and v 1 ⁇ Q 0 can be obtained by the first addition unit 15 +v 0 ⁇ Q 1 and v 1 ⁇ Q 2 +v 0 ⁇ Q 3 , and then calculated by the second multiplication unit 12 to obtain u 1 ⁇ (v 1 ⁇ Q 0 +v 0 ⁇ Q1 and u0 ⁇ v1 ⁇ Q2+ v0 ⁇ Q3, finally D0 can be calculated and output by the second addition operation unit 16, and the addition operation can store the last operation result of the preceding unit through the first data register unit 13 and the second data register unit 14 respectively Then, by controlling the image data to be calculated to enter and flow through each calculation unit in turn, the calculation of the entire image can be completed.
  • the output terminal 153 of the unit 15 is connected to the input terminal 211 of the first data selection unit 21, the output terminal 223 of the first data distribution unit 22 is connected to the input terminal 121 of the second multiplication unit 12, and the The first output terminal 212 of the first data selection unit 21 is directly connected with the ninth input terminal 221 of the first data distribution unit 22, and the tenth input terminal 222 of the first data distribution unit 22 is connected with the third addition
  • the output terminal 243 of the operation unit 24 is connected, and the second output terminal 213 of the first data selection unit 21 is connected to the input terminal 231 of the third data register unit 23 and the eleventh input terminal 24 of the third addition operation unit 24 respectively.
  • the input terminal 241 is connected, and the output terminal 232 of the third data register unit 23 is connected with the twelfth input terminal 242 of the third addition operation unit 24; after the second addition operation unit 16, it also includes a second The data selection unit 25 and the second data distribution unit 26, between the second data selection unit 25 and the second data distribution unit 26, also include a fourth data register unit 27 and a fourth addition operation unit 28; wherein , the output terminal 163 of the second addition operation unit 16 is connected to the input terminal 251 of the second data selection unit 25, and the output terminal 263 of the second data distribution unit 26 is used to output the calculation result of the image data , the third output terminal 252 of the second data selection unit 25 is directly connected to the thirteenth input terminal 261 of the second data distribution unit 26, and the fourteenth input terminal 262 of the second data distribution unit 26 is connected to The output end 283 of the fourth addition operation unit 28 is connected, and the fourth output end 253 of the second data selection unit 25 is respectively connected to the input end 271 of the fourth data storage unit 27 and the fourth addition
  • the image size adjustment structure adopts a bilinear interpolation algorithm.
  • the first data selection unit 21 and the first data distribution unit 22 select to use the data flow direction from the second output terminal 213 to the tenth input terminal 222, and the second The data selection unit 25 and the second data distribution unit 26 select the data flow direction from the fourth output terminal 253 to the fourteenth input terminal 262, and then the image size adjustment structure adopts a cubic interpolation method.
  • a group of data selection units and data distribution units can cooperate to perform bypass processing, so that the image size adjustment structure provided by this embodiment can choose to use a bilinear interpolation algorithm or a cubic interpolation method to process the input image data. Perform calculations to obtain corresponding scaled calculation results.
  • the first data selection unit 21 and the first data distribution unit 22 select to use the data flow direction from the first output terminal 212 to the ninth input terminal 221, and the second data selection unit 25 and the second data distribution unit 26 select to use the third output
  • this structure can use a bilinear interpolation algorithm to calculate the input image data. The specific calculation process is as described above and will not be repeated here.
  • the structure can use the cubic interpolation method to calculate the input image data, specifically the following formula can be used:
  • D 0 represents the calculation result, that is, the pixel data of the target image
  • v 0 , v 1 , v 2 and v 3 represent the first image position coefficients
  • u 0 , u 1 , u 2 and u 3 represent the second image position coefficients
  • Q 0 , Q 1 , Q 2 , Q 3 , Q 4 , Q 5 , Q 6 , Q 7 , Q 8 , Q 9 , Q 10 , Q 11 , Q 12 , Q 13 , Q 14 and Q 15 represent the source
  • the image corresponds to the required 16 pixel data.
  • v 0 ⁇ Q 0 , v 1 ⁇ Q 1 , v 2 ⁇ Q 2 , v 3 ⁇ Q 3 , v 0 ⁇ Q 4 , v 1 ⁇ Q 5 , v 2 can be serially calculated by the first multiplication unit 11 ⁇ Q 6 , v 3 ⁇ Q 7 , v 0 ⁇ Q 8 , v 1 ⁇ Q 9 , v 2 ⁇ Q 10 , v 3 ⁇ Q 11 , v 0 ⁇ Q 12 , v 1 ⁇ Q 13 , v 2 ⁇ Q 14 and v 3 ⁇ Q 15 , and then through the first addition operation unit 15, add the adjacent products in series, to obtain v 0 ⁇ Q 0 +v 1 ⁇ Q 1 , v 2 ⁇ Q 2 respectively +v 3 ⁇ Q 3 , v 0 ⁇ Q 4 +v 1 ⁇ Q 5 , v 2 ⁇ Q 6 +v 3 ⁇ Q 7 , v 0 ⁇ Q
  • the addition operation can be stored in the previous unit through the first data register unit 13, the second data register unit 14, the third data register unit 23 and the fourth data register unit 27 respectively.
  • the last operation result is realized, wherein, the third data register unit 23 can be used to store the last operation result of the first addition operation unit 15 in the serial calculation, and the fourth data register unit 27 can be used to store the first addition operation unit 15 in the serial calculation. 2.
  • the last operation result of the addition operation unit 16 Then, by controlling the image data to be calculated to enter and flow through each calculation unit in sequence, the calculation of the entire image can be completed.
  • the image size adjustment structure based on the data flow architecture includes a first multiplication unit, a second multiplication unit, a first data register unit, a second data register unit, a first addition unit and a second
  • the addition operation unit, the input and output ports of each unit are connected according to the set data flow direction, by constructing a special unit for adjusting the image size in the data flow architecture, and controlling the image data to be calculated to flow through different calculation units in sequence, It avoids calling the CPU for calculation based on the instruction set, realizes the fast calculation of image data, and releases the calculation pressure of the CPU, thereby realizing the rapid scaling processing of video and image streams, and effectively slowing down the inefficiency of the front and rear processing of the AI chip bottleneck problem.
  • FIG. 3 is a flowchart of an image size adjustment method provided by Embodiment 2 of the present invention. This embodiment is applicable to the case of performing front-end and back-end calculation processing on video and image stream-related data. This method can be applied to the image size adjustment structure based on the data stream architecture provided by any embodiment of the present invention, and has a corresponding method for this structure. Process and benefits. As shown in Figure 3, it specifically includes the following steps:
  • controlling the data flow in the image resizing structure according to the selected image resizing algorithm, and obtaining the zoomed calculation result through the image resizing structure including:
  • the image size adjustment algorithm is a bilinear interpolation algorithm, the following formula is used for calculation:
  • D 0 represents the pixel data of the target image
  • v 0 and v 1 represent the first image position coefficients
  • u 0 and u 1 represent the second image position coefficients
  • Q 0 , Q 1 , Q 2 and Q 3 Represents 4 pixel data of the source image.
  • controlling the data flow in the image resizing structure according to the selected image resizing algorithm, and obtaining a zoomed calculation result through the image resizing structure including:
  • D 0 represents the pixel data of the target image
  • v 0 , v 1 , v 2 and v 3 represent the first image position coefficients
  • u 0 , u 1 , u 2 and u 3 represent the second image position coefficients
  • Q 0 , Q 1 , Q 2 , Q 3 , Q 4 , Q 5 , Q 6 , Q 7 , Q 8 , Q 9 , Q 10 , Q 11 , Q 12 , Q 13 , Q 14 and Q 15 represent the source The 16 pixel data of the image.
  • the acquisition of the required image position coefficient includes:
  • the front-end module calculates the image position coefficient according to the preset image scaling ratio and the image size adjustment algorithm.
  • the required image position coefficient can be calculated according to the preset image scaling ratio and the selected image size adjustment algorithm through the front-end module of this structure , and then the acquired image data that needs to be adjusted can be input to the corresponding port of the image size adjustment structure based on the agreed timing requirements.
  • the image data can be input to the first multiplication unit in the structure.
  • the resizing algorithm controls the flow of data in the structure, specifically by controlling the input and output ports enabled by the first data selection unit, the first data allocation unit, the second data selection unit, and the second data allocation unit, thereby using
  • the structure is calculated.
  • the calculation process may adopt a bilinear interpolation algorithm or a cubic interpolation method, and the specific calculation process may refer to the above description, which will not be repeated here.
  • FIG. 4 is a flowchart of an image scaling method based on a bilinear interpolation algorithm according to Embodiment 1 of the present invention.
  • This embodiment is applicable to the case of performing front-end calculation and processing on video and image stream-related data.
  • This method can be executed by the image scaling device based on bilinear interpolation algorithm provided by the embodiment of the present invention.
  • the device can be composed of hardware and and/or implemented by software, and generally can be integrated into computer equipment. As shown in Figure 4, it specifically includes the following steps:
  • the source image is the original image
  • the target image is the desired target image after zooming.
  • its first size can be obtained
  • the source image needs to be zoomed it is usually known
  • the second size of the target image such as the size that matches the subsequent module.
  • the acquiring the first size of the source image and the second size of the target image, and determining the deformation coefficient in the row direction and the deformation coefficient in the column direction according to the first size and the second size include:
  • scale x represents the deformation coefficient in the row direction
  • scale y represents the deformation coefficient in the column direction
  • src rows represents the width dimension in the first dimension
  • dst rows represents the width dimension in the second dimension
  • src cols Indicates the high dimension in the first dimension
  • dst cols indicates the high dimension in the second dimension.
  • both the first size and the second size may include a width size (that is, the size of the image in the row direction) and a height size (that is, the size of the image in the column direction), after obtaining the first size and the second size , the deformation coefficient in the row direction can be calculated through the proportional relationship between the width dimension of the source image and the width dimension of the target image, and the deformation coefficient in the column direction can be calculated through the proportional relationship between the height dimension of the source image and the height dimension of the target image .
  • the row-direction deformation coefficient and the column-direction deformation coefficient can be respectively input into a fixed-point accumulator for cumulative calculation, so as to obtain the result of each accumulation.
  • the accumulation calculation process is a clock-level pipeline calculation, which can greatly improve the bilinear interpolation processing speed when applied to the data flow architecture AI chip.
  • the fixed-point accumulator includes a first accumulator and a second accumulator, and inputting the row-direction deformation coefficient and the column-direction deformation coefficient into the fixed-point accumulator for cumulative calculation includes: initializing the The first input terminal and the second input terminal of the first accumulator, and the third input terminal and the fourth input terminal of the second accumulator; the output terminal of the first accumulator is connected to the first input terminal , the deformation coefficient in the row direction is input through the second input terminal, the output terminal of the second accumulator is connected to the third input terminal, and the deformation coefficient in the column direction is input through the fourth input terminal; control The first accumulator performs the accumulation calculation of the first number, and controls the second accumulator to perform the accumulation calculation of the second number, wherein the first number is the number of pixels of the target image in the row direction number, the second number of times is the number of pixels of the target image in the column direction.
  • the process of accumulating the deformation coefficients in the row direction and the deformation coefficients in the column direction can be performed simultaneously using different fixed-point accumulators to improve calculation efficiency, and of course the same fixed-point accumulator can also be used to perform accumulation successively.
  • the row-direction deformation coefficients can be accumulated through the first accumulator, and the column-direction deformation coefficients can be accumulated through the second accumulator.
  • the first accumulator and the second accumulator can be initialized, and then the first accumulator
  • the result of the first accumulation is the deformation coefficient in the row direction itself, and at the same time, the result of the first accumulation is used as the input of the second accumulation process, and is added to the deformation coefficient in the row direction again, then the result of the second accumulation is twice the row Directional deformation coefficient, and so on, each cumulative result of the first accumulator is a row-direction deformation coefficient that increases regularly by an integer multiple, and similarly, each cumulative result of the second accumulator is a regular incremental integer multiple of column-direction deformation coefficient.
  • the accumulation times of the first accumulator and the second accumulator can be controlled by the positioning number, that is, the first accumulator only performs the accumulation calculation of the first number, and the second accumulator only performs the accumulation calculation of the second number, so as to In fact, it is necessary to control the calculation amount and save the storage space required for the calculation results.
  • the first number is the number of pixels of the target image in the row direction
  • the second number is the number of pixels of the target image in the column direction. Interpolation calculation requirements for all target pixels in the subsequent target image.
  • the integer part of the accumulation result obtained by accumulating the deformation coefficients in the row direction can be used as the row access address (that is, the address of the pixel point in the source image) Row coordinates)
  • the decimal part can be used as the row interpolation coefficient
  • the integer part of the accumulation result obtained by accumulating the deformation coefficient in the column direction can be used as the column access address (that is, the column coordinate of the pixel in the source image)
  • the decimal part can be used as the column Interpolation coefficient.
  • each target pixel in the target image may be processed separately to obtain the pixel value of the source image corresponding to each target pixel, thereby obtaining a complete target image.
  • the corresponding target row access address, target column access address, and target row can be selected from the stored row access address, row interpolation coefficient, column access address and column interpolation coefficient according to its pixel coordinates.
  • the interpolation coefficient and the target column interpolation coefficient, and then according to the target row access address and the target column access address, the corresponding source pixel can be found in the source image, and all the source pixels required by the algorithm can be determined according to the source pixel, Thereby, the pixel value of each source pixel is obtained, and then based on the bilinear interpolation algorithm, the pixel value of the target pixel can be calculated according to the pixel value of the source pixel, the target row interpolation coefficient and the target column interpolation coefficient.
  • the rounding operation is rounding down, and the source pixel points required by the corresponding algorithm in the source image are determined according to the target row fetch address and the target column fetch address, Including: using the target row access address as row coordinates, using the target column access address as column coordinates to determine the first source pixel point; using the target row access address as row coordinates, and using the target column access address Add one to the number address as the column coordinate to determine the second source pixel point; add one to the target row access address as the row coordinate, and use the target column access address as the column coordinate to determine the third source pixel point; Adding one to the row access address as the row coordinate, adding one to the target column access address as the column coordinate to determine the fourth source pixel.
  • the first source pixel point closest to the coordinate origin is firstly determined according to the target row fetch address and the target column fetch address, and then the target row fetch address can be added to 1. Add one to the target column access address, or add one to both the target row access address and the target column access address to obtain the required second source pixel, third source pixel, and fourth source pixel , for subsequent interpolation calculations.
  • the bilinear interpolation-based algorithm determining the pixel value of the target pixel point according to the pixel value of the source pixel point, the target row interpolation coefficient, and the target column interpolation coefficient includes:
  • (X dst , Y dst ) represents the pixel coordinates of the target pixel point
  • u x1 represents the target row interpolation coefficient
  • v x1 represents the target column interpolation coefficient
  • Q x1 represents the pixel value of the first source pixel
  • Q x1 represents the first source pixel
  • Qx21 represents the pixel value of the third source pixel
  • Qx22 represents the pixel value of the fourth source pixel.
  • the interpolation calculation process needs to first obtain the above formula
  • the target row interpolation coefficient (the value is the distance between the mapped coordinate point in the row direction and the first source pixel point)
  • the target column interpolation coefficient (the value is the distance between the mapped coordinate point in the column direction and the first source pixel point) Distance) and the pixel value of each source pixel, and to determine the pixel value of the source pixel, you need to get the pixel coordinates of each source pixel
  • the traditional calculation method is as follows:
  • X dst represents the row coordinate of the target pixel
  • X src represents the row coordinate after the target pixel is mapped back to the source image
  • scale x represents the deformation coefficient in the row direction
  • Y dst represents the column coordinate of the target pixel
  • Y src represents the The target pixel points are mapped back to the column coordinates of the source image
  • scale y represents the deformation coefficient in the column direction, so the surrounding 4 adjacent pixel points can be determined according to X src and Y src .
  • the integer part of the result obtained is the coordinate of the first source pixel point, and the decimal part is Row interpolation coefficients and column interpolation coefficients.
  • the target row access address, target column access address, target row interpolation coefficient and target column interpolation coefficient in the coefficient it also includes: sequentially storing the integer part and the fractional part of each accumulation result of the deformation coefficient in the row direction Into the first lookup table, the integer part and fractional part of each accumulation result of the deformation coefficient in the column direction are sequentially stored in the second lookup table, and the index values of the first lookup table and the second lookup table are respectively Corresponding to the row coordinates and column coordinates of each of the target pixel points; correspondingly, according to the pixel coordinates of the target pixel points, all the row fetch addresses, the column fetch addresses, and the row interpolation coefficients and determining the corresponding target row access address, target column access address, target row interpolation coefficient, and target column interpol
  • the integer part of each accumulation result of the row direction deformation coefficient can be pre-set and the fractional part are stored in the first lookup table in turn, and the integer part and the fractional part of each accumulation result of the column direction deformation coefficient are stored in the second lookup table in turn, wherein the index values of the first lookup table and the second lookup table can be It is an orderly increasing integer starting from 0, that is, 0, 1, 2, ...
  • the row coordinates and column coordinates of the target pixel are used as index values to look up the corresponding required access address and interpolation coefficient in the first lookup table and the second lookup table respectively, and the method provided in this embodiment is applied to change the image scaling each time After the ratio is completed and before the interpolation calculation is completed, it is only necessary to calculate the content of the lookup table once, and then it can be directly looked up and used later.
  • the second size includes a width size and a height size; when the integer part and fractional part of each accumulation result of the deformation coefficient in the row direction are stored in the first lookup table in sequence, the column Before storing the integer part and fractional part of each accumulation result of the directional deformation coefficient in the second lookup table in turn, it also includes: respectively assigning a lookup table space for the first lookup table and the second lookup table, the first lookup table
  • the lookup table space width of the lookup table is 2, and the depth is the wide dimension, and the lookup table space width of the second lookup table is 2, and the depth is the high dimension.
  • each index value in the lookup table corresponds to an access address and an interpolation coefficient
  • the number of access addresses and interpolation coefficients that need to be calculated in advance can be determined according to the second size of the target image, as the index of the lookup table Depth, so as to set the lookup table space according to actual needs.
  • the first size of the source image and the second size of the target image are obtained first, and the deformation coefficient in the row direction and the deformation coefficient in the column direction are determined according to the first size and the second size, and then the row The direction deformation coefficient and the column direction deformation coefficient are input into the fixed-point accumulator for cumulative calculation, and each accumulation result is rounded to determine the value of the pixel in the source image according to the integer part and fractional part obtained after rounding.
  • the access address and interpolation coefficient determine the corresponding target access address and target interpolation coefficient according to the pixel coordinates of the target pixel, and determine the corresponding target in the source image according to the target access address
  • the source pixel required by the algorithm, so based on the bilinear interpolation algorithm, the pixel value of the target pixel is determined according to the pixel value of the source pixel and the target interpolation coefficient.
  • the CPU In the traditional computing architecture, it is necessary to use the CPU to perform the bilinear interpolation function of the image based on the instruction set.
  • it When performing the interpolation calculation, it is divided into coefficient generation, data extraction, coefficient and data multiplication and addition operations, etc., because the operation data
  • the amount is large, and part of the process involves division operations, which leads to a large consumption of system resources and operation time costs, thereby affecting the overall calculation efficiency.
  • the method of embodiment 3 is used to greatly reduce multiplication and division operations, thereby Realize extremely high-efficiency calculations and release the calculation pressure of the CPU.
  • the original multiplication and division operations are converted into accumulation operations, which greatly reduces the time spent on Multiplication and division operations applied during image scaling.
  • Fig. 5 is a schematic structural diagram of an image zooming device based on a bilinear interpolation algorithm provided in Embodiment 2 of the present invention.
  • the device can be realized by hardware and/or software, and generally can be integrated into a computer device to execute the present invention.
  • Deformation coefficient determination module 31 configured to obtain the first size of the source image and the second size of the target image, and determine the row-direction deformation coefficient and the column-direction deformation coefficient according to the first size and the second size;
  • the accumulative calculation module 32 is used to respectively input the deformation coefficient in the row direction and the deformation coefficient in the column direction into a fixed-point accumulator for cumulative calculation;
  • the interpolation parameter determination module 33 is used to perform a rounding operation on each accumulation result to obtain an integer part and a fractional part of each accumulation result, and use the integer part of each accumulation result of the row direction deformation coefficient as The row access address, the decimal part is used as the row interpolation coefficient, the integer part of each accumulation result of the deformation coefficient in the column direction is used as the column access address, and the decimal part is used as the column interpolation coefficient;
  • Target pixel determination module 34 for each target pixel in the target image, according to the pixel coordinates of the target pixel, from all the row access addresses, the column access addresses, the row interpolation Determine the corresponding target row access address, target column access address, target row interpolation coefficient, and target column interpolation coefficient from the coefficient and the column interpolation coefficient, and according to the target row access address and the target column access address Determine the source pixels required by the corresponding algorithm in the source image, and then based on the bilinear interpolation algorithm, determine the source pixels according to the pixel values of the source pixels, the target row interpolation coefficient and the target column interpolation coefficient The pixel value of the target pixel point.
  • the first size of the source image and the second size of the target image are obtained first, and the deformation coefficient in the row direction and the deformation coefficient in the column direction are determined according to the first size and the second size, and then the row The direction deformation coefficient and the column direction deformation coefficient are input into the fixed-point accumulator for cumulative calculation, and each accumulation result is rounded to determine the value of the pixel in the source image according to the integer part and fractional part obtained after rounding.
  • the access address and interpolation coefficient determine the corresponding target access address and target interpolation coefficient according to the pixel coordinates of the target pixel, and determine the corresponding target in the source image according to the target access address
  • the source pixel required by the algorithm, so based on the bilinear interpolation algorithm, the pixel value of the target pixel is determined according to the pixel value of the source pixel and the target interpolation coefficient.
  • the image scaling device based on bilinear interpolation algorithm also includes:
  • the accumulative result storage module is used to determine the corresponding row access address, the column access address, the row interpolation coefficient, and the column interpolation coefficient according to the pixel coordinates of the target pixel point.
  • the integer part and the fractional part of each accumulation result of the deformation coefficient in the row direction are sequentially stored in the first lookup table, and the The integer part and the fractional part of each accumulation result of the deformation coefficient in the column direction are sequentially stored in the second lookup table, and the index values of the first lookup table and the second lookup table are respectively related to the index values of each of the target pixels.
  • Row coordinates correspond to column coordinates;
  • the target pixel determination module 34 includes:
  • a target parameter lookup unit configured to use row coordinates and column coordinates in the pixel coordinates as index values to look up the target row access address, the target A column access address, the target row interpolation coefficient and the target column interpolation coefficient.
  • the second size includes a width size and a height size
  • the image scaling device based on a bilinear interpolation algorithm further includes:
  • the lookup table space allocation module is used to sequentially store the integer part and fractional part of each accumulation result of the deformation coefficient in the row direction into the first lookup table, and store the integer part of each accumulation result of the deformation coefficient in the column direction into the first lookup table.
  • the look-up table space width of the first look-up table is 2, and the depth is For the wide dimension, the lookup table space width of the second lookup table is 2, and the depth is the high dimension.
  • the rounding operation is rounding down
  • the target pixel determination module 34 includes:
  • a source pixel determination unit configured to use the target row access address as a row coordinate, and use the target column access address as a column coordinate to determine a first source pixel; use the target row access address as a row coordinate, Adding one to the target column access address as column coordinates to determine a second source pixel point; adding one to the target row access address as row coordinates, and using the target column access address as column coordinates to determine a third source pixel point: add one to the target row access address as row coordinates, and add one to the target column access address as column coordinates to determine the fourth source pixel point.
  • the target pixel determination module 34 is specifically used for:
  • (X dst , Y dst ) represents the pixel coordinates of the target pixel point
  • u x1 represents the target row interpolation coefficient
  • v x1 represents the target column interpolation coefficient
  • Q x11 represents the pixel value of the first source pixel
  • Q x1 represents the first source pixel
  • Qx21 represents the pixel value of the third source pixel
  • Qx22 represents the pixel value of the fourth source pixel.
  • the deformation coefficient determining module 31 is specifically used for:
  • scale x represents the deformation coefficient in the row direction
  • scale y represents the deformation coefficient in the column direction
  • src rows represents the width dimension in the first dimension
  • dst rows represents the width dimension in the second dimension
  • src cols Indicates the high dimension in the first dimension
  • dst cols indicates the high dimension in the second dimension.
  • the cumulative calculation module 32 includes:
  • an initialization unit for initializing the first input terminal and the second input terminal of the first accumulator, and the third input terminal and the fourth input terminal of the second accumulator; the output terminal of the first accumulator Connected to the first input end, the row direction deformation coefficient is input through the second input end, the output end of the second accumulator is connected to the third input end, and the column direction deformation coefficient is passed through the The fourth input terminal input;
  • an accumulation control unit configured to control the first accumulator to perform the accumulation calculation of the first number, and control the second accumulator to perform the accumulation calculation of the second number, wherein the first number is the target The number of pixels of the image in the row direction, and the second order is the number of pixels of the target image in the column direction.
  • the image scaling device based on bilinear interpolation algorithm provided by the embodiments of the present invention can execute the image scaling method based on bilinear interpolation algorithm provided by any embodiment of the present invention, and has corresponding functional modules and beneficial effects for executing the method.
  • FIG. 6 is a schematic structural diagram of a computer device provided by Embodiment 5 of the present invention, showing a block diagram of an exemplary computer device suitable for implementing the implementation manner of the present invention.
  • the computer device shown in FIG. 6 is only an example, and should not limit the functions and scope of use of this embodiment of the present invention.
  • the computer equipment includes a processor 41, a memory 42, an input device 43 and an output device 44; the number of processors 41 in the computer equipment can be one or more, and one processor 41 is taken as an example in Figure 4 , the processor 41, the memory 42, the input device 43 and the output device 44 in the computer equipment can be connected through a bus or in other ways. In FIG. 6, the connection through a bus is taken as an example.
  • the memory 42 can be used to store software programs, computer-executable programs and modules, such as program instructions/modules corresponding to the image size adjustment method in the embodiment of the present invention.
  • Processor 41 executes various functional applications and data processing of computer equipment by running software programs, instructions and modules stored in memory 42, that is, realizes the above-mentioned image size adjustment method, and realizes the above-mentioned bilinear interpolation-based algorithm The image scaling method.
  • the memory 42 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system and at least one application required by a function; the data storage area may store data created according to the use of the computer device, and the like.
  • the memory 42 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage devices.
  • the memory 42 may further include memory located remotely relative to the processor 41, and these remote memories may be connected to the computer device through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the input device 43 can be used to obtain image data to be calculated and required image position coefficients, and generate key signal inputs related to user settings and function control of the computer equipment.
  • the output device 44 can be used to transmit calculation results etc. to subsequent modules.
  • the input device 43 can also be used to obtain the first size of the source image and the second size of the target image, and generate key signal input related to user settings and function control of the computer equipment, etc.
  • the output device 44 can be used to transmit calculation results etc. to subsequent modules.
  • Embodiment 6 of the present invention also provides a storage medium containing computer-executable instructions, and the computer-executable instructions are used to perform an image size adjustment method when executed by a computer processor, the method comprising:
  • the image position coefficients include a first image position coefficient in the row direction and a second image position coefficient in the column direction;
  • the data flow in the image size adjustment structure is controlled according to the selected image size adjustment algorithm, and the scaled calculation result is obtained through the image size adjustment structure.
  • Embodiment 7 of the present invention also provides a storage medium containing computer-executable instructions, the computer-executable instructions are used to execute an image scaling method based on a bilinear interpolation algorithm when executed by a computer processor, and the method includes:
  • the target pixel in the target image For each target pixel in the target image, according to the pixel coordinates of the target pixel, from all the row access addresses, the column access addresses, the row interpolation coefficients, and the column interpolation coefficients Determine the corresponding target row access address, target column access address, target row interpolation coefficient, and target column interpolation coefficient, and determine the corresponding The source pixel required by the algorithm, and then based on the bilinear interpolation algorithm, determine the pixel value of the target pixel according to the pixel value of the source pixel, the target row interpolation coefficient and the target column interpolation coefficient.
  • the storage medium may be any of various types of memory devices or storage devices.
  • the term "storage medium” is intended to include: installation media such as CD-ROMs, floppy disks, or tape drives; computer system memory or random access memory such as DRAM, DDR RAM, SRAM, EDO RAM, Rambus RAM, etc. ; non-volatile memory, such as flash memory, magnetic media (eg hard disk or optical storage); registers or other similar types of memory elements, etc.
  • the storage medium may also include other types of memory or combinations thereof.
  • the storage medium may be located in a computer system in which the program is executed, or may be located in a different second computer system connected to the computer system through a network such as the Internet.
  • the second computer system may provide program instructions to the computer for execution.
  • storage medium may include two or more storage media that may reside in different locations, such as in different computer systems connected by a network.
  • the storage medium may store program instructions (eg embodied as computer programs) executable by one or more processors.
  • the computer-executable instructions are not limited to the method operations described above, and can also perform the image size adjustment method provided by any embodiment of the present invention. related operations.
  • a computer readable signal medium may include a data signal carrying computer readable program code in baseband or as part of a carrier wave. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. .
  • Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • the present invention can be realized by means of software and necessary general-purpose hardware, and of course it can also be realized by hardware, but in many cases the former is a better implementation mode .
  • the essence of the technical solution of the present invention or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product can be stored in a computer-readable storage medium, such as a floppy disk of a computer , read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), flash memory (FLASH), hard disk or optical disc, etc., including several instructions to make a computer device (which can be a personal computer) , server, or network device, etc.) execute the methods described in various embodiments of the present invention.
  • a computer-readable storage medium such as a floppy disk of a computer , read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), flash memory (FLASH), hard disk or optical disc, etc

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

本发明公开一种基于数据流架构的图像尺寸调整结构、调整方法及图像缩放方法和装置。图像尺寸调整结构包括:第一乘法运算单元、第二乘法运算单元、第一数据寄存单元、第二数据寄存单元、第一加法运算单元和第二加法运算单元,各个单元的输入输出端口按照设定的数据流向进行连接。实现了图像数据的快速计算,并释放了CPU的计算压力。

Description

基于数据流架构的图像尺寸调整结构、调整方法及图像缩放方法和装置 技术领域
本发明实施例涉及数据处理技术领域,尤其涉及一种基于数据流架构的图像尺寸调整结构、调整方法及图像缩放方法和装置。
背景技术
随着深度学习的快速发展,卷积神经网络已经大量应用于机器视觉应用,例如图像识别与图像分类。基于数据流架构的人工智能芯片,由于其极高的芯片利用率,正在越来越多的场景中得到应用。为了提高数据流架构AI芯片的端到端效率,需要对视频、图像流相关数据进行前后端计算处理,比如将图像放大或缩小(resize)为与后续模块相匹配的尺寸。
在传统的计算体系架构中,需要使用CPU基于指令集进行图像的resize功能,由于resize运算数据量大,会占用较多CPU资源,从而降低整个AI系统的端到端性能,导致消耗更多的计算时间成本,影响整体的计算效率。
发明内容
本发明实施例提供一种基于数据流架构的图像尺寸调整结构、调整方法及图像缩放方法和装置,以释放CPU的计算压力,实现极高效率的运算。
第一方面,本发明实施例提供了一种基于数据流架构的图像尺寸调整结构,该结构包括:第一乘法运算单元、第二乘法运算单元、第一数据寄存单元、第二数据寄存单元、第一加法运算单元和第二加法运算单元;其中,
所述第一乘法运算单元的第一输入端用于接收待计算的图像数据,第二输入端用于接收第一图像位置系数,所述第一乘法运算单元的输出端分别与所述第一数据寄存单元的输入端和所述第一加法运算单元的第三输入端连接,所述第一数据寄存单元的输出端与所述第一加法运算单元的第四输入端连接,所述第一加法运算单元的输出端与所述第二乘法运算单元的第五输入端连接,所述第二乘法运算单元的第六输入端用于接收第二图像位置系数,所述第二乘法运算单元的输出端分别与所述第二数据寄存单元的输入端和所述第二加法运算单元的第七输入端连接,所述第二数据寄存单元的输出端与所述第二加法运算单元的第八输入端连接,所述第二加法运算单元的输出端用于输出所述图像数据的计算结果;所述第一数据寄存单元用于在串行运算中存储所述第一乘法运算单元上一次的运算结果,所述第二数据寄存单元用于在串行运算中存储所述第二乘法运算单元上一次的运算结果。
可选的,在所述第一加法运算单元和所述第二乘法运算单元之间,还包括第一数据选择单元和第一数据分配单元,在所述第一数据选择单元和所述第一数据分配单元之间,还包括第三数据寄存单元和第三加法运算单元;其中,
所述第一加法运算单元的输出端与所述第一数据选择单元的输入端连接,所述第一数据分配单元的输出端与所述第二乘法运算单元的输入端连接,所述第一数据选择单元的第一输出端直接与所述第一数据分配单元的第九输入端连接,所述第一数据分配单元的第十输入端与所述第三加法运算单元的输出端连接,所述第一数据选择单元的第二输出端分别与所述第三数据寄存单元的输入端和所述第三加法运算单元的第十一输入端连接,所述第三数据寄存单元的输出端与所述第三加法运算单元的第十二输入端连接;
在所述第二加法运算单元之后,还包括第二数据选择单元和第二数据分配单元,在所述第二数据选择单元和所述第二数据分配单元之间,还包括第四数据寄存单元和第四加法运算单元;其中,
所述第二加法运算单元的输出端与所述第二数据选择单元的输入端连接,所述第二数据分配单元的输出端用于输出所述图像数据的计算结果,所述第二数据选择单元的第三输出端直接与所述第二数据分配单元的第十三输入端连接,所述第二数据分配单元的第十四输入端与所述第四加法运算单元的输出端连接,所述第二数据选择单元的第四输出端分别与所述第四数据寄存单元的输入端和所述第四加法运算单元的第十五输入端连接,所述第四数据寄存单元的输出端与所述第四加法运算单元的第十六输入端连接。
可选的,若所述第一数据选择单元和所述第一数据分配单元选择使用所述第一输出端到所述第九输入端的数据流向,且所述第二数据选择单元和所述第二数据分配单元选择使用所述第三输出端到所述第十三输入端的数据流向,则所述图像尺寸调整结构采用双线性插值算法。
可选的,若所述第一数据选择单元和所述第一数据分配单元选择使用所述第二输出端到所述第十输入端的数据流向,且所述第二数据选择单元和所述第二数据分配单元选择使用所述第四输出端到所述第十四输入端的数据流向,则所述图像尺寸调整结构采用三次插值方法。
第二方面,本发明实施例还提供了一种图像尺寸调整方法,该方法应用于本发明任意实施例所提供的基于数据流架构的图像尺寸调整结构,包括:
获取待计算的图像数据及所需的图像位置系数,所述图像位置系数包括行方向上的第一图像位置系数和列方向上的第二图像位置系数;
根据预设时序要求,将所述图像数据和所述图像位置系数输入到所述图像尺寸调整结构的对应端口中;
根据所选择的图像尺寸调整算法控制所述图像尺寸调整结构中的数据流向,并通过所述图像尺寸调整结构得到完成放缩后的计算结果。
可选的,所述根据所选择的图像尺寸调整算法控制所述图像尺寸调整结构中的数据流向,并通过所述图像尺寸调整结构得到完成放缩后的计算结果,包括:
若所述图像尺寸调整算法为双线性插值算法,则使用如下公式进行计算:
D 0=u 1×(v 1×Q 0+v 0×Q 1)+u 0×(v 1×Q 2+v 0×Q 3)
其中,D 0表示目标图像的像素数据,v 0和v 1表示所述第一图像位置系数,u 0和u 1表示所述第二图像位置系数,Q 0、Q 1、Q 2和Q 3表示源图像的4个像素数据。
可选的,所述根据所选择的图像尺寸调整算法控制所述图像尺寸调整结构中的数据流向,并通过所述图像尺寸调整结构得到完成放缩后的计算结果,包括:
若所述图像尺寸调整算法为三次插值方法,则使用如下公式进行计算:
D 0=u 0×(v 0×Q 0+v 1×Q 1+v 2×Q 2+v 3×Q 3)+u 1
×(v 0×Q 4+v 1×Q 5+v 2×Q 6+v 3×Q 7)+u 2
×(v 0×Q 8+v 1×Q 9+v 2×Q 10+v 3×Q 11)+u 3
×(v 0×Q 12+v 1×Q 13+v 2×Q 14+v 3×Q 15)
其中,D 0表示目标图像的像素数据,v 0、v 1、v 2和v 3表示所述第一图像位置系数,u 0、u 1、u 2和u 3表示所述第二图像位置系数,Q 0、Q 1、Q 2、Q 3、Q 4、Q 5、Q 6、Q 7、Q 8、Q 9、Q 10、Q 11、Q 12、Q 13、Q 14和Q 15表示源图像的16个像 素数据。
可选的,所述获取所需的图像位置系数,包括:
通过前级模块根据预设图像放缩比例及所述图像尺寸调整算法计算所述图
像位置系数。
第三方面,本发明实施例提供了一种基于双线性插值算法的图像缩放方法,
应用于图像尺寸调整方法,包括:
获取源图像的第一尺寸和目标图像的第二尺寸,并根据所述第一尺寸和所述第二尺寸确定行方向形变系数和列方向形变系数;
分别将所述行方向形变系数和所述列方向形变系数输入定点累加器进行累加计算;
分别对每次的累加结果进行取整操作,以得到各个所述累加结果的整数部分和小数部分,并将所述行方向形变系数的各个累加结果的整数部分作为行取数地址,小数部分作为行插值系数,将所述列方向形变系数的各个累加结果的整数部分作为列取数地址,小数部分作为列插值系数;
针对所述目标图像中的每个目标像素点,根据所述目标像素点的像素坐标从所有所述行取数地址、所述列取数地址、所述行插值系数及所述列插值系数中确定对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数,并根据所述目标行取数地址和所述目标列取数地址确定在所述源图像中对应的算法所需的源像素点,再基于双线性插值算法,根据所述源像素点的像素值、所述目标行插值系数及所述目标列插值系数确定所述目标像素点的像素值。
可选的,在所述根据所述目标像素点的像素坐标从所有所述行取数地址、 所述列取数地址、所述行插值系数及所述列插值系数中确定对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数之前,还包括:
将所述行方向形变系数的各个累加结果的整数部分和小数部分依次存入第一查找表中,将所述列方向形变系数的各个累加结果的整数部分和小数部分依次存入第二查找表中,所述第一查找表和所述第二查找表的索引值分别与各个所述目标像素点的行坐标和列坐标对应;
相应的,所述根据所述目标像素点的像素坐标从所有所述行取数地址、所述列取数地址、所述行插值系数及所述列插值系数中确定对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数,包括:
分别将所述像素坐标中的行坐标和列坐标作为索引值从所述第一查找表和所述第二查找表中查找所述目标行取数地址、所述目标列取数地址、所述目标行插值系数及所述目标列插值系数。
可选的,所述第二尺寸包括宽尺寸和高尺寸;在所述将所述行方向形变系数的各个累加结果的整数部分和小数部分依次存入第一查找表中,将所述列方向形变系数的各个累加结果的整数部分和小数部分依次存入第二查找表中之前,还包括:
分别为所述第一查找表和所述第二查找表分配查找表空间,所述第一查找表的查找表空间宽度为2,深度为所述宽尺寸,所述第二查找表的查找表空间宽度为2,深度为所述高尺寸。
可选的,所述取整操作为向下取整,所述根据所述目标行取数地址和所述目标列取数地址确定在所述源图像中对应的算法所需的源像素点,包括:
将所述目标行取数地址作为行坐标,将所述目标列取数地址作为列坐标确 定第一源像素点;
将所述目标行取数地址作为行坐标,将所述目标列取数地址加一作为列坐标确定第二源像素点;
将所述目标行取数地址加一作为行坐标,将所述目标列取数地址作为列坐标确定第三源像素点;
将所述目标行取数地址加一作为行坐标,将所述目标列取数地址加一作为列坐标确定第四源像素点。
可选的,所述基于双线性插值算法,根据所述源像素点的像素值、所述目标行插值系数及所述目标列插值系数确定所述目标像素点的像素值,包括:
Figure PCTCN2023070762-appb-000001
其中,(X dst,Y dst)表示所述目标像素点的像素坐标,
Figure PCTCN2023070762-appb-000002
表示所述目标像素点的像素值,u x1表示所述目标行插值系数,v x1表示所述目标列插值系数,Q x11表示所述第一源像素点的像素值,Q x12表示所述第二源像素点的像素值,Q x21表示所述第三源像素点的像素值,Q x2表示所述第四源像素点的像素值。
可选的,所述获取源图像的第一尺寸和目标图像的第二尺寸,并根据所述第一尺寸和所述第二尺寸确定行方向形变系数和列方向形变系数,包括:
Figure PCTCN2023070762-appb-000003
其中,scale x表示所述行方向形变系数,scale y表示所述列方向形变系数,src rows表示所述第一尺寸中的宽尺寸,dst rows表示所述第二尺寸中的宽尺寸,src cols表示所述第一尺寸中的高尺寸,dst cols表示所述第二尺寸中的高尺寸。
可选的,所述定点累加器包括第一累加器和第二累加器,所述分别将所述行方向形变系数和所述列方向形变系数输入定点累加器进行累加计算,包括:初始化所述第一累加器的第一输入端和第二输入端,以及所述第二累加器的第三输入端和第四输入端;
所述第一累加器的输出端连接至所述第一输入端,所述行方向形变系数通过所述第二输入端输入,所述第二累加器的输出端连接至所述第三输入端,所述列方向形变系数通过所述第四输入端输入;
控制所述第一累加器进行第一次数的累加计算,以及控制所述第二累加器进行第二次数的累加计算,其中,所述第一次数为所述目标图像在行方向上的像素个数,所述第二次数为所述目标图像在列方向上的像素个数。
第四方面,本发明实施例提供了一种基于双线性插值算法的图像缩放装置,应用于图像尺寸调整方法,包括:形变系数确定模块,用于获取源图像的第一尺寸和目标图像的第二尺寸,并根据所述第一尺寸和所述第二尺寸确定行方向形变系数和列方向形变系数;
累加计算模块,用于分别将所述行方向形变系数和所述列方向形变系数输入定点累加器进行累加计算;
插值参数确定模块,用于分别对每次的累加结果进行取整操作,以得到各个所述累加结果的整数部分和小数部分,并将所述行方向形变系数的各个累加结果的整数部分作为行取数地址,小数部分作为行插值系数,将所述列方向形变系数的各个累加结果的整数部分作为列取数地址,小数部分作为列插值系数;
目标像素确定模块,用于针对所述目标图像中的每个目标像素点,根据所述目标像素点的像素坐标从所有所述行取数地址、所述列取数地址、所述行插 值系数及所述列插值系数中确定对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数,并根据所述目标行取数地址和所述目标列取数地址确定在所述源图像中对应的算法所需的源像素点,再基于双线性插值算法,根据所述源像素点的像素值、所述目标行插值系数及所述目标列插值系数确定所述目标像素点的像素值。
本发明实施例提供了一种基于数据流架构的图像尺寸调整结构,包括第一乘法运算单元、第二乘法运算单元、第一数据寄存单元、第二数据寄存单元、第一加法运算单元和第二加法运算单元,各个单元的输入输出端口按照设定的数据流向进行连接,通过在数据流架构中构建用于调整图像尺寸的专用单元,并控制待计算的图像数据依次流经不同的计算单元,避免了调用CPU进行基于指令集的计算,实现了图像数据的快速计算,并释放了CPU的计算压力,从而实现了视频及图像流的快速缩放处理,有效的减缓了AI芯片的前后处理效率低下的瓶颈问题。
附图说明
图1为本发明实施例一提供的基于数据流架构的图像尺寸调整结构的结构示意图;
图2为本发明实施例一提供的另一种基于数据流架构的图像尺寸调整结构的结构示意图;
图3为本发明实施例二提供的图像尺寸调整方法的流程图;
图4为本发明实施例三提供的基于双线性插值算法的图像缩放方法的流程图;
图5为本发明实施例四提供的基于双线性插值算法的图像缩放装置的结构示意图;
图6为本发明实施例五提供的计算机设备的结构示意图。
具体实施方式
下面结合附图和实施例对本发明作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释本发明,而非对本发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本发明相关的部分而非全部结构。
在更加详细地讨论示例性实施例之前应当提到的是,一些示例性实施例被描述成作为流程图描绘的处理或方法。虽然流程图将各步骤描述成顺序的处理,但是其中的许多步骤可以被并行地、并发地或者同时实施。此外,各步骤的顺序可以被重新安排。当其操作完成时所述处理可以被终止,但是还可以具有未包括在附图中的附加步骤。所述处理可以对应于方法、函数、规程、子例程、子程序等等。
此外,术语“第一”、“第二”等可在本文中用于描述各种方向、动作、步骤或元件等,但这些方向、动作、步骤或元件不受这些术语限制。这些术语仅用于将第一个方向、动作、步骤或元件与另一个方向、动作、步骤或元件区分。举例来说,在不脱离本申请的范围的情况下,可以将第一输入端称为第二输入端,且类似地,可将第二输入端称为第一输入端。第一输入端和第二输入端两者都是输入端,但其不是同一输入端。术语“第一”、“第二”等不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限 定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
实施例一
图1为本发明实施例一提供的基于数据流架构的图像尺寸调整结构的结构示意图,本实施例可适用于对视频及图像流相关数据进行前后端计算处理的情况。如图1所示,该图像尺寸调整结构包括:第一乘法运算单元11、第二乘法运算单元12、第一数据寄存单元13、第二数据寄存单元14、第一加法运算单元15和第二加法运算单元16;其中,所述第一乘法运算单元11的第一输入端111用于接收待计算的图像数据,第二输入端112用于接收第一图像位置系数,所述第一乘法运算单元11的输出端113分别与所述第一数据寄存单元13的输入端131和所述第一加法运算单元15的第三输入端151连接,所述第一数据寄存单元13的输出端132与所述第一加法运算单元15的第四输入端152连接,所述第一加法运算单元15的输出端153与所述第二乘法运算单元12的第五输入端121连接,所述第二乘法运算单元12的第六输入端122用于接收第二图像位置系数,所述第二乘法运算单元12的输出端123分别与所述第二数据寄存单元14的输入端141和所述第二加法运算单元16的第七输入端161连接,所述第二数据寄存单元14的输出端142与所述第二加法运算单元16的第八输入端162连接,所述第二加法运算单元16的输出端163用于输出所述图像数据的计算结果;所述第一数据寄存单元13用于在串行运算中存储所述第一乘法运算单元11上一次的运算结果,所述第二数据寄存单元14用于在串行运算中存储所述第二乘法运算单元12上一次的运算结果。
具体的,可以根据约定的时序要求分别从该结构中第一乘法运算单元11的第一输入端111、第二输入端112和第二乘法运算单元12的第六输入端122输入待计算的图像数据、第一图像位置系数和第二图像位置系数,则该结构可以采用双线性插值算法对输入的图像数据进行计算,以得到放缩后的计算结果。在此之前,可以根据预设图像缩放比例由前级模块先行计算得到行与列方向上对应的第一图像位置系数和第二图像位置系数。其中,双线性插值算法可以采用如下公式进行计算:
D 0=u 1×(v 1×Q 0+v 0×Q 1)+u 0×(v 1×Q 2+v 0×Q 3)
其中,D 0表示计算结果,即目标图像的像素数据,v 0和v 1表示第一图像位置系数,u 0和u 1表示第二图像位置系数,Q 0、Q 1、Q 2和Q 3表示源图像中对应所需的4个像素数据。则可以通过第一乘法运算单元11分别计算v 1×Q 0、v 0×Q 1、v 1×Q 2和v 0×Q 3,再通过第一加法运算单元15计算得到v 1×Q 0+v 0×Q 1和v 1×Q 2+v 0×Q 3,再通过第二乘法运算单元12计算得到u 1×(v 1×Q 0+v 0×Q1和u0×v1×Q2+v0×Q3,最后即可通过第二加法运算单元16计算得到D0并进行输出,其中的加法运算可以分别通过第一数据寄存单元13和第二数据寄存单元14存储其前级单元上一次运算结果来实现。然后通过控制待计算的图像数据依次进入并流经各个计算单元,即可完成整张图像的计算。
在上述技术方案的基础上,可选的,如图2所示,在所述第一加法运算单元15和所述第二乘法运算单元12之间,还包括第一数据选择单元21和第一数据分配单元22,在所述第一数据选择单元21和所述第一数据分配单元22之间,还包括第三数据寄存单元23和第三加法运算单元24;其中,所述第一加法运算单元15的输出端153与所述第一数据选择单元21的输入端211连接,所述 第一数据分配单元22的输出端223与所述第二乘法运算单元12的输入端121连接,所述第一数据选择单元21的第一输出端212直接与所述第一数据分配单元22的第九输入端221连接,所述第一数据分配单元22的第十输入端222与所述第三加法运算单元24的输出端243连接,所述第一数据选择单元21的第二输出端213分别与所述第三数据寄存单元23的输入端231和所述第三加法运算单元24的第十一输入端241连接,所述第三数据寄存单元23的输出端232与所述第三加法运算单元24的第十二输入端242连接;在所述第二加法运算单元16之后,还包括第二数据选择单元25和第二数据分配单元26,在所述第二数据选择单元25和所述第二数据分配单元26之间,还包括第四数据寄存单元27和第四加法运算单元28;其中,所述第二加法运算单元16的输出端163与所述第二数据选择单元25的输入端251连接,所述第二数据分配单元26的输出端263用于输出所述图像数据的计算结果,所述第二数据选择单元25的第三输出端252直接与所述第二数据分配单元26的第十三输入端261连接,所述第二数据分配单元26的第十四输入端262与所述第四加法运算单元28的输出端283连接,所述第二数据选择单元25的第四输出端253分别与所述第四数据寄存单元27的输入端271和所述第四加法运算单元28的第十五输入端281连接,所述第四数据寄存单元27的输出端272与所述第四加法运算单元28的第十六输入端282连接。
进一步可选的,若所述第一数据选择单元21和所述第一数据分配单元22选择使用所述第一输出端212到所述第九输入端221的数据流向,且所述第二数据选择单元25和所述第二数据分配单元26选择使用所述第三输出端252到所述第十三输入端261的数据流向,则所述图像尺寸调整结构采用双线性插值 算法。以及,可选的,若所述第一数据选择单元21和所述第一数据分配单元22选择使用所述第二输出端213到所述第十输入端222的数据流向,且所述第二数据选择单元25和所述第二数据分配单元26选择使用所述第四输出端253到所述第十四输入端262的数据流向,则所述图像尺寸调整结构采用三次插值方法。
具体的,可以通过成组的数据选择单元和数据分配单元配合进行旁路处理,以使本实施例所提供的图像尺寸调整结构可以选择使用双线性插值算法或者三次插值方法对输入的图像数据进行计算,以得到对应放缩后的计算结果。当第一数据选择单元21和第一数据分配单元22选择使用第一输出端212到第九输入端221的数据流向,且第二数据选择单元25和第二数据分配单元26选择使用第三输出端252到第十三输入端261的数据流向时,该结构可以采用双线性插值算法对输入的图像数据进行计算,具体的计算过程如上所述,在此不再冗述。当第一数据选择单元21和第一数据分配单元22选择使用第二输出端213到第十输入端222的数据流向,且第二数据选择单元25和第二数据分配单元26选择使用第四输出端253到第十四输入端262的数据流向时,该结构可以采用三次插值方法对输入的图像数据进行计算,具体可以采用如下公式:
D 0=u 0×(v 0×Q 0+v 1×Q 1+v 2×Q 2+v 3×Q 3)+u 1
×(v 0×Q 4+v 1×Q 5+v 2×Q 6+v 3×Q 7)+u 2
×(v 0×Q 8+v 1×Q 9+v 2×Q 10+v 3×Q 11)+u 3
×(v 0×Q 12+v 1×Q 13+v 2×Q 14+v 3×Q 15)
其中,D 0表示计算结果,即目标图像的像素数据,v 0、v 1、v 2和v 3表示第一图像位置系数,u 0、u 1、u 2和u 3表示第二图像位置系数,Q 0、Q 1、Q 2、Q 3、 Q 4、Q 5、Q 6、Q 7、Q 8、Q 9、Q 10、Q 11、Q 12、Q 13、Q 14和Q 15表示源图像中对应所需的16个像素数据。则可以通过第一乘法运算单元11串行计算v 0×Q 0、v 1×Q 1、v 2×Q 2、v 3×Q 3、v 0×Q 4、v 1×Q 5、v 2×Q 6、v 3×Q 7、v 0×Q 8、v 1×Q 9、v 2×Q 10、v 3×Q 11、v 0×Q 12、v 1×Q 13、v 2×Q 14和v 3×Q 15,再通过第一加法运算单元15将其中的相邻乘积,串行的两两相加,分别得到v 0×Q 0+v 1×Q 1、v 2×Q 2+v 3×Q 3、v 0×Q 4+v 1×Q 5、v 2×Q 6+v 3×Q 7、v 0×Q 8+v 1×Q 9、v 2×Q 10+v 3×Q 11、v 0×Q 12+v 1×Q 13和v 2×Q 14+v 3×Q 15,再通过第三加法运算单元24将其中的相邻加和,串行的两两相加,分别得到v 0×Q 0+v 1×Q 1+v 2×Q 2+v 3×Q 3、v 0×Q 4+v 1×Q 5+v 2×Q 6+v 3×Q 7、v 0×Q 8+v 1×Q 9+v 2×Q 10+v 3×Q 11和v 0×Q 12+v 1×Q 13+v 2×Q 14+v 3×Q 15,再通过第二乘法运算单元12将得到的各个加和,串行的分别与u 0、u 1、u 2和u 3相乘,分别得到u 0×(v 0×Q 0+v 1×Q 1+v 2×Q 2+v 3×Q 3)、u 1×(v 0×Q 4+v 1×Q 5+v 2×Q 6+v 3×Q 7)、u 2×(v 0×Q 8+v 1×Q 9+v 2×Q 10+v 3×Q 11)和u 3×(v 0×Q 12+v 1×Q 13+v 2×Q 14+v 3×Q 15),再通过第二加法运算单元16将其中的相邻乘积,串行的两两相加,得到u 0×(v 0×Q 0+v 1×Q 1+v 2×Q 2+v 3×Q 3)+u 1×(v 0×Q 4+v 1×Q 5+v 2×Q 6+v 3×Q 7)和u 2×(v 0×Q 8+v 1×Q 9+v 2×Q 10+v 3×Q 11)+u 3×(v 0×Q 12+v 1×Q 13+v 2×Q 14+v 3×Q 15),最后即可通过第四加法运算单元28计算得到D 0并进行输出,其中的加法运算可以分别通过第一数据寄存单元13、第二数据寄存单元14、第三数据寄存单元23和第四数据寄存单元27存储其前级单元上一次运算结果来实现,其中,第三数据寄存单元23可用于在串行计算中存储第一加法运算单元15上一次的运算结果,第四数据寄存单元27可用于在串行计算中存储第二加法运算单元 16上一次的运算结果。然后通过控制待计算的图像数据依次进入并流经各个计算单元,即可完成整张图像的计算。
本发明实施例所提供的基于数据流架构的图像尺寸调整结构,包括第一乘法运算单元、第二乘法运算单元、第一数据寄存单元、第二数据寄存单元、第一加法运算单元和第二加法运算单元,各个单元的输入输出端口按照设定的数据流向进行连接,通过在数据流架构中构建用于调整图像尺寸的专用单元,并控制待计算的图像数据依次流经不同的计算单元,避免了调用CPU进行基于指令集的计算,实现了图像数据的快速计算,并释放了CPU的计算压力,从而实现了视频及图像流的快速缩放处理,有效的减缓了AI芯片的前后处理效率低下的瓶颈问题。
实施例二
图3为本发明实施例二提供的图像尺寸调整方法的流程图。本实施例可适用于对视频及图像流相关数据进行前后端计算处理的情况,该方法可以应用于本发明任意实施例所提供的基于数据流架构的图像尺寸调整结构,具备该结构相应的方法流程和有益效果。如图3所示,具体包括如下步骤:
S11、获取待计算的图像数据及所需的图像位置系数,所述图像位置系数包括行方向上的第一图像位置系数和列方向上的第二图像位置系数。
S12、根据预设时序要求,将所述图像数据和所述图像位置系数输入到所述图像尺寸调整结构的对应端口中。
S13、根据所选择的图像尺寸调整算法控制所述图像尺寸调整结构中的数据流向,并通过所述图像尺寸调整结构得到完成放缩后的计算结果。
其中,可选的,所述根据所选择的图像尺寸调整算法控制所述图像尺寸调整结构中的数据流向,并通过所述图像尺寸调整结构得到完成放缩后的计算结果,包括:
若所述图像尺寸调整算法为双线性插值算法,则使用如下公式进行计算:
D 0=u 1×(v 1×Q 0+v 0×Q 1)+u 0×(v 1×Q 2+v 0×Q 3)
其中,D 0表示目标图像的像素数据,v 0和v 1表示所述第一图像位置系数,u 0和u 1表示所述第二图像位置系数,Q 0、Q 1、Q 2和Q 3表示源图像的4个像素数据。
以及,可选的,所述根据所选择的图像尺寸调整算法控制所述图像尺寸调整结构中的数据流向,并通过所述图像尺寸调整结构得到完成放缩后的计算结果,包括:
若所述图像尺寸调整算法为三次插值方法,则使用如下公式进行计算:
D 0=u 0×(v 0×Q 0+v 1×Q 1+v 2×Q 2+v 3×Q 3)+u 1
×(v 0×Q 4+v 1×Q 5+v 2×Q 6+v 3×Q 7)+u 2
×(v 0×Q 8+v 1×Q 9+v 2×Q 10+v 3×Q 11)+u 3
×(v 0×Q 12+v 1×Q 13+v 2×Q 14+v 3×Q 15)
其中,D 0表示目标图像的像素数据,v 0、v 1、v 2和v 3表示所述第一图像位置系数,u 0、u 1、u 2和u 3表示所述第二图像位置系数,Q 0、Q 1、Q 2、Q 3、Q 4、Q 5、Q 6、Q 7、Q 8、Q 9、Q 10、Q 11、Q 12、Q 13、Q 14和Q 15表示源图像的16个像素数据。
可选的,所述获取所需的图像位置系数,包括:
通过前级模块根据预设图像放缩比例及所述图像尺寸调整算法计算所述图像位置系数。
具体的,在需要对视频及图像流相关数据进行前后端计算处理时,首先可以通过本结构的前级模块根据预设图像放缩比例及所选择的图像尺寸调整算法计算所需的图像位置系数,然后可以同获取到的需要调整的图像数据基于约定的时序要求,对应输入到图像尺寸调整结构的相应端口中,如上所述,可以将图像数据输入到该结构中第一乘法运算单元的第一输入端,将第一图像位置系数和第二图像位置系数分别输入到该结构中第一乘法运算单元的第二输入端和第二乘法运算单元的第六输入端,再根据所选择的图像尺寸调整算法控制该结构中的数据流向,具体可以通过控制其中第一数据选择单元、第一数据分配单元、第二数据选择单元和第二数据分配单元所启用的输入输出端口来实现,从而使用该结构进行计算。计算过程具体可以采用双线性插值算法或三次插值方法,具体的计算过程可以参考上述说明,在此将不再冗述。
本发明实施例所提供的技术方案,通过在数据流架构中构建用于调整图像尺寸的专用单元,并控制待计算的图像数据依次流经不同的计算单元,避免了调用CPU进行基于指令集的计算,实现了图像数据的快速计算,并释放了CPU的计算压力,从而实现了视频及图像流的快速缩放处理,有效的减缓了AI芯片的前后处理效率低下的瓶颈问题。
实施例三
图4为本发明实施例一提供的基于双线性插值算法的图像缩放方法的流程图。本实施例可适用于对视频及图像流相关数据进行前后端计算处理的情况,该方法可以由本发明实施例所提供的基于双线性插值算法的图像缩放装置来执行,该装置可以由硬件和/或软件的方式来实现,一般可集成于计算机设备中。 如图4所示,具体包括如下步骤:
S21、获取源图像的第一尺寸和目标图像的第二尺寸,并根据所述第一尺寸和所述第二尺寸确定行方向形变系数和列方向形变系数。
其中,源图像即为原始图像,目标图像即为完成缩放后所希望得到的目标图像,在获得源图像时即可得到其第一尺寸,并当需要对源图像进行放缩时,通常也知晓目标图像的第二尺寸,如为与后续模块相匹配的尺寸。在获得该第一尺寸和第二尺寸之后,即可根据图像的尺寸变化确定行方向形变系数和列方向形变系数。
可选的,所述获取源图像的第一尺寸和目标图像的第二尺寸,并根据所述第一尺寸和所述第二尺寸确定行方向形变系数和列方向形变系数,包括:
Figure PCTCN2023070762-appb-000004
Figure PCTCN2023070762-appb-000005
其中,scale x表示所述行方向形变系数,scale y表示所述列方向形变系数,src rows表示所述第一尺寸中的宽尺寸,dst rows表示所述第二尺寸中的宽尺寸,src cols表示所述第一尺寸中的高尺寸,dst cols表示所述第二尺寸中的高尺寸。具体的,第一尺寸和第二尺寸中均可以包括宽尺寸(即图像在行方向的大小)和高尺寸(即图像在列方向上的大小),在获得了第一尺寸和第二尺寸之后,可以通过源图像的宽尺寸和目标图像的宽尺寸之间的比例关系计算得到行方向形变系数,以及通过源图像的高尺寸和目标图像的高尺寸之间的比例关系计算得到列方向形变系数。
S22、分别将所述行方向形变系数和所述列方向形变系数输入定点累加器进 行累加计算。
具体的,在确定了行方向形变系数和列方向形变系数之后,可以分别将行方向形变系数和列方向形变系数输入定点累加器进行累加计算,以得到每次累加的结果。其中,累加计算过程为时钟级别的流水线计算,应用到数据流架构AI芯片中可以大幅度提高双线性插值处理速度。
可选的,所述定点累加器包括第一累加器和第二累加器,所述分别将所述行方向形变系数和所述列方向形变系数输入定点累加器进行累加计算,包括:初始化所述第一累加器的第一输入端和第二输入端,以及所述第二累加器的第三输入端和第四输入端;所述第一累加器的输出端连接至所述第一输入端,所述行方向形变系数通过所述第二输入端输入,所述第二累加器的输出端连接至所述第三输入端,所述列方向形变系数通过所述第四输入端输入;控制所述第一累加器进行第一次数的累加计算,以及控制所述第二累加器进行第二次数的累加计算,其中,所述第一次数为所述目标图像在行方向上的像素个数,所述第二次数为所述目标图像在列方向上的像素个数。具体的,对行方向形变系数和列方向形变系数的累加过程可以使用不同的定点累加器同时进行,以提高计算效率,当然也可以使用同一定点累加器先后分别进行累加。具体可以通过第一累加器对行方向形变系数进行累加,以及通过第二累加器对列方向形变系数进行累加,首先可以对第一累加器和第二累加器进行初始化,则第一累加器的第一次累加结果为行方向形变系数本身,同时第一次累加结果又作为第二次累加过程的输入,并再次与行方向形变系数进行加和,则第二次累加结果为二倍的行方向形变系数,以此类推,第一累加器的各个累加结果为有规律的递增整数倍的行方向形变系数,同理第二累加器的各个累加结果为有规律的递增整数 倍的列方向形变系数。同时可以通过定位数来控制第一累加器和第二累加器的累加次数,即第一累加器仅执行第一次数的累加计算,第二累加器仅执行第二次数的累加计算,以根据实际需要控制计算量,并节约计算结果所需的存储空间,其中第一次数为目标图像在行方向上的像素个数,第二次数为目标图像在列方向上的像素个数,则可以满足后续目标图像中所有目标像素点的插值计算需求。
S23、分别对每次的累加结果进行取整操作,以得到各个所述累加结果的整数部分和小数部分,并将所述行方向形变系数的各个累加结果的整数部分作为行取数地址,小数部分作为行插值系数,将所述列方向形变系数的各个累加结果的整数部分作为列取数地址,小数部分作为列插值系数。
具体的,在获得了每次的累加结果之后,可以分别进行取整操作,则对行方向形变系数进行累加所得到的累加结果的整数部分可以作为行取数地址(即源图像中像素点的行坐标),小数部分可以作为行插值系数,对列方向形变系数进行累加所得到的累加结果的整数部分可以作为列取数地址(即源图像中像素点的列坐标),小数部分可以作为列插值系数。在确定了各个行取数地址、行插值系数、列取数地址和列插值系数之后,可以对这些数据进行存储备用。
S24、针对所述目标图像中的每个目标像素点,根据所述目标像素点的像素坐标从所有所述行取数地址、所述列取数地址、所述行插值系数及所述列插值系数中确定对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数,并根据所述目标行取数地址和所述目标列取数地址确定在所述源图像中对应的算法所需的源像素点,再基于双线性插值算法,根据所述源像素点的像素值、所述目标行插值系数及所述目标列插值系数确定所述目标像素点 的像素值。
具体的,可以分别对目标图像中的每个目标像素点进行处理,以得到各个目标像素点对应源图像的像素值,从而得到完整的目标图像。针对每个目标像素点,可以根据其像素坐标从存储的行取数地址、行插值系数、列取数地址和列插值系数中选取对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数,然后根据目标行取数地址和目标列取数地址可以在源图像中找到对应的源像素点,并可以根据该源像素点确定算法所需的所有源像素点,从而获得各个源像素点的像素值,再基于双线性插值算法,即可根据源像素点的像素值、目标行插值系数及目标列插值系数计算得到目标像素点的像素值。
可选的,所述取整操作为向下取整,所述根据所述目标行取数地址和所述目标列取数地址确定在所述源图像中对应的算法所需的源像素点,包括:将所述目标行取数地址作为行坐标,将所述目标列取数地址作为列坐标确定第一源像素点;将所述目标行取数地址作为行坐标,将所述目标列取数地址加一作为列坐标确定第二源像素点;将所述目标行取数地址加一作为行坐标,将所述目标列取数地址作为列坐标确定第三源像素点;将所述目标行取数地址加一作为行坐标,将所述目标列取数地址加一作为列坐标确定第四源像素点。具体的,当采用向下取整方式时,根据目标行取数地址和目标列取数地址首先确定得到的为距离坐标原点最近的第一源像素点,然后可以分别将目标行取数地址加一、将目标列取数地址加一、或者将目标行取数地址和目标列取数地址均加一,来获得所需的第二源像素点、第三源像素点和第四源像素点,以便后续进行插值计算。
进一步可选的,所述基于双线性插值算法,根据所述源像素点的像素值、 所述目标行插值系数及所述目标列插值系数确定所述目标像素点的像素值,包括:
Figure PCTCN2023070762-appb-000006
其中,(X dst,Y dst)表示所述目标像素点的像素坐标,
Figure PCTCN2023070762-appb-000007
表示所述目标像素点的像素值,u x1表示所述目标行插值系数,v x1表示所述目标列插值系数,Q x1表示所述第一源像素点的像素值,Q x1表示所述第二源像素点的像素值,Q x21表示所述第三源像素点的像素值,Q x22表示所述第四源像素点的像素值。具体的,在确定了目标行插值系数、目标列插值系数及各个源像素点的像素值之后,即可通过上述公式计算得到对应目标像素点的像素值。
进一步说明,在应用双线性插值算法时,需要将目标图像中的目标像素点映射回源图像,再取其周围4个邻近像素点做线性插值计算,插值计算过程需要首先得到上述公式中的目标行插值系数(值为映射后的坐标点在行方向上与第一源像素点之间距离)、目标列插值系数(值为映射后的坐标点在列方向上与第一源像素点之间距离)以及各个源像素点的像素值,而要确定其中的源像素点的像素值,则需要得到各个源像素点的像素坐标,传统计算方式如下:
X src=X dst×scale x
Y src=Y dxt×scale y
其中,X dst表示目标像素点的行坐标,X src表示将目标像素点映射回源图像后的行坐标,scale x表示行方向形变系数,Y dst表示目标像素点的列坐标,Y src表示将目标像素点映射回源图像后的列坐标,scale y表示列方向形变系数,则可以根据X src和Y src确定周围4个邻近像素点。在计算过程中,是将整数的X dst与 小数的scale x相乘,而scale x是由src rows与dst rows相除得到,因此在计算X src时涉及到一次乘法及一次除法运算,同理计算Y src时类似。而其中scale x和scale y为固定小数,只要源图像和目标图像的尺寸确定下来,即可计算得到,在整个图像的其他数据计算过程中均不会改变,同时X dst和Y dst为有规律的递增整数,即0,1,2,…递增,因此可以将上述公式转化为累加计算过程,通过不断累加计算,得到的结果的整数部分即为第一源像素点的坐标,小数部分即为行插值系数和列插值系数。应用本实施例所提供的方法,每次变换图像缩放比例后到完成插值计算前,仅需重新计算一次行方向形变系数和列方向形变系数,而后续对取数地址和插值系数的计算均可以通过累加来实现,进而再进行双线性插值的乘加运算,从而大幅度减少了乘除法运算。
在上述技术方案的基础上,可选的,在所述根据所述目标像素点的像素坐标从所有所述行取数地址、所述列取数地址、所述行插值系数及所述列插值系数中确定对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数之前,还包括:将所述行方向形变系数的各个累加结果的整数部分和小数部分依次存入第一查找表中,将所述列方向形变系数的各个累加结果的整数部分和小数部分依次存入第二查找表中,所述第一查找表和所述第二查找表的索引值分别与各个所述目标像素点的行坐标和列坐标对应;相应的,所述根据所述目标像素点的像素坐标从所有所述行取数地址、所述列取数地址、所述行插值系数及所述列插值系数中确定对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数,包括:分别将所述像素坐标中的行坐标和列坐标作为索引值从所述第一查找表和所述第二查找表中查找所述目标行取数地址、所述目标列取数地址、所述目标行插值系数及所述目标列插值系数。
具体的,为了便于各个目标像素点查找对应所需的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数,可以预先将行方向形变系数的各个累加结果的整数部分和小数部分依次存入第一查找表,将列方向形变系数的各个累加结果的整数部分和小数部分依次存入第二查找表,其中,第一查找表和第二查找表的索引值均可以是从0开始的有序递增整数,即0,1,2,…递增,以分别与目标像素点的行坐标和列坐标对应,从而在需要计算目标图像中的像素值时,可以首先直接通过目标像素点的行坐标和列坐标作为索引值分别到第一查找表和第二查找表中查找对应所需的取数地址和插值系数,应用本实施例所提供的方法,每次变换图像缩放比例后到完成插值计算前,仅需计算一次查找表内容,后续即可直接查找使用。进一步可选的,所述第二尺寸包括宽尺寸和高尺寸;在所述将所述行方向形变系数的各个累加结果的整数部分和小数部分依次存入第一查找表中,将所述列方向形变系数的各个累加结果的整数部分和小数部分依次存入第二查找表中之前,还包括:分别为所述第一查找表和所述第二查找表分配查找表空间,所述第一查找表的查找表空间宽度为2,深度为所述宽尺寸,所述第二查找表的查找表空间宽度为2,深度为所述高尺寸。具体的,即查找表中每个索引值对应一个取数地址和一个插值系数,并可以根据目标图像的第二尺寸确定预先所需计算的取数地址及插值系数的数量,以作为查找表的深度,从而根据实际需要设置查找表空间。
本发明实施例所提供的技术方案,首先获取源图像的第一尺寸和目标图像的第二尺寸,并根据第一尺寸和第二尺寸确定行方向形变系数和列方向形变系数,然后分别将行方向形变系数和列方向形变系数输入到定点累加器中进行累加计算,并分别对每次的累加结果进行取整操作,以根据取整后得到的整数部 分和小数部分确定源图像中像素点的取数地址及插值系数,再针对目标图像中的每个目标像素点,根据目标像素点的像素坐标确定对应的目标取数地址及目标插值系数,并根据目标取数地址确定在源图像中对应的算法所需的源像素点,从而基于双线性插值算法,根据源像素点的像素值及目标插值系数确定目标像素点的像素值。通过在数据流架构中构建用于双线性插值的专用单元,将原有的乘除法运算转化为累加运算,大幅度减少了在图像缩放过程中所应用到的乘除法运算,从而实现了极高效率的运算,释放了CPU的计算压力。
在传统的计算体系架构中,需要使用CPU基于指令集进行图像的双线性插值功能,在进行插值计算时,分为系数产生、数据提取、系数与数据乘加运算等等,由于运算的数据量大,且其中部分过程涉及到除法运算,因此导致对系统资源及运算时间成本有较大的消耗,从而影响了整体的计算效率,通过实施例三的方法以大幅度减少乘除法运算,从而实现极高效率的运算,释放CPU的计算压力,具体是,通过在数据流架构中构建用于双线性插值的专用单元,将原有的乘除法运算转化为累加运算,大幅度减少了在图像缩放过程中所应用到的乘除法运算。
实施例四
图5为本发明实施例二提供的基于双线性插值算法的图像缩放装置的结构示意图,该装置可以由硬件和/或软件的方式来实现,一般可集成于计算机设备中,用于执行本发明任意实施例所提供的基于双线性插值算法的图像缩放方法。如图5所示,该装置包括:
形变系数确定模块31,用于获取源图像的第一尺寸和目标图像的第二尺寸, 并根据所述第一尺寸和所述第二尺寸确定行方向形变系数和列方向形变系数;
累加计算模块32,用于分别将所述行方向形变系数和所述列方向形变系数输入定点累加器进行累加计算;
插值参数确定模块33,用于分别对每次的累加结果进行取整操作,以得到各个所述累加结果的整数部分和小数部分,并将所述行方向形变系数的各个累加结果的整数部分作为行取数地址,小数部分作为行插值系数,将所述列方向形变系数的各个累加结果的整数部分作为列取数地址,小数部分作为列插值系数;
目标像素确定模块34,用于针对所述目标图像中的每个目标像素点,根据所述目标像素点的像素坐标从所有所述行取数地址、所述列取数地址、所述行插值系数及所述列插值系数中确定对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数,并根据所述目标行取数地址和所述目标列取数地址确定在所述源图像中对应的算法所需的源像素点,再基于双线性插值算法,根据所述源像素点的像素值、所述目标行插值系数及所述目标列插值系数确定所述目标像素点的像素值。
本发明实施例所提供的技术方案,首先获取源图像的第一尺寸和目标图像的第二尺寸,并根据第一尺寸和第二尺寸确定行方向形变系数和列方向形变系数,然后分别将行方向形变系数和列方向形变系数输入到定点累加器中进行累加计算,并分别对每次的累加结果进行取整操作,以根据取整后得到的整数部分和小数部分确定源图像中像素点的取数地址及插值系数,再针对目标图像中的每个目标像素点,根据目标像素点的像素坐标确定对应的目标取数地址及目标插值系数,并根据目标取数地址确定在源图像中对应的算法所需的源像素点, 从而基于双线性插值算法,根据源像素点的像素值及目标插值系数确定目标像素点的像素值。通过在数据流架构中构建用于双线性插值的专用单元,将原有的乘除法运算转化为累加运算,大幅度减少了在图像缩放过程中所应用到的乘除法运算,从而实现了极高效率的运算,释放了CPU的计算压力。
在上述技术方案的基础上,可选的,该基于双线性插值算法的图像缩放装置,还包括:
累加结果存储模块,用于在所述根据所述目标像素点的像素坐标从所有所述行取数地址、所述列取数地址、所述行插值系数及所述列插值系数中确定对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数之前,将所述行方向形变系数的各个累加结果的整数部分和小数部分依次存入第一查找表中,将所述列方向形变系数的各个累加结果的整数部分和小数部分依次存入第二查找表中,所述第一查找表和所述第二查找表的索引值分别与各个所述目标像素点的行坐标和列坐标对应;
相应的,目标像素确定模块34,包括:
目标参数查找单元,用于分别将所述像素坐标中的行坐标和列坐标作为索引值从所述第一查找表和所述第二查找表中查找所述目标行取数地址、所述目标列取数地址、所述目标行插值系数及所述目标列插值系数。
在上述技术方案的基础上,可选的,所述第二尺寸包括宽尺寸和高尺寸;该基于双线性插值算法的图像缩放装置,还包括:
查找表空间分配模块,用于在所述将所述行方向形变系数的各个累加结果的整数部分和小数部分依次存入第一查找表中,将所述列方向形变系数的各个累加结果的整数部分和小数部分依次存入第二查找表中之前,分别为所述第一 查找表和所述第二查找表分配查找表空间,所述第一查找表的查找表空间宽度为2,深度为所述宽尺寸,所述第二查找表的查找表空间宽度为2,深度为所述高尺寸。
在上述技术方案的基础上,可选的,所述取整操作为向下取整,目标像素确定模块34,包括:
源像素点确定单元,用于将所述目标行取数地址作为行坐标,将所述目标列取数地址作为列坐标确定第一源像素点;将所述目标行取数地址作为行坐标,将所述目标列取数地址加一作为列坐标确定第二源像素点;将所述目标行取数地址加一作为行坐标,将所述目标列取数地址作为列坐标确定第三源像素点;将所述目标行取数地址加一作为行坐标,将所述目标列取数地址加一作为列坐标确定第四源像素点。
在上述技术方案的基础上,可选的,目标像素确定模块34具体用于:
Figure PCTCN2023070762-appb-000008
其中,(X dst,Y dst)表示所述目标像素点的像素坐标,
Figure PCTCN2023070762-appb-000009
表示所述目标像素点的像素值,u x1表示所述目标行插值系数,v x1表示所述目标列插值系数,Q x11表示所述第一源像素点的像素值,Q x1表示所述第二源像素点的像素值,Q x21表示所述第三源像素点的像素值,Q x22表示所述第四源像素点的像素值。
在上述技术方案的基础上,可选的,形变系数确定模块31具体用于:
Figure PCTCN2023070762-appb-000010
Figure PCTCN2023070762-appb-000011
其中,scale x表示所述行方向形变系数,scale y表示所述列方向形变系数,src rows表示所述第一尺寸中的宽尺寸,dst rows表示所述第二尺寸中的宽尺寸,src cols表示所述第一尺寸中的高尺寸,dst cols表示所述第二尺寸中的高尺寸。
在上述技术方案的基础上,可选的,累加计算模块32,包括:
初始化单元,用于初始化所述第一累加器的第一输入端和第二输入端,以及所述第二累加器的第三输入端和第四输入端;所述第一累加器的输出端连接至所述第一输入端,所述行方向形变系数通过所述第二输入端输入,所述第二累加器的输出端连接至所述第三输入端,所述列方向形变系数通过所述第四输入端输入;
累加控制单元,用于控制所述第一累加器进行第一次数的累加计算,以及控制所述第二累加器进行第二次数的累加计算,其中,所述第一次数为所述目标图像在行方向上的像素个数,所述第二次数为所述目标图像在列方向上的像素个数。
本发明实施例所提供的基于双线性插值算法的图像缩放装置可执行本发明任意实施例所提供的基于双线性插值算法的图像缩放方法,具备执行方法相应的功能模块和有益效果。
值得注意的是,在上述基于双线性插值算法的图像缩放装置的实施例中,所包括的各个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,各功能单元的具体名称也只是为了便于相互区分,并不用于限制本发明的保护范围。
实施例五
图6为本发明实施例五提供的计算机设备的结构示意图,示出了适于用来实现本发明实施方式的示例性计算机设备的框图。图6显示的计算机设备仅仅是一个示例,不应对本发明实施例的功能和使用范围带来任何限制。如图6所示,该计算机设备包括处理器41、存储器42、输入装置43及输出装置44;计算机设备中处理器41的数量可以是一个或多个,图4中以一个处理器41为例,计算机设备中的处理器41、存储器42、输入装置43及输出装置44可以通过总线或其他方式连接,图6中以通过总线连接为例。
存储器42作为一种计算机可读存储介质,可用于存储软件程序、计算机可执行程序以及模块,如本发明实施例中的图像尺寸调整方法对应的程序指令/模块。处理器41通过运行存储在存储器42中的软件程序、指令以及模块,从而执行计算机设备的各种功能应用以及数据处理,即实现上述的图像尺寸调整方法,以及实现上述的基于双线性插值算法的图像缩放方法。
存储器42可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据计算机设备的使用所创建的数据等。此外,存储器42可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实例中,存储器42可进一步包括相对于处理器41远程设置的存储器,这些远程存储器可以通过网络连接至计算机设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
输入装置43可用于获取待计算的图像数据及所需的图像位置系数,以及产生与计算机设备的用户设置和功能控制有关的键信号输入等。输出装置44可用于向后续模块传输计算结果等等。当然,输入装置43还可用于获取源图像的第 一尺寸和目标图像的第二尺寸,以及产生与计算机设备的用户设置和功能控制有关的键信号输入等。输出装置44可用于向后续模块传输计算结果等等。
实施例六
本发明实施例六还提供一种包含计算机可执行指令的存储介质,该计算机可执行指令在由计算机处理器执行时用于执行一种图像尺寸调整方法,该方法包括:
获取待计算的图像数据及所需的图像位置系数,所述图像位置系数包括行方向上的第一图像位置系数和列方向上的第二图像位置系数;
根据预设时序要求,将所述图像数据和所述图像位置系数输入到所述图像尺寸调整结构的对应端口中;
根据所选择的图像尺寸调整算法控制所述图像尺寸调整结构中的数据流向,并通过所述图像尺寸调整结构得到完成放缩后的计算结果。
实施例七
本发明实施例七还提供一种包含计算机可执行指令的存储介质,该计算机可执行指令在由计算机处理器执行时用于执行一种基于双线性插值算法的图像缩放方法,该方法包括:
获取源图像的第一尺寸和目标图像的第二尺寸,并根据所述第一尺寸和所述第二尺寸确定行方向形变系数和列方向形变系数;
分别将所述行方向形变系数和所述列方向形变系数输入定点累加器进行累加计算;
分别对每次的累加结果进行取整操作,以得到各个所述累加结果的整数部分和小数部分,并将所述行方向形变系数的各个累加结果的整数部分作为行取数地址,小数部分作为行插值系数,将所述列方向形变系数的各个累加结果的整数部分作为列取数地址,小数部分作为列插值系数;
针对所述目标图像中的每个目标像素点,根据所述目标像素点的像素坐标从所有所述行取数地址、所述列取数地址、所述行插值系数及所述列插值系数中确定对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数,并根据所述目标行取数地址和所述目标列取数地址确定在所述源图像中对应的算法所需的源像素点,再基于双线性插值算法,根据所述源像素点的像素值、所述目标行插值系数及所述目标列插值系数确定所述目标像素点的像素值。
存储介质可以是任何的各种类型的存储器设备或存储设备。术语“存储介质”旨在包括:安装介质,例如CD-ROM、软盘或磁带装置;计算机系统存储器或随机存取存储器,诸如DRAM、DDR RAM、SRAM、EDO RAM、兰巴斯(Rambus)RAM等;非易失性存储器,诸如闪存、磁介质(例如硬盘或光存储);寄存器或其它相似类型的存储器元件等。存储介质可以还包括其它类型的存储器或其组合。另外,存储介质可以位于程序在其中被执行的计算机系统中,或者可以位于不同的第二计算机系统中,第二计算机系统通过网络(诸如因特网)连接到计算机系统。第二计算机系统可以提供程序指令给计算机用于执行。术语“存储介质”可以包括可以驻留在不同位置中(例如在通过网络连接的不同计 算机系统中)的两个或更多存储介质。存储介质可以存储可由一个或多个处理器执行的程序指令(例如具体实现为计算机程序)。
当然,本发明实施例所提供的一种包含计算机可执行指令的存储介质,其计算机可执行指令不限于如上所述的方法操作,还可以执行本发明任意实施例所提供的图像尺寸调整方法中的相关操作。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、电线、光缆、RF等等,或者上述的任意合适的组合。
通过以上关于实施方式的描述,所属领域的技术人员可以清楚地了解到,本发明可借助软件及必需的通用硬件来实现,当然也可以通过硬件实现,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如计算机的软盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、闪存(FLASH)、硬盘或光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。
注意,上述仅为本发明的较佳实施例及所运用技术原理。本领域技术人员会理解,本发明不限于这里所述的特定实施例,对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本发明的保护范围。因此,虽然通过以上实施例对本发明进行了较为详细的说明,但是本发明不仅仅限于以上实施例,在不脱离本发明构思的情况下,还可以包括更多其他等效实施例,而本发明的范围由所附的权利要求范围决定。

Claims (16)

  1. 一种基于数据流架构的图像尺寸调整结构,其特征在于,包括:
    第一乘法运算单元、第二乘法运算单元、第一数据寄存单元、第二数据寄存单元、第一加法运算单元和第二加法运算单元;
    其中,所述第一乘法运算单元的第一输入端用于接收待计算的图像数据,第二输入端用于接收第一图像位置系数,所述第一乘法运算单元的输出端分别与所述第一数据寄存单元的输入端和所述第一加法运算单元的第三输入端连接,所述第一数据寄存单元的输出端与所述第一加法运算单元的第四输入端连接,所述第一加法运算单元的输出端与所述第二乘法运算单元的第五输入端连接,所述第二乘法运算单元的第六输入端用于接收第二图像位置系数,所述第二乘法运算单元的输出端分别与所述第二数据寄存单元的输入端和所述第二加法运算单元的第七输入端连接,所述第二数据寄存单元的输出端与所述第二加法运算单元的第八输入端连接,所述第二加法运算单元的输出端用于输出所述图像数据的计算结果;
    所述第一数据寄存单元用于在串行运算中存储所述第一乘法运算单元上一次的运算结果,所述第二数据寄存单元用于在串行运算中存储所述第二乘法运算单元上一次的运算结果。
  2. 根据权利要求1所述的基于数据流架构的图像尺寸调整结构,其特征在于,在所述第一加法运算单元和所述第二乘法运算单元之间,还包括第一数据选择单元和第一数据分配单元,在所述第一数据选择单元和所述第一数据分配单元之间,还包括第三数据寄存单元和第三加法运算单元;
    其中,所述第一加法运算单元的输出端与所述第一数据选择单元的输入端连接,所述第一数据分配单元的输出端与所述第二乘法运算单元的输入端连接, 所述第一数据选择单元的第一输出端直接与所述第一数据分配单元的第九输入端连接,所述第一数据分配单元的第十输入端与所述第三加法运算单元的输出端连接,所述第一数据选择单元的第二输出端分别与所述第三数据寄存单元的输入端和所述第三加法运算单元的第十一输入端连接,所述第三数据寄存单元的输出端与所述第三加法运算单元的第十二输入端连接;
    在所述第二加法运算单元之后,还包括第二数据选择单元和第二数据分配单元,在所述第二数据选择单元和所述第二数据分配单元之间,还包括第四数据寄存单元和第四加法运算单元;
    其中,所述第二加法运算单元的输出端与所述第二数据选择单元的输入端连接,所述第二数据分配单元的输出端用于输出所述图像数据的计算结果,所述第二数据选择单元的第三输出端直接与所述第二数据分配单元的第十三输入端连接,所述第二数据分配单元的第十四输入端与所述第四加法运算单元的输出端连接,所述第二数据选择单元的第四输出端分别与所述第四数据寄存单元的输入端和所述第四加法运算单元的第十五输入端连接,所述第四数据寄存单元的输出端与所述第四加法运算单元的第十六输入端连接。
  3. 根据权利要求2所述的基于数据流架构的图像尺寸调整结构,其特征在于,若所述第一数据选择单元和所述第一数据分配单元选择使用所述第一输出端到所述第九输入端的数据流向,且所述第二数据选择单元和所述第二数据分配单元选择使用所述第三输出端到所述第十三输入端的数据流向,则所述图像尺寸调整结构采用双线性插值算法。
  4. 根据权利要求2所述的基于数据流架构的图像尺寸调整结构,其特征在于,若所述第一数据选择单元和所述第一数据分配单元选择使用所述第二输出 端到所述第十输入端的数据流向,且所述第二数据选择单元和所述第二数据分配单元选择使用所述第四输出端到所述第十四输入端的数据流向,则所述图像尺寸调整结构采用三次插值方法。
  5. 一种图像尺寸调整方法,应用于如权利要求2所述的基于数据流架构的图像尺寸调整结构,其特征在于,包括:
    获取待计算的图像数据及所需的图像位置系数,所述图像位置系数包括行方向上的第一图像位置系数和列方向上的第二图像位置系数;
    根据预设时序要求,将所述图像数据和所述图像位置系数输入到所述图像尺寸调整结构的对应端口中;
    根据所选择的图像尺寸调整算法控制所述图像尺寸调整结构中的数据流向,并通过所述图像尺寸调整结构得到完成放缩后的计算结果。
  6. 根据权利要求5所述的图像尺寸调整方法,其特征在于,所述根据所选择的图像尺寸调整算法控制所述图像尺寸调整结构中的数据流向,并通过所述图像尺寸调整结构得到完成放缩后的计算结果,包括:若所述图像尺寸调整算法为双线性插值算法,则使用如下公式进行计算:
    D 0=u 1×(v 1×Q 0+v 0×Q 1)+u 0×(v 1×Q 2+v 0×Q 3)
    其中,D 0表示目标图像的像素数据,v 0和v 1表示所述第一图像位置系数,u 0和u 1表示所述第二图像位置系数,Q 0、Q 1、Q 2和Q 3表示源图像的4个像素数据。
  7. 根据权利要求5所述的图像尺寸调整方法,其特征在于,所述根据所选择的图像尺寸调整算法控制所述图像尺寸调整结构中的数据流向,并通过所述图像尺寸调整结构得到完成放缩后的计算结果,包括:若所述图像尺寸调整算法为三次插值方法,则使用如下公式进行计算
    D 0=u 0×(v 0×Q 0+v 1×Q 1+v 2×Q 2+v 3×Q 3)+u 1
    ×(v 0×Q 4+v 1×Q 5+v 2×Q 6+v 3×Q 7)+u 2
    ×(v 0×Q 8+v 1×Q 9+v 2×Q 10+v 3×Q 11)+u 3
    ×(v 0×Q 12+v 1×Q 13+v 2×Q 14+v 3×Q 15)
    其中,D 0表示目标图像的像素数据,v 0、v 1、v 2和v 3表示所述第一图像位置系数,u 0、u 1、u 2和u 3表示所述第二图像位置系数,Q 0、Q 1、Q 2、Q 3、Q 4、Q 5、Q 6、Q 7、Q 8、Q 9、Q 10、Q 11、Q 12、Q 13、Q 14和Q 15表示源图像的16个像素数据。
  8. 根据权利要求5所述的图像尺寸调整方法,其特征在于,所述获取所需的图像位置系数,包括:通过前级模块根据预设图像放缩比例及所述图像尺寸调整算法计算所述图像位置系数。
  9. 一种基于双线性插值算法的图像缩放方法,应用于如权利要求6所述的图像尺寸调整方法,其特征在于,包括:
    获取源图像的第一尺寸和目标图像的第二尺寸,并根据所述第一尺寸和所述第二尺寸确定行方向形变系数和列方向形变系数;
    分别将所述行方向形变系数和所述列方向形变系数输入定点累加器进行累加计算;
    分别对每次的累加结果进行取整操作,以得到各个所述累加结果的整数部分和小数部分,并将所述行方向形变系数的各个累加结果的整数部分作为行取数地址,小数部分作为行插值系数,将所述列方向形变系数的各个累加结果的整数部分作为列取数地址,小数部分作为列插值系数;
    针对所述目标图像中的每个目标像素点,根据所述目标像素点的像素坐标 从所有所述行取数地址、所述列取数地址、所述行插值系数及所述列插值系数中确定对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数,并根据所述目标行取数地址和所述目标列取数地址确定在所述源图像中对应的算法所需的源像素点,再基于双线性插值算法,根据所述源像素点的像素值、所述目标行插值系数及所述目标列插值系数确定所述目标像素点的像素值。
  10. 根据权利要求9所述的基于双线性插值算法的图像缩放方法,其特征在于,在所述根据所述目标像素点的像素坐标从所有所述行取数地址、所述列取数地址、所述行插值系数及所述列插值系数中确定对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数之前,还包括:
    将所述行方向形变系数的各个累加结果的整数部分和小数部分依次存入第一查找表中,将所述列方向形变系数的各个累加结果的整数部分和小数部分依次存入第二查找表中,所述第一查找表和所述第二查找表的索引值分别与各个所述目标像素点的行坐标和列坐标对应;
    相应的,所述根据所述目标像素点的像素坐标从所有所述行取数地址、所述列取数地址、所述行插值系数及所述列插值系数中确定对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数,包括:
    分别将所述像素坐标中的行坐标和列坐标作为索引值从所述第一查找表和所述第二查找表中查找所述目标行取数地址、所述目标列取数地址、所述目标行插值系数及所述目标列插值系数。
  11. 根据权利要求10所述的基于双线性插值算法的图像缩放方法,其特征在于,所述第二尺寸包括宽尺寸和高尺寸;在所述将所述行方向形变系数的各 个累加结果的整数部分和小数部分依次存入第一查找表中,将所述列方向形变系数的各个累加结果的整数部分和小数部分依次存入第二查找表中之前,还包括:
    分别为所述第一查找表和所述第二查找表分配查找表空间,所述第一查找表的查找表空间宽度为2,深度为所述宽尺寸,所述第二查找表的查找表空间宽度为2,深度为所述高尺寸。
  12. 根据权利要求9所述的基于双线性插值算法的图像缩放方法,其特征在于,所述取整操作为向下取整,所述根据所述目标行取数地址和所述目标列取数地址确定在所述源图像中对应的算法所需的源像素点,包括:
    将所述目标行取数地址作为行坐标,将所述目标列取数地址作为列坐标确定第一源像素点;
    将所述目标行取数地址作为行坐标,将所述目标列取数地址加一作为列坐标确定第二源像素点;
    将所述目标行取数地址加一作为行坐标,将所述目标列取数地址作为列坐标确定第三源像素点;
    将所述目标行取数地址加一作为行坐标,将所述目标列取数地址加一作为列坐标确定第四源像素点。
  13. 根据权利要求12所述的基于双线性插值算法的图像缩放方法,其特征在于,所述基于双线性插值算法,根据所述源像素点的像素值、所述目标行插值系数及所述目标列插值系数确定所述目标像素点的像素值,包括:
    Figure PCTCN2023070762-appb-100001
    其中,(X dst,Y dst)表示所述目标像素点的像素坐标,
    Figure PCTCN2023070762-appb-100002
    表示所述目标像素点的像素值,u x1表示所述目标行插值系数,v x1表示所述目标列插值系数,Q x11表示所述第一源像素点的像素值,Q x12表示所述第二源像素点的像素值,Q x21表示所述第三源像素点的像素值,Q x22表示所述第四源像素点的像素值。
  14. 根据权利要求9所述的基于双线性插值算法的图像缩放方法,其特征在于,所述获取源图像的第一尺寸和目标图像的第二尺寸,并根据所述第一尺寸和所述第二尺寸确定行方向形变系数和列方向形变系数,包括:
    Figure PCTCN2023070762-appb-100003
    其中,scale x表示所述行方向形变系数,scale y表示所述列方向形变系数,src rows表示所述第一尺寸中的宽尺寸,dst rows表示所述第二尺寸中的宽尺寸,src cols表示所述第一尺寸中的高尺寸,dst cols表示所述第二尺寸中的高尺寸。
  15. 根据权利要求9所述的基于双线性插值算法的图像缩放方法,其特征在于,所述定点累加器包括第一累加器和第二累加器,所述分别将所述行方向形变系数和所述列方向形变系数输入定点累加器进行累加计算,包括:初始化所述第一累加器的第一输入端和第二输入端,以及所述第二累加器的第三输入端和第四输入端;
    所述第一累加器的输出端连接至所述第一输入端,所述行方向形变系数通过所述第二输入端输入,所述第二累加器的输出端连接至所述第三输入端,所述列方向形变系数通过所述第四输入端输入;
    控制所述第一累加器进行第一次数的累加计算,以及控制所述第二累加器进行第二次数的累加计算,其中,所述第一次数为所述目标图像在行方向上的 像素个数,所述第二次数为所述目标图像在列方向上的像素个数。
  16. 一种基于双线性插值算法的图像缩放装置,应用于如权利要求6所述的图像尺寸调整方法,其特征在于,包括:形变系数确定模块,用于获取源图像的第一尺寸和目标图像的第二尺寸,并根据所述第一尺寸和所述第二尺寸确定行方向形变系数和列方向形变系数;
    累加计算模块,用于分别将所述行方向形变系数和所述列方向形变系数输入定点累加器进行累加计算;
    插值参数确定模块,用于分别对每次的累加结果进行取整操作,以得到各个所述累加结果的整数部分和小数部分,并将所述行方向形变系数的各个累加结果的整数部分作为行取数地址,小数部分作为行插值系数,将所述列方向形变系数的各个累加结果的整数部分作为列取数地址,小数部分作为列插值系数;
    目标像素确定模块,用于针对所述目标图像中的每个目标像素点,根据所述目标像素点的像素坐标从所有所述行取数地址、所述列取数地址、所述行插值系数及所述列插值系数中确定对应的目标行取数地址、目标列取数地址、目标行插值系数及目标列插值系数,并根据所述目标行取数地址和所述目标列取数地址确定在所述源图像中对应的算法所需的源像素点,再基于双线性插值算法,根据所述源像素点的像素值、所述目标行插值系数及所述目标列插值系数确定所述目标像素点的像素值。
PCT/CN2023/070762 2022-01-06 2023-01-05 基于数据流架构的图像尺寸调整结构、调整方法及图像缩放方法和装置 WO2023131252A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/301,985 US20230252600A1 (en) 2022-01-06 2023-04-17 Image size adjustment structure, adjustment method, and image scaling method and device based on streaming architecture

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202210007701.2 2022-01-06
CN202210007701.2A CN114022366B (zh) 2022-01-06 2022-01-06 基于数据流架构的图像尺寸调整装置、调整方法及设备
CN202210057376.0 2022-01-19
CN202210057376.0A CN114092336B (zh) 2022-01-19 2022-01-19 基于双线性插值算法的图像缩放方法、装置、设备及介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/301,985 Continuation-In-Part US20230252600A1 (en) 2022-01-06 2023-04-17 Image size adjustment structure, adjustment method, and image scaling method and device based on streaming architecture

Publications (1)

Publication Number Publication Date
WO2023131252A1 true WO2023131252A1 (zh) 2023-07-13

Family

ID=87073242

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/070762 WO2023131252A1 (zh) 2022-01-06 2023-01-05 基于数据流架构的图像尺寸调整结构、调整方法及图像缩放方法和装置

Country Status (2)

Country Link
US (1) US20230252600A1 (zh)
WO (1) WO2023131252A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104935831A (zh) * 2015-06-12 2015-09-23 中国科学院自动化研究所 并行多相位图像插值装置和方法
US20170345137A1 (en) * 2016-05-27 2017-11-30 Canon Kabushiki Kaisha Image processing apparatus and method for controlling the same
CN109784489A (zh) * 2019-01-16 2019-05-21 北京大学软件与微电子学院 基于fpga的卷积神经网络ip核
CN114022366A (zh) * 2022-01-06 2022-02-08 深圳鲲云信息科技有限公司 基于数据流架构的图像尺寸调整结构、调整方法及设备
CN114092336A (zh) * 2022-01-19 2022-02-25 深圳鲲云信息科技有限公司 基于双线性插值算法的图像缩放方法、装置、设备及介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104935831A (zh) * 2015-06-12 2015-09-23 中国科学院自动化研究所 并行多相位图像插值装置和方法
US20170345137A1 (en) * 2016-05-27 2017-11-30 Canon Kabushiki Kaisha Image processing apparatus and method for controlling the same
CN109784489A (zh) * 2019-01-16 2019-05-21 北京大学软件与微电子学院 基于fpga的卷积神经网络ip核
CN114022366A (zh) * 2022-01-06 2022-02-08 深圳鲲云信息科技有限公司 基于数据流架构的图像尺寸调整结构、调整方法及设备
CN114092336A (zh) * 2022-01-19 2022-02-25 深圳鲲云信息科技有限公司 基于双线性插值算法的图像缩放方法、装置、设备及介质

Also Published As

Publication number Publication date
US20230252600A1 (en) 2023-08-10

Similar Documents

Publication Publication Date Title
JP7256914B2 (ja) ベクトル縮小プロセッサ
CN108805266B (zh) 一种可重构cnn高并发卷积加速器
CN108416422B (zh) 一种基于fpga的卷积神经网络实现方法及装置
KR102258414B1 (ko) 처리 장치 및 처리 방법
CN114092336B (zh) 基于双线性插值算法的图像缩放方法、装置、设备及介质
WO2019184657A1 (zh) 图像识别方法、装置、电子设备及存储介质
CN107909537B (zh) 一种基于卷积神经网络的图像处理方法及移动终端
CN110188869B (zh) 一种基于卷积神经网络算法的集成电路加速计算的方法及系统
CN117933327A (zh) 处理装置、处理方法、芯片及电子装置
WO2022226721A1 (zh) 一种矩阵乘法器及矩阵乘法器的控制方法
CN107680028B (zh) 用于缩放图像的处理器和方法
CN110147252A (zh) 一种卷积神经网络的并行计算方法及装置
CN111984189A (zh) 神经网络计算装置和数据读取、数据存储方法及相关设备
CN107808394B (zh) 一种基于卷积神经网络的图像处理方法及移动终端
CN112905530A (zh) 片上架构、池化计算加速器阵列、单元以及控制方法
CN115310037A (zh) 矩阵乘法计算单元、加速单元、计算系统和相关方法
WO2023131252A1 (zh) 基于数据流架构的图像尺寸调整结构、调整方法及图像缩放方法和装置
CN114022366B (zh) 基于数据流架构的图像尺寸调整装置、调整方法及设备
CN115860080B (zh) 计算核、加速器、计算方法、装置、设备、介质及系统
CN114758209B (zh) 卷积结果获取方法、装置、计算机设备及存储介质
WO2023030061A1 (zh) 卷积运算电路及方法、神经网络加速器和电子设备
CN114912596A (zh) 面向稀疏卷积神经网络的多chiplet系统及其方法
CN107871162B (zh) 一种基于卷积神经网络的图像处理方法及移动终端
WO2021179175A1 (zh) 数据处理的方法、装置及计算机存储介质
CN117785480B (zh) 处理器、归约计算方法及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23737114

Country of ref document: EP

Kind code of ref document: A1