CN112927124A - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN112927124A
CN112927124A CN202110352221.5A CN202110352221A CN112927124A CN 112927124 A CN112927124 A CN 112927124A CN 202110352221 A CN202110352221 A CN 202110352221A CN 112927124 A CN112927124 A CN 112927124A
Authority
CN
China
Prior art keywords
sampling
sampling result
result
data
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110352221.5A
Other languages
Chinese (zh)
Inventor
周军
周亮
常亮
何翔
赵能
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Chengdu Sensetime Technology Co Ltd
Original Assignee
University of Electronic Science and Technology of China
Chengdu Sensetime Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China, Chengdu Sensetime Technology Co Ltd filed Critical University of Electronic Science and Technology of China
Priority to CN202110352221.5A priority Critical patent/CN112927124A/en
Publication of CN112927124A publication Critical patent/CN112927124A/en
Priority to PCT/CN2021/115555 priority patent/WO2022205763A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to a data processing method, a device, equipment and a storage medium, wherein the data processing method comprises the following steps: sampling data to be processed according to the step length of convolution operation to obtain at least one first sampling result, wherein the step length is larger than 1; sampling a convolution kernel according to the step length of the convolution operation to obtain at least one second sampling result, wherein the at least one first sampling result is in one-to-one correspondence with the at least one second sampling result; correspondingly inputting the at least one first sampling result and the at least one second sampling result into a processing array, so that the processing array outputs a processing result.

Description

Data processing method, device, equipment and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data processing method, apparatus, device, and storage medium.
Background
With the development of the artificial intelligence technology, the images can be automatically processed in multiple aspects, so that the labor cost is reduced, and the efficiency and the accuracy are improved. For example, the convolution processing of the image data may be performed by using the processing array, but the utilization rate of the processing array is often low in the convolution process, so that the energy consumption is increased and the efficiency is reduced.
Disclosure of Invention
The invention provides a data processing method, a data processing device, data processing equipment and a storage medium, which are used for solving the defects in the related art.
According to a first aspect of the embodiments of the present invention, there is provided a data processing method, including:
sampling data to be processed according to the step length of convolution operation to obtain at least one first sampling result, wherein the step length is larger than 1;
sampling a convolution kernel according to the step length of the convolution operation to obtain at least one second sampling result, wherein the at least one first sampling result is in one-to-one correspondence with the at least one second sampling result;
correspondingly inputting the at least one first sampling result and the at least one second sampling result into a processing array, so that the processing array outputs a processing result.
In combination with any embodiment provided by the present disclosure, the sampling the data to be processed according to the step length of the convolution operation to obtain at least one first sampling result includes:
sampling the data to be processed according to the step length to obtain at least one first line sampling result, wherein the union of the at least one first line sampling result is the data to be processed;
performing row-column sampling on the data to be processed according to the step length to obtain at least one first column of sampling results, wherein a union set of the at least one first column of sampling results is the data to be processed;
and respectively determining the intersection of each first row sampling result and each first column sampling result as a first sampling result.
In combination with any one of the embodiments provided by the present disclosure, the sampling the convolution kernel according to the step size of the convolution operation to obtain at least one second sampling result includes:
performing line sampling on the convolution kernel according to the step length to obtain at least one second line sampling result, wherein the union of the at least one second line sampling result is the convolution kernel;
performing column sampling on the convolution kernel according to the step length to obtain at least one second column sampling result, wherein a union set of the at least one second column sampling result is the data to be processed;
and respectively determining the intersection of each second row sampling result and each second column sampling result as a second sampling result.
In combination with any one of the embodiments provided by the present disclosure, the inputting the at least one first sampling result and the at least one second sampling result to a corresponding processing array to enable the processing array to output a processing result includes:
for each first sampling result, inputting the first sampling result into the processing array, and inputting a second sampling result corresponding to the first sampling result into the processing array; and are
Controlling the processing array to determine a corresponding sub-processing result according to the first sampling result and the corresponding second sampling result;
and controlling the processing array to output a processing result according to the sub-processing result corresponding to each first sampling result.
In combination with any one of the embodiments provided by the present disclosure, the inputting, for each first sampling result, the first sampling result to the processing array includes:
for each first sampling result, inputting a plurality of numerical values of the first sampling result into a plurality of cells of the processing array such that the relative positions of the plurality of numerical values in the plurality of cells are the same as the relative positions of the plurality of numerical values in the first sampling result.
In connection with any embodiment provided by the present disclosure, the processing array includes an active array, at least one overflow row and at least one overflow column distributed around the active array, wherein the active array includes a plurality of cells for storing and processing data, the overflow row and the overflow column include a plurality of cells for storing data;
the inputting the plurality of values of the first sampling result into the plurality of cells of the processing array includes:
and inputting a plurality of numerical values of the first sampling result into a plurality of units of the processing array, so that the numerical value of the first row and the first column in the first sampling result is input into the first row and the first column of the unit for storing and processing data.
In combination with any one of the embodiments provided by the present disclosure, the controlling the processing array to determine a corresponding sub-processing result according to the first sampling result and the corresponding second sampling result includes:
for each weight value in the corresponding second sampling result, controlling the processing array to adopt a numerical value corresponding to the weight value in the first sampling result and a partial sum of the numerical value and the weight value;
controlling the processing array to respectively correspond to the part and the determined part of the result according to each weight value in the corresponding second sampling result;
and controlling the processing array to determine a sub-processing result corresponding to the first sampling result according to at least one partial result.
In combination with any embodiment provided by the present disclosure, the controlling, for each weight value in the corresponding second sampling result, the processing array to adopt a value corresponding to the weight value in the first sampling result, and the sum of the weight value determination portion includes:
and for the first weight value in the corresponding second sampling result, controlling the processing array to adopt the sum of the numerical value of the first sampling result in the initial position corresponding unit of the processing array and the first weight value determination part.
In combination with any embodiment provided by the present disclosure, the controlling, for each weight value in the corresponding second sampling result, the processing array to adopt a value corresponding to the weight value in the first sampling result, and the sum of the weight value determination portion includes:
for each non-first weight value in the corresponding second sampling result, determining a moving mode of the first sampling result according to a first numerical value corresponding to the non-first weight value in the first sampling result and a position relation of a second numerical value corresponding to a previous weight value of the non-first weight value in the first sampling result, and controlling the processing array to move the second numerical value to a corresponding unit in the determined moving mode;
and controlling the processing array to adopt the sum of the numerical value in the moved corresponding unit and the non-first weight value determination part.
In connection with any embodiment provided by the present disclosure, further comprising:
determining the number of rows and columns of the data to be processed according to the processing array, the convolution kernel and the step length;
determining the number of overlapped rows and the number of overlapped columns according to the convolution kernel and the step length;
and sampling the data of the image to be processed according to the number of the rows and the columns of the data to be processed, the number of the overlapped rows and the number of the overlapped columns to obtain a plurality of data to be processed.
In combination with any one of the embodiments provided by the present disclosure, the data to be processed is one channel of single-channel data or multi-channel data, and the convolution kernel is one channel of a single-channel convolution kernel or a multi-channel convolution kernel.
According to a second aspect of embodiments of the present invention, there is provided a data processing apparatus, the apparatus comprising:
the controller is used for sampling data to be processed according to the step length of convolution operation to obtain at least one first sampling result, wherein the step length is larger than 1; sampling a convolution kernel according to the step length of the convolution operation to obtain at least one second sampling result, wherein the at least one first sampling result is in one-to-one correspondence with the at least one second sampling result; correspondingly inputting the at least one first sampling result and the at least one second sampling result into the processing array;
the processing array is configured to process the at least one first sampling result and the at least one second sampling result, and output a processing result.
In combination with any embodiment provided by the present disclosure, the controller is configured to sample the data to be processed according to the step size to obtain at least one first line sampling result, where a union of the at least one first line sampling result is the data to be processed; performing row-column sampling on the data to be processed according to the step length to obtain at least one first column of sampling results, wherein a union set of the at least one first column of sampling results is the data to be processed; and respectively determining the intersection of each first row sampling result and each first column sampling result as a first sampling result.
In combination with any embodiment provided by the present disclosure, the controller is configured to perform line sampling on the convolution kernel according to the step size to obtain at least one second line sampling result, where a union of the at least one second line sampling result is the convolution kernel; performing column sampling on the convolution kernel according to the step length to obtain at least one second column sampling result, wherein a union set of the at least one second column sampling result is the data to be processed; and respectively determining the intersection of each second row sampling result and each second column sampling result as a second sampling result.
In combination with any one of the embodiments provided by the present disclosure, the controller is configured to, for each first sampling result, input the first sampling result to the processing array, and input a second sampling result corresponding to the first sampling result to the processing array; and are
The processing array is used for determining a corresponding sub-processing result according to the first sampling result and the corresponding second sampling result; and outputting the processing result according to the sub-processing result corresponding to each first sampling result.
In combination with any one of the embodiments provided by the present disclosure, the controller is configured to, for each first sampling result, input a plurality of numerical values of the first sampling result into a plurality of cells of the processing array, so that a relative position of the plurality of numerical values in the plurality of cells is the same as a relative position of the plurality of numerical values in the first sampling result.
In connection with any embodiment provided by the present disclosure, the processing array includes an active array, at least one overflow row and at least one overflow column distributed around the active array, wherein the active array includes a plurality of cells for storing and processing data, the overflow row and the overflow column include a plurality of cells for storing data;
the controller is configured to input a plurality of numerical values of the first sampling result into a plurality of cells of the processing array, so that a numerical value in a first row and a first column of the first sampling result is input into a first row and a first column of the cell for storing and processing data.
In combination with any embodiment provided by the present disclosure, the processing array is configured to, for each weight value in the corresponding second sampling result, determine a partial sum with the weight value by using a numerical value corresponding to the weight value in the first sampling result;
respectively corresponding partial results and determining partial results according to the weight values in the corresponding second sampling results; and
and determining a sub-processing result corresponding to the first sampling result according to at least one partial result.
In combination with any one of the embodiments provided by the present disclosure, the processing array is configured to determine, for a first weight value in a corresponding second sampling result, a partial sum of a numerical value of the first sampling result in an initial position corresponding unit of the processing array and the first weight value.
In combination with any embodiment provided by the present disclosure, the controller is configured to, for each non-first weight value in the corresponding second sampling result, determine a moving manner of the first sampling result according to a position relationship, in the first sampling result, of a first numerical value corresponding to the non-first weight value in the first sampling result and a second numerical value corresponding to a last weight value of the non-first weight value in the first sampling result;
the processing array is used for moving the second numerical value to a corresponding unit by adopting a determined moving mode; and determining a partial sum by using the numerical value in the moved corresponding unit and the non-first weight value.
In combination with any one of the embodiments provided by the present disclosure, the controller is further configured to determine the number of rows and the number of columns of the data to be processed according to the processing array, the convolution kernel, and the step size; determining the number of overlapped rows and the number of overlapped columns according to the convolution kernel and the step length; and sampling the data of the image to be processed according to the number of the rows and the columns of the data to be processed, the number of the overlapped rows and the number of the overlapped columns to obtain a plurality of data to be processed.
In combination with any one of the embodiments provided by the present disclosure, the data to be processed is one channel of single-channel data or multi-channel data, and the convolution kernel is one channel of a single-channel convolution kernel or a multi-channel convolution kernel.
According to a third aspect of the embodiments of the present invention, there is provided an electronic device, the device including a memory, a processor, and the apparatus of the second aspect of the embodiments of the present invention.
According to a fourth aspect of embodiments of the present invention, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of the first aspect.
According to the embodiment, the data to be processed and the convolution kernel are synchronously sampled to obtain at least one first sampling result and at least one second sampling result, and the first sampling result and the second sampling result are in one-to-one correspondence, so that the corresponding first sampling result and the second sampling result can be sequentially input to the processing array to obtain the processing result. Because the sampling of the data to be processed and the convolution kernel is carried out based on the step length of convolution operation, the corresponding first sampling result and the second sampling result are matched with each other, namely the step length of the convolution operation of the second sampling result to the first sampling result is 1, and then each unit can be utilized after the data is input into the processing array, so that the utilization rate of the processing array is improved, the energy consumption waste is avoided, and the processing efficiency is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a flow chart illustrating a data processing method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of data to be processed according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of at least one first sampling result according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a convolution kernel as shown in an embodiment of the present invention;
FIG. 5 is a schematic diagram of at least one first sampling result according to an embodiment of the present invention;
FIG. 6 is a flow chart illustrating deriving processing results according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of a process array according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of a cell in a processing array according to an embodiment of the present invention;
FIG. 9 is a flow chart illustrating deriving sub-process results according to an embodiment of the present invention;
FIG. 10 is a schematic diagram showing a first sampling result shifted to the left on a processing array in accordance with an embodiment of the present invention;
FIG. 11 is a schematic diagram showing the first sample result moving up the processing array in accordance with an embodiment of the present invention;
FIG. 12 is a schematic diagram showing the result of a first sample moving to the right on the processing array in accordance with an embodiment of the present invention;
FIG. 13 is a schematic diagram illustrating a sampling manner of data to be processed according to an embodiment of the present invention;
fig. 14 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present invention. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
In a first aspect, at least one embodiment of the present invention provides a data processing method, please refer to fig. 1, which illustrates a flow of the method, including steps S101 to S103.
The method may be performed by an electronic device such as a terminal device or a server, where the terminal device may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA) handheld device, a computing device, a vehicle-mounted device, a wearable device, and the like, and the method may be implemented by a processor calling a computer readable instruction stored in a memory. Alternatively, the method may be performed by a server, which may be a local server, a cloud server, or the like.
In step S101, data to be processed is sampled according to a step size of convolution operation to obtain at least one first sampling result, where the step size is greater than 1.
The sampling mode can be downsampling, that is, part of data is selected from data to be processed to form a first sampling result, one first sampling result can be obtained through downsampling once, and a plurality of first sampling results can be obtained through downsampling for a plurality of times. There is no data overlap between different first sampling results, and all the first sampling results may constitute complete data to be sampled, that is, the data to be processed may be taken as one first sampling result, or the data to be processed may be split into a plurality of first sampling results.
The step size of the convolution operation refers to the step size of the convolution kernel moving on the data to be processed; in the convolution operation, the convolution kernel is calculated once every step. Similarly, sampling according to the step length, namely selecting in the data to be processed according to the step length, wherein the selected result is the first sampling result; the starting point of sampling is data that is not selected by other sampling processes, and the first sampling result obtained starting from different positions is different and there is no overlap. For example, when the data to be processed is divided into 17 sub-data, and the step size is 2, firstly, sampling with the step size of 2 is performed with the first sub-data as a starting point to obtain a first sampling result composed of the 1 st, 3 rd, 5 th, 7 th, 9 th, 11 th, 13 th, 15 th and 17 th sub-data, and then sampling with the step size of 2 is performed with the second sub-data as a starting point to obtain a first sampling result composed of the 2 nd, 4 th, 6 th, 8 th, 10 th, 12 th, 14 th and 16 th sub-data, until the data to be processed is completely selected, so that the sampling is finished to obtain the two first sampling results.
In step S102, a convolution kernel is sampled according to a step size of the convolution operation to obtain at least one second sampling result, where the at least one first sampling result and the at least one second sampling result are in one-to-one correspondence.
The sampling mode can be downsampling, that is, a part of weight values are selected from the convolution kernel to form a second sampling result, one second sampling result can be obtained through downsampling once, and a plurality of second sampling results can be obtained through downsampling for a plurality of times. There is no overlap between different second sampling results, and all second sampling results may form a complete convolution kernel, that is, the convolution kernel may be regarded as one second sampling result, or the convolution kernel may be split into a plurality of second sampling results.
Sampling according to the step length, namely selecting in a convolution kernel according to the step length, wherein the selected result is a second sampling result; the starting point of sampling is a weight value which is not selected by other sampling processes, and the second sampling results obtained by taking different positions as the starting points are different and have no overlap. For example, when the convolution kernel is divided into 9 weight values, and the step size is 2, first, sampling with the step size of 2 is performed with the first weight value as a starting point to obtain a second sampling result composed of 1 st, 3 rd, 5 th, 7 th, and 9 th sub-weight values, and then sampling with the step size of 2 is performed with the second weight value as a starting point to obtain a second sampling result composed of 2 nd, 4 th, 6 th, and 8 th weight values, until the convolution kernel is completely selected, so that sampling is finished to obtain the two second sampling results.
Since the sampling of the data to be processed and the sampling of the convolution kernel are both performed according to the step size, the number of the first sampling results and the number of the second sampling results are the same. In the convolution operation, the convolution kernel is calculated once when the convolution kernel moves one step, so that in the whole convolution operation process, the subdata of the data to be processed and the weight value of the convolution kernel have a corresponding relation, and further the first sampling result and the second sampling result also have a corresponding relation, namely at least one first sampling result and at least one second sampling result are in one-to-one correspondence. The first sampling result and the second sampling result which correspond to each other refer to the subdata in the data to be processed and the weight value which is calculated with the subdata in the convolution process. Specifically, when the data to be processed is matched, the sampling start point position may be used for matching, that is, the relative position of the start point of the first sampling result in the data to be processed is the same as the relative position of the start point of the second sampling result in the convolution kernel, and it is determined that the first sampling result and the second sampling result correspond to each other, for example, in the above-mentioned examples of the data to be processed and the convolution kernel, the first sampling result starting from the 1 st sub-data corresponds to the second sampling result starting from the 1 st weight value, that is, the first sampling result composed of the 1 st, 3 rd, 5 th, 7 th, 9 th, 11 th, 13 th, 15 th, 17 th sub-data corresponds to the second sampling result composed of the 1 st, 3 rd, 5 th, 7 th, 9 th weight values; the first sampling result starting from the 2 nd sub data corresponds to the second sampling result starting from the 2 nd weight value, that is, the first sampling result composed of the 2 nd, 4 th, 6 th, 8 th, 10 th, 12 th, 14 th, 16 th, 18 th sub data corresponds to the second sampling result composed of the 2 nd, 4 th, 6 th, 8 th weight values.
In step S103, the at least one first sampling result and the at least one second sampling result are correspondingly input to a processing array, so that the processing array outputs a processing result.
The method comprises the steps of inputting a first pair of first sampling results and a second sampling results which correspond to each other into a processing array, inputting a second pair of first sampling results and second sampling results which correspond to each other into the processing array until the last pair of first sampling results and second sampling results which correspond to each other are input into the processing array, and accordingly controlling the processing array to output a processing result, wherein the processing result refers to a result obtained after data to be processed is convolved by a convolution kernel.
It should be noted that the data to be processed is one channel of single-channel data or multi-channel data, and the convolution kernel is one channel of a single-channel convolution kernel or a multi-channel convolution kernel. That is to say, when the convolution kernel and/or the data to be processed are multiple channels, each channel is correspondingly convolved, and the method is used in each channel respectively for the process of convolving single data to be processed by a single convolution kernel channel, and the results of the multiple-channel convolution can be obtained by combining each process.
According to the embodiment, the data to be processed and the convolution kernel are synchronously sampled to obtain at least one first sampling result and at least one second sampling result, and the first sampling result and the second sampling result are in one-to-one correspondence, so that the corresponding first sampling result and the second sampling result can be sequentially input to the processing array to obtain the processing result. Because the sampling of the data to be processed and the convolution kernel is carried out based on the step length of convolution operation, the corresponding first sampling result and the second sampling result are matched with each other, namely the step length of the convolution operation of the second sampling result to the first sampling result is 1, and then each unit can be utilized after the data is input into the processing array, so that the utilization rate of the processing array is improved, the energy consumption waste is avoided, and the processing efficiency is improved.
In particular, a processing array of a conventional convolutional neural network accelerator is generally a two-dimensional connection architecture, and in an operation mode of Single Instruction Multiple Data (SIMD), a Single Instruction controls all units to perform the same operation (e.g., shift, access, MAC, etc.). When the step size of the convolution operation is larger than 1, the result of the calculation of a part of the units of the processing array is not needed, which greatly reduces the utilization rate of the processing array. For example, when stride is 2, the utilization rate of the processing array in SIMD mode is only 1/4, and when stride is 3, the utilization rate of the processing array in SIMD mode is only 1/9. By the processing method provided by the embodiment, each operation of the processing array is converted into the convolution operation of the second sampling result with the step length of 1 relative to the first sampling result, so that the utilization rate of the processing array reaches one hundred percent.
In some embodiments of the present disclosure, the data to be processed may be sampled according to a step size of the convolution operation in the following manner to obtain at least one first sampling result: firstly, sampling the data to be processed according to the step length to obtain at least one first line sampling result, wherein the union of the at least one first line sampling result is the data to be processed; next, performing row-column sampling on the data to be processed according to the step length to obtain at least one first column of sampling results, wherein a union of the at least one first column of sampling results is the data to be processed; and finally, respectively determining the intersection of each first row sampling result and each first column sampling result as a first sampling result.
During convolution operation, the movement of the convolution kernel on the data to be processed is divided into two directions, namely a row direction and a column direction, so that the corresponding relation between the weight value in the convolution kernel and the subdata in the data to be processed is divided into two dimensions of rows and columns. After sampling is carried out according to rows and columns, the sampling results are combined pairwise (even if each first row sampling result is combined with each first column sampling result) to obtain a plurality of first sampling results, and therefore the first sampling results and the second sampling results can correspond to each other in two dimensions of the rows and the columns.
In addition, both the row sampling and the column sampling are performed by the sampling method described in step S101. Since the sampling is performed from two dimensions, if the step length is S, the number of the first sampling results is S2
In one example, the data to be processed is a 17 × 17 block as shown in fig. 2, and the step size of the convolution operation is 2, so that four first sampling results 301, 302, 303 and 304 as shown in fig. 3 can be obtained by taking the row sampling, column sampling and finally intersection, where the first sampling result 301 is the intersection of the 1 st, 3 rd, 5 th, 7 th, 9 th, 11 th, 13 th, 15 th and 17 th rows and the 1 st, 3 th, 5 th, 7 th, 9 th, 11 th, 13 th, 15 th and 17 th columns (9 ″), the first sampling result 302 is the intersection of the 1 st, 3 th, 5 th, 7 th, 9 th, 11 th, 13 th, 15 th and 17 th rows and the 2 th, 4 th, 6 th, 8 th, 10 th, 12 th, 14 th and 16 th columns (9 ″), the first sampling result 303 is the intersection of the 2 nd, 4 th, 6 th, 8 th, 10 th, 12 th, 14 th and 16 th rows and the 1 st, 3 rd, 5 th, 7 th, 9 th, 11 th, 12 th and 16 th columns (9 th, 8 th and 304) are the first sampling result 304 is the first sampling, 4. The intersection of rows 6, 8, 10, 12, 14 and 16 with columns 2, 4, 6, 8, 10, 12, 14 and 16 (8 x 8).
In some embodiments of the present disclosure, the convolution kernel may be sampled according to a step size of the convolution operation in the following manner to obtain at least one second sampling result: firstly, performing line sampling on the convolution kernel according to the step length to obtain at least one second line sampling result, wherein the union of the at least one second line sampling result is the convolution kernel; next, performing column sampling on the convolution kernel according to the step length to obtain at least one second column sampling result, wherein a union set of the at least one second column sampling result is the data to be processed; and finally, determining the intersection of each second row sampling result and each second column sampling result as a second sampling result.
During convolution operation, the movement of the convolution kernel on the data to be processed is divided into two directions, namely a row direction and a column direction, so that the corresponding relation between the weight value in the convolution kernel and the subdata in the data to be processed is divided into two dimensions of rows and columns. And after sampling is carried out according to rows and columns, pairwise combination is carried out on the sampling results (even though each second row sampling result is combined with each second column sampling result respectively) to obtain a plurality of second sampling results, so that the first sampling result and the second sampling result can correspond to each other in two dimensions of the rows and the columns.
In addition, both the row sampling and the column sampling are performed in the sampling manner described in step S102. Since the sampling is performed from two dimensions, if the step length is S, the number of the first sampling results is S2
In one example, the convolution kernel is 3 × 3 convolution kernel as shown in fig. 4, and the step size of the convolution operation is 2, so according to the above-mentioned manner of line sampling, column sampling, and finally taking intersection, four second sampling results 501, 502, 503, and 504 as shown in fig. 5 can be obtained, where the second sampling result 501 is the intersection of the 1 st and 3 rd lines and the 1 st and 3 rd lines (i.e. A, C, G, I four weight values in the figure), the second sampling result 502 is the intersection of the 1 st and 3 rd lines and the 2 nd lines (i.e. B and H two weight values in the figure), the second sampling result 503 is the intersection of the 2 nd line and the 1 st and 3 rd lines (i.e. D and F two weight values in the figure), and the second sampling result 504 is the intersection of the 2 nd line and the 2 nd line (i.e. weight value in the figure).
The four first sampling results shown in fig. 3 and the four second sampling results shown in fig. 5 are obtained by sampling according to step 2, and thus may be in one-to-one correspondence, specifically, the first sampling result 301 corresponds to the second sampling result 501, the first sampling result 302 corresponds to the second sampling result 502, the first sampling result 303 corresponds to the second sampling result 503, and the first sampling result 304 corresponds to the second sampling result 504.
In some embodiments of the present disclosure, the at least one first sampling result and the at least one second sampling result may be input to a corresponding processing array in a manner as shown in fig. 6, so that the processing array outputs the processing result, including steps S601 to S603.
In step S601, for each first sampling result, the first sampling result is input to the processing array, and a second sampling result corresponding to the first sampling result is input to the processing array.
In step S602, the processing array is controlled to determine a corresponding sub-processing result according to the first sampling result and the corresponding second sampling result.
In step S603, the processing array is controlled to output a processing result according to the sub-processing result corresponding to each first sampling result.
Both steps S601 and S602 are repeated (i.e. repeated N times, where N is the number of the first sampling results), that is, step S601 and step S602 are performed for each first sampling result and the corresponding second sampling result. Specifically, a first sampling result is input into the processing array, a first second sampling result is input into the processing array, and the processing array is controlled to obtain a first sub-processing result according to the input; then inputting the second first sampling result into the processing array, inputting the second sampling result into the processing array, and controlling the processing array to obtain a second sub-processing result according to the input; until the last sub-processing result (i.e., the nth sub-processing result) is obtained. For example, when step S601 and step S602 are executed for the four first sampling results shown in fig. 3 and the four second sampling results shown in fig. 5, the first sampling result 301 may be input to the processing array, and then the second sampling result 501 may be input to the processing array, so as to control the processing array to obtain a first sub-processing result according to the first sampling result 301 and the second sampling result 501; then inputting the first sampling result 302 into the processing array, and inputting the second sampling result 502 into the processing array, thereby controlling the processing array to obtain a second sub-processing result according to the first sampling result 302 and the second sampling result 502; then inputting the first sampling result 303 to the processing array, and inputting the second sampling result 503 to the processing array, so as to control the processing array to obtain a third sub-processing result according to the first sampling result 303 and the second sampling result 503; finally, the first sampling result 304 is input to the processing array, and the second sampling result 504 is input to the processing array, so that the processing array is controlled to obtain a fourth sub-processing result according to the first sampling result 304 and the second sampling result 504.
Wherein, step S601 may be performed as follows: for each first sampling result, inputting a plurality of numerical values of the first sampling result into a plurality of cells of the processing array such that the relative positions of the plurality of numerical values in the plurality of cells are the same as the relative positions of the plurality of numerical values in the first sampling result. The first sampling result comprises a plurality of rows and a plurality of columns of numerical values, and the processing array comprises a plurality of rows and a plurality of columns of cells, and each cell is used for storing one numerical value. The arrangement of the numerical values in the first sampling result is completely consistent with the arrangement of the numerical values in the processing array, in a descriptive sense, the processing array is a layer of units with multiple rows and multiple columns, the first sampling result is a layer of numerical values with multiple rows and multiple columns, the units with multiple rows and multiple columns are parallel to the numerical values with multiple rows and multiple columns and are in one-to-one correspondence, and when the first sampling result is input, the numerical values with multiple rows and multiple columns are integrally mapped into the units with multiple rows and multiple columns.
Referring to fig. 7, a processing array may include an active array including a plurality of cells (i.e., circular cell PEs in fig. 7) for storing and processing data, at least one overflow row and at least one overflow column distributed around the active array including a plurality of cell PEs (i.e., hexagonal cell PEs in fig. 7) for storing data. The number of rows of the unit for storing and processing data is greater than, equal to, or less than 1 from the number of rows of the first sampling result, and the number of columns of the unit for storing and processing data is greater than, equal to, or less than 1 from the number of columns of the first sampling result, the relationship being determined when the data to be processed is obtained (as will be described in detail below) and when the first sampling result is obtained (as will be described in detail above). For example, the four first sampling results shown in fig. 3 are input into the processing array shown in fig. 7 (including 10 × 10 cells, the central 8 × 8 cells are cells for storing and processing data, two rows of cells for storing data are provided around the central 8 × 8 cells, and two columns of cells for storing data are provided on the upper and lower sides of the processing array), and the number of rows and columns of the four first sampling results is 8 or 9, so that the above relationship is satisfied.
The connection relationship between the units for storing and processing data and the adjacent units is shown in fig. 8, and it can be seen from the figure that the units for storing and processing data have an internal register R0, an arithmetic unit ALU and an associated data loading and storing circuit module M, and each unit for storing and processing data has a shift register file and a Static Random-Access Memory (SRAM) connected therein, and the shift register file has a plurality of shift registers R1, R2, R3, R4, etc. The unit for storing data has the same structure as the unit for storing and processing data, but does not have the arithmetic unit ALU. Adjacent cells are connected by a shift register file, and in the processing array, each cell is connected with adjacent cells in all directions (namely, up, down, left and right).
Based on the structure of the processing array, when the first sampling result is input, the plurality of values of the first sampling result may be input into the plurality of units of the processing array, so that the value of the first row and the first column in the first sampling result is input into the first row and the first column of the unit for storing and processing Data, that is, the first value is input into the first unit for storing and processing Data, and since the storing and moving of the first sampling result in the processing array are performed uniformly in a whole unit (i.e., in a Single Instruction Multiple Data (SIMD) operation mode), after the first value is located, the location of the whole first sampling result and the processing array is achieved. The number of lines of the first sampling result can be less than or equal to the number of lines of the unit for storing and processing data or greater than the number of lines of the unit for storing and processing data by 1, so that at most one line of values is stored in the unit for storing data, and in this case, the number of the extra line of values does not need to be convolved by the second sampling result, thereby ensuring the convolution operation between the first sampling result and the second sampling result, and avoiding waste of energy consumption and efficiency reduction; similarly, the number of columns of the first sampling result may be less than or equal to the number of columns of the unit for storing and processing data, or greater than 1, so that at most one column of values is stored in the unit for storing data, and in this case, the extra column of values is not required to be convolved by the second sampling result, thereby ensuring convolution operation between the first sampling result and the second sampling result, and avoiding waste of energy consumption and efficiency reduction.
Step S602 may be performed as shown in fig. 9, and includes steps S901 to S903.
In step S901, for each weight value in the corresponding second sampling result, the processing array is controlled to adopt a numerical value corresponding to the weight value in the first sampling result, and the weight value determination portion.
In step S902, the processing array is controlled to determine partial results and partial results corresponding to the weights in the corresponding second sampling results.
In step S903, the processing array is controlled to determine a sub-processing result corresponding to the first sampling result according to at least one partial result.
Step S901 is a repeated step (i.e., M times, where M is the number of weight values in the corresponding second sampling result), that is, step S901 is performed for each weight value of the second sampling result, so that the partial sum (i.e., the 1 st to M th partial sums) corresponding to each weight value can be obtained in sequence. When the partial sum is determined, the first weight value may be multiplied by a corresponding numerical value.
Wherein, the step S901 may be executed as follows: for a first weight value in a corresponding second sampling result, controlling the processing array to adopt the sum of a numerical value of the first sampling result in an initial position corresponding unit of the processing array and the first weight value determination part; and for each non-first weight value in the corresponding second sampling result, determining a moving mode of the first sampling result according to a first numerical value corresponding to the non-first weight value in the first sampling result and a position relation of a second numerical value corresponding to a previous weight value of the non-first weight value in the first sampling result, controlling the processing array to move the second numerical value to the corresponding unit in a determined moving mode, and controlling the processing array to determine a partial sum of the numerical value in the corresponding unit after moving and the non-first weight value. For example, the jth non-first weight value is to the right of the jth-1 non-first weight value, then the first sample result is shifted one cell to the left with respect to the processing array. Since the shifting of the first sampling result is performed uniformly in units of a whole (i.e. the operation mode of Single Instruction Multiple Data (SIMD)), each cell transmits its stored value to an adjacent cell according to the shifting direction during shifting, for example, if the first sampling result is shifted to the left by one cell with respect to the processing array, each cell transmits the stored value to its left adjacent cell.
In step S902, when partial results are obtained from the partial sums, the partial sums may be summed to obtain partial results. When executed, the unit first derives a partial sum and then stores it, after which each time the unit derives a partial sum and then sums it with the stored partial sum and stores the result of the summation as a new partial sum.
In step S904, partial results obtained from the units for storing and processing data may be arranged correspondingly according to the position relationship of each unit, so as to obtain sub-processing results of multiple rows and multiple columns.
The following describes the process of solving the sub-processing result in further detail by taking the four first sampling results shown in fig. 3 and the four second sampling results shown in fig. 5 as examples.
First sampling result 301 and second sampling result 501: firstly, determining that A is a first weighted value, and keeping an initial position when a first sampling result is input into a processing matrix, namely (1, 1) stored in a first row and a first column of cells for storing and processing data, (1, 3) stored in a first row and a second column of cells for storing and processing data, (3, 1) stored in a first row and a first column of cells for storing and processing data, (3, 3) stored in a second row and a second column of cells for storing and processing data, a 9 th row numerical value (namely a last row numerical value) stored in a first row of cells for storing data at the lower side of an 8 x 8 array formed by the cells for storing and processing data, and a 9 th column numerical value (namely a last column numerical value) stored in a first column of cells for storing data at the right side of the 8 x 8 array formed by the cells for storing and processing data; then each unit for storing and processing data multiplies the stored data by a to obtain a partial sum, taking the first row and the first column of the unit for storing and processing data as an example, obtaining the partial sum a (1, 1), storing, taking the first row and the second column of the unit for storing and processing data as an example, obtaining the partial sum a (1, 3), storing, taking the second row and the first column of the unit for storing and processing data as an example, obtaining the partial sum a (3, 1), storing, taking the second row and the second column of the unit for storing and processing data as an example, obtaining the partial sum a (3, 3), storing, and the process of obtaining the partial sum by other units for storing and processing data is not described any more, but the unit for storing data does not perform operation, and thus does not obtain the partial sum, this is because the convolution kernel shown in fig. 4 performs convolution operation on the data shown in fig. 2 by step size 2 to obtain a data array with a result of 8 × 8, that is, the last row of the first sampling result 301 in the convolution process is only multiplied by the weight values G and I of the second sampling result 501, and does not need to be multiplied by a and C, and the last column of the first sampling result 301 in the convolution process is only multiplied by the weight values C and I of the second sampling result 501, and does not need to be multiplied by a and G; then, for C, which is a non-leading weight value, since C is on the right side of a, referring to fig. 10, the first sampling result is shifted to the left by one unit with respect to the processing array as a whole, i.e. R1 in the shift register file of each cell sends its stored data to R1 in the shift register file of the cell to its left, that is, (1, 3) stored in the first row and the first column of cells for storing and processing data, (1, 5) stored in the first row and the second column of cells for storing and processing data, (3, 3) stored in the first row and the first column of cells for storing and processing data, (3, 5) stored in the second row and the second column of cells for storing and processing data, and the first column of values stored in the first column of cells for storing data on the left side of 8 x 8 array of cells for storing and processing data; then each cell for storing and processing data multiplies the stored data by C to obtain a partial sum, taking the cell for storing and processing data at the head of the first row as an example, obtains a partial sum C (1, 3), adds the partial sum a (1, 1) to the original stored partial sum a (1, 1) + C (1, 3) to obtain the latest partial sum a (1, 1) + C (1, 3), takes the cell for storing and processing data at the second row of the first row as an example, obtains a partial sum C (1, 5), adds the latest partial sum a (1, 3) + C (1, 5) to the original stored partial sum a (1, 3), stores the latest partial sum C (3, 3) to the cell for storing and processing data at the head of the second row as an example, obtains a partial sum C (3, 3), and adds the latest partial sum a (3, 1) to the original stored partial sum a (3, 1), 1) + C (3, 3) is stored, taking the second row and the second column of cells for storing and processing data as an example, obtaining a partial sum C (3, 5), and adding the partial sum a (3, 3) to obtain the latest partial sum a (3, 3) + C (3, 5) for storing, and the process of obtaining the partial sum by the other cells for storing and processing data is not described any more, but it should be noted that the cells for storing data do not perform operation, and therefore do not obtain the partial sum, because the convolution kernel shown in fig. 4 performs convolution operation on the data shown in fig. 2 by using 2 to obtain a data array with a result of 8 × 8, that is, the first column of the first sampling result 301 in the convolution process is only multiplied by the weighted values a and G of the second sampling result 501, and does not need to be multiplied by C and I; then, for I, which is a non-leading weight value, since I is at the lower side of C, referring to fig. 11, the first sampling result is shifted up by one unit with respect to the processing array as a whole, that is R1 in the shift register file of each cell sends its stored data to R1 in the shift register file of the cell on its upper side, that is, (3, 3) stored in the first row and column of cells for storing and processing data, (3, 5) stored in the first row and second column of cells for storing and processing data, (5, 3) stored in the first row and column of cells for storing and processing data, (5, 5) stored in the second row and second column of cells for storing and processing data, and the first row of values stored in the first row of cells for storing data on the upper side of an 8 x 8 array of cells for storing and processing data; then each cell for storing and processing data multiplies the stored data by I to obtain a partial sum, taking the cell for storing and processing data at the head row as an example, to obtain a partial sum I (3, 3), and adds the original stored partial sum a (1, 1) + C (1, 3) to obtain the latest partial sum a (1, 1) + C (1, 3) + I (3, 3), taking the cell for storing and processing data at the second row at the head row as an example, to obtain a partial sum I (3, 5), and adds the original stored partial sum a (1, 3) + C (1, 5) to obtain the acquired partial sum a (1, 3) + C (1, 5) + I (3, 5), taking the cell for storing and processing data at the head row as an example, to obtain the latest partial sum I (5), 3) and adding the original stored partial sum a (3, 1) + C (3, 3) to obtain the latest partial sum a (3, 1) + C (3, 3) + I (5, 3), taking the second row and the second column of cells for storing and processing data as an example, obtaining partial sum I (5, 5), and adding the latest partial sum a (3, 3) + C (3, 5) + I (5, 5) to the originally stored partial sum a (3, 3) + C (3, 5) to obtain the latest partial sum a (3, 3) + C (3, 5) + I (5, 5), and the process of obtaining partial sums by other cells for storing and processing data is not described any more, but it is necessary to note that the cells for storing data do not perform operation, and therefore do not obtain partial sums because the convolution kernel shown in fig. 4 performs convolution operation on the data shown in fig. 2 to obtain the result of convolution 8 as the step size of the data array, that is, the first line of the first sampling result 301 is multiplied only by the weight values a and C of the second sampling result 501 in the convolution process, and is not multiplied by G and I; finally, for G, which is a non-leading weight value, since G is on the left side of I, referring to fig. 12, the first sampling result is shifted to the right by one unit with respect to the processing array as a whole, i.e. R1 in the shift register file of each cell sends its stored data to R1 in the shift register file of the cell to its right, that is, (3, 1) stored in the first row and the first column of cells for storing and processing data, (3, 3) stored in the first row and the second column of cells for storing and processing data, (5, 1) stored in the first row and the first column of cells for storing and processing data, (5, 3) stored in the second row and the second column of cells for storing and processing data, and the last column of values stored in the first row of cells for storing data on the upper side of an 8 x 8 array of cells for storing and processing data; then each cell for storing and processing data multiplies the stored data by G to obtain a partial sum, taking the first row of the cell for storing and processing data as an example, obtains a partial sum G (3, 1), adds the original stored partial sum a (1, 1) + C (1, 3) + I (3, 3) to obtain a partial result a (1, 1) + C (1, 3) + I (3, 3) + G (3, 1) of the first row of the cell for storing and processing data, takes the first row of the cell for storing and processing data as an example, obtains a partial sum G (3, 3) and adds the original stored partial sum a (1, 3) + C (1, 5) + I (3, 5) to obtain the latest partial sum a (1, 3) + C (1, 3, 1), 5) storing + 3, 5 + G (3, 3), taking the first row of cells for storing and processing data as an example, obtaining a partial sum G (5, 1), adding the original stored partial sum a (3, 1) + C (3, 3) + G (5, 3), obtaining the latest partial sum a (3, 1) + C (3, 3) + G (5, 1), taking the second row of cells for storing and processing data as an example, obtaining a partial sum I (5, 3), adding the original stored partial sum a (3, 3) + C (3, 5) + G (5, 5), and obtaining the latest partial sum a (3, 3) + C (3, 5) + I (5, 5) after adding the original stored partial sum a (3, 3) + C (3, 5) + I), and obtaining the latest partial sum a (3, 5) + I (5, 5) for each time, note, however, that the cells used to store the data do not operate, and therefore do not yield a partial sum, because the convolution kernel shown in fig. 4 results in an 8 x 8 array of data resulting from a convolution operation with a step size of 2 on the data shown in fig. 2. And finally, arranging partial results of all units for storing and processing data according to needs to obtain a sub-processing result corresponding to the first sampling result 301.
First sampling result 302 and second sampling result 502: firstly, determining B as a first weighted value, and keeping an initial position when a first sampling result is input into a processing matrix, namely (1, 2) storing in a first row and a first column of units for storing and processing data, (1, 4) storing in a first row and a second column of units for storing and processing data, (3, 2) storing in a first row and a first column of units for storing and processing data, (3, 4) storing in a second row and a second column of units for storing and processing data, and a 9 th row of numerical values (namely, a last row of numerical values) storing in a first row of units for storing data at the lower side of an 8 x 8 array consisting of units for storing and processing data; then each unit for storing and processing data multiplies the stored data by B to obtain a partial sum, taking the first row and the first column of the unit for storing and processing data as an example, obtaining the partial sum B (1, 2), storing, taking the first row and the second column of the unit for storing and processing data as an example, obtaining the partial sum B (1, 4), storing, taking the first row and the first column of the unit for storing and processing data as an example, obtaining the partial sum B (3, 2), storing, taking the second row and the second column of the unit for storing and processing data as an example, obtaining the partial sum B (3, 4), storing, and no longer describing the process of obtaining the partial sum by other units for storing and processing data one by one, but it needs to be noted that the unit for storing data does not perform operation, so that the partial sum is not obtained, this is because the convolution kernel shown in fig. 4 performs convolution operation on the data shown in fig. 2 with step size of 2 to obtain a data array with result of 8 × 8, that is, the last row of the first sampling result 302 is only multiplied by the weight value H of the second sampling result 502 in the convolution process, and does not need to be multiplied by B; then, for H, which is a non-first weight value, since H is on the lower side of B, the first sampling result is shifted upward by one unit with respect to the entire processing array, that is, R1 in each shift register file sends its stored data to R1 in the shift register file of the unit on the upper side thereof, that is, (3, 2) stored in the first row and first column of the units for storing and processing data, (3, 4) stored in the first row and second column of the units for storing and processing data, (5, 2) stored in the first row and second column of the units for storing and processing data, (5, 4) stored in the second row and second column of the units for storing and processing data, and the first row of the data is stored in the first row of the units for storing data on the upper side of 8 × 8 array composed of the units for storing and processing data; then each cell for storing and processing data multiplies the stored data by H to obtain a partial sum, taking the cell for storing and processing data at the head row and the head column as an example, to obtain a partial sum H (3, 2), and adds the partial sum H (1, 2) + H (3, 2) to the original stored partial sum B (1, 2) to obtain a partial result B (1, 2) + H (3, 2) of the cell for storing and processing data at the head row and the head column, taking the cell for storing and processing data at the head row and the second column as an example, to obtain a partial sum H (3, 4), and adds the partial sum B (1, 4) + H (3, 4) to the original stored partial sum B (1, 4) + H (3, 4) to obtain a storage, taking the cell for storing and processing data at the head column of the second row as an example, to obtain a partial sum H (5, 2) and adds the stored partial sum B (3, 2), 2) after the addition, the latest partial sum B (3, 2) + H (5, 2) is obtained and stored, taking the second row and the second column of the unit for storing and processing data as an example, the partial sum H (5, 4) is obtained and added with the original partial sum B (3, 4) to obtain the latest partial sum B (3, 4) + H (5, 4) for storage, the process of obtaining the partial sum by the other units for storing and processing data is not described any more, but it should be noted that the unit for storing data does not perform the operation, and therefore does not obtain the partial sum, because the convolution kernel shown in fig. 4 performs the convolution operation on the data shown in fig. 2 with the step size of 2 to obtain the data array with the result of 8 x 8, that is, the first row of the first sampling result 302 in the convolution process is only multiplied by the weight value B of the second sampling result 502, without multiplication by H; and finally, arranging partial results of all units for storing and processing data according to needs to obtain a sub-processing result corresponding to the first sampling result 302.
First sampling result 303 and second sampling result 503: firstly, determining D as a first weighted value, and keeping an initial position when a first sampling result is input into a processing matrix, namely (2, 1) stored in a first row and a first column of units for storing and processing data, (2, 3) stored in a first row and a second column of units for storing and processing data, (4, 1) stored in a first row and a first column of units for storing and processing data, (4, 3) stored in a second row and a second column of units for storing and processing data, and a 9 th column of numerical values (namely, a last column of numerical values) stored in a first column of units for storing data on the right side of an 8 x 8 array consisting of units for storing and processing data; then each unit for storing and processing data multiplies the stored data by D to obtain a partial sum, taking the first row and the first column of the unit for storing and processing data as an example, obtaining a partial sum D (2, 1), storing, taking the first row and the second column of the unit for storing and processing data as an example, obtaining a partial sum D (2, 3), storing, taking the first row and the first column of the unit for storing and processing data as an example, obtaining a partial sum D (4, 1), storing, taking the second row and the second column of the unit for storing and processing data as an example, obtaining a partial sum D (4, 3), and storing, the process of obtaining the partial sum by other units for storing and processing data is not described any more, but it needs to be noted that the unit for storing data does not perform operation, and thus does not obtain the partial sum, this is because the convolution kernel shown in fig. 4 performs convolution operation on the data shown in fig. 2 with step size of 2 to obtain a data array with result of 8 × 8, that is, the last column of the first sampling result 303 is only multiplied by the weight value F of the second sampling result 503 in the convolution process, and does not need to be multiplied by D; then, for F, which is a non-first weight value, since F is on the right side of D, the first sampling result is shifted to the left by one unit with respect to the processing array as a whole, that is, R1 in the shift register file of each unit transmits its stored data to R1 in the shift register file of the unit on the left side thereof, that is, (2, 3) stored in the first row and first column of the unit for storing and processing data, (2, 5) stored in the second row and second column of the unit for storing and processing data, (4, 3) stored in the first row and first column of the unit for storing and processing data, (4, 5) stored in the second row and second column of the unit for storing and processing data, and the first column of the unit for storing data on the left side of 8 × 8 array of the units for storing and processing data; then each cell multiplies the stored data by F to obtain a partial sum, taking the first row and the first column of cells for storing and processing data as an example, obtains a partial sum F (2, 3), and adds the partial sum D (2, 1) + F (2, 3) to the original stored partial sum D, and takes the first row and the second column of cells for storing and processing data as an example, obtains a partial sum F (2, 5), and adds the partial sum D (2, 3) + F (2, 5) to the original stored partial sum D (2, 3) to obtain the latest partial sum D (2, 3) + F (2, 5), and takes the second row and the first column of cells for storing and processing data as an example, obtains a partial sum F (4, 3), and adds the original partial sum D (4, 1) to the stored partial sum D, the latest part sum D (4, 1) + F (4, 3) is obtained for storage, taking the second row and the second column of the cell for storing and processing data as an example, the part sum F (4, 5) is obtained, adding the obtained product to the original stored partial sum D (4, 3) to obtain the latest partial sum D (4, 3) + F (4, 5) for storage, the partial summation process of other units for storing and processing data is not described in detail, but it should be noted that, the cells used to store the data do not operate and therefore do not yield a partial sum because the convolution kernel shown in figure 4 results in an 8 x 8 array of data resulting from a convolution operation on the data shown in figure 2 at step 2, that is, the first column of the first sampling result 303 is only multiplied by the weight value D of the second sampling result 503 in the convolution process, and is not multiplied by F; and finally, arranging partial results of all units for storing and processing data according to needs to obtain a sub-processing result corresponding to the first sampling result 303.
First sampling result 304 and second sampling result 504: firstly, determining E as a first weighted value, and keeping an initial position when a first sampling result is input into a processing matrix, namely (2, 2) stored in a first row and a first column of units for storing and processing data, (2, 4) stored in a first row and a second column of units for storing and processing data, (4, 1) stored in a first row and a first column of units for storing and processing data, and (4, 4) stored in a second row and a second column of units for storing and processing data; then each unit multiplies the stored data by E to obtain a partial sum, taking the first row and first column of units for storing and processing data as an example, a partial result E (2, 2) of the first row and first column of units for storing and processing data is obtained and stored, taking the first row and second column of units for storing and processing data as an example, a partial sum E (2, 4) is obtained and stored, taking the second row and first column of units for storing and processing data as an example, a partial sum E (4, 2) is obtained and stored, taking the second row and second column of units for storing and processing data as an example, a partial sum E (4, 4) is obtained and stored, and the process of obtaining the partial sum by other units for storing and processing data is not repeated; and finally, arranging partial results of all units for storing and processing data according to needs to obtain a sub-processing result corresponding to the first sampling result 304.
Wherein, step S603 may be performed as follows: and summing the plurality of sub-processing results to obtain a processing result. Since the sub-processing results are partial results of a plurality of rows and a plurality of columns and the number of rows and columns of each sub-processing result are equal (since the number of rows and columns of the unit for storing and processing data is equal), the partial results of the corresponding positions are added, and the sum of the partial results of each position is used as the processing result, namely the partial sums of the units for storing and processing data are summed to obtain the value of the unit, and the value of each unit constitutes the processing result. For example, in the example of fig. 3 and 5, the first row of cells for storing and processing data yields four partial sums, which are added to obtain the value of the corresponding location of the cell in the processing result, i.e., a (1, 1) + C (1, 3) + I (3, 3) + G (3, 1) + B (1, 2) + H (3, 2) + D (2, 1) + F (2, 3) + E (2, 2), the first row of cells for storing and processing data yields four partial sums, which are added to obtain the value of the corresponding location of the cell in the processing result, i.e., a (1, 3) + C (1, 5) + I (3, 5) + G (3, 3) + B (1, 4) + H (3, 4) + F (2, 2), 5) + E [ (2, 4); the first row of cells for storing and processing data in the second row yields four partial sums, which are added to yield the value of the corresponding location of the cell in the processing result, i.e., a (3, 1) + C (3, 3) + I (5, 3) + G (5, 1) + B (3, 2) + H (5, 2) + D (4, 1) + F (4, 3) + E (4, 2); the second row and the second column of cells used to store and process data yield four partial sums, which are summed to yield the value of the corresponding location of the cell in the processing result, i.e., a (3, 3) + C (3, 5) + I (5, 3) + B (3, 4) + H (5, 4) + D (4, 3) + F (4, 5) + E (4, 4).
In some embodiments of the present disclosure, the data to be processed is obtained according to data of an image, and in order to match a first sampling result after sampling the data to be processed with the processing array, the data to be processed may be obtained by: firstly, determining the number of rows and the number of columns of the data to be processed according to the processing array, the convolution kernel and the step length; then, determining the number of overlapped rows and the number of overlapped columns according to the convolution kernel and the step length; and finally, sampling the data of the image to be processed according to the number of the rows and the columns of the data to be processed, the number of the overlapped rows and the number of the overlapped columns to obtain a plurality of data to be processed.
The number of overlapping rows and the number of overlapping columns may also be equal, and the number of overlapping rows and the number of overlapping columns P may be determined in the following manner:
where K is the number of rows of the convolution kernel (equal number of rows and columns) and S is the step size of the convolution operation. Further, K is greater than or equal to S.
The number of rows and columns of the data to be processed may be equal, and the number of rows and columns L of the data to be processed may be determined in the following manner:
l is S a + P, where S is the step size of the convolution operation and a is the number of rows (equal to the number of columns) of the unit used to store and process the data.
As shown in fig. 13, when sampling data of an image to be processed, a sampling frame of L × L is placed at the top left corner of the data of the image to be processed, the data in the sampling frame is taken as first data to be processed, then the sampling frame is moved to the right by L-P, the data in the sampling frame is taken as second data to be processed, the sampling frame is moved to the right by L-P again, the sampling frame cannot be moved to the right by L-P, then the sampling frame is moved to the left by L-P downwards to perform sampling, then the process of sampling in the first row is repeated, after sampling in the second row is finished, the sampling frame is moved downwards continuously until the sampling frame cannot be moved downwards by L-P, and after each movement, sampling in accordance with the first row is performed on a new row.
According to a second aspect of embodiments of the present invention, there is provided a data processing apparatus, the apparatus comprising:
the controller is used for sampling data to be processed according to the step length of convolution operation to obtain at least one first sampling result, wherein the step length is larger than 1; sampling a convolution kernel according to the step length of the convolution operation to obtain at least one second sampling result, wherein the at least one first sampling result is in one-to-one correspondence with the at least one second sampling result; correspondingly inputting the at least one first sampling result and the at least one second sampling result into the processing array;
the processing array is configured to process the at least one first sampling result and the at least one second sampling result, and output a processing result.
In some embodiments of the present disclosure, the controller is configured to sample the data to be processed according to the step size to obtain at least one first line sampling result, where a union of the at least one first line sampling result is the data to be processed; performing row-column sampling on the data to be processed according to the step length to obtain at least one first column of sampling results, wherein a union set of the at least one first column of sampling results is the data to be processed; and respectively determining the intersection of each first row sampling result and each first column sampling result as a first sampling result.
In some embodiments of the present disclosure, the controller is configured to perform line sampling on the convolution kernel according to the step size to obtain at least one second line sampling result, where a union of the at least one second line sampling result is the convolution kernel; performing column sampling on the convolution kernel according to the step length to obtain at least one second column sampling result, wherein a union set of the at least one second column sampling result is the data to be processed; and respectively determining the intersection of each second row sampling result and each second column sampling result as a second sampling result.
In some embodiments of the present disclosure, the controller is configured to, for each first sampling result, input the first sampling result to the processing array, and input a second sampling result corresponding to the first sampling result to the processing array; and are
The processing array is used for determining a corresponding sub-processing result according to the first sampling result and the corresponding second sampling result; and outputting the processing result according to the sub-processing result corresponding to each first sampling result.
In some embodiments of the disclosure, the controller is configured to, for each first sampling result, input a plurality of numerical values of the first sampling result into a plurality of cells of the processing array such that a relative position of the plurality of numerical values in the plurality of cells is the same as a relative position of the plurality of numerical values in the first sampling result.
In some embodiments of the present disclosure, the processing array comprises an active array, at least one overflow row and at least one overflow column distributed around the active array, wherein the active array comprises a plurality of cells for storing and processing data, the overflow row and the overflow column comprise a plurality of cells for storing data;
the controller is configured to input a plurality of numerical values of the first sampling result into a plurality of cells of the processing array, so that a numerical value in a first row and a first column of the first sampling result is input into a first row and a first column of the cell for storing and processing data.
In some embodiments of the present disclosure, the processing array is configured to, for each weight value in the corresponding second sampling result, determine a partial sum with the weight value using a numerical value corresponding to the weight value in the first sampling result;
respectively corresponding partial results and determining partial results according to the weight values in the corresponding second sampling results; and
and determining a sub-processing result corresponding to the first sampling result according to at least one partial result.
In some embodiments of the present disclosure, the processing array is configured to determine, for a first weight value in a corresponding second sampling result, a partial sum of a numerical value of the first sampling result in an initial position corresponding unit of the processing array and the first weight value.
In some embodiments of the disclosure, the controller is configured to, for each non-first weight value in the corresponding second sampling result, determine a moving manner of the first sampling result according to a first numerical value corresponding to the non-first weight value in the first sampling result and a positional relationship, in the first sampling result, of a second numerical value corresponding to a last weight value of the non-first weight value;
the processing array is used for moving the second numerical value to a corresponding unit by adopting a determined moving mode; and determining a partial sum by using the numerical value in the moved corresponding unit and the non-first weight value.
In some embodiments of the present disclosure, the controller is further configured to determine, according to the processing array, the convolution kernel, and the step size, a number of rows and a number of columns of the data to be processed; determining the number of overlapped rows and the number of overlapped columns according to the convolution kernel and the step length; and sampling the data of the image to be processed according to the number of the rows and the columns of the data to be processed, the number of the overlapped rows and the number of the overlapped columns to obtain a plurality of data to be processed.
In some embodiments of the present disclosure, the data to be processed is single-channel data or one channel of multi-channel data, and the convolution kernel is one channel of a single-channel convolution kernel or a multi-channel convolution kernel.
With regard to the apparatus in the above-mentioned embodiments, the specific manner in which each module performs the operation has been described in detail in the first aspect with respect to the embodiment of the method, and will not be elaborated here.
The data processing device provided by the embodiment of the disclosure may include a chip, an AI chip, and the like.
In a third aspect, at least one embodiment of the present invention provides an electronic device, please refer to fig. 14, which shows a structure of the electronic device, where the electronic device includes a memory, a processor, and a data processing apparatus provided in this embodiment. The memory is for storing computer instructions executable on the processor for processing data based on the method of the first aspect when executing the computer instructions.
In a fourth aspect, at least one embodiment of the invention provides a computer readable storage medium having a computer program stored thereon, which when executed by a processor, performs the method of the first aspect.
In the present invention, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The term "plurality" means two or more unless expressly limited otherwise.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (24)

1. A method of data processing, the method comprising:
sampling data to be processed according to the step length of convolution operation to obtain at least one first sampling result, wherein the step length is larger than 1;
sampling a convolution kernel according to the step length of the convolution operation to obtain at least one second sampling result, wherein the at least one first sampling result is in one-to-one correspondence with the at least one second sampling result;
correspondingly inputting the at least one first sampling result and the at least one second sampling result into a processing array, so that the processing array outputs a processing result.
2. The data processing method according to claim 1, wherein sampling the data to be processed according to the step size of the convolution operation to obtain at least one first sampling result comprises:
sampling the data to be processed according to the step length to obtain at least one first line sampling result, wherein the union of the at least one first line sampling result is the data to be processed;
performing row-column sampling on the data to be processed according to the step length to obtain at least one first column of sampling results, wherein a union set of the at least one first column of sampling results is the data to be processed;
and respectively determining the intersection of each first row sampling result and each first column sampling result as a first sampling result.
3. The data processing method of claim 1, wherein sampling the convolution kernel according to the step size of the convolution operation to obtain at least one second sampling result comprises:
performing line sampling on the convolution kernel according to the step length to obtain at least one second line sampling result, wherein the union of the at least one second line sampling result is the convolution kernel;
performing column sampling on the convolution kernel according to the step length to obtain at least one second column sampling result, wherein a union set of the at least one second column sampling result is the data to be processed;
and respectively determining the intersection of each second row sampling result and each second column sampling result as a second sampling result.
4. The data processing method of claim 1, wherein the inputting the at least one first sampling result and the at least one second sampling result into the corresponding processing array to cause the processing array to output the processing result comprises:
for each first sampling result, inputting the first sampling result into the processing array, and inputting a second sampling result corresponding to the first sampling result into the processing array; and are
Controlling the processing array to determine a corresponding sub-processing result according to the first sampling result and the corresponding second sampling result;
and controlling the processing array to output a processing result according to the sub-processing result corresponding to each first sampling result.
5. The data processing method of claim 4, wherein the inputting of each first sampling result to the processing array comprises:
for each first sampling result, inputting a plurality of numerical values of the first sampling result into a plurality of cells of the processing array such that the relative positions of the plurality of numerical values in the plurality of cells are the same as the relative positions of the plurality of numerical values in the first sampling result.
6. The data processing method of claim 5, wherein the processing array comprises an active array, at least one overflow row and at least one overflow column distributed around the active array, wherein the active array comprises a plurality of cells for storing and processing data, and wherein the overflow row and the overflow column comprise a plurality of cells for storing data;
the inputting the plurality of values of the first sampling result into the plurality of cells of the processing array includes:
and inputting a plurality of numerical values of the first sampling result into a plurality of units of the processing array, so that the numerical value of the first row and the first column in the first sampling result is input into the first row and the first column of the unit for storing and processing data.
7. The data processing method according to any of claims 4 to 6, wherein said controlling said processing array to determine a corresponding sub-processing result based on the first sampling result and the corresponding second sampling result comprises:
for each weight value in the corresponding second sampling result, controlling the processing array to adopt a numerical value corresponding to the weight value in the first sampling result and a partial sum of the numerical value and the weight value;
controlling the processing array to respectively correspond to the part and the determined part of the result according to each weight value in the corresponding second sampling result;
and controlling the processing array to determine a sub-processing result corresponding to the first sampling result according to at least one partial result.
8. The data processing method of claim 7, wherein controlling the processing array to adopt, for each weight value in the corresponding second sample result, a value in the first sample result corresponding to the weight value, and the weight value determination portion sum, comprises:
and for the first weight value in the corresponding second sampling result, controlling the processing array to adopt the sum of the numerical value of the first sampling result in the initial position corresponding unit of the processing array and the first weight value determination part.
9. The data processing method according to claim 7 or 8, wherein said controlling the processing array to adopt, for each weight value in the corresponding second sample result, the value corresponding to the weight value in the first sample result, together with the weight value determination portion sum, comprises:
for each non-first weight value in the corresponding second sampling result, determining a moving mode of the first sampling result according to a first numerical value corresponding to the non-first weight value in the first sampling result and a position relation of a second numerical value corresponding to a previous weight value of the non-first weight value in the first sampling result, and controlling the processing array to move the second numerical value to a corresponding unit in the determined moving mode;
and controlling the processing array to adopt the sum of the numerical value in the moved corresponding unit and the non-first weight value determination part.
10. The data processing method according to any one of claims 1 to 9, further comprising:
determining the number of rows and columns of the data to be processed according to the processing array, the convolution kernel and the step length;
determining the number of overlapped rows and the number of overlapped columns according to the convolution kernel and the step length;
and sampling the data of the image to be processed according to the number of the rows and the columns of the data to be processed, the number of the overlapped rows and the number of the overlapped columns to obtain a plurality of data to be processed.
11. The data processing method according to any one of claims 1 to 10, wherein the data to be processed is one channel of single-channel data or multi-channel data, and the convolution kernel is one channel of a single-channel convolution kernel or a multi-channel convolution kernel.
12. A data processing apparatus, characterized in that the apparatus comprises:
the controller is used for sampling data to be processed according to the step length of convolution operation to obtain at least one first sampling result, wherein the step length is larger than 1; sampling a convolution kernel according to the step length of the convolution operation to obtain at least one second sampling result, wherein the at least one first sampling result is in one-to-one correspondence with the at least one second sampling result; correspondingly inputting the at least one first sampling result and the at least one second sampling result into the processing array;
the processing array is configured to process the at least one first sampling result and the at least one second sampling result, and output a processing result.
13. The data processing apparatus according to claim 12, wherein the controller is configured to sample the data to be processed according to the step size to obtain at least one first line sampling result, where a union of the at least one first line sampling result is the data to be processed; performing row-column sampling on the data to be processed according to the step length to obtain at least one first column of sampling results, wherein a union set of the at least one first column of sampling results is the data to be processed; and respectively determining the intersection of each first row sampling result and each first column sampling result as a first sampling result.
14. The data processing apparatus of claim 12, wherein the controller is configured to perform line sampling on the convolution kernel according to the step size to obtain at least one second line sampling result, and a union of the at least one second line sampling result is the convolution kernel; performing column sampling on the convolution kernel according to the step length to obtain at least one second column sampling result, wherein a union set of the at least one second column sampling result is the data to be processed; and respectively determining the intersection of each second row sampling result and each second column sampling result as a second sampling result.
15. The data processing apparatus of claim 12, wherein the controller is configured to, for each first sampling result, input the first sampling result to the processing array and input a second sampling result corresponding to the first sampling result to the processing array; and are
The processing array is used for determining a corresponding sub-processing result according to the first sampling result and the corresponding second sampling result; and outputting the processing result according to the sub-processing result corresponding to each first sampling result.
16. The data processing apparatus of claim 15, wherein the controller is configured to input, for each first sampling result, a plurality of values of the first sampling result into a plurality of cells of the processing array, such that a relative position of the plurality of values in the plurality of cells is the same as a relative position of the plurality of values in the first sampling result.
17. The data processing apparatus of claim 16, wherein the processing array comprises an active array comprising a plurality of cells for storing and processing data, at least one overflow row and at least one overflow column distributed around the active array, the overflow row and the overflow column comprising a plurality of cells for storing data;
the controller is configured to input a plurality of numerical values of the first sampling result into a plurality of cells of the processing array, so that a numerical value in a first row and a first column of the first sampling result is input into a first row and a first column of the cell for storing and processing data.
18. The data processing apparatus according to any of claims 15 to 17, wherein the processing array is configured to determine, for each weight value in the corresponding second sample result, a partial sum with the weight value using a value in the first sample result corresponding to the weight value;
respectively corresponding partial results and determining partial results according to the weight values in the corresponding second sampling results; and
and determining a sub-processing result corresponding to the first sampling result according to at least one partial result.
19. The data processing apparatus of claim 18, wherein the processing array is configured to determine a partial sum of a first weight value and a numerical value of the first sample result in an initial position corresponding unit of the processing array for the first weight value in the corresponding second sample result.
20. The data processing apparatus according to claim 18 or 19, wherein the controller is configured to determine, for each non-first weight value in the corresponding second sample result, a moving manner of the first sample result according to a position relationship between a first value in the first sample result corresponding to the non-first weight value and a second value in the first sample result corresponding to a weight value that is previous to the non-first weight value in the first sample result;
the processing array is used for moving the second numerical value to a corresponding unit by adopting a determined moving mode; and determining a partial sum by using the numerical value in the moved corresponding unit and the non-first weight value.
21. The data processing apparatus according to any of claims 12 to 20, wherein the controller is further configured to determine a number of rows and a number of columns of the data to be processed according to the processing array, the convolution kernel, and the step size; determining the number of overlapped rows and the number of overlapped columns according to the convolution kernel and the step length; and sampling the data of the image to be processed according to the number of the rows and the columns of the data to be processed, the number of the overlapped rows and the number of the overlapped columns to obtain a plurality of data to be processed.
22. The data processing apparatus according to any one of claims 12 to 21, wherein the data to be processed is one channel of single-channel data or multi-channel data, and the convolution kernel is one channel of a single-channel convolution kernel or a multi-channel convolution kernel.
23. An electronic device, characterized in that the device comprises a memory, a processor, and an apparatus according to any of claims 12 to 22.
24. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of claims 1 to 11.
CN202110352221.5A 2021-03-31 2021-03-31 Data processing method, device, equipment and storage medium Pending CN112927124A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110352221.5A CN112927124A (en) 2021-03-31 2021-03-31 Data processing method, device, equipment and storage medium
PCT/CN2021/115555 WO2022205763A1 (en) 2021-03-31 2021-08-31 Data processing method and apparatus, device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110352221.5A CN112927124A (en) 2021-03-31 2021-03-31 Data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112927124A true CN112927124A (en) 2021-06-08

Family

ID=76173597

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110352221.5A Pending CN112927124A (en) 2021-03-31 2021-03-31 Data processing method, device, equipment and storage medium

Country Status (2)

Country Link
CN (1) CN112927124A (en)
WO (1) WO2022205763A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022205763A1 (en) * 2021-03-31 2022-10-06 成都商汤科技有限公司 Data processing method and apparatus, device, and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110533164A (en) * 2019-08-05 2019-12-03 西安交通大学 A kind of Winograd convolution method for splitting towards convolutional neural networks accelerator
CN111428189A (en) * 2020-04-01 2020-07-17 南京大学 Data preprocessing method and device for deconvolution operation
CN112395092A (en) * 2020-11-30 2021-02-23 清华大学 Data processing method and artificial intelligence processor

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10621725B2 (en) * 2017-04-12 2020-04-14 Here Global B.V. Small object detection from a large image
CN109885407B (en) * 2019-03-05 2021-09-21 上海商汤智能科技有限公司 Data processing method and device, electronic equipment and storage medium
CN111597029B (en) * 2020-05-20 2024-03-22 上海商汤智能科技有限公司 Data processing method and device, electronic equipment and storage medium
CN112927124A (en) * 2021-03-31 2021-06-08 成都商汤科技有限公司 Data processing method, device, equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110533164A (en) * 2019-08-05 2019-12-03 西安交通大学 A kind of Winograd convolution method for splitting towards convolutional neural networks accelerator
CN111428189A (en) * 2020-04-01 2020-07-17 南京大学 Data preprocessing method and device for deconvolution operation
CN112395092A (en) * 2020-11-30 2021-02-23 清华大学 Data processing method and artificial intelligence processor

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022205763A1 (en) * 2021-03-31 2022-10-06 成都商汤科技有限公司 Data processing method and apparatus, device, and storage medium

Also Published As

Publication number Publication date
WO2022205763A1 (en) 2022-10-06

Similar Documents

Publication Publication Date Title
EP3349153B1 (en) Convolutional neural network (cnn) processing method and apparatus
US10909447B2 (en) Transposing neural network matrices in hardware
CN108615072B (en) Performing average pooling in hardware
TWI627593B (en) Rotating data for neural network computations
US11816559B2 (en) Dilated convolution using systolic array
CN114127742A (en) System and method for cross-channel, shift-based information mixing for a mixed-rank-like networking neural network
KR20190126887A (en) Alternative loop limit
US20180137414A1 (en) Convolution operation device and convolution operation method
JP2018026027A (en) Calculation processor and control method of calculation processor
JP2013205973A (en) Matrix arithmetic device
WO2020190466A1 (en) Spatially sparse convolutional neural networks for inking applications
US11915118B2 (en) Method and apparatus for processing computation of zero value in processing of layers in neural network
CN112967172A (en) Data processing device, method, computer equipment and storage medium
CN114041114A (en) Counter-based multiply-accumulate circuit for neural networks
CN111311599A (en) Image processing method, image processing device, electronic equipment and storage medium
US11164032B2 (en) Method of performing data processing operation
CN112927124A (en) Data processing method, device, equipment and storage medium
CN113994347A (en) System and method for asymmetric scale factor support for negative and positive values
CN114008664A (en) Optimization of deconvolution
US11610128B2 (en) Neural network training under memory restraint
CN113989169A (en) Expansion convolution accelerated calculation method and device
WO2023115814A1 (en) Fpga hardware architecture, data processing method therefor and storage medium
US20190164035A1 (en) Device for reorganizable neural network computing
JP7387017B2 (en) Address generation method and unit, deep learning processor, chip, electronic equipment and computer program
CN116051345A (en) Image data processing method, device, computer equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40044247

Country of ref document: HK