CN114968506A - Data processing device and method, computer equipment and storage medium - Google Patents

Data processing device and method, computer equipment and storage medium Download PDF

Info

Publication number
CN114968506A
CN114968506A CN202110221178.9A CN202110221178A CN114968506A CN 114968506 A CN114968506 A CN 114968506A CN 202110221178 A CN202110221178 A CN 202110221178A CN 114968506 A CN114968506 A CN 114968506A
Authority
CN
China
Prior art keywords
data
data processing
processing
processed
processors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110221178.9A
Other languages
Chinese (zh)
Inventor
江嘉文
何翔
顾茹雅
徐宁仪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Power Tensors Intelligent Technology Co Ltd
Original Assignee
Shanghai Power Tensors Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Power Tensors Intelligent Technology Co Ltd filed Critical Shanghai Power Tensors Intelligent Technology Co Ltd
Priority to CN202110221178.9A priority Critical patent/CN114968506A/en
Publication of CN114968506A publication Critical patent/CN114968506A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Image Processing (AREA)

Abstract

The present disclosure provides a data processing apparatus, a method, a computer device, and a storage medium, wherein the data processing apparatus includes: a controller, and a plurality of data processors; the controller is used for issuing a processing instruction for indicating a data processing task to the corresponding data processor; the data processors are used for acquiring data to be processed from the corresponding internal buffers based on the received processing instructions and executing corresponding data processing tasks based on the data to be processed; the data processing device can improve the reuse rate of data, reduce the times of accessing from an external memory and improve the calculation capacity utilization rate.

Description

Data processing device, method, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a data processing apparatus, a data processing method, a computer device, and a storage medium.
Background
When data is processed, for example, by filtering, dividing, enhancing, and feature extracting, it is generally implemented by programming on a general-purpose Processor such as a Central Processing Unit (CPU) or a Digital Signal Processor (DSP).
When the device processes data, the computational power utilization rate is low.
Disclosure of Invention
The embodiment of the disclosure at least provides a data processing device, a data processing method, computer equipment and a storage medium.
In a first aspect, an embodiment of the present disclosure provides a data processing apparatus, including: a controller, and a plurality of data processors; the controller is used for issuing a processing instruction for indicating a data processing task to the corresponding data processor; the data processors are used for acquiring data to be processed from the corresponding internal buffers based on the received processing instructions and executing corresponding data processing tasks based on the data to be processed.
Therefore, the reuse rate of data can be improved, the times of accessing from the external memory are reduced, and the calculation capacity utilization rate is improved.
In an optional implementation manner, the controller is configured to issue a processing instruction corresponding to a corresponding data processing cycle to a corresponding data processor according to a data processing task of each data processing cycle; and the data processors are used for acquiring the data to be processed corresponding to the data processing period from the corresponding internal buffer and processing the data based on the processing instruction received in the data processing period in each data processing period.
Therefore, the controller sends the corresponding processing instructions to the corresponding data processors according to the data processing tasks corresponding to different data processing periods, so that the plurality of data processors can cooperatively complete the processing of the original data, the data processing efficiency is improved, the utilization rate of the data processors is improved to the maximum extent in one data processing period, or the time consumed by the data processors when different functions are switched is reduced, and the data processing mode is more flexible.
In an optional embodiment, the data to be processed differs according to the data processing instruction, and includes at least one of: external data, intermediate data processed by the data processor in the previous data processing period, and intermediate data processed by other data processors in the previous data processing period.
In an alternative embodiment, the data processing tasks include parallel data processing tasks; the controller is used for sending a corresponding processing instruction to a corresponding data processor in each data processing period aiming at a plurality of data processing periods of the parallel data processing task; for the same data processing cycle, the processing instructions received by the data processors are the same, and the processed data to be processed are different; the data processors are used for acquiring the data to be processed corresponding to the data processing period from the corresponding internal buffer in each data processing period except the head and tail data processing periods based on the processing instruction received in the data processing period; and storing the processing result of the processed data to be processed into the corresponding internal buffer as the data to be processed in the subsequent data processing period.
In this way, since the data processor processes the data to be processed according to the processing instruction in each data processing cycle, the data processor can concentrate the computing power and process the data with the highest efficiency when the data processor is allocated with the full amount, and thus the waste of computing resources can be reduced.
In an alternative embodiment, the plurality of data processors comprises a programmable data processor; the programmable data processor is used for providing different processing functions in different data processing periods in cooperation with the processing instructions corresponding to the data processing periods.
In an optional implementation manner, the sub-data processing tasks executed by the processing instructions corresponding to the adjacent data processing cycles correspond to adjacent processing steps in the parallel data processing task.
Therefore, the order of the processing instructions can be determined according to the adjacent processing steps in the data processing task, and possible confusion caused by a plurality of data processors when the images to be processed are processed is avoided.
In an alternative embodiment, the data processing tasks include serial data processing tasks; the controller is used for sending a corresponding processing instruction to a corresponding data processor in each data processing period aiming at a plurality of data processing periods of the serial data processing task; for the same data processing cycle, in a plurality of groups of data processors included in the plurality of data processors, the processing instructions received by the same group of data processors are the same, and the processing instructions received by different groups of data processors are different; the group of data processors comprises at least one data processor; each group of data processors in the plurality of data processors is used for acquiring data to be processed corresponding to the data processing period from the corresponding first internal buffer based on the processing instruction received in the data processing period; and storing the processing result of the data to be processed after processing into a corresponding second internal buffer as the data to be processed of another group of data processor in the subsequent data processing period.
Thus, the switching of the data processor on the corresponding processing function can be reduced, so that the switching time can be reduced in the data processing process, and the processing efficiency can be improved.
In an alternative embodiment, the plurality of data processors comprises a programmable data processor; and the programmable data processors in different groups are used for providing different processing functions in the same data processing cycle in cooperation with the processing instructions in the corresponding groups.
In this way, the execution program of the data processor can be changed after the instruction is converted, so that the different processing instructions can be flexibly received. Meanwhile, as the data processor can provide more processing functions, the problem that more data processors are needed when a special data processor is used is solved, and the using number of the data processors in the data processing device is reduced so as to reduce the cost.
In an optional implementation manner, the sub data processing tasks executed by the processing instructions corresponding to different data processors with adjacent processing logics in the same data processing cycle correspond to adjacent processing steps in the serial data processing task.
In an alternative embodiment, the internal buffer includes a line buffer unit; the data processing apparatus further includes: a data conversion circuit; the line cache unit is used for storing original data to be processed which are input in a line scanning sequence; and the data conversion circuit is used for performing line-block conversion on the original data to be processed to obtain the data to be processed which is provided for the data processor to process.
In an optional implementation manner, the controller is further configured to perform segmentation processing on the image to be processed according to a data processing window with a preset size, so as to obtain the data to be processed.
When the data to be processed is obtained, the data with larger volume can be divided into a plurality of data to be processed with smaller volume, so that the data processing task of the data with larger volume is divided into a plurality of small data processing tasks, and the data processing device is more suitable for being deployed in the embedded equipment.
In an alternative embodiment, the data processing task includes: and performing a filtering processing task and/or at least one subtask on the data to be processed.
Therefore, the large data processing task can be divided into a plurality of smaller data processing tasks, so that the tasks issued to the data processor are simpler, the data processing requirements on the data processor are reduced, errors possibly generated by more complicated processing on a large amount of data are avoided, the performance requirements on the data processor can be reduced, and the cost is further reduced.
In an alternative embodiment, the internal buffer is connected to the plurality of data processors via a bus; each data processor corresponds to at least one internal buffer.
In an optional embodiment, the controller is further configured to determine a task to be processed for each data processing cycle; and dividing the tasks to be processed of the data processing period into subtasks for processing by the plurality of data processors, and generating corresponding processing instructions.
In a second aspect, an embodiment of the present disclosure further provides a data processing method, which is applied to a data processing apparatus, where the data processing apparatus includes: a controller, and a plurality of data processors; the data processing method comprises the following steps: the controller transmits a processing instruction for indicating a data processing task to a corresponding data processor; and the data processors acquire data to be processed from the corresponding internal buffers based on the received processing instructions and execute corresponding data processing tasks based on the data to be processed.
In an optional implementation manner, the controller issues a processing instruction for instructing a data processing task to a corresponding data processor, and includes: the controller sends the corresponding processing instruction of the corresponding data processing period to the corresponding data processor according to the data processing task of each data processing period; the data processors acquire data to be processed from corresponding internal buffers based on the received processing instructions, and execute corresponding data processing tasks based on the data to be processed, and the data processing method comprises the following steps: and the data processors acquire the data to be processed corresponding to the data processing period from the corresponding internal buffer and process the data based on the processing instructions received in the data processing period respectively.
In an optional embodiment, the data to be processed differs according to the data processing instruction, and includes at least one of: external data, intermediate data processed by the data processor in the previous data processing period, and intermediate data processed by other data processors in the previous data processing period.
In an alternative embodiment, the data processing tasks include parallel data processing tasks; the controller issues the corresponding processing instruction of the corresponding data processing period to the corresponding data processor according to the data processing task of each data processing period, and the method comprises the following steps: for a plurality of data processing periods of the parallel data processing task, issuing a corresponding processing instruction to a corresponding data processor in each data processing period; for the same data processing cycle, the processing instructions received by the data processors are the same, and the processed data to be processed are different; the data processors acquire data to be processed from corresponding internal buffers based on the received processing instructions, and the data processing method comprises the following steps: the data processors acquire the data to be processed corresponding to the data processing period from the corresponding internal buffer in each data processing period except the head and tail data processing periods based on the processing instructions received in the data processing period; and storing the processing result of the processed data to be processed into the corresponding internal buffer as the data to be processed in the subsequent data processing period.
In an alternative embodiment, the plurality of data processors comprises a programmable data processor; the data processing method further comprises: the programmable data processor provides different processing functions in different data processing cycles in cooperation with the processing instructions corresponding to the data processing cycles.
In an optional implementation manner, the sub-data processing tasks executed by the processing instructions corresponding to the adjacent data processing cycles correspond to adjacent processing steps in the parallel data processing task.
In an alternative embodiment, the data processing tasks include serial data processing tasks; the controller issues the corresponding processing instruction of the corresponding data processing period to the corresponding data processor according to the data processing task of each data processing period, and the method comprises the following steps: aiming at a plurality of data processing periods of the serial data processing task, in each data processing period, issuing a corresponding processing instruction to a corresponding data processor; for the same data processing cycle, in a plurality of groups of data processors included in the plurality of data processors, the processing instructions received by the same group of data processors are the same, and the processing instructions received by different groups of data processors are different; the group of data processors comprises at least one data processor; the data processors acquire data to be processed from corresponding internal buffers based on the received processing instructions, and the data processing method comprises the following steps: each group of data processors in the plurality of data processors acquires data to be processed corresponding to the data processing period from the corresponding first internal buffer based on the processing instruction received in the data processing period; and storing the processing result of the data to be processed after processing into a corresponding second internal buffer as the data to be processed of another group of data processor in the subsequent data processing period.
In an alternative embodiment, the plurality of data processors comprises a programmable data processor; the data processing method further comprises: different sets of programmable data processors cooperate with corresponding sets of processing instructions to provide different processing functions during the same data processing cycle.
In an optional implementation manner, the sub data processing tasks executed by the processing instructions corresponding to different data processors with adjacent processing logics in the same data processing cycle correspond to adjacent processing steps in the serial data processing task.
In an alternative embodiment, the internal buffer includes a line buffer unit; the data processing apparatus further includes: a data conversion circuit; the data processing method further comprises: the line cache unit stores original data to be processed which are input in a line scanning sequence; and the data conversion circuit performs line-block conversion on the original data to be processed to obtain the data to be processed which is provided for the data processor to process.
In an optional embodiment, the method further comprises: and the controller performs segmentation processing on the image to be processed according to a data processing window with a preset size to obtain the data to be processed.
In an alternative embodiment, the data processing task includes: and performing a filtering processing task and/or at least one subtask on the image to be processed.
In an alternative embodiment, the internal buffer is connected to the plurality of data processors via a bus; each data processor corresponds to at least one internal buffer.
In an optional embodiment, the method further comprises: the controller determines a task to be processed in each data processing cycle; and dividing the tasks to be processed of the data processing period into subtasks for processing by the plurality of data processors, and generating corresponding processing instructions.
In a third aspect, an embodiment of the present disclosure further provides a computer device, including: an instruction memory and a data processing apparatus as claimed in any one of the first aspect.
In a fourth aspect, the disclosed embodiments also provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed to perform the steps in the second aspect or any one of the possible implementation manners of the second aspect.
For the description of the effects of the data processing method, the computer device, and the computer-readable storage medium, reference is made to the description of the data processing apparatus, which is not repeated herein.
In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.
Fig. 1 shows a schematic diagram of a data processing apparatus provided by an embodiment of the present disclosure;
FIG. 2 is a schematic diagram illustrating a method for storing original data into a line cache unit through line scanning according to an embodiment of the present disclosure;
FIG. 3 is a diagram illustrating data processing according to parallel processing tasks according to an embodiment of the present disclosure;
FIG. 4 is a diagram illustrating data processing according to serial processing tasks according to an embodiment of the disclosure;
fig. 5 shows a flowchart of a data processing method provided by an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of embodiments of the present disclosure, as generally described and illustrated herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making any creative effort, shall fall within the protection scope of the disclosure.
It has been found that when processing image data, the following two methods are generally used: (1): programming on a CPU or DSP or other general processors; (2): and (3) directly mapping related algorithms used by data processing into a hardware circuit, and processing image data by using a special chip obtained by integration.
In the method (1), since there are many basic operation units in the general-purpose processor, in order to complete the processing of the image data, it is necessary to perform combination control on the plurality of basic operation units, which results in low computational efficiency and large power consumption. Moreover, the processing architecture of the general-purpose processor is not suitable for algorithms for processing partial image data, such as an image filtering algorithm, and has low data access efficiency, many basic operation units have more idle time, and the computational power utilization rate is low.
In the method (2), since the dedicated chip obtained by integration has high specificity and cannot be applied to other data processing tasks, different dedicated chips need to be integrated for different data processing tasks, which results in high cost.
Based on the above research, the present disclosure provides a data processing apparatus, which uses a controller to issue a processing instruction for instructing a data processing task, and controls a plurality of data processors, so that the plurality of data processors obtain data to be processed from corresponding internal buffers based on the received processing instruction, and execute the corresponding data processing task, thereby improving the data reuse rate, reducing the number of times of accessing from an external memory, and improving the computation utilization rate.
In addition, because the data processor is programmable, the data processor can adapt to operators and instructions corresponding to different data processing tasks, namely can adapt to different data processing tasks more flexibly, and therefore equipment cost can be reduced.
The above-mentioned drawbacks are the results of the inventor after practical and careful study, and therefore, the discovery process of the above-mentioned problems and the solutions proposed by the present disclosure to the above-mentioned problems should be the contribution of the inventor in the process of the present disclosure.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
To facilitate understanding of the present embodiment, a detailed description is first given of a data processing apparatus provided in an embodiment of the present disclosure.
Referring to fig. 1, a schematic diagram of a data processing apparatus provided in an embodiment of the present disclosure is shown, where the apparatus includes: a controller 10, and a plurality of data processors 20 (wherein, a plurality of data processors 20 are shown in fig. 1, including data processor 1 to data processor n); wherein the content of the first and second substances,
the controller 10 is used for sending processing instructions for indicating data processing tasks to the corresponding data processors 20;
and a plurality of data processors 20, configured to obtain data to be processed from the corresponding internal buffers based on the received processing instructions, and execute corresponding data processing tasks based on the data to be processed.
The data processing apparatus provided by the present disclosure includes a controller 10 and a plurality of data processors 20, wherein after transmitting data to be processed to a corresponding data processor 20, the controller 10 may issue corresponding processing instructions to different data processors 20 of the plurality of data processors 20 according to the data processing tasks, so that different data processors 20 of the plurality of data processors 20 may perform data processing on the data to be processed corresponding to the data processor 20 according to the received processing instructions, and the controller 10 in the apparatus can flexibly schedule each data processor 20 based on the data processing tasks, thereby fully utilizing the computing power of the data processor 20 and improving the utilization rate of the computing power.
In a specific implementation, as shown in fig. 1, a specific example of a plurality of data processors 20 being interconnected is provided, in which case the data processors 20 may be connected together in any manner. The connection may be physical or logical, for example, a plurality of data processors 20 form an M × N (where M, N are positive integers) array of data processors 20. The specific connection mode of the data processors 20 may be determined according to actual situations, and will not be described herein.
In the embodiment of the present disclosure, the data to be processed includes, for example: image data, audio data, character data, and the like.
Taking the data to be processed as the image data as an example, the controller 10 may perform segmentation processing on the image to be processed according to a data processing window with a preset size, for example, to obtain the data to be processed. Therefore, the data processing device can divide the data with larger volume into a plurality of data to be processed with smaller volume, so as to divide the data processing task of the data with larger volume into a plurality of small data processing tasks, and is more suitable for being deployed in the embedded device.
When the controller 10 issues the processing instruction of the data processing task to the corresponding data processor 20, for example, in different data processing cycles, the controller 10 issues the processing instruction corresponding to the current data processing cycle to the data processor 20, so that the plurality of data processors 20 cooperatively complete the processing of the original data in a serial or parallel manner, thereby improving the efficiency of data processing, and simultaneously improving the utilization rate of the data processor 20 to the maximum extent in one data processing cycle, or reducing the time consumed by the data processor 20 when switching different functions, and the data processing manner is more flexible.
In addition, the data processing device can process the data to be processed in a serial or parallel mode, so when the total data volume of the data to be processed is large, the corresponding data processing task can be completed by the data processing device.
Taking the data to be processed as the image data as an example, the size of the image to be processed is determined by the size that can be processed by the data processor 20.
Illustratively, when the sizes of the image data that can be processed by the plurality of data processors 20 in the data processing apparatus are consistent, for example, the sizes of the image data that can be processed by the plurality of data processors 20 (the unit is a pixel point, the size is expressed as M × N, where M represents the number of horizontal pixel points, and N represents the number of vertical pixel points) may include, for example, sizes of 64 × 64 or 32 × 32 such that the number of horizontal and vertical pixel points is the same, or sizes of 64 × 32 such that the number of horizontal and vertical pixel points is different.
In the case where the sizes of the image data that can be processed by different data processors 20 among the plurality of data processors 20 in the data processing apparatus are not uniform, the controller 10 may schedule the different data processors 20, allocate different data processing tasks to the different data processors 20, or cause the different data processors 20 to execute different links in the same processing task.
The embodiment of the present disclosure will explain the data processing apparatus in detail by taking as an example the case where the sizes of image data that can be processed by a plurality of data processors 20 in the data processing apparatus are identical.
When the data to be processed is processed by using the plurality of data processors 20 in the data processing device, under the condition that the data to be processed with the same size as the data processors 20 can be processed correspondingly, the bearable processing capacity of the data processors 20 can be met, and meanwhile, the data processors 20 can process the image to be processed with the maximum computational power, so that the computational efficiency of the data processors 20 is improved.
In a specific implementation, the data to be processed can be obtained by performing corresponding data processing on the raw data.
Specifically, the raw data may include, for example, a shot image, and the size of the image is not limited, and may include, for example, 256 × 256 or 1080 × 1080. For example, objects such as people, animals, plants, and scenes may be presented in the image to be processed, and the displayed color, brightness, and other data may be directly determined by the pixel value corresponding to the pixel point, such as the display attribute, and the like, which are not limited herein.
The raw data may include a plurality of images, and the plurality of images may be the same size or different sizes. When data processing is performed on a plurality of images by a data processing device, for example, the plurality of images may be sequentially processed in order or randomly, or the plurality of images may be stitched and the data obtained by the stitching may be processed. Specifically, the determination may be performed according to actual situations, and details are not described herein.
After the raw data is acquired, the raw data can also be stored in the form of raw data to be processed by using an internal buffer in the data processing device.
In a specific implementation, the original data may be stored in an internal buffer Memory by a Direct Memory Access (DMA) transfer method, where the internal buffer Memory includes at least one line buffer unit, each line buffer unit may store data of a certain size, for example, and after line scanning, the original data may store original data to be processed obtained after scanning into the line buffer unit.
Specifically, the size of the line buffer unit in which data can be stored may be determined, for example, according to the size of image data that can be processed by the data processor 20. For example, in the case where the size of the image data that can be processed by the data processor 20 includes 32 × 32, the number of vertical pixels 32 may be set as the size of the line buffer unit capable of storing data. The raw data may be preliminarily processed by the size of the line buffer unit capable of storing data, and the size thereof in the longitudinal direction is first converted to the size in the longitudinal direction that can be processed by the data processor 20.
Referring to fig. 2, a schematic diagram of storing original data to a line cache unit through line scanning according to an embodiment of the present disclosure is shown. In fig. 2, the original data 21 has a size of 128 × 128, and can be divided into 16 sub-images of 32 × 32, which are denoted as a to P, respectively. That is, for sub-image a, its corresponding size is 32 × 32.
When the original data 21 is transferred to the line cache unit through the DMA, for example, the original data 21 may be segmented with 32 longitudinal pixels as the line scanning size to obtain four different original data to be processed, which respectively correspond to the original data to be processed L0, L1, L2, and L3 including the sub-images a to D, the sub-images E to H, the sub-images I to L, and the sub-images M to P. Then, it is also possible to store the original to-be-processed data L0 into the line cache unit 0 (22 in fig. 2), the original to-be-processed data L1 into the line cache unit 1 (23 in fig. 2), the original to-be-processed data L2 into the line cache unit 2 (24 in fig. 2), and the original to-be-processed data L3 into the line cache unit 3 (25 in fig. 2) in order.
In another possible implementation, in the case that resources that can be deployed on the device are few, only one line cache unit may also be provided. When the original data is processed, the line scanning process may be performed on the original data, and taking the example shown in fig. 2 as an example, for example, only the sub-images a to D may be line-scanned to obtain the original data to be processed L0, and the original data to be processed L0 may be stored in the line buffer unit. The original data to be processed L0 in the line cache unit is processed or called, so that after the line cache unit does not contain the original data to be processed, the line scanning is performed on other data in the original data, and a result obtained by the line scanning is correspondingly stored in the line cache unit.
At this time, the original to-be-processed data obtained by line scanning the original data is correspondingly stored in different line cache units.
After the original data to be processed is obtained, the original data to be processed may be subjected to line-block conversion by using a data conversion circuit, so as to obtain external data that can be processed by the data processor 20.
Specifically, the data conversion circuit may cut the original data to be processed stored in the line buffer unit. Taking the original to-be-processed data L0 in fig. 2 as an example, when the standard of the cutting of the data conversion circuit is set to be 32 pixels in the horizontal direction, the to-be-processed data L0 can be cut into the sub-image a, the sub-image B, the sub-image C, and the sub-image D. Since the sub-image obtained by cutting is the same size as the data processor 20 can process, it can be used as external data transmitted to the data processor 20 for data processing.
In addition, when external data is obtained, the to-be-processed image can be segmented by directly utilizing the data processing window with the preset size, and to-be-processed data is obtained.
In determining the preset size, the size of the data that can be processed by the data processor 20 may be set as the preset size. In this way, when the data processing window performs the segmentation processing on the image to be processed, the data to be processed, which can be directly processed by the data processor 20, can be directly obtained.
In a possible implementation manner, when the data processing tasks are different, the data to be processed is different according to the data processing instruction, and includes external data, intermediate data obtained by the data processor in a previous data processing cycle, and intermediate data obtained by other data processors in a previous data processing cycle.
In particular, the division of the data processing cycle may be determined, for example, according to different processing tasks and/or at least one sub-task of the data to be processed. For example, when performing convolution operation on data to be processed, two steps of weighting processing and convolution summing processing are included, so that the time required for weighting processing of the data to be processed can be determined as one data processing period, and the time for performing convolution summing on the result of weighting processing of the data to be processed can be determined as one data processing period. Or, the data processing method may further determine the data amount of the data to be processed, and when the data amount of the data to be processed is large, a period in which part of the data is processed may be regarded as one data processing period. The specific manner of determining the data processing period may be determined according to actual conditions, and is not described herein again.
The data processor is operated by the controller 10 by issuing a processing instruction when processing data to be processed. Specifically, the controller 10 may issue the processing instruction corresponding to the corresponding data processing cycle to the corresponding data processor 20 according to the data processing task of each data processing cycle. Accordingly, the data processors 20 may obtain the data to be processed corresponding to the data processing cycle from the corresponding internal buffers and process the data. The internal buffers may be connected to a plurality of data processors 20 through a bus, for example, and at least one internal buffer is provided for each data processor 20.
When the data processing tasks are different, the data to be processed are different according to the data processing instructions, and include external data, intermediate data obtained by the data processor 20 in the previous data processing cycle, and intermediate data obtained by other data processors 20 in the previous data processing cycle.
In different control cycles, the controller 10 is further configured to determine a task to be processed in each data processing cycle, divide the task to be processed in the data processing cycle into sub-tasks for processing by the plurality of data processors 20, and generate corresponding processing instructions.
For example, the data processing task may include a filtering processing task and/or at least one sub-task thereof for the image to be processed.
Specifically, the data processing tasks include parallel processing tasks and serial processing tasks. When the data processing tasks are different, the data processing method for the controller 10 and the plurality of data processors 20 is different, and includes the following (a) and (B):
(A) the method comprises the following steps The data processing tasks include parallel processing tasks.
Specifically, the controller 10 issues a corresponding processing instruction to the corresponding data processor 20 in each data processing cycle for a plurality of data processing cycles of the parallel data processing task; for the same data processing cycle, the processing instructions received by the data processors 20 are the same, and the data to be processed is different.
Correspondingly, in each data processing cycle except the head and tail data processing cycles, the plurality of data processors 20 acquire the data to be processed corresponding to the data processing cycle from the corresponding internal buffer based on the processing instruction respectively received in the data processing cycle; and storing the processing result after the data to be processed is processed into the corresponding internal buffer as the data to be processed in the subsequent data processing period.
Specifically, after determining the data processing task, the controller 10 may accordingly determine the corresponding processing instruction according to a plurality of data processing cycles. For example, in the case where the determination data processing task includes three tasks of weighting processing, convolution sum processing, and averaging processing, and the data processor 20 includes four, it may be determined that the sub data processing tasks executed by the processing instructions corresponding to adjacent data processing cycles correspond to adjacent processing steps in the parallel processing task.
Here, data processing is performed on the sub-image a, the sub-image B, the sub-image C, and the sub-image D as an example.
For example, the controller 10 may determine, in the first data processing cycle, data processing instructions issued to the data processor 0, including a task instruction for performing weighting processing on the sub-image a in the data to be processed; the data processing instruction issued to the data processor 1 comprises a task instruction for performing weighting processing on the sub-image B in the data to be processed; the data processing instruction issued to the data processor 2 comprises a task instruction for performing weighting processing on the sub-image C in the data to be processed; and data processing instructions issued to the data processor 3, including task instructions for performing weighting processing on the sub-image D in the data to be processed.
Here, the first data processing cycle is the first data processing cycle.
In the first data processing cycle, the data processors 20 respectively receive external data, store the external data in the internal buffers corresponding to the data processors 20, and process the external data stored in the internal buffers according to the received processing instructions, so as to obtain processed results.
For example, after the data processor performs weighting processing on the sub-image a, the sub-image B, the sub-image C, and the sub-image D, the corresponding result sub-image a, the corresponding result sub-image B, the corresponding result sub-image C, and the corresponding result sub-image D can be obtained.
And then storing the processed result sub-image a, the sub-image b, the sub-image c and the sub-image d into corresponding internal buffers to be used as data to be processed in a second data period.
In the second data processing period, the controller 10 determines data processing instructions issued to the data processor 0, including a task instruction for performing convolution summation on the sub-image a in the data to be processed; the data processing instruction issued to the data processor 1 comprises a task instruction for performing convolution summation on the sub-image b in the data to be processed; the data processing instructions sent to the data processor 2 comprise task instructions for performing convolution summation on the sub-images c in the data to be processed; and data processing instructions issued to the data processor 3, including task instructions for performing convolution summation on the sub-image d in the data to be processed.
Here, the second data processing period is a middle data processing period, and the task corresponding to the data processing instruction issued by the controller 10 is different from the task corresponding to the first processing period, so that the plurality of data processors 20 provide the processing function of convolution summation in the second data processing period in cooperation with the corresponding processing instruction.
For example, after the data processor performs convolution summation processing on the sub-image a, the sub-image b, the sub-image c, and the sub-image d, the corresponding result sub-image a ', the sub-image b', the sub-image c ', and the sub-image d' can be obtained.
And then, storing the processed result sub-image a ', the sub-image b', the sub-image c 'and the sub-image d' obtained after processing into corresponding internal buffers to serve as data to be processed in a third data processing period.
In the third data processing period, the controller 10 determines data processing instructions issued to the data processor 0, including a task instruction for performing an averaging process on the sub-image a' in the data to be processed; the data processing instruction issued to the data processor 1 comprises a task instruction for performing mean value processing on the sub-image b' in the data to be processed; the data processing instruction issued to the data processor 2 comprises a task instruction for performing mean value processing on the sub-image c' in the data to be processed; and data processing instructions issued to the data processor 3, including task instructions for performing averaging processing on the sub-image d' in the data to be processed.
Here, the third data processing cycle is the last data processing cycle, and the task corresponding to the data processing instruction issued by the controller 10 is different from the task corresponding to the second data processing cycle, so that the plurality of data processors 20 provide the processing function of the averaging processing in cooperation with the corresponding processing instruction in the third data processing cycle.
For example, after the data processor 20 performs the averaging process on the sub-image a ', the sub-image b', the sub-image c ', and the sub-image d', the corresponding result sub-image a ", the sub-image b", the sub-image c ", and the sub-image d" can be obtained.
Then, storing the sub-image a ", the sub-image b", the sub-image c "and the sub-image d" of the processing result obtained after the processing into corresponding internal buffers to wait for output; or directly output. After the output, the obtained result sub-image a ", the sub-image b", the sub-image c "and the sub-image d" are spliced again, so that the processing result of the original data can be obtained.
Here, since the data processor 20 processes the data to be processed according to the processing instruction in each data processing cycle, when the data processor 20 is allocated with a full amount, the data processor to be used can concentrate the computation power and perform the data processing with the highest efficiency, and thus the waste of the computation resources can be reduced.
(B) The method comprises the following steps The data processing tasks include serial data processing tasks.
Specifically, the controller 10 issues a corresponding processing instruction to the corresponding data processor 20 in each data processing cycle for a plurality of data processing cycles of the serial data processing task; for the same data processing cycle, among the multiple groups of data processors 20 included in the multiple data processors, the processing instructions received by the same group of data processors 20 are the same, and the processing instructions received by different groups of data processors 20 are different; a set of data processors 20 comprises at least one data processor.
Correspondingly, each group of data processors 20 in the plurality of data processors obtains the data to be processed corresponding to the data processing cycle from the corresponding first internal buffer based on the processing instruction respectively received in the data processing cycle; and storing the processed result of the data to be processed in the corresponding second internal buffer as the data to be processed in the subsequent data processing period of another group of data processor.
Specifically, after determining the data processing task, the controller 10 may accordingly determine the corresponding processing instruction according to a plurality of data processing cycles. For example, in the case where the data processing task is determined to include three tasks of weighting processing, convolution summing processing, and averaging processing, and the data processor 20 includes three tasks, it may be determined that the sub-data processing tasks indicated by the processing instructions corresponding to different data processors whose processing logics are adjacent in the same data processing cycle correspond to adjacent processing steps in the serial data processing task.
Here, the different data processors 20 that are logically adjacent may be different data processors 20 that are adjacently located, or may be data processors 20 that are not adjacently located.
For example, in a case where the data amount of the data to be processed is small, it may be determined that the different data processors 20 respectively correspond to different data processing functions. To reduce the need for equipment count, etc. In addition, in the case that the data amount of the data to be processed is large, a plurality of sets of data processors 20 may be provided, and each set of data processors 20 executes the corresponding same data processing task, so as to improve the efficiency of data processing.
Illustratively, a first group of data processors 20 may be provided, corresponding to the processing functions of the weighting process, and the first group of data processors 20 includes m data processors; a second group of data processors 20 is provided, corresponding to the processing function of the convolution and summation processing, and the second group of data processors 20 comprises n data processors; the third group of data processors 20 is provided corresponding to the processing function of the averaging process, and the third group of data processors 20 includes o data processors. The values of m, n, and o may be the same or different, and may be determined specifically according to the actual situation, and are not described herein again.
Here, since any one of the data processors 20 in the first group of data processors 20 corresponds to an adjacent processing step to any one of the data processors 20 in the second group of data processors 20, respectively, there may be transmission of data even if they are not adjacent in position but are adjacent in processing logic.
Similarly, since any data processor in the second group of data processors 20 corresponds to an adjacent processing step with any data processor 20 in the third group of data processors 20, there may be data transmission even if they are not adjacent in position but are adjacent in processing logic.
In addition, for any data processor 20 in the first group of data processors 20 and any data processor 20 in the third group of data processors 20, since corresponding processing steps are not adjacent, there is no adjacent relation in processing logic, and even if they are adjacent in position, there is no transmission of data.
Here, data processing is performed on the sub-image a, the sub-image B, the sub-image C, and the sub-image D as an example. The corresponding data processor 20 and the corresponding processing tasks include: the data processor 0 performs weighting processing, the data processor 1 performs convolution-sum processing, and the data processor 2 performs averaging processing.
For example, in the first data processing cycle, the controller 10 may determine data processing instructions to be issued to the data processor 0, including a task instruction for performing weighting processing on the sub-image a in the external data.
Here, the first data processing period is the first data processing period, and the data processor 0 receives the external data sub-image a and performs data processing on the sub-image a to obtain a processed result sub-image a. Then, the sub-image a is stored in the internal buffer corresponding to the data processor 0.
Regarding the data processor 0 in the first data processing cycle, the internal buffer in which the result of data processing is stored is considered as the second internal buffer. I.e. the second internal buffer, for characterizing the internal buffer that can buffer the result data obtained after processing.
In the second data processing cycle, the controller 10 may determine the data processing instruction issued to the data processor 1, including performing convolution summation processing on the sub-image a in the internal buffer corresponding to the data processor 0.
Regarding the data processor 1 in the second data processing cycle, the internal buffer in which the sub-image a acquired in the data processor 0 is located is considered as the first internal buffer. That is, the first internal buffer is used to represent the internal buffer that can obtain the result data obtained after the processing, and can be used in the previous step to store the result data generated in the previous step in the corresponding data processor, and the first internal buffer and the second internal buffer can indicate the same internal buffer.
After the data processor 1 performs convolution and summation processing on the sub-image a, a processing result sub-image a' can be obtained and stored in the internal buffer corresponding to the data processor 1. At this time, the internal buffer stored in the data processor 1 is the second internal buffer.
Meanwhile, in the second data processing period, the controller 10 may determine data processing instructions to be issued to the data processor 0, including a task instruction for performing weighting processing on the sub-image B in the external data. And storing the processed sub-image b into the internal buffer corresponding to the data processor 0.
Here, the data processing procedure of the data processor 0 for the sub-image B is similar to the data processing procedure of the data processor 0 for the sub-image a in the first data processing period, and is not described herein again.
In the third data processing cycle, the controller 10 determines a data processing instruction issued to the data processor 2, including performing averaging processing on the sub-image a' in the internal buffer corresponding to the data processor 1.
After the data processor 2 performs the averaging process on the sub-image a', the processing result sub-image a ″ can be obtained and stored in the internal buffer corresponding to the data processor 2.
The controller may further determine a data processing instruction issued to the data processor 1, including performing convolution summation processing on the sub-image b in the internal buffer corresponding to the data processor 0, and storing the processed sub-image b' in the internal buffer corresponding to the data processor 1.
Meanwhile, in the third data processing cycle, the controller 10 may determine a data processing instruction issued to the data processor 0, including a task instruction for performing weighting processing on the sub-image C in the external data, and store the processed sub-image C in an internal buffer corresponding to the data processor 0.
Here, the data processing procedure of the data processor 0 for the sub-image C is similar to the data processing procedure of the data processor 0 for the sub-image a in the first data processing period, and is not described herein again.
……
Until the sub-image A, the sub-image B, the sub-image C and the sub-image D are completely processed, obtaining sub-images a ', B', C ', and D', and waiting for output; or directly output. After the output, the obtained result sub-image a ", the sub-image b", the sub-image c "and the sub-image d" are spliced again, so that the processing result of the original data can be obtained.
With this serial processing, switching of the data processor 20 to the corresponding processing function can be reduced, and thus the switching time can be reduced during data processing, thereby improving the processing efficiency.
In another embodiment of the present disclosure, a specific example of a data processing apparatus when processing raw data is also provided.
When the data processing device processes data, firstly, the original data is subjected to line scanning, and the original data to be processed obtained after line scanning is stored in a line cache unit. Then, the data conversion circuit performs line block conversion processing on the original data to be processed in the line cache unit according to the size of the data which can be processed by the data processor in the data processing device, so as to obtain the data to be processed.
When the data processing device performs filtering processing on data to be processed, corresponding steps are different due to different filtering processing. Here, the determination includes two steps: determining a weighted gray value and determining a Gaussian fuzzy value, namely a task 1 to be processed and a task 2 to be processed corresponding to the controller.
Specifically, the controller may control the data processor to perform filtering processing on the data to be processed by issuing a parallel data processing task or a serial data processing task to the data processor.
The method comprises the steps of firstly, storing original data to be processed in a line cache unit after line scanning is carried out on the original data in the form of [ ABC ].
(a1) The controller issues a parallel data processing task to the data processor.
Referring to fig. 3, a schematic diagram of data processing according to parallel processing tasks according to an embodiment of the present disclosure is shown.
In the first data processing period, the controller determines, according to the to-be-processed task 1, the subtasks respectively corresponding to the to-be-processed data A, B, C, which are respectively issued to the data processor 1, the data processor 2, and the data processor 3, and include a subtask 1-0, a subtask 1-1, and a subtask 1-2. The subtasks 1-0 issued by the controller to the data processor 1 include a data processing task for controlling the data processor 1 to determine a weighted gray value for the data a to be processed.
In the first data processing cycle, taking data processor 1 as an example, data processor 1 receives data to be processed a obtained by processing original data to be processed [ ABC ] by the data conversion circuit according to subtasks 1-0, and stores the data to be processed a in the corresponding internal buffer. After receiving the data a to be processed in the internal buffer of the data processor 1, the data processing for determining the weighted gray value is performed on the data a to be processed according to the task for determining the weighted gray value in the subtask 1-0, so as to obtain the intermediate data a of the first data processing period. The remaining data processors 2 and 3 respectively perform a data processing process corresponding to the to-be-processed task 1 on the to-be-processed data B, C, which is similar to the data processing process of the data processor 1 on the to-be-processed data a and is not described again.
At this time, the data processor 1, the data processor 2, and the data processor 3 may obtain the corresponding intermediate data a, b, and c, respectively, and store them in the corresponding internal buffers, respectively.
In the second data processing period, the controller determines, according to the task 2 to be processed, subtasks which are respectively issued to the data processor 1, the data processor 2 and the data processor 3 and respectively correspond to the intermediate data a, b and c generated in the first data processing period, wherein the subtasks include a subtask 2-0, a subtask 2-1 and a subtask 2-2. The subtasks 2-0 issued by the controller to the data processor 1 include a data processing task for controlling the data processor to determine the gaussian fuzzy value of the data a to be processed.
In the second data processing period, taking the data processor 1 as an example, the data processor 1 processes the intermediate data a generated in the first data processing period and stored in the internal buffer of the data processor 1 according to the subtask 2-0, that is, performs data processing for determining the gaussian blur value on the intermediate data a according to the task for determining the gaussian blur value in the subtask 2-0, and obtains the intermediate data a' in the second data processing period. The data processing processes of the rest of the data processors 2 and 3 on the intermediate data b and c are similar to the data processing process of the data processor 1 on the intermediate data a, and are not repeated.
At this time, for the to-be-processed task issued by the controller, the data processor 1, the data processor 2, and the data processor 3 have already executed the to-be-processed data, that is, the second data processing cycle is the last data processing cycle, and data a ', b ', and c ' processed by the data processor 1, the data processor 2, and the data processor 3 are output.
The data processing procedures for [ DEF ] and [ GHI ] are similar to the data processing procedure for [ ABC ] and are not described in detail herein. After the data processing for all the original data is completed, the process is ended.
(a2) The controller issues a serial data processing task to the data processor.
Referring to fig. 4, a schematic diagram of data processing according to serial processing tasks according to an embodiment of the disclosure is shown.
In the first data processing period, the controller determines a subtask 1-0 corresponding to the data a to be processed, which is issued to the data processor 1, according to the task 1 to be processed, including a data processing task for controlling the data processor 1 to perform a weighted gray value determination on the data a to be processed.
In the first data processing period, the data processor 1 receives the data to be processed A obtained by processing the original data to be processed [ ABC ] by the data conversion circuit according to the subtasks 1 to 0, and stores the data to be processed A in the corresponding internal buffer. After receiving the data a to be processed in the internal buffer of the data processor 1, the data processing for determining the weighted gray value is performed on the data a to be processed according to the task for determining the weighted gray value in the subtask 1-0, so as to obtain the intermediate data a of the first data processing period.
In the second data processing period, the controller determines, according to the task 2 to be processed, a subtask 2-0 corresponding to the intermediate data a obtained by the data processor 1 in the previous data processing period, which is issued to the data processor 2, including a data processing task for controlling the data processor 2 to determine the gaussian blur value for the intermediate data a.
The data processor 2 receives the intermediate data a processed by the data processor 1 in the first data processing cycle according to the subtask 2-0, and stores the intermediate data a in the internal buffer corresponding to the data processor 2. After the internal buffer of the data processor 2 receives the intermediate data a, the task of determining the gaussian fuzzy value is performed on the intermediate data a according to the task of determining the gaussian fuzzy value in the subtask 2-0 to obtain data a'. At this time, for the data a to be processed, the second data processing cycle is the last data processing cycle.
Meanwhile, in the second data processing cycle, the controller determines a subtask 1-1 corresponding to the data B to be processed, which is issued to the data processor 1, according to the task 1 to be processed, including a data processing task that controls the data processor 1 to determine a weighted gray value for the data B to be processed.
That is, in the second data processing cycle, the data processor 1 receives the to-be-processed data B obtained by processing the original to-be-processed data [ ABC ] by the data conversion circuit according to the subtask 1-1, and stores the to-be-processed data B in the corresponding internal buffer. After receiving the data B to be processed in the internal buffer of the data processor 1, the data processor performs data processing for determining a weighted gray value on the data B to be processed according to the task for determining a weighted gray value in the subtask 1-1, so as to obtain intermediate data B in the second data processing cycle.
Until reaching the last data processing cycle corresponding to the last data to be processed in the data to be processed, outputting the data a ', b ', c ' processed by the data processor 1 and the data processor 2.
The data processing procedures for [ DEF ] and [ GHI ] are similar to the data processing procedure for [ ABC ] and are not described in detail herein. After the data processing of all the original data is completed, the process is ended.
It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.
Based on the same inventive concept, the embodiment of the present disclosure further provides a data processing method corresponding to the data processing apparatus, and as the principle of solving the problem of the apparatus in the embodiment of the present disclosure is similar to that of the data processing apparatus in the embodiment of the present disclosure, the implementation of the apparatus may refer to the implementation of the method, and the repeated parts are not described again.
Referring to fig. 5, a flowchart of a data processing method provided by an embodiment of the present disclosure is shown, where the data processing method is applied to a data processing apparatus; the data processing method comprises the following steps:
s501: the controller transmits a processing instruction for indicating a data processing task to a corresponding data processor;
s502: the data processors acquire the data to be processed from the corresponding internal buffers based on the received processing instructions, and execute corresponding data processing tasks based on the data to be processed.
In an optional implementation manner, the controller issues a processing instruction for instructing a data processing task to a corresponding data processor, and includes: the controller sends the corresponding processing instruction of the corresponding data processing period to the corresponding data processor according to the data processing task of each data processing period; the data processors acquire data to be processed from corresponding internal buffers based on the received processing instructions, and execute corresponding data processing tasks based on the data to be processed, and the data processing tasks include: and the data processors acquire the data to be processed corresponding to the data processing period from the corresponding internal buffer and process the data based on the processing instructions received in the data processing period respectively.
In an optional embodiment, the data to be processed differs according to the data processing instruction, and includes at least one of: external data, intermediate data processed by the data processor in the previous data processing period, and intermediate data processed by other data processors in the previous data processing period.
In an alternative embodiment, the data processing tasks include parallel data processing tasks; the controller issues the corresponding processing instruction of the corresponding data processing period to the corresponding data processor according to the data processing task of each data processing period, and the method comprises the following steps: for a plurality of data processing periods of the parallel data processing task, issuing a corresponding processing instruction to a corresponding data processor in each data processing period; for the same data processing cycle, the processing instructions received by the data processors are the same, and the processed data to be processed are different; the data processors acquire the data to be processed from the corresponding internal buffers based on the received processing instructions, and the method comprises the following steps: the data processors acquire the data to be processed corresponding to the data processing period from the corresponding internal buffer in each data processing period except the head and tail data processing periods based on the processing instructions received in the data processing period; and storing the processing result of the processed data to be processed into the corresponding internal buffer as the data to be processed in the subsequent data processing period.
In an alternative embodiment, the plurality of data processors comprises programmable data processors; the data processing method further comprises: the programmable data processor provides different processing functions in different data processing cycles in cooperation with the processing instructions corresponding to the data processing cycles.
In an optional implementation manner, the sub-data processing tasks executed by the processing instructions corresponding to the adjacent data processing cycles correspond to adjacent processing steps in the parallel data processing task.
In an alternative embodiment, the data processing tasks include serial data processing tasks; the controller issues the corresponding processing instruction of the corresponding data processing period to the corresponding data processor according to the data processing task of each data processing period, and the method comprises the following steps: aiming at a plurality of data processing periods of the serial data processing task, in each data processing period, issuing a corresponding processing instruction to a corresponding data processor; for the same data processing cycle, in a plurality of groups of data processors included in the plurality of data processors, the processing instructions received by the same group of data processors are the same, and the processing instructions received by different groups of data processors are different; the group of data processors comprises at least one data processor; the data processors acquire data to be processed from corresponding internal buffers based on the received processing instructions, and the data processing method comprises the following steps: each group of data processors in the plurality of data processors acquires data to be processed corresponding to the data processing period from the corresponding first internal buffer based on the processing instruction received in the data processing period; and storing the processing result of the data to be processed after processing into a corresponding second internal buffer as the data to be processed of another group of data processor in the subsequent data processing period.
In an alternative embodiment, the plurality of data processors comprises a programmable data processor; the data processing method further comprises: different groups of programmable data processors cooperate with corresponding groups of processing instructions to provide different processing functions in the same data processing cycle.
In an optional implementation manner, the sub data processing tasks executed by the processing instructions corresponding to different data processors with adjacent processing logics in the same data processing cycle correspond to adjacent processing steps in the serial data processing task.
In an alternative embodiment, the internal buffer includes a line buffer unit; the data processing apparatus further comprises: a data conversion circuit; the data processing method further comprises: the line cache unit stores original data to be processed which are input in a line scanning sequence; and the data conversion circuit performs line-block conversion on the original data to be processed to obtain the data to be processed which is provided for the data processor to process.
In an optional embodiment, the method further comprises: and the controller performs segmentation processing on the image to be processed according to a data processing window with a preset size to obtain the data to be processed.
In an alternative embodiment, the data processing task includes: and performing a filtering processing task and/or at least one subtask on the image to be processed.
In an alternative embodiment, the internal buffer is connected to the plurality of data processors via a bus; each data processor corresponds to at least one internal buffer.
In an optional embodiment, the method further comprises: the controller determines a task to be processed in each data processing cycle; and dividing the tasks to be processed of the data processing period into subtasks for processing by the plurality of data processors, and generating corresponding processing instructions.
The description of the data processing device may refer to the related description of the above device embodiments, and will not be described in detail here.
An embodiment of the present disclosure further provides a computer device, including: an instruction memory and a data processing apparatus as in any one of the embodiments of the present disclosure.
The data processing device provided by the embodiment of the disclosure may include a chip, an AI chip, and the like. The computer device provided by the embodiment of the present disclosure may include an intelligent terminal such as a mobile phone, or may also be other devices, servers, and the like that may be used for data processing, and is not limited herein.
The embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the data processing method described in the above method embodiments. The storage medium may be a volatile or non-volatile computer-readable storage medium.
The embodiments of the present disclosure also provide a computer program product, where the computer program product carries a program code, and instructions included in the program code may be used to execute the steps of the data processing method in the foregoing method embodiments, which may be referred to specifically in the foregoing method embodiments, and are not described herein again.
The computer program product may be implemented by hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in software functional units and sold or used as a stand-alone product, may be stored in a non-transitory computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (30)

1. A data processing apparatus, comprising: a controller, and a plurality of data processors;
the controller is used for issuing a processing instruction for indicating a data processing task to the corresponding data processor;
the data processors are used for acquiring data to be processed from the corresponding internal buffers based on the received processing instructions and executing corresponding data processing tasks based on the data to be processed.
2. The data processing apparatus according to claim 1, wherein the controller is configured to issue a processing instruction corresponding to each data processing cycle to the corresponding data processor according to the data processing task of each data processing cycle;
and the data processors are used for acquiring the data to be processed corresponding to the data processing period from the corresponding internal buffer and processing the data based on the processing instruction received in the data processing period in each data processing period.
3. The data processing apparatus according to claim 1 or 2, wherein the data to be processed is different according to the data processing instruction, and comprises at least one of: external data, intermediate data processed by the data processor in the previous data processing period, and intermediate data processed by other data processors in the previous data processing period.
4. A data processing apparatus as claimed in any one of claims 1 to 3, wherein the data processing tasks comprise parallel data processing tasks;
the controller is used for sending a corresponding processing instruction to a corresponding data processor in each data processing period aiming at a plurality of data processing periods of the parallel data processing task; for the same data processing cycle, the processing instructions received by the data processors are the same, and the processed data to be processed are different;
the data processors are used for acquiring the data to be processed corresponding to the data processing period from the corresponding internal buffer in each data processing period except the head and tail data processing periods based on the processing instruction received in the data processing period; and storing the processing result of the processed data to be processed into the corresponding internal buffer as the data to be processed in the subsequent data processing period.
5. The data processing apparatus of claim 4, wherein the plurality of data processors comprise programmable data processors;
the programmable data processor is used for providing different processing functions in different data processing periods by matching with the processing instructions of the corresponding data processing periods.
6. The data processing apparatus according to claim 4 or 5, wherein the sub data processing tasks executed by the processing instructions corresponding to adjacent data processing cycles correspond to adjacent processing steps in the parallel data processing task.
7. A data processing apparatus according to any of claims 1 to 3, wherein the data processing tasks comprise serial data processing tasks;
the controller is used for sending a corresponding processing instruction to a corresponding data processor in each data processing period aiming at a plurality of data processing periods of the serial data processing task; for the same data processing cycle, in a plurality of groups of data processors included in the plurality of data processors, the processing instructions received by the same group of data processors are the same, and the processing instructions received by different groups of data processors are different; the group of data processors comprises at least one data processor;
each group of data processors in the plurality of data processors is used for acquiring data to be processed corresponding to the data processing period from the corresponding first internal buffer based on the processing instruction received in the data processing period; and storing the processing result after the data to be processed is processed into a corresponding second internal buffer as the data to be processed of another group of data processors in the subsequent data processing period.
8. The data processing apparatus of claim 7, wherein the plurality of data processors comprise programmable data processors;
and the programmable data processors in different groups are used for providing different processing functions in the same data processing cycle in cooperation with the processing instructions in the corresponding groups.
9. The data processing apparatus according to claim 7 or 8, wherein the sub data processing tasks executed by the processing instructions corresponding to different data processors with adjacent processing logic in the same data processing cycle correspond to adjacent processing steps in the serial data processing task.
10. A data processing apparatus according to any of claims 1 to 9, wherein the internal buffer comprises a line buffer unit; the data processing apparatus further includes: a data conversion circuit;
the line cache unit is used for storing original data to be processed which are input in a line scanning sequence;
and the data conversion circuit is used for performing line-block conversion on the original data to be processed to obtain the data to be processed which is provided for the data processor to process.
11. The data processing apparatus according to any one of claims 1 to 10,
the controller is further configured to perform segmentation processing on the image to be processed according to a data processing window with a preset size, so as to obtain the data to be processed.
12. A data processing apparatus according to any one of claims 1 to 10, wherein the data processing tasks include: and performing a filtering processing task and/or at least one subtask on the data to be processed.
13. A data processing apparatus according to any of claims 1 to 10, wherein said internal buffer is connected to said plurality of data processors via a bus; each data processor corresponds to at least one internal buffer.
14. The data processing apparatus according to any of claims 1 to 10, wherein the controller is further configured to determine a task to be processed for each data processing cycle; dividing the task to be processed of the data processing cycle into sub-tasks for processing by the plurality of data processors, and generating corresponding processing instructions.
15. A data processing method applied to a data processing apparatus, the data processing apparatus comprising: a controller, and a plurality of data processors; the data processing method comprises the following steps:
the controller transmits a processing instruction for indicating a data processing task to a corresponding data processor;
the data processors acquire data to be processed from corresponding internal buffers based on the received processing instructions, and execute corresponding data processing tasks based on the data to be processed.
16. The data processing method of claim 15, wherein the controller issues processing instructions for instructing data processing tasks to the corresponding data processors, and the method comprises:
the controller sends the corresponding processing instruction of the corresponding data processing period to the corresponding data processor according to the data processing task of each data processing period;
the data processors acquire data to be processed from corresponding internal buffers based on the received processing instructions, and execute corresponding data processing tasks based on the data to be processed, and the data processing tasks include:
and the data processors acquire the data to be processed corresponding to the data processing period from the corresponding internal buffer and process the data based on the processing instructions received in the data processing period respectively.
17. The data processing method according to claim 15 or 16, wherein the data to be processed is different according to the data processing instruction, and comprises at least one of the following: external data, intermediate data processed by the data processor in the previous data processing period, and intermediate data processed by other data processors in the previous data processing period.
18. A data processing method according to any of claims 15 to 17, wherein the data processing tasks comprise parallel data processing tasks;
the controller issues the corresponding processing instruction of the corresponding data processing period to the corresponding data processor according to the data processing task of each data processing period, and the method comprises the following steps: for a plurality of data processing periods of the parallel data processing task, issuing a corresponding processing instruction to a corresponding data processor in each data processing period; for the same data processing cycle, the processing instructions received by the data processors are the same, and the processed data to be processed are different;
the data processors acquire data to be processed from corresponding internal buffers based on the received processing instructions, and the data processing method comprises the following steps:
the data processors acquire the data to be processed corresponding to the data processing period from the corresponding internal buffer in each data processing period except the head and tail data processing periods based on the processing instructions received in the data processing period; and storing the processing result of the processed data to be processed into the corresponding internal buffer as the data to be processed in the subsequent data processing period.
19. The data processing method of claim 18, wherein the plurality of data processors comprise programmable data processors;
the data processing method further comprises: the programmable data processor provides different processing functions in different data processing cycles in cooperation with the processing instructions corresponding to the data processing cycles.
20. The data processing method according to claim 18 or 19, wherein the sub-data processing tasks executed by the processing instructions corresponding to the adjacent data processing cycles correspond to adjacent processing steps in the parallel data processing task.
21. A data processing method according to any of claims 15 to 17, wherein the data processing tasks comprise serial data processing tasks;
the controller issues the corresponding processing instruction of the corresponding data processing period to the corresponding data processor according to the data processing task of each data processing period, and the method comprises the following steps: aiming at a plurality of data processing periods of the serial data processing task, in each data processing period, issuing a corresponding processing instruction to a corresponding data processor; for the same data processing cycle, in a plurality of groups of data processors included in the plurality of data processors, the processing instructions received by the same group of data processors are the same, and the processing instructions received by different groups of data processors are different; the group of data processors comprises at least one data processor;
the data processors acquire data to be processed from corresponding internal buffers based on the received processing instructions, and the data processing method comprises the following steps:
each group of data processors in the plurality of data processors acquires data to be processed corresponding to the data processing period from the corresponding first internal buffer based on the processing instruction received in the data processing period; and storing the processing result of the data to be processed after processing into a corresponding second internal buffer as the data to be processed of another group of data processor in the subsequent data processing period.
22. The data processing method of claim 21, wherein the plurality of data processors comprise programmable data processors;
the data processing method further comprises: different groups of programmable data processors cooperate with corresponding groups of processing instructions to provide different processing functions in the same data processing cycle.
23. The data processing method according to claim 21 or 22, wherein the sub data processing tasks executed by the processing instructions corresponding to different data processors with adjacent processing logic in the same data processing cycle correspond to adjacent processing steps in the serial data processing task.
24. A data processing method according to any of claims 15 to 23, wherein the internal buffer comprises a line buffer unit; the data processing apparatus further includes: a data conversion circuit;
the data processing method further comprises: the line cache unit stores original data to be processed which are input in a line scanning sequence;
and the data conversion circuit performs line-block conversion on the original data to be processed to obtain the data to be processed which is provided for the data processor to process.
25. The data processing method according to any one of claims 15 to 24, further comprising:
and the controller performs segmentation processing on the image to be processed according to a data processing window with a preset size to obtain the data to be processed.
26. A data processing method according to any one of claims 15 to 24, wherein the data processing task comprises: and performing a filtering processing task and/or at least one subtask on the image to be processed.
27. A data processing method according to any of claims 15 to 24, wherein said internal buffer is connected to said plurality of data processors via a bus; each data processor corresponds to at least one internal buffer.
28. A data processing method according to any one of claims 16 to 24, characterized in that the method further comprises: the controller determines a task to be processed in each data processing cycle; and dividing the tasks to be processed of the data processing period into subtasks for processing by the plurality of data processors, and generating corresponding processing instructions.
29. A computer device, comprising: an instruction memory and a data processing apparatus as claimed in any one of claims 1 to 14.
30. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when executed by the controller and by the programmable puzzle processor, performs the steps of the data processing method according to any one of claims 15 to 28.
CN202110221178.9A 2021-02-26 2021-02-26 Data processing device and method, computer equipment and storage medium Pending CN114968506A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110221178.9A CN114968506A (en) 2021-02-26 2021-02-26 Data processing device and method, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110221178.9A CN114968506A (en) 2021-02-26 2021-02-26 Data processing device and method, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114968506A true CN114968506A (en) 2022-08-30

Family

ID=82973828

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110221178.9A Pending CN114968506A (en) 2021-02-26 2021-02-26 Data processing device and method, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114968506A (en)

Similar Documents

Publication Publication Date Title
US8434085B2 (en) Scalable scheduling of tasks in heterogeneous systems
CN112035238A (en) Task scheduling processing method and device, cluster system and readable storage medium
US11468145B1 (en) Storage of input values within core of neural network inference circuit
KR101609079B1 (en) Instruction culling in graphics processing unit
CN112181657B (en) Video processing method, device, electronic equipment and storage medium
CN111523652B (en) Processor, data processing method thereof and image pickup device
EP3855362A1 (en) Convolution processing method, apparatus, and storage medium of convolutional neural network
CN115880132B (en) Graphics processor, matrix multiplication task processing method, device and storage medium
CN111343288B (en) Job scheduling method and system and computing device
CN114723033B (en) Data processing method, data processing device, AI chip, electronic device and storage medium
US20190228574A1 (en) Identifying primitives in input index stream
CN113032116B (en) Training method of task time prediction model, task scheduling method and related devices
CN109598250A (en) Feature extracting method, device, electronic equipment and computer-readable medium
CN106682258B (en) Multi-operand addition optimization method and system in high-level comprehensive tool
WO2022095714A1 (en) Image rendering processing method and apparatus, storage medium, and electronic device
CN109509139A (en) Vertex data processing method, device and equipment
EP3859660A1 (en) Data processing method and sensor device for performing the same
CN1955933A (en) Data processing apparatus and method
CN114968506A (en) Data processing device and method, computer equipment and storage medium
CN112468414B (en) Cloud computing multi-level scheduling method, system and storage medium
US11809902B2 (en) Fine-grained conditional dispatching
WO2014105550A1 (en) Configurable ring network
CN114201727A (en) Data processing method, processor, artificial intelligence chip and electronic equipment
CN113902088A (en) Method, device and system for searching neural network structure
US11403783B2 (en) Techniques to dynamically gate encoded image components for artificial intelligence tasks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination