CN113362219A - Image data processing method and device - Google Patents

Image data processing method and device Download PDF

Info

Publication number
CN113362219A
CN113362219A CN202110748762.XA CN202110748762A CN113362219A CN 113362219 A CN113362219 A CN 113362219A CN 202110748762 A CN202110748762 A CN 202110748762A CN 113362219 A CN113362219 A CN 113362219A
Authority
CN
China
Prior art keywords
processor
data
region
areas
area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110748762.XA
Other languages
Chinese (zh)
Other versions
CN113362219B (en
Inventor
刘晓伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Spreadtrum Communications Tianjin Co Ltd
Original Assignee
Spreadtrum Communications Tianjin Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Spreadtrum Communications Tianjin Co Ltd filed Critical Spreadtrum Communications Tianjin Co Ltd
Priority to CN202110748762.XA priority Critical patent/CN113362219B/en
Publication of CN113362219A publication Critical patent/CN113362219A/en
Application granted granted Critical
Publication of CN113362219B publication Critical patent/CN113362219B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)

Abstract

The present invention relates to the field of data processing, and in particular, to a method and an apparatus for processing image data. Wherein, the method comprises the following steps: dividing first image data into a plurality of data areas according to the data arrangement position of the first image data, wherein the plurality of data areas comprise processor areas corresponding to the processors one by one, and a boundary area is arranged between two adjacent processor areas; and respectively allocating the processor area and the boundary area of the first image data to each processor, so that each processor acquires the data of the corresponding area and executes the data. In the embodiment of the invention, when the corresponding processor area is divided for each processor, a boundary area is divided between two adjacent processor areas, so that the situations of cache conflict and cache inconsistency can not occur when each processor executes the allocated processor area.

Description

Image data processing method and device
[ technical field ] A method for producing a semiconductor device
The present invention relates to the field of data processing, and in particular, to a method and an apparatus for processing image data.
[ background of the invention ]
In the field of image data processing, heterogeneous processing methods are often used to increase processing speed. The heterogeneous processing means that the data to be processed is divided into a plurality of parts and distributed to a plurality of processors, and the plurality of processors complete the processing work of the data to be processed together. In order to increase the processing speed, the data to be processed is loaded from the storage device to the memory in advance for the processor to process. However, when the processor directly reads the data to be processed in the memory, the reading efficiency of the processor is low. A common processing method is that the processor loads the data to be processed in the memory into its own cache space first, and then processes the data to increase the reading efficiency. In a specific heterogeneous processing process, each processor reads cache data corresponding to a processing task from a memory according to the allocated processing task. In this case, each processor loads the cache data into its cache space for processing. However, when any processor modifies the cached data, other processors cannot know that the data are inconsistent when the processors write the data back to the memory, which causes a problem of data consistency.
[ summary of the invention ]
In order to solve the above problem, embodiments of the present invention provide an image data processing method and apparatus. The data distribution mode of each processor is changed to avoid the situation of data inconsistency and the like.
In a first aspect, an embodiment of the present invention provides an image data processing method, including:
dividing first image data into a plurality of data areas according to the data arrangement position of the first image data, wherein the plurality of data areas comprise processor areas corresponding to the processors one by one, and a boundary area is arranged between two adjacent processor areas;
and respectively allocating the processor area and the boundary area of the first image data to each processor, so that each processor acquires the data of the corresponding area and executes the data.
In the embodiment of the invention, the first image data is divided into the processor areas corresponding to the processors one by one and the boundary areas between the two adjacent processor areas according to the data arrangement positions of the first image data, so that the situations of data inconsistency, cache conflict and the like can not occur when the two adjacent processor areas are processed by dividing the boundary areas.
In one possible implementation manner, dividing the first image data into a plurality of data areas according to a data arrangement position of the first image data includes:
determining coarse segmentation areas corresponding to the processors one by one in the first image data according to the number and the computational power of the processors and the data arrangement position of the first image data;
expanding the boundary positions of two adjacent rough segmentation areas to the inner areas of the two adjacent rough segmentation areas respectively to obtain the boundary area between the two adjacent rough segmentation areas;
and in the coarse segmentation area, the area except the boundary area is the processor area.
In a possible implementation manner, the expanding from the boundary position of two adjacent roughly-divided regions to the inner region of the two adjacent roughly-divided regions respectively to obtain the boundary region between the two adjacent roughly-divided regions includes:
and expanding a plurality of necessary lines and at least one redundant line from the boundary position of two adjacent coarse division areas to the inner areas of the two adjacent coarse division areas respectively, wherein the line number of the necessary lines is determined by the number of adjacent data needing to be accessed when the processor executes each data in the processor area and the boundary area.
In one possible implementation, assigning the boundary regions of the first image data to the respective processors includes:
two adjacent processor areas are respectively distributed to two processors corresponding to the two processor areas one by one;
the boundary area between two adjacent processor areas is allocated to the processor with large calculation power in two corresponding processors.
In one possible implementation manner, allocating the processor region and the boundary region of the first image data to each processor respectively, so that each processor acquires data of a corresponding region and executes, includes:
and each processor loads the allocated processor region and the boundary region from the memory to the respective cache and executes the processor region and the boundary region.
In one possible implementation manner, each processor loads the allocated processor region and the boundary region from the memory to the respective cache and executes the loaded processor region and the boundary region, including:
if the processor is assigned to a processor region and a bounding region, the processor region is executed first and the bounding region is executed second.
In one possible implementation, if a processor is allocated to a processor region only, execution starts from the last row of the processor region until execution completes the first row of the processor region;
wherein, when a last line of the processor region is executed, a plurality of lines of data required for processing the last line are loaded from the processor region and a border region adjacent to the last line;
executing a last row of the processor region based on the plurality of rows of data.
In a second aspect, an embodiment of the present invention provides an image data processing apparatus, including:
the processing module is used for dividing the first image data into a plurality of data areas according to the data arrangement position of the first image data, the plurality of data areas comprise processor areas corresponding to the processors one by one, and a boundary area is arranged between two adjacent processor areas;
and the distribution module is used for respectively distributing the processor area and the boundary area of the first image data to each processor so as to enable each processor to acquire and execute the data of the corresponding area.
In one possible implementation, the method includes:
at least one processor; and
at least one memory communicatively coupled to the processor, wherein:
the memory stores program instructions executable by the processor, the processor being capable of performing the method of the first aspect when invoked by the program instructions.
In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, which stores computer instructions, and the computer instructions cause the computer to execute the method according to the first aspect.
It should be understood that the second to third aspects of the embodiment of the present invention are consistent with the technical solution of the first aspect of the embodiment of the present invention, and the beneficial effects achieved by the aspects and the corresponding possible implementation manners are similar, and are not described again.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a diagram of a processor architecture according to an embodiment of the present invention;
FIG. 2 is a flowchart of an image data processing method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a partition according to an embodiment of the present invention;
FIG. 4 is a flow chart of another image data processing method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of another partition according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of another partition according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of an image data processing apparatus according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of another image data processing apparatus according to an embodiment of the present invention.
[ detailed description ] embodiments
For better understanding of the technical solutions in the present specification, the following detailed description of the embodiments of the present invention is provided with reference to the accompanying drawings.
It should be understood that the described embodiments are only a few embodiments of the present specification, and not all embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step are within the scope of the present invention.
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the specification. As used in the examples of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
In the embodiment of the invention, the first image data is divided into the processor areas corresponding to the processors one by one and the boundary areas between the two adjacent processor areas according to the data arrangement positions of the first image data, and the situations of data inconsistency, cache conflict and the like can not occur when the two adjacent processor areas are processed by dividing the boundary areas.
Fig. 1 is an architecture diagram of a processor according to an embodiment of the present invention. As shown in fig. 1, a plurality of processors are connected to a shared memory, and each processor has a corresponding cache region. And each processor loads data from the memory into the respective cache region during execution and executes the data according to the data in the respective cache region. The plurality of processors may be implemented in the form of a plurality of processing cores, and one of the processing cores is a main processing core, and the main processing core executes the image data processing method provided by the embodiment of the present invention. Alternatively, the plurality of processors may be implemented as a plurality of independent processors connected to each other, and for example, the processors may be a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a Digital Signal Processor (DSP), or other common processors connected to each other. Wherein, a processor can be selected as a main processor to execute the image data processing method provided by the embodiment of the invention.
Fig. 2 is a flowchart of an image data processing method according to an embodiment of the present invention, and as shown in fig. 2, the method includes:
step 201, dividing the first image data into a plurality of data areas according to the data arrangement position of the first image data, wherein the plurality of data areas comprise processor areas corresponding to the processors one by one, and a boundary area is arranged between two adjacent processor areas. Fig. 3 is a data area distribution diagram according to an embodiment of the present invention, as shown in fig. 3, when the number of processors is three (processor 1, processor 2, and processor 3), the first image data is divided into three processor areas (first processor area, second processor area, and third processor area) and a boundary area (first boundary area and second boundary area) between two adjacent processor areas according to the data arrangement position of the first image data, and data in each area is arranged in rows. In fig. 3, the processor corresponding to the first processor area is the first processor, the processor corresponding to the second processor area is the second processor, and the processor corresponding to the third processor area is the third processor. The size of the boundary region may be a fixed value, and when each processor region and the boundary region are divided, the number of the processor regions to be divided may be determined according to the length and width of the first image data and the number of processors, and thus the number of the boundary regions may be determined. Since the length and width of the first image data are known and the number and size of the boundary regions are known, the total size of the processor regions can be obtained, and then the processor regions corresponding to the respective processors are divided from the total processor region according to the number of the processors. For example, if the size of the first image data is 60 × 110, the size of the boundary area is 60 × 10, and the number of processors is 3, the total size of the boundary area may be determined to be 60 × 20, and the total boundary area is subtracted from the first image data to obtain a 60 × 90 area, that is, the size of the total processor area. And then, according to the number of the processors, averagely dividing the total processor area into 3 processor areas to obtain a partition result of the first image data.
In some embodiments, there may be a case where the computational power of each processor is not equal, and when dividing each processor region, the computational power factor of each processor may be added to the partition condition, as shown in fig. 4, and the method includes the following processing steps:
in step S2011, rough segmentation regions corresponding to the respective processors one to one are determined in the first image data according to the number of processors, the computational power, and the data arrangement position of the first image data. For example, the number of processors is two, respectively a first processor and a second processor. The ratio of the computational power of the first processor to the second processor is 3 to 5. The ratio of the area of the coarsely partitioned area corresponding to the first processor to the coarsely partitioned area corresponding to the second processor should be 3 to 5. The obtained rough segmentation result is shown in fig. 5, and in the first image data, the first processor corresponds to the first rough segmentation region in fig. 5, and the second processor corresponds to the second rough segmentation region in fig. 5.
In step S2012, the boundary positions of two adjacent rough-divided regions are expanded to the inner regions of the two adjacent rough-divided regions, respectively, so as to obtain a boundary region between the two adjacent rough-divided regions. In the roughly divided region, the region other than the boundary region is a processor region. As shown in fig. 5, the boundary region in the graph is obtained by expanding to the inner region of two adjacent roughly divided regions. Specifically, a plurality of necessary lines and at least one redundant line may be respectively extended from the boundary position of two adjacent coarse division areas to the inner area of the two adjacent coarse division areas, and the number of lines of the necessary lines is determined by the number of adjacent data to be accessed when the processor executes each data in the processor area and the boundary area. In the image processing process, when processing the current pixel point data, it is necessary to load the n adjacent pixel point data of the current pixel point data, i.e. the upper, lower, left and right pixel points, as an assistant. Therefore, to avoid situations where pixel point data in adjacent processor regions is loaded when a processor executes an allocated processor region, thereby causing access conflicts and cache inconsistencies. The number of the necessary rows is determined according to the number of the adjacent pixel points to be accessed by the processor, so that the condition can be avoided. For example, if the number of adjacent pixel point data to be accessed by the processor is four, then when processing the pixel point data, four rows of pixel point data above and four rows of pixel point data below the pixel point need to be loaded from the memory, that is, a total of 9 rows of pixel point data need to be loaded into the cache region of the processor. The border region in fig. 5 is composed of four necessary rows of pixel data required by the processor region above the border region, four necessary rows of pixel data described by the processor region below the border region, and a row of redundant rows of pixel data.
When caching, the minimum unit of each caching is a cache line, generally, one cache line is 64 bytes, and the processor loads pixel data of one cache line from the memory into the cache region of the processor each time. However, there may be a case where the pixel data in a row is smaller than 64 bytes, and at this time, when the memory loads data of one cache row, the corresponding pixel data is larger than one row. When the processor executes the last line of the first processor area in fig. 5, the pixel point data buffered from the boundary area will exceed 4 lines, which results in the situation that the first processor processing the first processor area and the second processor processing the second processor area have buffer conflict, and therefore, the buffer conflict can be avoided by adding one redundant line.
Step 202, allocating the processor area and the boundary area of the first image data to each processor respectively, so that each processor acquires and executes the data of the corresponding area. When the processor regions and the boundary regions are allocated, two adjacent processor regions may be respectively allocated to two processors corresponding to the two processor regions one to one. The boundary area between two adjacent processor areas may be randomly allocated to either one of the two corresponding processors. Preferably, in order to improve the computational efficiency, the boundary region between two adjacent processor regions may be allocated to the processor with the highest computational power of the two corresponding processors, so as to avoid the situation that after the allocated processor region is already executed by a certain processor with high computational power, the other processor waits for the completion of the execution of the other processor. As shown in fig. 5, the ratio of the computational power of the first processor to the second processor is 3 to 5. Thus, the first processor regions in FIG. 5 are assigned to each corresponding first processor and the second processor regions are assigned to the corresponding second processor, while the boundary regions in FIG. 5 are assigned to the second processor because the computational power of the second processor is greater than that of the first processor. And then, each processor loads the allocated processor region and the boundary region from the memory to the respective cache and executes the processor region and the boundary region.
In some embodiments, each processor, when executing the allocated area, if the processor is allocated to only the processor area, executes from the last row of the processor area until the execution completes the first row of the processor area. Wherein, when a last line of the processor area is executed, a plurality of lines of data required for processing the last line are loaded from the processor area and a border area adjacent to the last line. The last row of the processor region is executed based on the plurality of rows of data.
If the processor is assigned to a processor region and a bounding region, the processor region is executed first and the bounding region is executed second.
In some embodiments, if the total number of processors is two, the execution subject performing the embodiment of the present invention is a processor (e.g., CPU, etc.) using a cache mechanism, and another processor is a processor (e.g., DSP) not using a cache mechanism, the boundary area between two processor areas may be allocated to the main processor using a cache mechanism instead of determining the allocation of the boundary area according to the ratio of the computation power of the two processors.
In one specific example, the method may be a heterogeneous processing system composed of a CPU and a DSP, wherein the CPU is a main processor and the computational power of the CPU is greater than that of the DSP. According to the number and the computational power of the processors and the data arrangement positions of the image data, the first image data is divided into two rough segmentation areas, then the two rough segmentation areas are expanded from the junction of the two rough segmentation areas to the inside of the two rough segmentation areas to obtain boundary areas, and the areas except the boundary areas in the two rough segmentation areas are respectively determined as the processor areas corresponding to the CPU and the DSP. As shown in fig. 6, the V area is a DSP processing area, the C area is a CPU processing area, and a boundary area between the V area and the C area is a CPU execution area. The V region includes n pixel dot data (V0-Vn), and the C region includes m pixel dot data (C0-Cm). During execution, the DSP starts executing from the last line, and specifically, the DSP acquires pixel data required for processing the V-region from the Memory by using a Direct Access Memory (DMA). Specifically, the DMA loads pixel point data in the Memory into a Dynamic Random Access Memory (Dram) in a data exchange manner, and then the DSP obtains an effect of accelerating processing by accessing the Dram. In this way, the DPS bypasses the traditional data caching mechanism, thereby ensuring data consistency and cache consistency. Then, the DSP executes pixel point data in the V region in the order from right to left, from bottom to top. As shown in fig. 6, the DSP executes pixel data of the V region in the order of V0, V1, and V2, starting from V0.
The CPU processes the pixel data in the C area firstly and then processes the pixel data in the boundary area. The CPU executes the pixel point data in the C region and the boundary region in the order from left to right and from top to bottom. As shown in fig. 6, the CPU executes pixel point data in the C region in the order of C0, C1, and C2, starting from C0.
Therefore, through the division of the heterogeneous regions, the heterogeneous processing mode that the CPU adopting the cache mechanism is used as the main processor, and the DSP not adopting the cache mechanism is used as the slave processor, the data consistency and the cache consistency in the data processing process are ensured. Because the areas of the V area and the C area are determined according to the ratio of the computing power of the CPU and the DSP, according to the execution sequence, when the CPU executes the C area and then executes the boundary area, the DSP already processes the area close to the top of the V area, and because the DSP adopts a DMA mode to obtain the corresponding pixel data from the memory and does not cache the pixel data in the memory, the CPU does not have the condition of cache inconsistency or cache conflict when caching the pixel data processed by the DSP.
Corresponding to the image data processing method, an embodiment of the present invention provides a schematic structural diagram of an image data processing apparatus, as shown in fig. 7, the apparatus including: a processing module 701 and an assignment module 702.
The processing module 701 is configured to divide the first image data into a plurality of data regions according to a data arrangement position of the first image data, where the plurality of data regions include processor regions corresponding to the processors one to one, and a boundary region is included between two adjacent processor regions.
An allocating module 702, configured to allocate the processor area and the boundary area of the first image data to each processor, respectively, so that each processor acquires data of a corresponding area and executes the data.
The embodiment shown in fig. 7 provides an image data processing apparatus for executing the technical solutions of the method embodiments shown in fig. 1 to fig. 6 in this specification, and the implementation principles and technical effects thereof may further refer to the related descriptions in the method embodiments.
Fig. 8 is a schematic structural diagram of another image data processing apparatus according to an embodiment of the present invention, as shown in fig. 8, the image data processing apparatus may include at least one processor; and at least one memory communicatively coupled to the processor, wherein: the memory stores program instructions executable by the processor, and the processor calls the program instructions to execute the image data processing method provided by the embodiments shown in fig. 1 to 6 in the present specification.
As shown in fig. 8, the image data processing apparatus is represented in the form of a general-purpose computing device. The components of the image data processing apparatus may include, but are not limited to: one or more processors 810, a communication interface 820, and a memory 830, a communication bus 840 that connects the various system components, including the memory 830, the communication interface 820, and the processing unit 810.
Communication bus 840 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. These architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus, to name a few.
The image data processing apparatus typically includes a variety of computer system readable media. Such media can be any available media that can be accessed by the image data processing apparatus and includes both volatile and nonvolatile media, removable and non-removable media.
Memory 830 may include computer system readable media in the form of volatile Memory, such as Random Access Memory (RAM) and/or cache Memory. The image data processing apparatus may further include other removable/non-removable, volatile/nonvolatile computer system storage media. Memory 830 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the specification.
A program/utility having a set (at least one) of program modules, including but not limited to an operating system, one or more application programs, other program modules, and program data, may be stored in memory 830, each or some combination of which may comprise an implementation of a network environment. The program modules generally perform the functions and/or methodologies of the embodiments described herein.
The processor 810 executes various functional applications and data processing by executing programs stored in the memory 830, for example, implementing the image data processing method provided by the embodiments shown in fig. 1 to 6 in this specification.
The embodiment of the present specification provides a computer-readable storage medium, which stores computer instructions, and the computer instructions cause the computer to execute the image data processing method provided by the embodiment shown in fig. 1 to 6 of the present specification.
The computer-readable storage medium described above may take any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read Only Memory (ROM), an Erasable Programmable Read Only Memory (EPROM), a flash Memory, an optical fiber, a portable compact disc Read Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
In the description of the specification, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the specification. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present specification, "a plurality" means at least two, e.g., two, three, etc., unless explicitly defined otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present description in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present description.
The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.
It should be noted that the apparatuses referred to in the embodiments of the present disclosure may include, but are not limited to, a Personal Computer (Personal Computer; hereinafter, PC), a Personal Digital Assistant (Personal Digital Assistant; hereinafter, PDA), a wireless handheld apparatus, a Tablet Computer (Tablet Computer), a mobile phone, an MP3 display, an MP4 display, and the like.
In the several embodiments provided in this specification, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions in actual implementation, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present description may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a connector, or a network device) or a Processor (Processor) to execute some steps of the methods described in the embodiments of the present disclosure. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only a preferred embodiment of the present disclosure, and should not be taken as limiting the present disclosure, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

Claims (10)

1. An image data processing method characterized by comprising:
dividing first image data into a plurality of data areas according to the data arrangement position of the first image data, wherein the plurality of data areas comprise processor areas corresponding to the processors one by one, and a boundary area is arranged between two adjacent processor areas;
and respectively allocating the processor area and the boundary area of the first image data to each processor, so that each processor acquires the data of the corresponding area and executes the data.
2. The method according to claim 1, wherein dividing the first image data into a plurality of data areas according to a data arrangement position of the first image data includes:
determining coarse segmentation areas corresponding to the processors one by one in the first image data according to the number and the computational power of the processors and the data arrangement position of the first image data;
expanding the boundary positions of two adjacent rough segmentation areas to the inner areas of the two adjacent rough segmentation areas respectively to obtain the boundary area between the two adjacent rough segmentation areas;
and in the coarse segmentation area, the area except the boundary area is the processor area.
3. The method according to claim 2, wherein the expanding from the boundary position of two adjacent rough divided regions to the inner region of the two adjacent rough divided regions respectively to obtain the boundary region between the two adjacent rough divided regions comprises:
and expanding a plurality of necessary lines and at least one redundant line from the boundary position of two adjacent coarse division areas to the inner areas of the two adjacent coarse division areas respectively, wherein the line number of the necessary lines is determined by the number of adjacent data needing to be accessed when the processor executes each data in the processor area and the boundary area.
4. The method of claim 2, wherein assigning the boundary regions of the first image data to respective processors comprises:
two adjacent processor areas are respectively distributed to two processors corresponding to the two processor areas one by one;
the boundary area between two adjacent processor areas is allocated to the processor with large calculation power in two corresponding processors.
5. The method of claim 4, wherein assigning the processor region and the boundary region of the first image data to each processor respectively, such that each processor acquires data of a corresponding region and executes, comprises:
and each processor loads the allocated processor region and the boundary region from the memory to the respective cache and executes the processor region and the boundary region.
6. The method of claim 5, wherein each processor loads the allocated processor region and boundary region from memory into a respective cache and executes;
if the processor is assigned to a processor region and a bounding region, the processor region is executed first and the bounding region is executed second.
7. The method of claim 4, wherein if a processor is only allocated to a processor region, then executing from the last row of the processor region until the first row of the processor region is completed;
wherein, when a last line of the processor region is executed, a plurality of lines of data required for processing the last line are loaded from the processor region and a border region adjacent to the last line;
executing a last row of the processor region based on the plurality of rows of data.
8. An image data processing apparatus characterized by comprising:
the processing module is used for dividing the first image data into a plurality of data areas according to the data arrangement position of the first image data, the plurality of data areas comprise processor areas corresponding to the processors one by one, and a boundary area is arranged between two adjacent processor areas;
and the distribution module is used for respectively distributing the processor area and the boundary area of the first image data to each processor so as to enable each processor to acquire and execute the data of the corresponding area.
9. The apparatus of claim 8, comprising:
at least one processor; and
at least one memory communicatively coupled to the processor, wherein:
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any of claims 1 to 7.
10. A computer-readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 7.
CN202110748762.XA 2021-07-02 2021-07-02 Image data processing method and device Active CN113362219B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110748762.XA CN113362219B (en) 2021-07-02 2021-07-02 Image data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110748762.XA CN113362219B (en) 2021-07-02 2021-07-02 Image data processing method and device

Publications (2)

Publication Number Publication Date
CN113362219A true CN113362219A (en) 2021-09-07
CN113362219B CN113362219B (en) 2023-08-11

Family

ID=77537914

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110748762.XA Active CN113362219B (en) 2021-07-02 2021-07-02 Image data processing method and device

Country Status (1)

Country Link
CN (1) CN113362219B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106651748A (en) * 2015-10-30 2017-05-10 华为技术有限公司 Image processing method and apparatus
CN106951322A (en) * 2017-02-28 2017-07-14 中国科学院深圳先进技术研究院 The image collaboration processing routine acquisition methods and system of a kind of CPU/GPU isomerous environments
CN107945098A (en) * 2017-11-24 2018-04-20 腾讯科技(深圳)有限公司 Image processing method, device, computer equipment and storage medium
CN107977922A (en) * 2016-10-25 2018-05-01 杭州海康威视数字技术股份有限公司 A kind of image analysis method, apparatus and system
CN111984417A (en) * 2020-08-26 2020-11-24 展讯通信(天津)有限公司 Image processing method and device for mobile terminal, storage medium and terminal

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106651748A (en) * 2015-10-30 2017-05-10 华为技术有限公司 Image processing method and apparatus
CN107977922A (en) * 2016-10-25 2018-05-01 杭州海康威视数字技术股份有限公司 A kind of image analysis method, apparatus and system
CN106951322A (en) * 2017-02-28 2017-07-14 中国科学院深圳先进技术研究院 The image collaboration processing routine acquisition methods and system of a kind of CPU/GPU isomerous environments
CN107945098A (en) * 2017-11-24 2018-04-20 腾讯科技(深圳)有限公司 Image processing method, device, computer equipment and storage medium
CN111984417A (en) * 2020-08-26 2020-11-24 展讯通信(天津)有限公司 Image processing method and device for mobile terminal, storage medium and terminal

Also Published As

Publication number Publication date
CN113362219B (en) 2023-08-11

Similar Documents

Publication Publication Date Title
JP5422614B2 (en) Simulate multiport memory using low port count memory
US20150371359A1 (en) Processing method and apparatus for single-channel convolution layer, and processing method and apparatus for multi-channel convolution layer
EP0523863B1 (en) Digital data processor for high level instructions
US11151155B2 (en) Memory use in a distributed index and query system
US9626285B2 (en) Storage resource allocation to dataflows based on data requirements and attributes
EP2565786A1 (en) Information processing device and task switching method
CN111079917A (en) Tensor data block access method and device
CN110209493B (en) Memory management method, device, electronic equipment and storage medium
CN117058288A (en) Graphics processor, multi-core graphics processing system, electronic device, and apparatus
CN110515872B (en) Direct memory access method, device, special computing chip and heterogeneous computing system
US20220171717A1 (en) Adaptive out of order arbitration for numerous virtual queues
CN114168301A (en) Thread scheduling method, processor and electronic device
CN111831429B (en) Tensor parallel computing method based on SIMT programming model
CN117271136A (en) Data processing method, device, equipment and storage medium
CN113362219B (en) Image data processing method and device
CN117112215A (en) Memory allocation method, equipment and storage medium
US9405470B2 (en) Data processing system and data processing method
CN109783220B (en) Memory allocation method, device, computer system and storage medium
CN112130977B (en) Task scheduling method, device, equipment and medium
CN111913812B (en) Data processing method, device, equipment and storage medium
CN116150041A (en) Space allocation method, apparatus, electronic device, and computer-readable storage medium
CN116048377A (en) Data processing method of solid state disk and related equipment
JP2023542935A (en) Register compression with early release
CN112036370A (en) Face feature comparison method, system, equipment and computer storage medium
CN111708715A (en) Memory allocation method, memory allocation device and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant