CN113722053A - Data access control circuit, method, electronic device, and computer-readable storage medium - Google Patents

Data access control circuit, method, electronic device, and computer-readable storage medium Download PDF

Info

Publication number
CN113722053A
CN113722053A CN202010448553.9A CN202010448553A CN113722053A CN 113722053 A CN113722053 A CN 113722053A CN 202010448553 A CN202010448553 A CN 202010448553A CN 113722053 A CN113722053 A CN 113722053A
Authority
CN
China
Prior art keywords
task
access control
data access
processing core
control circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010448553.9A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Simm Computing Technology Co ltd
Original Assignee
Beijing Simm Computing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Simm Computing Technology Co ltd filed Critical Beijing Simm Computing Technology Co ltd
Priority to CN202010448553.9A priority Critical patent/CN113722053A/en
Publication of CN113722053A publication Critical patent/CN113722053A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the disclosure discloses a data access control circuit, a data access control method, an electronic device and a computer readable storage medium. Wherein the data access control circuit comprises: a connection circuit comprising a plurality of first interfaces and a plurality of second interfaces; the first interfaces correspond to the processing cores one by one; the plurality of second interfaces correspond to the plurality of storage blocks one by one; and the control circuit is used for controlling the connection relation of the first interface and the second interface in the connection circuit based on the received task update instruction so as to connect the processing core and the corresponding storage block, wherein the task update instruction is sent by the processing core. The data access control circuit controls the connecting circuit through the task updating instruction, so that the processing cores are connected with the storage blocks in a one-to-one correspondence mode, and the technical problem that storage space is wasted or processing core computing power is wasted due to the fact that a shared storage is used in the prior art is solved.

Description

Data access control circuit, method, electronic device, and computer-readable storage medium
Technical Field
The present disclosure relates to the field of processors, and in particular, to a task allocation method and apparatus, an electronic device, and a computer-readable storage medium.
Background
With the development of science and technology, the human society is rapidly entering the intelligent era. The important characteristics of the intelligent era are that people obtain more and more data, the quantity of the obtained data is larger and larger, and the requirement on the speed of processing the data is higher and higher. Chips are the cornerstone of task assignment, which fundamentally determines the ability of people to process data. From the application field, the chip mainly has two routes: one is a generic chip path, such as a CPU or the like, which offers great flexibility but is less computationally efficient in processing domain-specific algorithms; the other is a special chip route, such as TPU and the like, which can exert higher effective computing power in certain specific fields, but have poorer or even no processing capability in the more flexible and changeable and more general fields. Because the data of the intelligent era is various and huge in quantity, the chip is required to have extremely high flexibility, can process algorithms in different fields and in different days, has extremely high processing capacity, and can rapidly process extremely large and sharply increased data volume.
In neural network computing, multi-core or many-core chips are often used. How to make many processing cores perform calculation efficiently is the key to determine the performance of the whole chip. The computational power of each processing core depends on a number of factors, such as how efficiently the processing core has access to the tasks to be executed (including the parameters of the tasks). For example, in an application scenario of image recognition, a chip has N processing cores, all the processing cores independently process images in parallel, and programs and parameters executed by the cores are the same; in order to maximize the computational power of the chip and minimize the delay and power consumption, all processing cores share the same program and parameters.
The following schemes are generally used in the prior art to make the processing cores execute the above-mentioned processing in parallel:
fig. 1 shows a scheme of shared storage in the prior art. In the scheme, the processing cores in the chip independently read tasks required by the processing cores from the SM (shared memory), and after the tasks are read among the processing cores, the tasks are independently executed and the complementary correlation is realized. However, this solution has the following disadvantages: 1. if the processing cores need to execute the tasks independently in parallel, the tasks need to be stored in the SM in multiple copies, namely each processing core needs to correspond to one identical task, so that the SM storage space is wasted; 2. if only one part of the task exists in the SM, each processing core needs to read the SM in series to obtain the task, the computing power of the processing core cannot be fully exerted, the computing power is wasted, and the performance of the chip is reduced.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
In order to solve the technical problems of inflexible task allocation and complex control of processing cores in the prior art, the embodiment of the disclosure provides the following technical solutions:
in a first aspect, an embodiment of the present disclosure provides a task allocation method, for use in a chip including a plurality of processing cores, where the data access control circuit includes:
a connection circuit comprising a plurality of first interfaces and a plurality of second interfaces;
the first interfaces correspond to the processing cores one by one; the plurality of second interfaces correspond to the plurality of storage blocks one by one;
and the control circuit is used for controlling the connection relation of the first interface and the second interface in the connection circuit based on the received task update instruction so as to connect the processing core and the corresponding storage block, wherein the task update instruction is sent by the processing core.
Further, the control circuit is further configured to:
and when the control circuit is initialized, controlling the connection relation of the first interface and the second interface according to the preset configuration.
Further, the connection circuit includes:
a switch array disposed between the plurality of first interfaces and the plurality of second interfaces such that the first interfaces are connected to or disconnected from the respective second interfaces.
In a second aspect, an embodiment of the present disclosure provides a data access control method, including:
the processing core reads a first task to be executed from the first storage block;
executing the first task;
and sending a task updating instruction, wherein the task updating instruction is used for indicating the processing core to be connected with a second storage block, and the second storage block is a storage block in which a second task expected to be executed by the processing core in the next synchronization cycle is stored.
Further, the task update instruction includes:
an identification of the processing core and an identification of the second memory block.
Further, after the sending of the task update instruction, the method further includes:
a synchronization request signal is transmitted.
Further, before the reading the task to be executed from the first storage block, the method further includes:
receiving a synchronization signal;
the reading of the first task to be executed from the first storage block includes:
and reading the first task to be executed from the first storage block according to the synchronous signal.
Further, after receiving the synchronization signal, the method further includes:
and the processing core determines the first task or the second task to be executed according to the number of the synchronous signals.
In a third aspect, an embodiment of the present disclosure provides a chip, including:
data access control circuitry as claimed in any one of the first aspects;
a plurality of processing cores;
a memory comprising a plurality of memory blocks; and
and a synchronization signal generation module.
Further, the synchronization signal generation module is configured to:
and sending a synchronization signal after receiving the synchronization request signals sent by the plurality of processing cores.
In a fourth aspect, an embodiment of the present disclosure provides an electronic device, including: a memory for storing computer readable instructions; and one or more processors configured to execute the computer-readable instructions, such that the processors when executed implement any of the data access control methods of the first aspect.
In a fifth aspect, the disclosed embodiments provide a non-transitory computer-readable storage medium, wherein the non-transitory computer-readable storage medium stores computer instructions for causing a computer to execute the data access control method of any one of the foregoing first aspects.
In a sixth aspect, an embodiment of the present disclosure provides a computer program product, wherein: comprising computer instructions which, when executed by a computing device, may perform the data access control method of any of the preceding first aspects.
In a seventh aspect, an embodiment of the present disclosure provides a computing device, including one or more chips described in the third aspect.
The embodiment of the disclosure discloses a data access control circuit, a data access control method, an electronic device and a computer readable storage medium. Wherein the data access control circuit comprises: a connection circuit comprising a plurality of first interfaces and a plurality of second interfaces; the first interfaces correspond to the processing cores one by one; the plurality of second interfaces correspond to the plurality of storage blocks one by one; and the control circuit is used for controlling the connection relation of the first interface and the second interface in the connection circuit based on the received task update instruction so as to connect the processing core and the corresponding storage block, wherein the task update instruction is sent by the processing core. The data access control circuit controls the connecting circuit through the task updating instruction, so that the processing cores are connected with the storage blocks in a one-to-one correspondence mode, and the technical problem that storage space is wasted or processing core computing power is wasted due to the fact that a shared storage is used in the prior art is solved.
The foregoing is a summary of the present disclosure, and for the purposes of promoting a clear understanding of the technical means of the present disclosure, the present disclosure may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
Drawings
The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale.
FIG. 1 is a schematic view of the prior art of the present disclosure;
fig. 2 is a schematic structural diagram of a data access control circuit provided in an embodiment of the present disclosure;
fig. 3 is a schematic diagram of an operation process of a data access control circuit according to an embodiment of the present disclosure;
fig. 4 is a flowchart of a data access control method provided by an embodiment of the present disclosure;
FIG. 5a is a diagram of a neural network architecture in an embodiment of the present disclosure;
FIG. 5b is a diagram illustrating the structure of a chip for performing neural network tasks in an embodiment of the present disclosure;
FIG. 5c is a timing diagram of a chip executing tasks according to an embodiment of the disclosure;
FIG. 5d is a connection status of the connection circuit when the first synchronization signal is received according to the embodiment of the disclosure;
FIG. 5e is a connection status of the connection circuit when the second synchronization signal is received according to the embodiment of the disclosure;
fig. 5f is a connection state of the connection circuit when the third synchronization signal is received in the embodiment of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.
It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
Fig. 2 is a schematic diagram of a data access control circuit provided in an embodiment of the present disclosure. The data access control circuit 200 provided in this embodiment is used in a chip including a plurality of processing cores, where the data access control circuit 200 includes:
a connection circuit 201, the connection circuit 201 comprising a plurality of first interfaces 203 and a plurality of second interfaces 204; the plurality of first interfaces are in one-to-one correspondence with the plurality of processing cores, and the plurality of second interfaces are in one-to-one correspondence with the plurality of storage blocks. As shown in fig. 2, each of the processing cores C1, C2, … … CN is connected to one first interface, and each of the memory blocks M1, M2, … … MK is connected to one second interface.
A control circuit 202, configured to control a connection relationship between the first interface and the second interface in the connection circuit based on a received task update instruction, so as to connect the processing core and the corresponding storage block, where the task update instruction is sent by the processing core. The processing cores send a task update instruction Refresh to the control circuit 202, the control circuit 202 analyzes and runs the instruction after receiving the task update instruction, and sends a control signal C _ F to the connection circuit to control the connection relationship between the first interface and the second interface in the connection circuit 201, so as to connect one processing core with a corresponding storage block. Optionally, the plurality of processing cores and the plurality of storage blocks are in a one-to-one correspondence relationship, that is, each storage block is connected to only one processing core in each synchronization cycle, so that the processing cores can read the tasks stored in the storage blocks in parallel.
Optionally, the processing cores and the control circuit 202 are connected to a synchronization signal generation module SG, and the SG sends a synchronization signal Sync to the processing cores and the control circuit, so that the processing cores and the control circuit can cooperate with each other. Specifically, each time after receiving a synchronization signal sent by an SG, the processing cores read the to-be-executed task of the current cycle from the memory block corresponding to the current synchronization cycle; the control circuit receives a task update instruction sent by the processing core each time after receiving a synchronization signal sent by an SG.
Optionally, the control circuit 202 controls a connection relationship between the first interface and the second interface according to a preset configuration in an initialization stage. The preset configuration is a configuration in an initialization program, for example, a processing core that needs to execute a task in a first synchronization cycle and a storage block that stores the task to be executed are preset, and in an initialization stage, a first interface and a second interface in the connection circuit are connected according to the preset configuration, so that the processing core that has to execute the task in the first synchronization cycle can read the task to be executed from the corresponding storage block when a first synchronization signal comes.
Optionally, the connection circuit 201 includes: the switch array is arranged between the plurality of first interfaces and the plurality of second interfaces, so that the first interfaces are connected with or disconnected from the corresponding second interfaces. Wherein the switch array comprises a plurality of switches, and each first interface is connected with each second interface through a switch element. Illustratively, if there are N first interfaces and K second interfaces, then N switches are provided, each switch having K +1 states, where K states indicate connection with one of the K second interfaces and another state indicates disconnection with each second interface.
Fig. 3 is an operation diagram of a data access control circuit in the embodiment of the present disclosure. As shown in fig. 3, in this example, a synchronization signal generation module is included that is connected to each processing core and to the data access control circuitry. Each processing core executes different tasks in a certain time period determined by the synchronous signal, wherein the tasks can be a complete task or a part of a certain complete task, such as a certain layer of calculation task in a neural network task; the tasks are stored in different storage blocks of the SM, the tasks stored in each storage block are different, and each storage block can be independently accessed. Each processing core C1, C2, … … CN may read the tasks in a certain memory block for a certain period of time.
For example, in a certain synchronization cycle, all processing cores have tasks to be executed, and when receiving a synchronization signal sent by the SG, the C1, C2, and … … CN execute a task reading instruction to read the tasks to be executed, which include, for example, the programs to be executed and/or parameters corresponding to the programs to be executed, from the corresponding storage blocks through the first interface and the second interface. After reading the task to be executed, each processing core starts to execute the task to be executed, after the task to be executed is executed, each processing core sends a task update instruction to the data access control circuit, after the data access control circuit receives the task update instruction, the data access control circuit analyzes and executes the task update instruction to control the connecting circuit, and each processing core is connected with a storage block storing the task to be executed in the next period, so that when a next synchronization signal is received, the processing core can directly acquire the task to be executed in the current synchronization signal from the corresponding storage block.
The above embodiments disclose a connection circuit comprising a plurality of first interfaces and a plurality of second interfaces; the first interfaces correspond to the processing cores one by one; the plurality of second interfaces correspond to the plurality of storage blocks one by one; and the control circuit is used for controlling the connection relation of the first interface and the second interface in the connection circuit based on the received task update instruction so as to connect the processing core and the corresponding storage block, wherein the task update instruction is sent by the processing core. The data access control circuit controls the connecting circuit through the task updating instruction, so that the processing cores are connected with the storage blocks in a one-to-one correspondence mode, and the technical problem that storage space is wasted or processing core computing power is wasted due to the fact that a shared storage is used in the prior art is solved.
Fig. 4 is a flowchart of a data access control method according to an embodiment of the present disclosure. As shown in fig. 4, the method includes the steps of:
step S401, a processing core reads a first task to be executed from a first storage block;
for example, in a certain time period, a processing core has a task to be executed, and after the time period comes, the processing core reads a first task to be executed from a first storage block corresponding to the processing core, wherein a connection relationship between the processing core and the first storage block is preset in the previous time period, so that when the time comes, the processing core directly reads the first task from the first storage block.
Step S402, executing the first task;
in this step, the processing core executes the first task, for example, the first task is a computation task of a certain layer in a neural network task, and the processing core obtains input data required by the computation task of the layer, and computes the input data according to the first task to obtain output data. In the present disclosure, the input data is a part of the complete input data, which is related to the processing cores executing the tasks, each processing core to execute the tasks processes different input data, the input data is obtained from the external memory, an external memory address of the input data to be processed by each processing core is set at the initialization stage of the processing core, and then the input data is obtained from the external memory address when the processing core executes the tasks to be executed. It is understood that the manner of acquiring the input data may be any known manner, and is not described herein.
Step S403, sending a task update instruction, where the task update instruction is used to instruct the processing core to connect to a second storage block, and the second storage block is a storage block in which a second task expected to be executed by the processing core in a next synchronization cycle is stored.
After the first task is executed, the processing core sends a task update instruction Refresh to the data access control circuit. Illustratively, the task update instruction includes an identifier of a processing core that sends the task update instruction and an identifier of the second storage block, and the data access control circuit determines, according to the identifier of the processing core and the identifier of the second storage block, that the processing core needs to read a task to be executed from the second storage block in a next synchronization cycle, so as to connect the corresponding first interface and the second interface.
Optionally, each processing core may determine, through a pre-configured program, whether it has a task to be executed in the next synchronization cycle, and if so, determine a storage block storing the task to be executed. Illustratively, in an initialization stage, the identifiers of the memory blocks of each to-be-executed task of the processing core during the task execution process are arranged according to an execution sequence and stored in the processing core, and after the processing core finishes executing the current to-be-executed task, the processing core directly obtains the identifier of the memory block of the next to-be-executed task, and sends out the identifier of the memory block through a task update instruction.
Optionally, in each synchronization cycle, the multiple processing cores may be classified into the following types: a first processing core to which a task is to be executed in a current synchronization cycle and which is also to be executed in a next synchronization cycle; a second processing core to which a task is to be executed in a current synchronization cycle and which has no task to be executed in a next synchronization cycle; a third processing core, which has no task to be executed in the current synchronization cycle and has a task to be executed in the next synchronization cycle; and the fourth processing core has no task to be executed in the current synchronization period and the next synchronization period.
For a first processing core, after executing a task in the current synchronization cycle, sending a task updating instruction; for the second processing core, after the task in the current synchronization cycle is executed, a task updating instruction is not sent; for the third processing core, in the current synchronization cycle, the third processing core directly sends a task update instruction; for the fourth processing core, it may not perform tasks or perform other tasks during the current synchronization cycle.
Optionally, after sending the task update instruction, the processing core further sends a synchronization request signal. In this alternative embodiment, the synchronization signal generation module sends synchronization signals to each processing core and to the data access control circuitry. The synchronization signal generation module can determine the processing cores of the tasks to be executed in each synchronization period through pre-configuration in an initialization stage, and in the subsequent synchronization period, after receiving that each processing core of the tasks to be executed in the synchronization period sends a synchronization request signal, a new synchronization signal is generated and sent to all the processing cores and the data access control circuit.
In this optional embodiment, the processing cores are the first processing core and the second processing core, that is, the processing core that is to execute the task in the current synchronization cycle, the processing core sends the synchronization request signal to indicate that the task that is to be executed by the processing core in the current synchronization cycle has been executed, when the synchronization signal generation module receives the synchronization request signal of each processing core that is to execute the task in the current cycle, the synchronization signal generation module indicates that the processing cores that are to execute the task in the current synchronization cycle have all executed the task that is to be executed, and at this time, the synchronization signal generation module sends a new synchronization signal.
Optionally, before the step S401, the method further includes: receiving a synchronization signal; the reading of the first task to be executed from the first storage block includes: and reading the first task to be executed from the first storage block according to the synchronous signal. In this alternative embodiment, the operations of the respective processing cores are synchronized using a synchronization signal, and each time the synchronization signal is received, the processing core executes a read instruction to read a task to be executed in a synchronization cycle determined by the synchronization signal from a memory block connected thereto.
Optionally, after receiving the synchronization signal, the method further includes: and the processing core determines the first task or the second task to be executed according to the number of the synchronous signals.
Optionally, when the parameter of the task is stored in the storage block, after receiving the synchronization signal, determining the first task to be executed by the number of times of the synchronization signal, where at this time, a complete task is stored in a storage space of each processing core, the first task to be executed in the current synchronization cycle may be determined by a preset configuration, and the processing core directly obtains the first task from its own storage space, reads the parameter for executing the first task from the storage block, then executes the first task, and after the first task is executed, determines the second task according to the number of times of the synchronization signal, so as to send the task update instruction.
Optionally, when the tasks and the parameters are stored in the storage block, after receiving the synchronization signal, the first task and the parameters are directly obtained from the storage block, and then the first task is executed, at this time, after the first task is executed, the second task needs to be determined by the number of times of the synchronization signal, so as to send the task update instruction.
Optionally, when the storage block stores a task, after receiving a synchronization signal, determining the first task to be executed by the number of times of the synchronization signal, where at this time, a storage space of each processing core stores a parameter of the complete task, and a parameter of the first task to be executed in a current synchronization cycle may be determined by a preset configuration, and at this time, the processing core directly obtains the parameter of the first task from its own storage space, and reads and executes the first task from the storage block, and then executes the first task, and after the first task is executed, determines a second task according to the number of times of the synchronization signal, and sends the task update instruction.
In the above, although the steps in the above method embodiments are described in the above sequence, it should be clear to those skilled in the art that the steps in the embodiments of the present disclosure are not necessarily performed in the above sequence, and may also be performed in other sequences such as reverse, parallel, and cross, and further, on the basis of the above steps, other steps may also be added by those skilled in the art, and these obvious modifications or equivalents should also be included in the protection scope of the present disclosure, and are not described herein again.
The embodiment of the disclosure also provides a chip, which comprises the data access control circuit in the embodiment; a plurality of processing cores; a memory comprising a plurality of memory blocks; and a synchronization signal generation module. The overall structure of the chip refers to the structure shown in fig. 3. The synchronization signal generation module is used for sending synchronization signals after receiving the synchronization request signals sent by the plurality of processing cores. The working process of each part in the chip may refer to the description in the above embodiments of the data access control circuit and the data access control method, and is not described herein again.
The operation of the data access control circuit in the embodiment of the present disclosure is described below as a practical application scenario. As shown in fig. 5a, for the tasks to be executed by the processing core in the application scenario: a neural network comprising two layers of Layer1 and Layer2, programs and parameters of each Layer of the neural network being stored in a memory block in the shared memory SM; the structure of the chip performing the neural network computational tasks is shown in fig. 5b, which comprises two processing cores C1 and C2, a connection circuit comprising a control circuit and switches 1 and 2, a shared memory SM comprising two memory blocks M1 and M2, and a synchronization signal generation module SG connecting the two processing cores and the data access control circuit. Wherein the storage block M1 is used for storing tasks and parameters corresponding to Layer1, and the storage block M2 is used for storing tasks and parameters corresponding to Layer 2. Each of the processing cores C1 and C2 may perform a complete neural network task, i.e., perform the tasks corresponding to Layer1 and Layer2 in time-sharing, and access the corresponding memory blocks M1 and M2 in time-sharing. Optionally, the time period is a synchronization cycle, and is divided by a synchronization signal Sync generated by the synchronization signal generation module SG.
Fig. 5c is a task execution timing diagram in the application scenario. As shown in fig. 5C, after receiving the first synchronization signal, entering a first synchronization period t1, in the synchronization period t1, the processing core C1 reads the memory block M1 storing the task corresponding to Layer1 to obtain the first task and the parameter corresponding to Layer 1; the processing core C1 then obtains the corresponding input data and executes the first task. It can be understood that the input data of C1 in t1 is a part of the input data of the neural network, after the first task is executed, the processing core C1 sends a task update instruction to the control circuit F _ ctrl in the data access control circuit according to the number of times of the current synchronization signal, where the task update instruction includes the ID of the processing core C1, i.e., the ID of C1 and the identifier M2 of the memory block of the second task to be executed by C1 in the next synchronization cycle, the second task is a task corresponding to Layer2, and then the processing core C1 enters a state of waiting for the synchronization signal. After receiving the task update instruction, the control circuitry parses and executes the task update instruction to communicate processing cores C1 and M2.
In t1, C2 does not execute the task, but C2 determines from the number of times the synchronization signal is received that it is the first task to be executed in the next synchronization cycle, and therefore, in t 1C 2 directly sends a task update instruction to the control circuit in the data access control circuit, and thereafter the processing core C2 enters a state of waiting for the synchronization signal; after receiving the task update instruction, the control circuitry parses and executes the task update instruction to communicate processing cores C2 and M1.
Fig. 5d shows the connection state of the first interface and the second interface in the data access control circuit when the first synchronization signal is received, which makes it possible for the processing core C1 to read the first task and parameters directly from M1 at t 1.
Fig. 5e shows a connection state of the first interface and the second interface in the data access control circuit when the second synchronization signal is received. At t1, the data access control circuit connects C1 and M2 according to the data update command of C1, and the data update command of C2 connects C2 and M1. After receiving the second synchronization signal, in a synchronization period t2, C1 reads the second task and parameters corresponding to Layer2 from M2, C2 reads the first task and parameters corresponding to Layer1 from M1, and then C1 and C2 acquire respective input data and execute respective tasks to be executed. As can be appreciated, in t2, the input data of the second task of C1 is the output data for which the first task is executed; the input data of the first task of C2 is part of the raw input data of the neural network.
After C1 completes the second task, it sends a data update command to the control circuitry, causing the control circuitry to communicate C1 with M1; after C2 completes the first task, it sends a data update command to the control circuitry, causing the control circuitry to communicate C2 with M2. Fig. 5f shows the connection status of the first interface and the second interface in the data access control circuit when the third synchronization signal is received. The following steps are similar to the above steps, and the control circuit continuously changes the connection state of the connection circuit according to the data updating command sent by the C1 and/or the C2, so that the C1 and/or the C2 reads the task to be executed from the corresponding storage block in the next synchronization cycle until the complete task is executed.
In the above embodiment, each part of the complete tasks, such as the calculation program of each layer in the two-layer neural network, only needs to store one part in the memory, which reduces the area of the shared memory and further reduces the area of the chip; all the processing cores can run in parallel, and the computing power of each processing core can be fully exerted; the unified data access control circuit is used for controlling the access of the processing cores to the memory, and the control circuit does not need to be designed for each processing core, so that the area of a chip is saved, and the design difficulty of the circuit is reduced.
An embodiment of the present disclosure provides an electronic device, including: a memory for storing computer readable instructions; and one or more processors configured to execute the computer-readable instructions, such that the processors when executed implement any of the data access control methods in the embodiments.
The present disclosure also provides a non-transitory computer-readable storage medium, which stores computer instructions for causing a computer to execute the data access control method in any one of the foregoing embodiments.
The embodiment of the present disclosure further provides a computer program product, wherein: comprising computer instructions which, when executed by a computing device, may perform the data access control method of any of the preceding embodiments.
The embodiment of the present disclosure further provides a computing device, which includes the chip in any one of the embodiments.
The flowchart and block diagrams in the figures of the present disclosure illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

Claims (10)

1. A data access control circuit for use in a chip comprising a plurality of processing cores, wherein the data access control circuit comprises:
a connection circuit comprising a plurality of first interfaces and a plurality of second interfaces;
the first interfaces correspond to the processing cores one by one; the plurality of second interfaces correspond to the plurality of storage blocks one by one;
and the control circuit is used for controlling the connection relation of the first interface and the second interface based on a received task update instruction so as to connect the processing core and the corresponding storage block, wherein the task update instruction is sent by the processing core.
2. The data access control circuit of claim 1, wherein the control circuit is further to:
and when the control circuit is initialized, controlling the connection relation of the first interface and the second interface according to the preset configuration.
3. The data access control circuit of claim 1, wherein the connection circuit comprises:
the switch array is arranged between the plurality of first interfaces and the plurality of second interfaces and used for controlling the connection or disconnection of the first interfaces and the corresponding second interfaces.
4. A data access control method, comprising:
the processing core reads a first task to be executed from the first storage block;
executing the first task;
and sending a task updating instruction, wherein the task updating instruction is used for indicating the processing core to be connected with a second storage block, and the second storage block is a storage block in which a second task expected to be executed by the processing core in the next synchronization cycle is stored.
5. The data access control method of claim 4, wherein the task update instruction comprises:
an identification of the processing core and an identification of the second memory block.
6. The data access control method of claim 4, further comprising, after the sending a task update instruction:
a synchronization request signal is transmitted.
7. The data access control method of claim 4, wherein prior to said reading the task to be performed from the first memory block, further comprising:
receiving a synchronization signal;
the reading of the first task to be executed from the first storage block includes:
and reading the first task to be executed from the first storage block according to the synchronous signal.
8. The data access control method of claim 7, wherein after receiving the synchronization signal, further comprising:
and the processing core determines the first task or the second task to be executed according to the number of the synchronous signals.
9. A chip, comprising:
data access control circuitry as claimed in any one of claims 1 to 3;
a plurality of processing cores;
a memory comprising a plurality of memory blocks; and
and a synchronization signal generation module.
10. The chip of claim 10, wherein the synchronization signal generation module is to:
and sending a synchronization signal after receiving the synchronization request signal sent by all the processing cores executing the tasks in the plurality of processing cores.
CN202010448553.9A 2020-05-25 2020-05-25 Data access control circuit, method, electronic device, and computer-readable storage medium Pending CN113722053A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010448553.9A CN113722053A (en) 2020-05-25 2020-05-25 Data access control circuit, method, electronic device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010448553.9A CN113722053A (en) 2020-05-25 2020-05-25 Data access control circuit, method, electronic device, and computer-readable storage medium

Publications (1)

Publication Number Publication Date
CN113722053A true CN113722053A (en) 2021-11-30

Family

ID=78671580

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010448553.9A Pending CN113722053A (en) 2020-05-25 2020-05-25 Data access control circuit, method, electronic device, and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN113722053A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116319610A (en) * 2023-05-23 2023-06-23 南京芯驰半导体科技有限公司 Data transmission method, device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116319610A (en) * 2023-05-23 2023-06-23 南京芯驰半导体科技有限公司 Data transmission method, device, electronic equipment and storage medium
CN116319610B (en) * 2023-05-23 2023-08-29 南京芯驰半导体科技有限公司 Data transmission method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
EP0318221A2 (en) Controlling responding by users of an intercommunications bus
CN111190842B (en) Direct memory access, processor, electronic device, and data transfer method
KR102471141B1 (en) Programmable logic circuits, associated control devices and methods for controlling electrical facilities, in particular nuclear facilities
US20190044883A1 (en) NETWORK COMMUNICATION PRIORITIZATION BASED on AWARENESS of CRITICAL PATH of a JOB
CN112363972B (en) Electronic device and method for supporting communication among multiple CPUs
CN103703427B (en) Treating apparatus and the method for synchronous the first processing unit and the second processing unit
EP2759927B1 (en) Apparatus and method for sharing function logic between functional units, and reconfigurable processor thereof
CN113722053A (en) Data access control circuit, method, electronic device, and computer-readable storage medium
US20230067432A1 (en) Task allocation method, apparatus, electronic device, and computer-readable storage medium
CN110119375B (en) Control method for linking multiple scalar cores into single-core vector processing array
CN112925739B (en) Communication method applied to many-core chip, many-core chip and storage medium
CN115865701A (en) Node control method, device and system based on daisy chain network
CN114546926A (en) Core cluster synchronization, control method, data processing method, core, device, and medium
CN113556242A (en) Method and equipment for performing inter-node communication based on multi-processing nodes
CN109948785B (en) High-efficiency neural network circuit system and method
CN111767999A (en) Data processing method and device and related products
CN111209230A (en) Data processing device, method and related product
EP4120093A1 (en) Task allocation method and apparatus, and electronic device and computer-readable storage medium
CN115280272A (en) Data access circuit and method
CN113688090A (en) Data transmission method, processor system, readable storage medium and electronic device
CN111915014A (en) Artificial intelligence instruction processing method and device, board card, mainboard and electronic equipment
CN114281559A (en) Multi-core processor, synchronization method for multi-core processor and corresponding product
CN116339944A (en) Task processing method, chip, multi-chip module, electronic device and storage medium
EP1193605A2 (en) Apparatus and method for the transfer of signal groups between digital signal processors in a digital signal processing unit
JP2002222161A (en) Semiconductor device, and method of transferring data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination