CN112434800A - Control device and brain-like computing system - Google Patents

Control device and brain-like computing system Download PDF

Info

Publication number
CN112434800A
CN112434800A CN202011313163.7A CN202011313163A CN112434800A CN 112434800 A CN112434800 A CN 112434800A CN 202011313163 A CN202011313163 A CN 202011313163A CN 112434800 A CN112434800 A CN 112434800A
Authority
CN
China
Prior art keywords
task
module
trigger
subtask
functional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011313163.7A
Other languages
Chinese (zh)
Other versions
CN112434800B (en
Inventor
裴京
施路平
王冠睿
马骋
徐海峥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202011313163.7A priority Critical patent/CN112434800B/en
Priority to PCT/CN2020/137469 priority patent/WO2022104991A1/en
Publication of CN112434800A publication Critical patent/CN112434800A/en
Application granted granted Critical
Publication of CN112434800B publication Critical patent/CN112434800B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/061Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/04Generating or distributing clock signals or signals derived directly therefrom
    • G06F1/06Clock generators producing several clock signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Neurology (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Power Sources (AREA)
  • Microcomputers (AREA)

Abstract

The present disclosure relates to a control device and a brain-like computing system, the device comprising: a first trigger module; a second trigger module; the multiplexer is electrically connected with the first trigger module and the second trigger module; and the control module is electrically connected to the multiplexer module and is used for controlling the multiplexer to transmit any one of the one or more first trigger signals and any one of the one or more second trigger signals to one or more functional cores in the processor, so that the one or more functional cores execute the subtasks of the task according to the received first trigger signal and the received second trigger signal. Through the device, the embodiment of the disclosure can realize the division of the independent task, accelerate the execution speed, reduce the operation time, improve the performance of the chip and reduce the power consumption.

Description

Control device and brain-like computing system
Technical Field
The present disclosure relates to the field of artificial intelligence technology, and in particular, to a control device and a brain-like computing system.
Background
The explosive development of large data information network core intelligent mobile devices generates massive unstructured information, accompanied by a rapid increase in the energy-efficient processing demand for such information. The traditional von Neumann architecture chip adopts a working mode of bus communication, synchronization, serial and concentration, the density is increased according to Moore's law, the micro-scale is expected to reach the physical limit within 10 to 15 years in the future, and the development is bound to be fundamentally limited.
The structure is different from the traditional computer processing mode, and has great advantages when processing some non-formalized problems through distributed storage and parallel cooperative processing of information. The traditional triggering mechanism of the many-core neuromorphic chip architecture has great limitation and cannot perform independent task division.
Disclosure of Invention
In view of this, the present disclosure proposes a control device, the device comprising:
a first trigger module for generating one or more first trigger signals, wherein each first trigger signal corresponds to each task;
the second trigger module is electrically connected with the first trigger module and used for generating one or more second trigger signals according to the first trigger signal, wherein the second trigger signals correspond to subtasks of the task;
the multiplexer is electrically connected with the first trigger module and the second trigger module;
and the control module is electrically connected to the multiplexer module and is used for controlling the multiplexer to transmit any one of the one or more first trigger signals and any one of the one or more second trigger signals to one or more functional cores in the processor, so that the one or more functional cores execute the subtasks of the task according to the received first trigger signal and the received second trigger signal.
In a possible implementation manner, the apparatus further includes a first timing module electrically connected to the first triggering module, where the first timing module includes a first timing clock, and the first timing module is configured to start the first timing clock when receiving the first triggering signal, so as to time an execution cycle of the current task;
the first triggering module is further configured to determine that a condition for triggering an execution period of a next task is met and generate a first triggering signal corresponding to the execution period of the next task when the first timing clock reaches a first threshold, all subtasks of the current task are finished executing, and the current task is not a last task.
In a possible implementation manner, the first triggering module is further configured to generate a forced end signal when the first timing clock reaches a third threshold, where the forced end signal is used to forcibly end execution of each current sub-task in the current task, so as to forcibly end execution of the current task;
the control module is further configured to transmit the forced end signal to each functional core of the current task by using the multiplexer.
In a possible implementation manner, the apparatus further includes a second timing module electrically connected to the second triggering module, where the second timing module includes one or more second timing clocks, and the second timing module is configured to start the one or more second timing clocks when receiving the second triggering signal, so as to time an execution cycle of each sub-task of the current task,
the second triggering module is further configured to determine that a condition that an execution period of a next subtask that triggers each current subtask in the current task is satisfied when the second timing clock reaches a second threshold and the functional cores corresponding to the second triggering signals all end execution of each current subtask, and generate the second triggering signals corresponding to each next subtask.
In a possible implementation manner, the control module is further configured to receive an operation end signal output by each functional core corresponding to the second trigger signal, and generate a subtask end signal when each functional core outputs the operation end signal, so as to determine that all functional cores corresponding to the second trigger signal end execution of the current subtask;
the device further comprises:
and the first storage module is electrically connected to the control module and used for storing the subtask ending signal.
In a possible implementation manner, the control module is further configured to generate a task end signal to determine that all subtasks of the current task corresponding to the first trigger signal are completely finished executing, when the subtask end signals of all subtasks of the current task are stored in the first storage module;
the device further comprises:
and the second storage module is electrically connected to the control module and used for storing the task ending signal.
In a possible implementation manner, the control module is further configured to allocate a function core to each task and a subtask of each task, number the function cores in the processor, number a first function core set allocated to each task, and number a second function core set corresponding to each subtask of each task.
In one possible implementation, the apparatus includes a plurality of phase group registers, a first selection register, a functional core register, a second selection register, wherein,
the phase group register is configured to be a two-dimensional register with the size of s x n bits, the first dimension represents the number of a current task, the second dimension represents the number of subtasks included in the current task, and s and n are positive integers;
the first selection register is configured to be a two-dimensional register with the size of m x y bits, the first dimension represents the number of the current functional core, the second dimension represents the task number to which the current functional core belongs, wherein m and y are positive integers, and y is log2s;
The functional core register is configured to be a two-dimensional register with the size of n x m bits, the first dimension represents a subtask number, and the second dimension represents a functional core included in the subtask;
the second selection register is configured as a two-dimensional register with the size of m x bits, the first dimension represents the number of the functional core, the second dimension represents the subtask number of the current functional core, wherein x is log2n。
In one possible implementation, the control module is further configured to,
releasing a functional core in the processor corresponding to the first trigger signal under the condition that a preset condition is met, wherein the preset condition comprises:
the current task is the last task and the current task is finished to be executed; or
When execution of the current task is forcibly ended.
According to another aspect of the present disclosure, there is provided a brain-like computing system, the system comprising the control device.
Through the device, the embodiment of the disclosure can generate one or more first trigger signals according to the number of tasks to be executed, generate one or more second trigger signals, control the corresponding functional cores to execute each subtask of the current task according to the first trigger signals and the one or more second trigger signals, support parallel or mixed operation of multiple asynchronous tasks, and simultaneously divide the current task into each subtask to enable the functional cores with similar tasks inside to execute simultaneously, so that the division of independent tasks can be realized through a two-stage trigger mechanism, the execution speed is accelerated, the operation time is reduced, and the performance of a chip is improved; the corresponding functional cores are controlled through the trigger signals, and the functional cores which are not selected can be in a dormant state, so that power consumption is reduced. The task may be a network or an application task, such as a task for performing neural network operations (for example, a VGG network or a ResNet50 network), or a task for running application software.
Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features, and aspects of the disclosure and, together with the description, serve to explain the principles of the disclosure.
Fig. 1 shows a schematic diagram of a control device according to an embodiment of the present disclosure.
FIG. 2 illustrates a block diagram of processor functional cores according to an embodiment of the present disclosure.
Fig. 3 shows a schematic diagram of a trigger timing of a control device according to an embodiment of the present disclosure.
FIG. 4 shows a trigger timing diagram of a control device according to an embodiment of the present disclosure.
FIG. 5 shows a schematic diagram of a control device according to an embodiment of the present disclosure.
Fig. 6a, 6b show schematic diagrams of a control device according to an embodiment of the present disclosure.
Detailed Description
Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The use of "first," "second," and similar terms in this disclosure is not intended to indicate any order, quantity, or importance, but rather is used to distinguish one element from another. Also, the use of the terms "a," "an," or "the" and similar referents do not denote a limitation of quantity, but rather denote the presence of at least one. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.
With the continuous development of the technical field of neural networks, massive unstructured information is generated, and the demand for high-energy-efficiency processing of the information is increased sharply. The many-core neuromorphic chip architecture is different from a traditional computer processing mode, and has great advantages in processing some non-formalized problems through distributed storage and parallel cooperative processing of information. However, the trigger mechanism in the conventional chip cannot perform independent network task division when multiple networks are executed, and has great limitation.
In order to further improve performance, the embodiments of the present disclosure provide a control device, where a first-level trigger may perform division of different networks or applications by triggering a control function core in two levels, and a second-level trigger may divide similar computation tasks in the networks or applications and allocate the computation tasks to perform operations in the function core, so that performance of a chip is effectively improved, and the control device has a high application value.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating a control device according to an embodiment of the disclosure.
As shown in fig. 1, the apparatus includes:
a first triggering module 10 for generating one or more first triggering signals, wherein each first triggering signal corresponds to each task;
the second triggering module 20, electrically connected to the first triggering module 10, is configured to generate one or more second triggering signals according to the first triggering signal, where the second triggering signals correspond to subtasks of the task;
a multiplexer 30 electrically connected to the first trigger module 10 and the second trigger module 20;
a control module 40, electrically connected to the multiplexer module 30, configured to control the multiplexer 30 to transmit any one of the one or more first trigger signals and any one of the one or more second trigger signals to one or more functional cores in the processor 1, so that the one or more functional cores execute sub-tasks of a task according to the received first trigger signal and the received second trigger signal.
Through the device, the embodiment of the disclosure can generate one or more first trigger signals according to the number of tasks to be executed, generate one or more second trigger signals, control the corresponding functional cores to execute each subtask of the current task according to the first trigger signals and the one or more second trigger signals, support parallel or mixed operation of multiple asynchronous tasks, and simultaneously divide the current task into each subtask to enable the functional cores with similar tasks inside to execute simultaneously, so that the division of independent tasks can be realized through a two-stage trigger mechanism, the execution speed is accelerated, the operation time is reduced, and the performance of a chip is improved; the corresponding functional cores are controlled through the trigger signals, and the functional cores which are not selected can be in a dormant state, so that power consumption is reduced. The task may be a network or an application task, such as a task for performing neural network operations (for example, a VGG network or a ResNet50 network), or a task for running application software.
The control apparatus of the embodiment of the present disclosure may be used in a terminal or a server, where the terminal is also referred to as a User Equipment (UE), a Mobile Station (MS), a Mobile Terminal (MT), and the like, and is a device that provides voice and/or data connectivity to a user, for example, a handheld device, a vehicle-mounted device, and the like having a wireless connection function. Currently, some examples of terminals are: a mobile phone (mobile phone), a tablet computer, a notebook computer, a palm top computer, a Mobile Internet Device (MID), a wearable device, a Virtual Reality (VR) device, an Augmented Reality (AR) device, a wireless terminal in industrial control (industrial control), a wireless terminal in unmanned driving (self), a wireless terminal in remote surgery (remote medical supply), a wireless terminal in smart grid (smart grid), a wireless terminal in transportation safety (transportation safety), a wireless terminal in city (smart city), a wireless terminal in smart home (smart home), a wireless terminal in vehicle networking, and the like.
The modules of the embodiments of the present disclosure may be implemented by dedicated hardware circuits, or may be implemented by general hardware circuits, for example, the control module may include a controller having a function of executing instructions, and the control module may be implemented in any suitable manner, for example, by using a microprocessor, a Central Processing Unit (CPU), a control logic portion in a memory controller, and the like, including but not limited to the following types of chips: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F 320. Within the processor 101, the executable instructions may be executed by hardware circuits such as logic gates, switches, Application Specific Integrated Circuits (ASICs), programmable logic controllers, and embedded microcontrollers.
Referring to fig. 2, fig. 2 is a block diagram of functional cores of a processor according to an embodiment of the disclosure.
In one possible implementation, the processor to which the apparatus of the embodiment of the present disclosure may be applied may be a many-core neuromorphic chip, as shown in fig. 2, and the processor may include a plurality of functional cores, and the functional cores of the processor may be grouped according to different types of tasks.
In one example, as shown in fig. 2, step _ grp (beat timing group) and Phase _ grp (Phase timing group) may respectively represent a corresponding set of function cores, step _ grp0, step _ grp1, and step _ grp2 represent a set of function cores corresponding to each first trigger signal, Phase _ grp0, Phase _ grp1, Phase _ grp2, Phase _ grp3, Phase _ grp4, and Phase _ grp5 represent a set of function cores corresponding to each second trigger signal, and C00 to C44 represent different function cores. For example, Phase _ grp0 corresponds to a set of functional cores including functional cores C00, C01, C10, C11, Phase _ grp1, Phase _ grp2, Phase _ grp3, Phase _ grp4, Phase _ grp5 are described in relation to functional cores, step _ grp0 corresponds to a set of three functional cores Phase _ grp0, Phase _ grp1, Phase _ grp2, step _ grp1, step _ grp2 are described in relation to Phase _ grp3, Phase _ grp4, Phase _ grp5, as shown in fig. 2, a Phase _ grp may be included in one or more p grps, a functional core may be included in one or more Phase _ grp (as C13 is included in Phase grp4, as in the case of a set of two functional cores, e.g. a network core may include two functional cores 8653, if they are included in a set of a network core cross-network, as in the example, a network core 8653, it can be used for data interaction, and also has other functions (for example, C13 is contained in Phase _ grp1, and is also contained in Phase _ grp 3).
In one example, the same step _ grp may be used to execute the same network task, each Phase _ grp under the same step _ grp may be used to execute the same or similar task under the same network task, and the functional core under the same Phase _ grp may be used to execute primitive operations.
In one possible implementation manner, the control apparatus may be applied to a system on chip including a plurality of functional cores (or referred to as processor cores), where each first trigger signal corresponds to a part of the functional cores (i.e., a set of functional cores) in the system on chip, the first trigger signal may trigger the functional cores to operate and execute sub-tasks by generating second trigger signals, and each second trigger signal may correspond to one or more of the part of the functional cores. The corresponding relation between the first trigger signal, the second trigger signal and the functional core can be manually or automatically set according to requirements. The correspondence may be released after a complete task is performed (e.g., a neural network operation is completed) for subsequent reconfiguration.
In a possible implementation manner, the corresponding relationship between the first trigger signal, the second trigger signal, and the functional core may be reset before starting a complete task, specifically, if there are a sufficient number of idle functional cores required, the setting may be completed and a complete task may be started to be executed, if there are no idle functional cores, the functional core may be released after the other tasks are finished being executed, and after the number of idle functional cores meets the requirement, the setting may be completed and a complete task may be started to be executed. Because the same functional core can respectively correspond to different second trigger signals or first trigger signals, the multiplexing signals can be set for the functional core which can be multiplexed after the setting is finished, so that the functional core can be used as an idle functional core to participate in operation when the functional core is set by other tasks, the functional core is fully utilized, and the performance and the utilization rate of the system are improved.
In a possible implementation manner, within the execution period of the current task, the execution period of one or more subtasks may be included, that is, between two first trigger signals, a second trigger signal of multiple periods may occur. Taking the neural network operation task as an example, for a neural network with a small operation amount, the first trigger signal may complete one operation (current task) of the entire neural network in one execution cycle, where the second trigger signal may complete one link of the neural network operation in each cycle, for example, an operation of one network layer, and the functional core corresponding to each second trigger signal may perform a similar operation (subtask), for example, addition or multiplication, in each link, and each functional core may be configured to perform a corresponding primitive operation. For example, still taking fig. 2 as an example, the four functional cores C00, C01, C10, and C11 corresponding to Phase _ grp0 may respectively perform addition, the 6 functional cores C02, C03, C12, C13, C22, and C23 corresponding to Phase _ grp1 may respectively perform multiplication, and the like, which is not limited by the disclosure. For a neural network with a large computation amount, the first trigger signal may complete one link in one computation of the entire neural network within one execution cycle, for example, computation of one network layer (current task), that is, the current task may also be a subtask of a higher-level task (entire neural network computation task). In a possible implementation manner, after a link of the neural network operation task is completed, the corresponding relationship between the first trigger signal and the functional core and the second trigger signal and the functional core may be used, and after the whole neural network task is completed, the functional core is released.
In a possible implementation manner, the first trigger module may generate the first trigger signal according to an external input signal, or may generate the first trigger signal by itself. For example, the apparatus may further include an external trigger module 90 (as shown in fig. 5), and for an initial first trigger signal, when a new task needs to be performed, the external trigger module 90 may generate and input an activation signal so that the first trigger module generates the first trigger signal, or the external trigger module 90 may directly generate the first trigger signal; during the task execution process, the first trigger module may generate a new first trigger signal when it is determined that the task period has not ended. As will be described in detail below.
Referring to fig. 3, fig. 3 is a schematic diagram illustrating a trigger timing of a control device according to an embodiment of the disclosure.
Wherein clk represents a reference clock, and clk set to 1 represents a reference clock number; step group0 and step group1 represent the functional core sets (or referred to as beat time sequence groups) corresponding to the first trigger signals, and different first trigger signals may be used to trigger the respective corresponding functional core sets to perform different tasks, for example, one first trigger signal is used to perform the operation task of one neural network, and another first trigger signal is used to perform the operation task of another neural network, or the operation task of an application software, etc. The set of functional cores corresponding to the first trigger signal may be referred to as a beat timing group; step _ ck0 represents a first trigger signal corresponding to step group0, which may be referred to as a beat trigger signal corresponding to step group 0; step _ ck0 setting 1 indicates that a beat trigger signal is received; step _ ck1 represents a beat trigger signal corresponding to step group1, and step _ ck1 is set to 1 to represent that a beat trigger signal is received; p _ grp0_ ck and p _ grp1_ ck respectively represent two second trigger signals corresponding to step group0, the second trigger signals may be referred to as phase trigger signals, a functional core set corresponding to the second trigger signals may be referred to as a phase time sequence group, and setting of the phase trigger signal to 1 indicates that one phase trigger signal is received; p _ grp2_ ck, p _ grp3_ ck and p _ grp4_ ck respectively represent three phase trigger signals corresponding to step group1, and the setting of the phase trigger signal to 1 represents that one phase trigger signal is received; s _ group 0_ finish represents a signal that the corresponding functional core set of step group0 finishes executing all, that is, a signal that all subtasks of the current task (i.e., the task corresponding to step group 0) finish executing all, which may be referred to as a step group0 beat end signal, and the beat end signal is set to 1 when all phase sequence groups finish executing corresponding all subtasks within step group0, otherwise set to 0; s _ group 1_ finish represents a signal that the corresponding set of functional cores of step group1 all finish executing, which may be referred to as a beat finish signal of step group 1; all the phase sequence groups in step group1 are set to 1 when the execution of all the corresponding subtasks is finished, otherwise, set to 0.
As shown in fig. 3, in one possible implementation, for step group0, when step _ ck0 is received, p _ grp0_ ck and p _ grp1_ ck are triggered, all phase groups in step group0 group are triggered at the same time, a new phase group work cycle (which may be called an execution cycle) is started, wherein the work cycles of the phase groups of different phase groups may be the same or different, a beat clock (which may be called a first clock) may be started after receiving a beat trigger signal, when the number of clocks of the beat clock equals the number of beat end clocks, it may be checked whether all phase groups in the beat group end work, it may also be checked whether a signal of s _ grp0_ finish set 1 is received, if so (at this time s _ grp0_ finish set 1), a next clock cycle of the phase group work cycle (at this time, step _ ck0 is set to 1 automatically) or all beat core functions in the beat group are released and ended, otherwise, after the work is completely finished, the next working period of the beat time sequence group is started or the work is finished, if the work is not completely finished when the forced finishing clock is reached, a forced finishing signal is generated, the execution of all the phase time sequence groups under the beat time sequence group is forcibly finished, and the related first timing clock and each second timing clock are reset and cleared. The same applies to step _ ck 1.
It should be noted that the number of working cycle clocks of the phase timing group under the same beat timing group may be the same or different; the phase sequence groups belonging to the same beat sequence group are synchronous when receiving the phase trigger signal and triggering the corresponding first subtask, the automatic triggering of the subsequent subtasks after the first subtask is finished can be asynchronous, and the phase sequence groups belonging to different beat sequence groups can be asynchronously triggered. The different sets of beat timings may also be asynchronous.
Referring to fig. 4, fig. 4 is a schematic diagram illustrating a trigger timing of a control device according to an embodiment of the disclosure.
In one example, as shown in fig. 4, clk represents a reference clock, and 1 clock in clk represents 1 reference clock number; s _ ck represents a first trigger signal, which may be referred to as a beat trigger signal, phase group0 and phase group1 represent sets of functional cores (or referred to as phase timing groups) corresponding to different second trigger signals, and the sets of functional cores corresponding to the second trigger signals may be referred to as phase timing groups; setting s _ ck to 1 represents that a beat trigger signal is received; p _ grp0_ ck represents a second trigger signal corresponding to phase group0, and the second trigger signal may be referred to as a phase trigger signal; p _ grp0_ ck is set to 1 to indicate that a corresponding phase trigger signal is received; p _ grp1_ ck represents a phase trigger signal corresponding to phase group 1; p _ grp1_ ck is set to 1 to indicate that a corresponding phase trigger signal is received; core0, core1, and core2 represent three functional cores under phase group0, respectively, and core3 and core4 represent two functional cores under phase group1, respectively; p _ grp0_ finish represents a signal that the functional cores of phase group0 all finish executing, and may be referred to as phase group0 phase finish signal; if all the functional cores in phase group0 finish executing the operation, 1 is set, otherwise 0 is set, the signal that p _ group 1_ finish indicates that all the functional cores in phase group1 finish executing can be called phase group1 phase end signal, if all the functional cores in phase group1 finish executing the operation, 1 is set, otherwise 0 is set. The phase end signal of the phase timing group corresponding to each phase timing group under the beat timing group can be obtained by taking the phase end signal phase of the phase timing group, that is, the phase of p _ grp0_ finish and the phase of p _ grp1_ finish phase can be taken as s _ grp0_ finish in fig. 4. p0, p1, p2 and p3 respectively indicate that the corresponding functional core is in the corresponding execution operation state.
In one possible implementation, as shown in fig. 4, for phase group0, when s _ ck is received, p _ group 0_ ck is triggered synchronously, and all functional cores in a phase group0 group are triggered simultaneously, and core0, core1, and core2 are controlled to start to execute operations, where the time for different functional cores to execute operations may be the same or different, a phase timing clock (may be referred to as a second timing clock) may be started after the functional cores start to operate, when the number of clocks of the phase timing clock is equal to the number of phase end clocks, whether all functional cores in the phase group end the current sub-task may be checked, if so, a next phase group working cycle (may be referred to as an execution cycle) is triggered, otherwise, a next phase group working cycle is started after all operations of the current sub-task are finished. If the function cores in the phase sequence group finish executing all the subtasks of the current task, setting p _ grp0_ finish to 1, and when the function cores in the phase sequence group do not finish executing all the subtasks of the current task, continuing executing until the next task is started after finishing all the executing, or receiving a forced ending signal of the beat sequence group and forcibly ending all the current subtasks. The same applies to phase group 1.
It should be noted that the number of clocks for executing operations by each functional core in the same phase time sequence group may be the same or different; functional cores belonging to the same phase timing group are synchronously triggered, and functional cores belonging to different phase timing groups can be asynchronously triggered.
The triggering sequence of the control device is described above by way of example, and is described below by way of example in connection with possible implementations of the control device.
Referring to fig. 5, fig. 5 is a schematic diagram of a control device according to an embodiment of the disclosure.
In a possible implementation manner, as shown in fig. 5, the apparatus may further include a first timing module 50 electrically connected to the first triggering module 10, where the first timing module 50 may include a first timing clock, and the first timing module 50 is configured to start the first timing clock when receiving the first triggering signal, so as to time an execution cycle of the current task;
in a possible embodiment, the first triggering module 10 may be further configured to determine that a condition for triggering an execution cycle of a next task is satisfied and generate a first trigger signal corresponding to the execution cycle of the next task when the first clock reaches a first threshold, all subtasks of the current task are finished executing, and the current task is not a last task.
In the embodiment of the disclosure, all subtasks of the current task are completely executed, that is, the current task is completely executed, and by judging whether all subtasks of the current task are completely executed and triggering the execution cycle of the next task when all subtasks are completely executed, the idle of the functional cores can be reduced as much as possible, each functional core is utilized to the greatest extent, and the execution efficiency is improved.
In one example, the first threshold may be set in advance, and different first trigger signals may correspond to different first thresholds. For example, the first threshold (number of clocks) may be directly input from the outside, or a parameter related to the first threshold may be input, and the control module may perform a correlation operation to determine the first threshold according to the obtained parameter.
In one possible implementation, as shown in fig. 5, the apparatus may include a register module 70.
In one example, the register module 70 may include a first threshold register to receive a first threshold value of an external input.
In an example, the first threshold may be a variable value that changes according to different tasks, and is received each time the execution cycle of the current task is triggered, or may be a preset fixed value, and only needs to be received once, and may be reused subsequently.
In one example, in the case that a condition for triggering an execution period of the next task is satisfied, the execution period of the next task may be triggered, that is, the next task starts to be executed, and the next task may be executed in the same manner as the current task.
In one example, the first timing clock may be clocked once every one reference clock cycle and determine whether the first threshold is reached.
By setting the first threshold, the embodiment of the disclosure can implement advance control and deployment of execution periods of different network applications, thereby implementing asynchronous independent operation of different tasks and accelerating operation speed.
In one example, the first timing clock may be re-clocked from 0 each time a new first trigger signal is generated.
By comparing the timing duration of the first timing clock with the first threshold, whether the condition for triggering the execution cycle of the next task is met or not is judged, and the control on the execution cycle of the current task can be realized, so that different tasks can be better scheduled, the running time is reduced, and the execution efficiency is improved.
In a possible implementation manner, the first triggering module 10 may be further configured to, when the first timing clock reaches a third threshold, generate a forced end signal, where the forced end signal is used to forcibly end execution of each current sub-task in the current task, so as to forcibly end execution of the current task;
in a possible implementation manner, the control module 40 may be further configured to transmit the forced end signal to each functional core of the current task by using the multiplexer 30, so that each functional core ends the operation.
In an example, the third threshold may be greater than the first threshold, may be a variable value that changes according to different tasks, and is received each time the execution cycle of the current task is triggered, or may be a preset fixed value, and only needs to be received once, and may be reused subsequently.
In one example, the register module 70 may include a third threshold register to store a third threshold, which may be read or written to obtain the third threshold or set the third threshold.
In one example, in a case where a current task reaches a third threshold and has not yet finished executing, a program error may occur in the current task to cause deadlock, for example, for a certain link in a certain neural network task, the estimated number of clocks required to execute the task of the link is 500 clocks, so that, in a case where a first timing clock reaches 1000 clocks and the task of the link has not yet finished executing, it is likely that a program error has occurred, a dead cycle cannot be ended (which is a pathological state), and forced ending is required, so in this case, the third threshold may be set to be 1000 clocks, and when the third threshold is reached, the task is not yet finished, and forced ending is performed. If the current task is a certain subtask of the previous-level task, the whole previous-level task can be forcibly ended, and after the execution of the whole task is forcibly ended, the function core set corresponding to the first trigger signal can be released, so that unnecessary function core resources are prevented from being occupied.
The maximum clock number executed by the current task is controlled by a mechanism for realizing forced termination of the current task, so that dead cycle that the task cannot be terminated due to program errors and the like can be avoided, the possibility of wasting a large amount of unnecessary functional core resources is avoided, the power consumption is further reduced, and the running efficiency is improved.
In a possible implementation manner, as shown in fig. 5, the apparatus may further include a second timing module 60 electrically connected to the second triggering module 20, where the second timing module 60 may include one or more second timing clocks, and the second timing module is configured to start the one or more second timing clocks when receiving the second triggering signal, so as to time an execution cycle of each sub task of the current task,
in a possible implementation manner, the second triggering module 20 may be further configured to determine that a condition that an execution cycle of a next subtask that triggers each current subtask in the current task is satisfied and generate the second triggering signal corresponding to each next subtask, when the second timing clock reaches a second threshold and the functional cores corresponding to the second triggering signals all end execution of each current subtask.
According to the embodiment of the disclosure, under the condition that the functional core set corresponding to the current subtask does not finish the execution completely, after the functional core set finishes the execution completely, the execution cycle of the next subtask is triggered, so that the possibility that the current task card cannot continue in a certain subtask due to the fact that the current subtask starts the next subtask without completion can be prevented, an error report condition is prevented, deployment and scheduling of each task can be executed smoothly, and corresponding performance is improved.
In one example, the second threshold may be preset, and the set of functional cores corresponding to different second trigger signals may correspond to the corresponding second ending clock number.
In one example, the register module 70 may include a second threshold register to receive a second threshold value of the external input.
In an example, the second ending clock number may be a variable value that changes according to different subtasks, and may be received each time the execution period of the subtask under the current task is triggered, or may be a preset fixed value, and only needs to be received once, and may be reused subsequently.
The second threshold value is determined according to the preset second ending clock number, so that the execution periods of different subtasks can be controlled and deployed in advance, asynchronous independent operation of the different subtasks is realized, and the operation speed is accelerated.
In one example, when the condition of triggering the execution period of the next subtask of each current subtask is satisfied, the execution period of the next task of each current subtask is triggered, that is, each next subtask starts to be executed, the execution manner of the next subtask may be the same as that of the current subtask, and so on, until all subtasks of the current task are executed completely or a forced end signal is received to end the execution of the subtask.
In one example, the second timing module 60 controls the second timing clock to start timing when receiving the second trigger signal, and the second timing clock may determine whether the second threshold is reached every time one clock count passes; each second trigger signal may correspond to a second timing clock, and different second timing clocks may correspond to different second thresholds, for example, after a functional core set corresponding to one second trigger signal receives the second trigger signal, the second timing clock corresponding to the functional core set may be started and compared with the corresponding second threshold; the execution cycles of the functional core sets may be the same or different. The received second trigger signal here includes a second trigger signal generated based on the first trigger signal, and also includes a second trigger signal automatically generated when the following condition is satisfied, and the corresponding second clock may be re-clocked from 0 each time a new second trigger signal is generated.
By comparing the second timing clock with the second threshold value, whether the condition for triggering the execution period of the next subtask of each current subtask is met or not is judged, and the control of the execution period of each subtask can be realized, so that each subtask can be better scheduled.
In a possible implementation manner, the control module 40 may be further configured to receive an operation end signal output by each functional core corresponding to the second trigger signal, and generate a subtask end signal when each functional core outputs the operation end signal, so as to determine that all functional cores corresponding to the second trigger signal end the execution of the current subtask;
in one possible implementation, the apparatus further includes: and a first storage module (not shown) electrically connected to the control module for storing the subtask end signal.
In one example, each functional core of the processor generates an operation end signal after completing execution of its own operation (e.g., primitive operations such as multiplication, addition, etc.), and outputs the operation end signal to the control module.
Referring to fig. 6a, fig. 6a and 6b are schematic diagrams illustrating a control device according to an embodiment of the disclosure.
As shown in FIG. 6a and FIG. 6b, the processor may include m functional cores (cores), for example, and in one example, the control module is further configured to number each functional core of the processor to uniquely identify each functional core, for example, the functional core of the processor may be configured with an identifier, which is core [0], core [0] … core [ m-2], and core [ m-1], respectively.
In one example, when each functional core performs a primitive operation, the operation end signal (e.g., core _ finish [0], corresponding to the functional core [0]) may be at a low level (0), and when the functional core [0] completes the execution of the primitive operation, the operation end signal core _ finish [0]) may be at a high level (1), in which case the control module may determine the state (idle state or operation state) of the functional core by detecting the operation end signal, and when the operation end signals of all functional cores in the functional core set corresponding to the second trigger signal are at a high level, the control module generates a subtask end signal to determine that the functional core corresponding to the second trigger signal all ends the execution of the current subtask.
In one example, the control module may perform an and operation on the operation end signals in the respective functional cores to determine a subtask end signal corresponding to the second trigger signal.
In one example, the control module may be further configured to allocate a functional core to each task, and the control module may allocate a free functional core to the task according to an operation requirement required by the task, for example, a current task corresponding to a first trigger signal may be allocated with a plurality of functional cores to obtain a beat time sequence group, one or more second trigger signals generated according to the first trigger signal correspond to one or more subtasks of the current task, the control module further allocates the plurality of functional cores allocated to the current task to each subtask according to an operation requirement of the subtask to obtain one or more phase time sequence groups, and when a plurality of tasks (networks or applications) exist, the control module may allocate and obtain a plurality of beat time sequence groups, a plurality of phase time sequence groups, and further, the control module may number each beat time sequence group and each phase time sequence group, as shown in fig. 6a, it is assumed that a certain beat time sequence group includes n phase time sequence groups (phase groups), which are sequentially numbered as phase _ grp [0] to phase _ grp [ n-1], and correspondingly, when the control module determines that the operation end signals of all the functional cores in the phase time sequence group corresponding to the subtask are high level, the control module generates subtask end signals phase _ grp _ finish [0] (corresponding to phase time sequence group phase _ grp [0]) -phase _ grp _ finish [ n-1] (corresponding to phase time sequence group phase _ grp [ n-1 ]).
In one example, as shown in fig. 6a, the register module 70 may further include a functional core register (core _ en, e.g., core _ en [0,0]), the functional core register is configured as a two-dimensional register with a size of m × n bits, the first dimension represents a functional core number (e.g., the first dimension in the core _ en [0,0] represents that the functional core is a functional core [0] belonging to a phase timing group phase _ grp [0]), and the second dimension represents a subtask number included in the subtask, where n and m are positive integers and n ≦ m.
In one example, the first storage module may include a subtask end signal register (phase _ grp _ finish) to store a subtask end signal, and the subtask end signal register may be configured to have n bits, corresponding to each phase timing group (a task may include one or more subtasks, corresponding to one or more phase timing groups), which may have a subtask end signal of 1 when all functional cores in the phase timing group complete the operation, and may have a 0 otherwise.
In a possible implementation manner, the control module 40 may be further configured to generate a task end signal to determine that all subtasks of the current task corresponding to the first trigger signal are completely finished executing, when the subtask end signals of all subtasks of the current task are stored in the first storage module.
In one possible implementation, the apparatus may further include: and a second storage module (not shown in fig. 5) electrically connected to the control module for storing the task end signal.
In one example, as shown in fig. 6b, the control module may number (step _ grp) for each beat timing group, e.g., assuming there are s tasks, the control module may establish s beat timing groups (numbered step _ grp [0] — step _ grp [ s-1 ]).
In one example, the register module 70 may include a phase group register (e.g., phase _ group _ en [0,0]), where the phase group register is configured as a two-dimensional register with a size of s × n bits, a first dimension represents a subtask number (or called a phase timing group number), and a second dimension includes a current task number (or called a beat timing group number), where s and n are positive integers. For example, the first dimension of the phase group register phase _ group _ en [0,0] represents the phase timing group phase _ group [0], and the second dimension represents the beat timing group step _ group [0], i.e., the phase timing group phase _ group [0] belongs to the beat timing group step _ group [0 ].
In one example, the second storage module may include a task end signal register (step _ grp _ finish) for storing task end signals of respective beat time sequence groups, and the task end signal register may be configured to include s bits respectively corresponding to the s beat time sequence groups, for example, the task end signal register step _ grp _ finish [1] is used to store task end signals of the beat time sequence group step _ grp [1 ].
In one example, the control module may read the subtask end signal in the subtask end signal register phase _ grp _ finish, and perform an and operation on the subtask end signals of each subtask (phase timing group) to obtain the task end signal of the current task (beat timing group), for example, when all phase timing groups in one beat timing group are ended, that is, the subtask end signals corresponding to all phase timing groups in one beat timing group are 1, the result of performing the and operation is 1, the task end signal of the task may be set to 1, otherwise, the task end signal of the task is 0.
Referring to fig. 6a and 6b together, as shown in fig. 6a and 6b, each functional core may send an operation end signal to the control module (or the control module obtains the operation end signal from each functional core), when the operation end signal of each functional core in the phase-timing group is 1, after performing the phase-and-operation, the control module completes the operation task of the phase-timing group according to the obtained sub-task end signal being 1, if each sub-task (phase-timing group) of the current task (beat-timing group) completes the operation, the sub-task end signal of each sub-task is 1 and is stored in the sub-task end signal register, the control module may obtain the stored value in the sub-task end signal, perform the phase-and-operation, and the obtained result of the phase-and-operation is 1, it may be determined that the current task is completed, and set the task end register corresponding to the current task is 1, indicating that the current task completed the operation.
In one example, the register block 70 may also include a first selection register (S _ sel [0: m-1, as shown in FIG. 6 b)][0:y-1]) The first selection register is configured to be a two-dimensional register with the size of m x y bits, the first dimension represents the number of the current functional core, the second dimension represents the task number to which the current functional core belongs, wherein m and y are positive integers, and y is log2s, the control module may configure the first selection register such that a corresponding first trigger signal s _ ck (0: s-1) is output to the corresponding functional core through the multiplexer.
In one example, as shown in fig. 6b, if the current task completes the operation (the value corresponding to the task end signal register step _ grp _ finish is 1), and the current task is not the last task, triggering a next beat cycle, generating a first trigger signal corresponding to the execution cycle of the next task, and each functional core freely selects any one of the first trigger signals as its own first trigger signal according to the configuration of the first selection register S _ sel, or the control module sends any one of the first trigger signals S _ ck (0: S-1) to any one of the functional cores according to the configuration of the first selection register S _ sel.
In one example, the register block 70 may also include a second selection register (P _ sel [0: m-1, as shown in FIG. 6 a)][0:x-1]) The second selection register is configured to be a two-dimensional register with a size of m x bits, the first dimension represents the number of the functional core, and the second dimension represents the subtask number described by the current functional core, wherein x is log2n, the control module may configure the second selection register such that the corresponding second trigger signal p _ ck (0: n-1) is output to the corresponding functional core through the multiplexer.
In an example, as shown in fig. 6a, if the current subtask completes the operation (the value corresponding to the subtask end signal register phase _ grp _ finish is 1), and all the functional cores corresponding to the second trigger signals end the execution of the current subtasks, the next phase cycle is triggered to generate the second trigger signals corresponding to the next subtasks, and each functional core freely selects any one of the second trigger signals as its own second trigger signal according to the configuration of the second selection register P _ sel, or the control module sends any one of the second trigger signals P _ ck (0: n-1) to any one of the functional cores according to the configuration of the first selection register P _ sel.
According to the embodiment of the disclosure, by setting each functional core to arbitrarily select the first trigger signal and the second trigger signal, the expansion of the functions of the subsequent processor can be realized, and the adaptability and the flexibility are increased.
In one possible implementation, the control module may be further configured to,
releasing a functional core in the processor corresponding to the first trigger signal under the condition that a preset condition is met, wherein the preset condition comprises:
the current task is the last task and the current task is finished to be executed; or
When execution of the current task is forcibly ended.
For example, if the current task is the last link of a neural network operation task, after the current task is finished executing, the entire neural network operation task is completed, the first trigger signal corresponding to the set of functional cores in the processor may be released, that is, the set of functional cores may become an idle state for use by other tasks, and after the current task is finished forcibly, since the end of the current task is finished forcibly after the timeout, the entire neural network operation task may be finished executing, and the first trigger signal corresponding to the set of functional cores in the processor may also be released. After the current task is finished, resetting and clearing operations can be carried out on the corresponding first timing clock and each second timing clock.
The first trigger signal is released after the execution of the task is finished, the functional core set corresponding to the processor is enabled, the functional cores which are not selected can be in the dormant state when only a few tasks are executed, power consumption is reduced, meanwhile, the functional cores which are not selected in the idle state can be selected by other tasks by timely releasing the functional cores, operation efficiency is improved, waiting time of other tasks is reduced, and execution speed is accelerated.
The control device of the embodiment of the disclosure can support parallel or mixed operation of a plurality of asynchronous network applications on a chip: different step triggers can correspondingly deploy different network applications, the network applications run independently, the power consumption can be reduced, when only a few network applications are executed, the non-selected core is in a dormant state, the power consumption is reduced, and the running time can be reduced: for a network application, the core with similar operation tasks inside is divided into a phase time sequence group, so that the execution speed is accelerated.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (10)

1. A control device, characterized in that the device comprises:
a first trigger module for generating one or more first trigger signals, wherein each first trigger signal corresponds to each task;
the second trigger module is electrically connected with the first trigger module and used for generating one or more second trigger signals according to the first trigger signal, wherein the second trigger signals correspond to subtasks of the task;
the multiplexer is electrically connected with the first trigger module and the second trigger module;
and the control module is electrically connected to the multiplexer module and is used for controlling the multiplexer to transmit any one of the one or more first trigger signals and any one of the one or more second trigger signals to one or more functional cores in the processor, so that the one or more functional cores execute the subtasks of the task according to the received first trigger signal and the received second trigger signal.
2. The apparatus of claim 1,
the device also comprises a first timing module which is electrically connected with the first trigger module and comprises a first timing clock, wherein the first timing module is used for starting the first timing clock when receiving the first trigger signal so as to time the execution period of the current task;
the first triggering module is further configured to determine that a condition for triggering an execution period of a next task is met and generate a first triggering signal corresponding to the execution period of the next task when the first timing clock reaches a first threshold, all subtasks of the current task are finished executing, and the current task is not a last task.
3. The apparatus of claim 2,
the first triggering module is further configured to generate a forced end signal when the first timing clock reaches a third threshold, where the forced end signal is used to forcibly end the execution of each current sub-task in the current task, so as to forcibly end the execution of the current task;
the control module is further configured to transmit the forced end signal to each functional core of the current task by using the multiplexer.
4. The apparatus of claim 1,
the device also comprises a second timing module which is electrically connected with the second triggering module and comprises one or more second timing clocks, the second timing module is used for starting the one or more second timing clocks when receiving the second triggering signal so as to time the execution cycle of each subtask of the current task,
the second triggering module is further configured to determine that a condition that an execution period of a next subtask that triggers each current subtask in the current task is satisfied when the second timing clock reaches a second threshold and the functional cores corresponding to the second triggering signals all end execution of each current subtask, and generate the second triggering signals corresponding to each next subtask.
5. The apparatus of claim 1,
the control module is further configured to receive operation end signals output by the functional cores corresponding to the second trigger signal, and generate a subtask end signal when each functional core outputs the operation end signal, so as to determine that all the functional cores corresponding to the second trigger signal end execution of the current subtask;
the device further comprises:
and the first storage module is electrically connected to the control module and used for storing the subtask ending signal.
6. The apparatus of claim 5,
the control module is further configured to generate a task ending signal under the condition that subtask ending signals of all subtasks of the current task are stored in the first storage module, so as to determine that all subtasks of the current task corresponding to the first trigger signal are completely ended and executed;
the device further comprises:
and the second storage module is electrically connected to the control module and used for storing the task ending signal.
7. The apparatus of claim 1, wherein the control module is further configured to assign functional cores to each task and subtasks of each task, number functional cores in the processor, number a first set of functional cores assigned to each task, and number a second set of functional cores corresponding to each subtask of each task.
8. The apparatus of claim 1, wherein the apparatus comprises a plurality of phase bank registers, a first select register, a functional core register, a second select register, wherein,
the phase group register is configured to be a two-dimensional register with the size of s x n bits, the first dimension represents the number of a current task, the second dimension represents the number of subtasks included in the current task, and s and n are positive integers;
the first selection register is configured to be a two-dimensional register with the size of m x y bits, the first dimension represents the number of the current functional core, the second dimension represents the task number to which the current functional core belongs, wherein m and y are positive integers, and y is log2s;
The functional core register is configured to be a two-dimensional register with the size of n x m bits, the first dimension represents a subtask number, and the second dimension represents a functional core included in the subtask;
the second selection register is configured as a two-dimensional register of size m x bits, the first dimension representing a number of functional cores,the second dimension represents the subtask number described by the current functional core, where x is log2n。
9. The apparatus of claim 1 or 3, wherein the control module is further configured to,
releasing a functional core in the processor corresponding to the first trigger signal under the condition that a preset condition is met, wherein the preset condition comprises:
the current task is the last task and the current task is finished to be executed; or
When execution of the current task is forcibly ended.
10. A brain-like computing system, characterized in that the system comprises a control device according to any one of claims 1-9.
CN202011313163.7A 2020-11-20 2020-11-20 Control device and brain-like computing system Active CN112434800B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011313163.7A CN112434800B (en) 2020-11-20 2020-11-20 Control device and brain-like computing system
PCT/CN2020/137469 WO2022104991A1 (en) 2020-11-20 2020-12-18 Control apparatus and brain-inspired computing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011313163.7A CN112434800B (en) 2020-11-20 2020-11-20 Control device and brain-like computing system

Publications (2)

Publication Number Publication Date
CN112434800A true CN112434800A (en) 2021-03-02
CN112434800B CN112434800B (en) 2024-02-20

Family

ID=74692879

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011313163.7A Active CN112434800B (en) 2020-11-20 2020-11-20 Control device and brain-like computing system

Country Status (2)

Country Link
CN (1) CN112434800B (en)
WO (1) WO2022104991A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114172644A (en) * 2021-12-03 2022-03-11 三未信安科技股份有限公司 Method and system for optimizing elliptic curve public key password of PCI (peripheral component interconnect) password card
CN116707493A (en) * 2023-07-31 2023-09-05 苏州萨沙迈半导体有限公司 Trigger signal generating device, power driving module and motor control chip

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100076916A1 (en) * 2008-09-21 2010-03-25 Van Der Made Peter Aj Autonomous Learning Dynamic Artificial Neural Computing Device and Brain Inspired System
CN107578102A (en) * 2017-07-21 2018-01-12 韩永刚 One species neurode information processing method and smart machine
CN109376843A (en) * 2018-10-12 2019-02-22 山东师范大学 EEG signals rapid classification method, implementation method and device based on FPGA
CN109901878A (en) * 2019-02-25 2019-06-18 北京灵汐科技有限公司 One type brain computing chip and calculating equipment
CN110163016A (en) * 2019-04-29 2019-08-23 清华大学 Hybrid system and mixing calculation method
CN110502330A (en) * 2018-05-16 2019-11-26 上海寒武纪信息科技有限公司 Processor and processing method
CN110623663A (en) * 2019-08-19 2019-12-31 北京信息科技大学 Electroencephalogram signal acquisition system and control method thereof
CN110909869A (en) * 2019-11-21 2020-03-24 浙江大学 Brain-like computing chip based on impulse neural network
CN211187235U (en) * 2019-08-19 2020-08-07 北京信息科技大学 Electroencephalogram signal acquisition system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200057948A1 (en) * 2016-10-31 2020-02-20 Nec Corporation Automatic prediction system, automatic prediction method and automatic prediction program
CN108073982B (en) * 2016-11-18 2020-01-03 上海磁宇信息科技有限公司 Brain-like computing system
CN107729050B (en) * 2017-09-22 2021-01-22 中国科学技术大学苏州研究院 Real-time system based on LET programming model and task construction method
CN109933204A (en) * 2019-03-22 2019-06-25 河北雄安有份儿智慧科技有限公司 A kind of man-machine interaction method of Behavior-based control action triggers and brain wave perception
CN110322010B (en) * 2019-07-02 2021-06-25 深圳忆海原识科技有限公司 Pulse neural network operation system and method for brain-like intelligence and cognitive computation
CN110621052B (en) * 2019-09-29 2020-11-10 广东电网有限责任公司 Multipath routing optimization method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100076916A1 (en) * 2008-09-21 2010-03-25 Van Der Made Peter Aj Autonomous Learning Dynamic Artificial Neural Computing Device and Brain Inspired System
CN107578102A (en) * 2017-07-21 2018-01-12 韩永刚 One species neurode information processing method and smart machine
CN110502330A (en) * 2018-05-16 2019-11-26 上海寒武纪信息科技有限公司 Processor and processing method
CN109376843A (en) * 2018-10-12 2019-02-22 山东师范大学 EEG signals rapid classification method, implementation method and device based on FPGA
CN109901878A (en) * 2019-02-25 2019-06-18 北京灵汐科技有限公司 One type brain computing chip and calculating equipment
CN110163016A (en) * 2019-04-29 2019-08-23 清华大学 Hybrid system and mixing calculation method
CN110623663A (en) * 2019-08-19 2019-12-31 北京信息科技大学 Electroencephalogram signal acquisition system and control method thereof
CN211187235U (en) * 2019-08-19 2020-08-07 北京信息科技大学 Electroencephalogram signal acquisition system
CN110909869A (en) * 2019-11-21 2020-03-24 浙江大学 Brain-like computing chip based on impulse neural network

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114172644A (en) * 2021-12-03 2022-03-11 三未信安科技股份有限公司 Method and system for optimizing elliptic curve public key password of PCI (peripheral component interconnect) password card
CN114172644B (en) * 2021-12-03 2023-04-25 三未信安科技股份有限公司 Method and system for optimizing elliptic curve public key cryptography of PCI (peripheral component interconnect) cryptographic card
CN116707493A (en) * 2023-07-31 2023-09-05 苏州萨沙迈半导体有限公司 Trigger signal generating device, power driving module and motor control chip
CN116707493B (en) * 2023-07-31 2023-10-20 苏州萨沙迈半导体有限公司 Trigger signal generating device, power driving module and motor control chip

Also Published As

Publication number Publication date
WO2022104991A1 (en) 2022-05-27
CN112434800B (en) 2024-02-20

Similar Documents

Publication Publication Date Title
CN112434800B (en) Control device and brain-like computing system
US11003429B1 (en) Compile-time scheduling
JP6834097B1 (en) Hardware-specific partitioning of inference neural network accelerators
US20150134884A1 (en) Method and system for communicating with non-volatile memory
CN112418412B (en) Trigger device and brain-like computing system
CN113434284B (en) Privacy computation server side equipment, system and task scheduling method
TW202246977A (en) Task scheduling method and apparatus, computer device and storage medium
US9390033B2 (en) Method and system for communicating with non-volatile memory via multiple data paths
EP2551768A1 (en) Multi-core system and start-up method
US11175919B1 (en) Synchronization of concurrent computation engines
US20030177163A1 (en) Microprocessor comprising load monitoring function
CN111767121B (en) Operation method, device and related product
US9377968B2 (en) Method and system for using templates to communicate with non-volatile memory
US10922146B1 (en) Synchronization of concurrent computation engines
CN113641476B (en) Task scheduling method, game engine, device and storage medium
CN111985634B (en) Operation method and device of neural network, computer equipment and storage medium
CN114330686A (en) Configurable convolution processing device and convolution calculation method
US20230067432A1 (en) Task allocation method, apparatus, electronic device, and computer-readable storage medium
CN114764346A (en) Data transmission method, system and computing node
CN107832154B (en) Multi-process processing method, processing device and application
US11625453B1 (en) Using shared data bus to support systolic array tiling
CN112416475A (en) Triggering method
WO2015073608A1 (en) Method and system for communicating with non-volatile memory
EP4120093A1 (en) Task allocation method and apparatus, and electronic device and computer-readable storage medium
US11442794B1 (en) Event assignment for synchronization of concurrent execution engines

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant