WO2022105664A1 - Computing system and operation method therefor, and electronic device and computer-readable medium - Google Patents

Computing system and operation method therefor, and electronic device and computer-readable medium Download PDF

Info

Publication number
WO2022105664A1
WO2022105664A1 PCT/CN2021/130002 CN2021130002W WO2022105664A1 WO 2022105664 A1 WO2022105664 A1 WO 2022105664A1 CN 2021130002 W CN2021130002 W CN 2021130002W WO 2022105664 A1 WO2022105664 A1 WO 2022105664A1
Authority
WO
WIPO (PCT)
Prior art keywords
temperature
computing unit
analog
unit
processing core
Prior art date
Application number
PCT/CN2021/130002
Other languages
French (fr)
Chinese (zh)
Inventor
沈杨书
何伟
祝夭龙
Original Assignee
北京灵汐科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202011312225.2A external-priority patent/CN112416587A/en
Priority claimed from CN202011310981.1A external-priority patent/CN112416586A/en
Application filed by 北京灵汐科技有限公司 filed Critical 北京灵汐科技有限公司
Publication of WO2022105664A1 publication Critical patent/WO2022105664A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Definitions

  • the present disclosure relates to the field of computer technology, and in particular, to a computing system, a method for running the same, an electronic device, and a computer-readable medium.
  • Computing systems are devices that can be used to perform various complete computing tasks.
  • the computing system may include one or more relatively independent computing units (eg, processing cores), and each computing unit may perform all or part of the operations in the overall computing task.
  • the performance of the computing unit in the computing system is actually related to its temperature, and the influence of the temperature of the computing unit is not considered in the related art, so that the overall performance of the computing system is easily adversely affected because the temperature of the computing unit is not suitable.
  • the present disclosure provides a computing system, a method for operating the same, an electronic device, and a computer-readable medium.
  • an embodiment of the present disclosure provides a method for operating a computing system, where the computing system includes a computing unit; the method includes: acquiring a temperature of the computing unit; and adjusting a temperature of the computing unit according to the temperature of the computing unit working status.
  • the computing system includes a plurality of computing units, and the computing units are processing cores; the adjusting the working state of the computing units according to the temperature of the computing units includes: according to the temperature of the processing cores , to assign tasks to the processing cores.
  • the obtaining the temperature of the computing unit includes: obtaining the temperature of the processing core in response to a new task waiting to be allocated.
  • the acquiring the temperature of the computing unit includes: in response to the existence of an idle processing core, acquiring the temperature of the processing core.
  • the acquiring the temperature of the computing unit includes: acquiring a plurality of temperatures of the processing cores; and assigning tasks to the processing cores according to the temperatures of the processing cores includes: The temperature of the processing cores, and task assignment is performed on at least one of the processing cores.
  • assigning tasks to the processing cores according to the temperature of the processing cores includes at least one of the following: in response to the temperature of the first processing cores being lower than a preset first temperature threshold, assigning tasks to the first processing cores Allocating a computing-intensive task to the processing core; in response to the temperature of the second processing core being higher than or equal to a first temperature threshold and lower than or equal to a preset second temperature threshold, allocating a data-intensive task to the second processing core ; in response to the temperature of the third processing core being higher than the second temperature threshold, stop allocating tasks to the third processing core; wherein, the second temperature threshold is higher than the first temperature threshold, when the processing core is running computationally intensive tasks is greater than when running data-intensive tasks.
  • the method further includes: moving at least part of the tasks processed by the third processing core out of the third processing core.
  • the computing unit is an analog computing unit
  • the computing system further includes a temperature sensor and a circuit unit corresponding to the analog computing unit, and the circuit unit is arranged within the first range of the corresponding analog computing unit.
  • the analog computing unit is set within the second range of its corresponding temperature sensor;
  • the acquiring the temperature of the computing unit includes: acquiring the temperature collected by the temperature sensor, according to the temperature corresponding to the analog computing unit The temperature collected by the sensor determines the temperature of the analog computing unit;
  • the adjusting the working state of the computing unit according to the temperature of the computing unit includes: adjusting the analog computing unit according to the comparison result between the temperature of the analog computing unit and a preset temperature range The operating state of the computing unit and its corresponding circuit unit.
  • the adjusting the operating state of the analog computing unit and its corresponding circuit unit according to the comparison result between the temperature of the analog computing unit and the preset temperature range includes: responding to the temperature of the analog computing unit Within the preset temperature range, maintain the current operating state of the analog computing unit and its corresponding circuit unit; in response to the temperature of the analog computing unit being higher than the highest temperature in the preset temperature range, control the analog computing unit and its corresponding circuit The unit stops operating; in response to the temperature of the analog computing unit being lower than the lowest temperature in the preset temperature range, the operating state of the analog computing unit and its corresponding circuit unit is controlled according to whether the analog computing unit needs to start running.
  • the controlling the operation state of the simulation computing unit and its corresponding circuit unit according to whether the simulation computing unit needs to start running includes: judging whether the simulation computing unit needs to start running; responding to the simulation computing The unit needs to start running, control the simulation computing unit to start running, and control its corresponding circuit unit to run; in response to the simulation computing unit not needing to start running, maintain the current running of the simulation computing unit and its corresponding circuit unit state.
  • the number of the simulation computing units is multiple: the method further includes: determining the history of the simulation computing unit according to a plurality of temperatures collected by a temperature sensor corresponding to the simulation computing unit at multiple times temperature; according to the historical temperature of the analog computing unit, determine the recommended number of circuit units corresponding to the analog computing unit.
  • the number of the analog computing units is multiple; at least some of the temperature sensors correspond to multiple analog computing units.
  • the determining the temperature of the analog computing unit according to the temperature collected by the temperature sensor corresponding to the analog computing unit includes: in the case that the analog computing unit corresponds to only one temperature sensor, determining the temperature of the analog computing unit The temperature is the temperature collected by its corresponding temperature sensor; when the analog computing unit corresponds to multiple temperature sensors, the temperature of the analog computing unit is determined to be the average value of the multiple temperatures collected by the corresponding multiple temperature sensors .
  • an embodiment of the present disclosure provides a method for laying out a computing system, wherein a simulation computing unit has been determined to be arranged at a first target position of the computing system; the method includes: determining and simulating according to the temperature of the simulation computing unit The circuit unit corresponding to the computing unit is located in the first range of the analog computing unit; it is determined that the temperature sensor corresponding to each analog computing unit is located in the second range of the analog computing unit.
  • the number of the analog computing unit is multiple; the determining that the circuit unit corresponding to the analog computing unit is located within the first range of the analog computing unit according to the temperature of the analog computing unit includes: Running a preset computing system, the preset computing system includes an analog computing unit set at the first target position, and a temperature sensor corresponding to the analog computing unit; acquiring multiple temperatures collected by the temperature sensor at multiple times, Determine the historical temperature of the analog computing unit according to the multiple temperatures collected by the temperature sensor corresponding to the analog computing unit at multiple times; determine the recommended number of circuit units corresponding to the analog computing unit according to the historical temperature of the analog computing unit; determine The number of circuit units corresponding to the analog computing unit is a recommended number, and is located within the first range of the analog computing unit.
  • an embodiment of the present disclosure provides a computing system, including: a computing unit and a controller; the controller is configured to execute any one of the operating methods of the computing system of the embodiments of the present disclosure.
  • the computing unit is an analog computing unit; the computing system further includes: a temperature sensor and a circuit unit corresponding to the analog computing unit.
  • embodiments of the present disclosure provide an electronic device, including: one or more processors; one or more memories, on which one or more programs are stored, when the one or more programs are One or more processors execute, so that the one or more processors can implement any one of the operating methods of the computing system of the embodiments of the present disclosure, or implement any one of the layout methods of the computing systems of the embodiments of the present disclosure.
  • an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, implements any method for running a computing system in the embodiment of the present disclosure, or implements A layout method of any computing system according to the embodiments of the present disclosure.
  • the temperature of the computing unit can be obtained when the computing system is running, and its working state can be adjusted according to the temperature of the computing unit, thereby ensuring that the temperature of the computing unit is always in a suitable range and improving the performance of the computing system.
  • FIG. 1 is a flowchart of a method for running a computing system according to an embodiment of the present disclosure
  • FIG. 2 is a flowchart of another operating method of a computing system provided by an embodiment of the present disclosure
  • FIG. 3 is a flowchart of another operating method of a computing system provided by an embodiment of the present disclosure
  • FIG. 4 is a block diagram of a computing system (many-core system) provided by an embodiment of the present disclosure
  • FIG. 5 is a flowchart of another operating method of a computing system provided by an embodiment of the present disclosure.
  • FIG. 6 is a flowchart of another operating method of a computing system provided by an embodiment of the present disclosure.
  • Fig. 7 is the volt-ampere characteristic curve of the diode at different temperatures in the related art
  • FIG. 8 is a schematic structural diagram of another computing system (on-chip structure) provided by an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram of another computing system (on-chip structure) provided by an embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of another computing system (on-chip structure) provided by an embodiment of the present disclosure.
  • FIG. 11 is a schematic structural diagram of another computing system (on-chip structure) provided by an embodiment of the present disclosure.
  • FIG. 12 is a schematic structural diagram of another computing system (on-chip structure) provided by an embodiment of the present disclosure.
  • FIG. 13 is a schematic structural diagram of another computing system (on-chip structure) provided by an embodiment of the present disclosure.
  • FIG. 14 is a flowchart of a layout method of a computing system provided by an embodiment of the present disclosure.
  • FIG. 15 is a block diagram of the composition of a computing system provided by an embodiment of the present disclosure.
  • FIG. 16 is a block diagram of the composition of an electronic device according to an embodiment of the present disclosure.
  • FIG. 17 is a block diagram of the composition of a computer-readable medium provided by an embodiment of the present disclosure.
  • first and second are for descriptive purposes only, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with “first”, “second” may expressly or implicitly include one or more of that feature. In the description of the present invention, unless otherwise specified, “plurality” means two or more.
  • an embodiment of the present disclosure provides a method for operating a computing system, where the computing system includes a computing unit.
  • the embodiments of the present disclosure are used to run a computing system, which is a system with a certain data processing capability, which includes one or more computing units; and each computing unit is a relatively independent structure and can perform certain operations.
  • the operation method of the computing system includes:
  • the temperature of at least some of the computing units is acquired, and the working state of the corresponding computing unit is adjusted according to the temperature of the computing unit.
  • the temperature of the computing unit refers to a parameter that can characterize the temperature characteristics of the computing unit, which may also be a temperature value at a certain position on the computing unit, or a temperature value at a specific position away from the computing unit, which will not be described in detail here.
  • the temperature of the computing unit can be acquired through a temperature-sensing device located inside or outside the computing unit (such as a temperature sensing device built into the computing unit, or a subsequent temperature sensor, or an infrared thermometer, etc.), or by acquiring and parsing a temperature database
  • the temperature data corresponds to the acquisition of the temperature data in the cache, which will not be described in detail here.
  • the temperature of the computing unit can be obtained when the computing system is running, and its working state can be adjusted according to the temperature of the computing unit, thereby ensuring that the temperature of the computing unit is always in a suitable range and improving the performance of the computing system.
  • the computing system includes a plurality of computing units, the computing units being processing cores.
  • obtaining the temperature of the computing unit ( S101 ) includes:
  • the working state of the computing unit is adjusted (S102), including:
  • S202 Assign tasks to the processing cores according to the temperature of the processing cores.
  • each computing unit may be a "processing core".
  • a processing core also known as a core or a core, is the smallest processing unit that can be independently scheduled and has complete computing capabilities, and may specifically be a chip (IC) or a core (Core) within a chip.
  • the computing system of the embodiment of the present disclosure is a “many-core system” including multiple processing cores as a whole.
  • the multiple processing cores of the many-core system are connected through the on-chip network to form a certain topology structure, wherein each processing core can process a certain number of processing cores.
  • the overall program can be quickly run, and the ability of multi-task parallel processing is provided.
  • the temperature of the processing core can be used to characterize its utilization and load to a certain extent. For example, when the temperature of the processing core is within the predetermined range, it means that the processing core is in the working range of normal operation, and when the temperature of the processing core is outside the predetermined range, it means that the processing core is overloaded or is idle.
  • the influence of the temperature of the task processing cores is not considered, so that the temperature distribution of different processing cores is not uniform, the performance of each processing core affected by temperature is greatly different, and the amount of tasks processed in each processing core is not equal. Evenly, the processing progress of each processing core is uneven, which affects the overall performance of the many-core system.
  • tasks are allocated according to the temperature of each processing core, so as to ensure that the temperature of each processing core is uniform, the amount of tasks processed is relatively uniform, and the processing progress of each processing core is similar, which improves the many-core system. overall performance.
  • acquiring the temperature of the computing unit (S101) includes:
  • the temperature of the processing core may be acquired to determine which processing core should be assigned the task.
  • acquiring the temperature of the computing unit (S101) includes:
  • the idle processing core can receive tasks (such as new tasks, or tasks migrated from other processing cores), so as another way of the embodiment of the present disclosure, It is also possible to obtain the temperature of a processing core (not necessarily the idle processing core, but also other processing cores) when there is at least one idle processing core to determine whether to assign tasks to the idle processing core, or to determine which one should be The tasks of the processing cores are migrated to the idle processing cores.
  • tasks such as new tasks, or tasks migrated from other processing cores
  • acquiring the temperature of the computing unit (S101) includes:
  • tasks are allocated to the processing core (S202), including:
  • the temperature of multiple processing cores may be acquired at one time, and based on this, it is determined to perform task assignment on at least some of the processing cores. For example, when there are new tasks waiting to be assigned, the temperature of multiple processing cores can be obtained, so as to decide which processing core to assign the task to.
  • the temperature of a processing core can also be obtained first, and if the temperature indicates that a new task can be allocated to the processing core, the Allocation, if not, then obtain the temperature of another processing core.
  • task assignment is performed on the processing core (S202), including at least one of the following:
  • the second temperature threshold is higher than the first temperature threshold, and the calorific value of the processing core when running computation-intensive tasks is greater than the calorific value when running data-intensive tasks.
  • a predetermined interval from a first temperature threshold to a second temperature threshold may be set, and tasks may be divided into computationally intensive tasks and data-intensive tasks.
  • computing-intensive tasks have the characteristics of large amount of calculation, which will make the processing cores performing the task generate more heat, such as tasks related to convolution operations, etc.; while data-intensive tasks have the characteristics of large amount of data and small amount of computation. , which will make the processing cores that perform tasks generate less heat, such as fully connected tasks.
  • the first temperature threshold and the second temperature threshold may also have a certain mapping relationship with computation-intensive tasks and data-intensive tasks.
  • the processing core can be assigned tasks in different ways:
  • the above first processing core, second processing core, and third processing core are only used to represent that the processing cores are at different temperatures, and do not represent specific processing cores.
  • the method further includes:
  • how many tasks are transferred out from the third processing core and which tasks are transferred out can be determined according to its temperature. For example, from the viewpoint of effectively lowering the temperature of the third processing core, computationally intensive tasks can be "prioritized" out of it.
  • the tasks transferred out of the third processing core may be temporarily "suspended" to stop processing, or may be allocated to other processing cores (ie, migrated to other processing cores) for processing.
  • the above task migration will occupy a certain amount of routing bandwidth, so the task can also be reassigned through the task distribution module in the system.
  • the processing core that receives the tasks migrated from the third processing core can be selected according to the above method, that is, the migrated computing-intensive tasks can be allocated to the first processing core whose temperature is lower than the first temperature threshold, and the The migrated data-intensive tasks are distributed to the second processing core whose temperature is between the first temperature threshold and the second temperature threshold, while other third processing cores whose temperature is higher than the second temperature threshold should no longer receive the migrated tasks.
  • the specific manner of assigning tasks according to temperature is not limited to this.
  • the first processing core receives computing-intensive tasks preferentially, it can also receive data-intensive tasks; for another example, when the temperature of each processing core exceeds the first temperature threshold, if there are computing-intensive tasks that must be allocated , the computationally intensive task can still be assigned to one of the processing cores (of course, the relatively low temperature).
  • the computing unit is an analog computing unit
  • the computing system further includes a temperature sensor and a circuit unit corresponding to the analog computing unit, the circuit unit is arranged in the first range of the corresponding analog computing unit, and the analog computing unit is arranged in the within the second range of its corresponding temperature sensor.
  • acquiring the temperature of the computing unit ( S101 ) includes:
  • the analog device refers to the device in which at least part of the processed signal is an analog signal (non-digital signal), which can specifically be a resistor, a capacitor, an inductor, a diode, a transistor, an analog amplifier, a D/A conversion circuit, an A/D Conversion circuits, analog signal conditioners, integrated voltage regulator circuits, sensors, audio and video circuits, etc.
  • an analog signal non-digital signal
  • the analog signal can specifically be a resistor, a capacitor, an inductor, a diode, a transistor, an analog amplifier, a D/A conversion circuit, an A/D Conversion circuits, analog signal conditioners, integrated voltage regulator circuits, sensors, audio and video circuits, etc.
  • a computing unit including at least one analog device is called an analog computing unit.
  • the analog computing unit may be the above processing core or chip, or may be a part of circuit modules in the processing core or chip.
  • Fig. 7 is a volt-ampere (IV) characteristic curve of a diode at different temperatures. It can be seen that when the temperature is different, the volt-ampere characteristic of the diode (analog device) changes significantly.
  • the analog device Since the analog device has the above-mentioned temperature characteristics, for a single analog device, if the temperature difference at different times is large, the operating results at different times will be greatly different; for multiple analog devices, if it is in the The temperature difference of the analog devices at different positions is large, which will lead to deviations in the operation result calculation of the analog devices at different positions. In a word, the above differences will cause the operation results generated by an analog device at one time to be unusable at another time, or the operation results generated by a certain analog device cannot be used by other analog devices, resulting in unreliable data finally generated by the analog device. .
  • the operation result is also affected by the temperature change, which may cause the operation result to be unreliable.
  • the computing unit is an analog computing unit
  • a corresponding temperature sensor and a circuit unit are set for the analog computing unit
  • the analog computing unit is set with a temperature corresponding to The distance between the sensor and the circuit unit does not exceed the preset range (first range, second range).
  • the number of analog computing units may be one referring to FIG. 8, or multiple referring to FIG. 9 to FIG. 13, but regardless of the number of analog computing units, each analog computing unit must correspond to one or more circuits unit, and one or more temperature sensors; and regardless of the above correspondence, the circuit unit and the temperature sensor must be located within the corresponding range of the corresponding analog computing unit.
  • the computing system may include 1 analog computing unit, and the analog computing unit corresponds to 1 temperature sensor; for another example, referring to FIG. 9 , the computing system may include 8 analog computing units, and each analog computing unit may include 8 analog computing units. Each temperature sensor corresponds to 1 analog computing unit; for another example, referring to FIG. 10 , the computing system may include 8 analog computing units, and every 2 analog computing units corresponds to the same temperature sensor; for another example, referring to FIG.
  • each The temperature sensor can correspond to the four surrounding analog computing units, so that the two analog computing units on the left in the virtual frame correspond to two temperature sensors (the first temperature sensor and the second temperature sensor), and the 2 analog computing units on the right in the virtual frame correspond to two temperature sensors (the first temperature sensor and the second temperature sensor).
  • Each analog computing unit also corresponds to 2 temperature sensors (the second temperature sensor and the third temperature sensor); for another example, the number of circuit units corresponding to different analog computing units may be the same with reference to FIG. 9 to FIG. 11 , or refer to 12 and 13, the number of circuit units corresponding to different analog computing units is different.
  • the structure of the computing system in each drawing is just an example, and in other embodiments, the structure of the computing system can be determined according to the actual situation.
  • the computing system as a whole may be an "on-chip structure", that is, an analog computing unit, a circuit unit, and an analog computing unit provided on a chip or a circuit board.
  • the analog computing unit is the above unit for performing operations
  • the circuit unit is a circuit structure arranged in the computing system for implementing the "non-operation" function, such as for analog Computing unit power supply, for processing analog computing unit signals, for transmitting analog computing unit signals, etc.
  • the circuit unit may be a functional circuit unit with the above practical functions, or may be a flip circuit unit that can only generate heat (only used for temperature control).
  • the temperature of the corresponding analog computing unit is determined according to the temperature detected by the temperature sensor, and the operating states of the corresponding analog computing unit and the circuit unit are controlled according to the comparison result between the temperature of the analog computing unit and the preset temperature range , in order to realize the temperature control of the computing system (on-chip structure), which can prevent the temperature difference of the analog computing unit from being too large (including the temperature difference of the analog computing unit being too large at different times, and the temperature difference between different analog computing units being too large), Therefore, the accuracy of the operation result of the analog computing unit is improved, and the problems in the related art that due to the large temperature difference of the analog device, the error of the operation result is large or the operation result is unreliable is solved.
  • the computing system of the embodiment of the present disclosure may further include a controller, and the above steps S301 and S302 are performed by the controller, that is, the temperature of the computing system may be controlled by the controller.
  • adjusting the operating state of the analog computing unit and its corresponding circuit unit including:
  • the operation of various units will generate heat and cause the temperature to rise. Therefore, as a method of the embodiment of the present disclosure, when the temperature of the analog computing unit is too high (exceeding the maximum temperature of the preset temperature range), the The analog computing unit and the corresponding circuit unit stop running to prevent the temperature from rising further; and when the temperature of the analog computing unit is appropriate (within the preset temperature range), the current operating state of the analog computing unit and the corresponding circuit unit is maintained, To maintain the temperature of the analog computing unit; when the temperature of the analog computing unit is too low (lower than the minimum temperature of the preset temperature range), try to start the analog computing unit and its corresponding circuit unit to increase the temperature of the analog computing unit .
  • the specific value of the above preset temperature range can be adjusted according to the actual situation.
  • the preset temperature range may be set according to the temperature corresponding to the best working state of the simulation computing unit, or the preset temperature range may be set by the staff based on experience, which will not be described in detail here.
  • the control of the analog computing unit and the circuit unit should not affect the overall operation of the computing system.
  • the work performed by it can be processed by other analog computing units and circuit units (such as task migration); for another example, when the analog computing unit and circuit unit are started, they should be assigned corresponding (such as assigning tasks); another example, when the circuit unit has no actual work that can be performed, it can "flip" to run, that is, it only generates heat and does not process the actual content.
  • the temperature difference of the analog computing unit can be reduced, and the uniformity of its temperature and performance can be improved.
  • controlling the running state of the analog computing unit and its corresponding circuit unit includes:
  • the simulation computing unit when the temperature of the simulation computing unit is lower than the lowest temperature of the preset temperature range, it may be determined whether the simulation computing unit needs to start running (for example, whether there is a new task to be allocated), and then based on the Determines whether the simulation computing unit should start running.
  • the number of analog computing units is multiple.
  • the method of the embodiment of the present disclosure further includes:
  • S304 Determine the historical temperature of the analog computing unit according to multiple temperatures collected by the temperature sensor corresponding to the analog computing unit at multiple times.
  • the total number of corresponding circuit units is constant, but these circuit units may be distributed near different analog computing units, that is, corresponding to different Analog computing unit.
  • a temperature sensor can be used to collect multiple temperatures at different times, so as to determine the historical temperature of the corresponding analog computing unit (that is, the long-term temperature distribution law of the analog computing unit) according to these temperatures. Therefore, the recommended number of circuit units corresponding to the analog computing unit can be determined according to the historical temperature (that is, when the analog computing unit corresponds to the recommended number of circuit units, the temperature can be within a relatively reasonable range with a high probability), and used as Determines the basis for the layout of the layout computing system (structure-on-chip).
  • the number of analog computing units is multiple; at least some of the temperature sensors correspond to multiple analog computing units.
  • the temperature measured by the temperature sensor may be used to determine multiple analog computing units Calculate the temperature of the cell.
  • determining the temperature of the analog computing unit according to the temperature collected by the temperature sensor corresponding to the analog computing unit (S302) includes:
  • the analog computing unit corresponds to only one temperature sensor, its temperature is the temperature of the temperature sensor.
  • its temperature is the average value of the temperatures collected by multiple temperature sensors.
  • each analog computing unit is not necessarily related to the number of analog computing units corresponding to each temperature sensor.
  • an embodiment of the present disclosure provides a layout method of a computing system, and the simulation computing unit has been determined to be arranged on a first target position of the computing system.
  • the number of analog computing units and their positions in the computing system have been predetermined, so they can be determined according to the location and temperature of each analog computing unit How to set the corresponding circuit unit in its first range, and how to set the corresponding temperature sensor in its second range.
  • the corresponding circuit unit and temperature sensor are set according to the temperature of the analog computing unit, so that a reasonable layout of the computing system can be realized, and it is ensured that the analog computing unit in the computing system according to the layout is in a suitable state for most of the time during operation. temperature range, improving the performance of the computing system.
  • the number of analog computing units is multiple; according to the temperature of the analog computing unit, it is determined that the circuit unit corresponding to the analog computing unit is located within the first range of the analog computing unit (S401), including:
  • S4012 Acquire multiple temperatures collected by the temperature sensor at multiple times, and determine the historical temperature of the analog computing unit according to the multiple temperatures collected by the temperature sensor corresponding to the analog computing unit at multiple times.
  • a "preset computing system" with the same layout of the simulation computing units may be pre-manufactured, and the preset computing system may be run to measure each simulation computing unit. Based on the historical temperature of the unit, the recommended number of circuit units corresponding to each analog computing unit is determined, and the corresponding recommended number of circuit units are actually laid out.
  • the temperature of the simulation computing unit in the embodiment of the present disclosure is not limited to the above historical temperature, for example, it may also be a "predicted temperature" obtained through model simulation operation or experience.
  • an embodiment of the present disclosure provides a computing system 40 , including: a computing unit and a controller 48 .
  • the controller 48 is configured to execute any one of the operating methods of the computing system 40 in the embodiments of the present disclosure.
  • the computing system 40 of the embodiment of the present disclosure may include the above computing unit, and a controller 48 for controlling the computing unit to work according to the above operating method of the computing system.
  • the computing unit is an analog computing unit 42 ; the computing system further includes: a temperature sensor 46 and a circuit unit 44 corresponding to the analog computing unit 42 .
  • the computing unit is the above analog computing unit 42 , and the computing system also includes the above temperature sensor 46 and the circuit unit 44 .
  • an electronic device 50 including:
  • processors 51 one or more processors 51;
  • One or more memories 52 having one or more programs stored thereon, when the one or more programs are executed by the one or more processors 51, enable the one or more processors 51 to implement any one of the embodiments of the present disclosure A method for running a computing system, or a method for implementing any layout of a computing system in the embodiments of the present disclosure.
  • the electronic device 50 may further include one or more I/O interfaces (represented by double-sided arrows in the figure), connected between the processor 51 and the memory 52, and configured to implement the communication between the processor 51 and the memory 52. Information exchange.
  • the processor is a device with data processing capability, which includes but is not limited to a central processing unit (CPU), etc.
  • the memory is a device with data storage capability, which includes but is not limited to random access memory (RAM, more specifically such as SDRAM) , DDR, etc.), read-only memory (ROM), electrified erasable programmable read-only memory (EEPROM), flash memory (FLASH); I/O interface (read and write interface) is connected between the processor and the memory, which can realize the memory and the memory.
  • the information exchange of the processor which includes but is not limited to the data bus (Bus) and the like.
  • the electronic devices in the embodiments of the present disclosure include mobile electronic devices and non-mobile electronic devices.
  • an embodiment of the present disclosure provides a computer-readable medium 60 on which a computer program is stored, and when the computer program is executed by a processor, implements any method of running a computing system in the embodiment of the present disclosure , or implement any of the layout methods of the computing system in the embodiments of the present disclosure.
  • the processor may be the processor in the electronic device in the above embodiment.
  • the computer-readable medium of the embodiment of the present disclosure includes a computer read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
  • the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical components Components execute cooperatively.
  • Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit (CPU), digital signal processor or microprocessor, or as hardware, or as an integrated circuit such as Application-specific integrated circuits.
  • a processor such as a central processing unit (CPU), digital signal processor or microprocessor, or as hardware, or as an integrated circuit such as Application-specific integrated circuits.
  • Such software may be distributed on computer-readable media, which may include computer-readable media (or non-transitory media) and communication media (or transitory media).
  • computer-readable medium includes both volatile and non-transitory media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules or other data. volatile, removable and non-removable media.
  • Computer readable media include, but are not limited to, random access memory (RAM, more specifically SDRAM, DDR, etc.), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), flash memory (FLASH), or other magnetic disks memory; compact disk read only (CD-ROM), digital versatile disk (DVD), or other optical disk storage; magnetic cartridge, tape, magnetic disk storage, or other magnetic storage; any other storage that can be used to store desired information and that can be accessed by a computer medium.
  • communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and can include any information delivery media, as is well known to those of ordinary skill in the art .

Abstract

A computing system and an operation method therefor, and an electronic device and a computer-readable medium, wherein the computing system comprises a computing unit. The operation method of the computing system comprises: acquiring the temperature of a computing unit (S101); and adjusting a working state of the computing unit according to the temperature of the computing unit (S102).

Description

计算系统及其运行方法和电子设备、计算机可读介质Computing system, method for operating the same, electronic device, and computer-readable medium 技术领域technical field
本公开涉及计算机技术领域,特别涉及计算系统及其运行方法和电子设备、计算机可读介质。The present disclosure relates to the field of computer technology, and in particular, to a computing system, a method for running the same, an electronic device, and a computer-readable medium.
背景技术Background technique
计算系统(如众核系统、片上结构等)是可用于进行各种完整运算任务的设备。计算系统可包括一个或多个相对独立的计算单元(如处理核),每个计算单元可进行整体运算任务中的全部或部分运算。Computing systems (such as many-core systems, on-chip structures, etc.) are devices that can be used to perform various complete computing tasks. The computing system may include one or more relatively independent computing units (eg, processing cores), and each computing unit may perform all or part of the operations in the overall computing task.
但是,计算系统中的计算单元的性能与其温度实际相关,而相关技术中未考虑计算单元的温度的影响,从而容易因为计算单元的温度不合适而对计算系统的整体性能造成不良影响。However, the performance of the computing unit in the computing system is actually related to its temperature, and the influence of the temperature of the computing unit is not considered in the related art, so that the overall performance of the computing system is easily adversely affected because the temperature of the computing unit is not suitable.
发明内容SUMMARY OF THE INVENTION
本公开提供一种计算系统及其运行方法和电子设备、计算机可读介质。The present disclosure provides a computing system, a method for operating the same, an electronic device, and a computer-readable medium.
第一方面,本公开实施例提供一种计算系统的运行方法,所述计算系统包括计算单元;所述方法包括:获取所述计算单元的温度;根据所述计算单元的温度,调整计算单元的工作状态。In a first aspect, an embodiment of the present disclosure provides a method for operating a computing system, where the computing system includes a computing unit; the method includes: acquiring a temperature of the computing unit; and adjusting a temperature of the computing unit according to the temperature of the computing unit working status.
在一些实施例中,所述计算系统包括多个计算单元,所述计算单元为处理核;所述根据所述计算单元的温度,调整计算单元的工作状态,包括:根据所述处理核的温度,对处理核进行任务分配。In some embodiments, the computing system includes a plurality of computing units, and the computing units are processing cores; the adjusting the working state of the computing units according to the temperature of the computing units includes: according to the temperature of the processing cores , to assign tasks to the processing cores.
在一些实施例中,所述获取所述计算单元的温度,包括:响应于存在等待分配的新任务,获取所述处理核的温度。In some embodiments, the obtaining the temperature of the computing unit includes: obtaining the temperature of the processing core in response to a new task waiting to be allocated.
在一些实施例中,所述获取所述计算单元的温度,包括:响应于存在空闲的处理核,获取所述处理核的温度。In some embodiments, the acquiring the temperature of the computing unit includes: in response to the existence of an idle processing core, acquiring the temperature of the processing core.
在一些实施例中,所述获取所述计算单元的温度,包括:获取多个所述处理核的温度;所述根据所述处理核的温度,对处理核进行任务分配,包括:根据多个所述处理核的温度,对其中的至少一个处理核进行任务分配。In some embodiments, the acquiring the temperature of the computing unit includes: acquiring a plurality of temperatures of the processing cores; and assigning tasks to the processing cores according to the temperatures of the processing cores includes: The temperature of the processing cores, and task assignment is performed on at least one of the processing cores.
在一些实施例中,所述根据所述处理核的温度,对处理核进行任务分配,包括以下至少一项:响应于第一处理核的温度低于预先设置的第一温度阈值,向第一处理核分配计算密集型任务;响应于所述第二处理核的温度高于或等于第一温度阈值,且低于或等于预先设置的第二温度阈值,向第二处理核分配数据密集型任务;响应于第三处理核的温度高于第二温度阈值,停止向第三处理核分配任务;其中,所述第二温度阈值高于第一温度阈值,所述处理核运行计算密集型任务时的发热量大于运行数据密集型任务时的发热量。In some embodiments, assigning tasks to the processing cores according to the temperature of the processing cores includes at least one of the following: in response to the temperature of the first processing cores being lower than a preset first temperature threshold, assigning tasks to the first processing cores Allocating a computing-intensive task to the processing core; in response to the temperature of the second processing core being higher than or equal to a first temperature threshold and lower than or equal to a preset second temperature threshold, allocating a data-intensive task to the second processing core ; in response to the temperature of the third processing core being higher than the second temperature threshold, stop allocating tasks to the third processing core; wherein, the second temperature threshold is higher than the first temperature threshold, when the processing core is running computationally intensive tasks is greater than when running data-intensive tasks.
在一些实施例中,在所述停止向第三处理核分配任务后,还包括:将由所述第三处理核处理的至少部分任务迁出第三处理核。In some embodiments, after the stopping allocating tasks to the third processing core, the method further includes: moving at least part of the tasks processed by the third processing core out of the third processing core.
在一些实施例中,所述计算单元为模拟计算单元,所述计算系统还包括与模拟计算单元对应的温度传感器和电路单元,所述电路单元设置在其对应的模拟计算单元的第一范围内,所述模拟计算单元设置在其对应的温度传感器的第二范围内;所述获取所述计算单元的温度,包括:获取所述温度传感器采集到的温度,根据所述模拟计算单元对应的温度传感器采集到的温度确定模拟计算单元的温度;所述根据所述计算单元的温度,调整计算单元的工作状态,包括;根据所述模拟计算单元的温度与预设温度范围的比较结果,调整模拟计算单元和其对应的电路单元的运行状态。In some embodiments, the computing unit is an analog computing unit, and the computing system further includes a temperature sensor and a circuit unit corresponding to the analog computing unit, and the circuit unit is arranged within the first range of the corresponding analog computing unit. , the analog computing unit is set within the second range of its corresponding temperature sensor; the acquiring the temperature of the computing unit includes: acquiring the temperature collected by the temperature sensor, according to the temperature corresponding to the analog computing unit The temperature collected by the sensor determines the temperature of the analog computing unit; the adjusting the working state of the computing unit according to the temperature of the computing unit includes: adjusting the analog computing unit according to the comparison result between the temperature of the analog computing unit and a preset temperature range The operating state of the computing unit and its corresponding circuit unit.
在一些实施例中,所述根据所述模拟计算单元的温度与预设温度范围的比较结果,调整模拟计算单元和其对应的电路单元的运行状态,包括:响应于所述模拟计算单元的温度在预设温度范围内,维持模拟计算单元和其对应的电路单元的当前运行状态;响应于所述模拟计算单元的温度高于预设温度范围的最高温度,控制模拟计算单元和其对应的电路单元停止运行;响应于所述模拟计算单元的温度低于预设温 度范围的最低温度,根据模拟计算单元是否需要启动运行,控制模拟计算单元和其对应的电路单元的运行状态。In some embodiments, the adjusting the operating state of the analog computing unit and its corresponding circuit unit according to the comparison result between the temperature of the analog computing unit and the preset temperature range includes: responding to the temperature of the analog computing unit Within the preset temperature range, maintain the current operating state of the analog computing unit and its corresponding circuit unit; in response to the temperature of the analog computing unit being higher than the highest temperature in the preset temperature range, control the analog computing unit and its corresponding circuit The unit stops operating; in response to the temperature of the analog computing unit being lower than the lowest temperature in the preset temperature range, the operating state of the analog computing unit and its corresponding circuit unit is controlled according to whether the analog computing unit needs to start running.
在一些实施例中,所述根据模拟计算单元是否需要启动运行,控制模拟计算单元和其对应的电路单元的运行状态,包括:判断所述模拟计算单元是否需要启动运行;响应于所述模拟计算单元需要启动运行,控制所述模拟计算单元启动运行,并控制其对应的电路单元运行;响应于所述模拟计算单元不需要启动运行,维持所述模拟计算单元和其对应的电路单元的当前运行状态。In some embodiments, the controlling the operation state of the simulation computing unit and its corresponding circuit unit according to whether the simulation computing unit needs to start running includes: judging whether the simulation computing unit needs to start running; responding to the simulation computing The unit needs to start running, control the simulation computing unit to start running, and control its corresponding circuit unit to run; in response to the simulation computing unit not needing to start running, maintain the current running of the simulation computing unit and its corresponding circuit unit state.
在一些实施例中,所述模拟计算单元的数量为多个:所述方法还包括:根据所述模拟计算单元对应的温度传感器在多个时间采集到的多个温度,确定模拟计算单元的历史温度;根据所述模拟计算单元的历史温度,确定模拟计算单元对应的电路单元的推荐数量。In some embodiments, the number of the simulation computing units is multiple: the method further includes: determining the history of the simulation computing unit according to a plurality of temperatures collected by a temperature sensor corresponding to the simulation computing unit at multiple times temperature; according to the historical temperature of the analog computing unit, determine the recommended number of circuit units corresponding to the analog computing unit.
在一些实施例中,所述模拟计算单元的数量为多个;至少部分所述温度传感器对应多个模拟计算单元。In some embodiments, the number of the analog computing units is multiple; at least some of the temperature sensors correspond to multiple analog computing units.
在一些实施例中,所述根据所述模拟计算单元对应的温度传感器采集到的温度确定模拟计算单元的温度包括:在所述模拟计算单元仅对应一个温度传感器的情况下,确定模拟计算单元的温度为其对应的温度传感器所采集的温度;在所述模拟计算单元对应多个温度传感器的情况下,确定模拟计算单元的温度为其对应的多个温度传感器所采集的多个温度的平均值。In some embodiments, the determining the temperature of the analog computing unit according to the temperature collected by the temperature sensor corresponding to the analog computing unit includes: in the case that the analog computing unit corresponds to only one temperature sensor, determining the temperature of the analog computing unit The temperature is the temperature collected by its corresponding temperature sensor; when the analog computing unit corresponds to multiple temperature sensors, the temperature of the analog computing unit is determined to be the average value of the multiple temperatures collected by the corresponding multiple temperature sensors .
第二方面,本公开实施例提供一种计算系统的布局方法,模拟计算单元已确定设置在计算系统的第一目标位置上;所述方法包括:根据所述模拟计算单元的温度,确定与模拟计算单元对应的电路单元位于该模拟计算单元的第一范围内;确定与每个模拟计算单元对应的温度传感器位于该模拟计算单元的第二范围内。In a second aspect, an embodiment of the present disclosure provides a method for laying out a computing system, wherein a simulation computing unit has been determined to be arranged at a first target position of the computing system; the method includes: determining and simulating according to the temperature of the simulation computing unit The circuit unit corresponding to the computing unit is located in the first range of the analog computing unit; it is determined that the temperature sensor corresponding to each analog computing unit is located in the second range of the analog computing unit.
在一些实施例中,所述模拟计算单元的数量为多个;所述根据所述模拟计算单元的温度,确定与模拟计算单元对应的电路单元位于该模拟计算单元的第一范围内,包括:运行预置计算系统,所述预置计算系统包括设于第一目标位置的模拟计算单元,以及与模拟计算单元 对应的温度传感器;获取所述温度传感器在多个时间采集到的多个温度,根据模拟计算单元对应的温度传感器在多个时间采集到的多个温度,确定模拟计算单元的历史温度;根据所述模拟计算单元的历史温度,确定模拟计算单元对应的电路单元的推荐数量;确定与所述模拟计算单元对应的电路单元个数为推荐数量,并位于该模拟计算单元的第一范围内。In some embodiments, the number of the analog computing unit is multiple; the determining that the circuit unit corresponding to the analog computing unit is located within the first range of the analog computing unit according to the temperature of the analog computing unit includes: Running a preset computing system, the preset computing system includes an analog computing unit set at the first target position, and a temperature sensor corresponding to the analog computing unit; acquiring multiple temperatures collected by the temperature sensor at multiple times, Determine the historical temperature of the analog computing unit according to the multiple temperatures collected by the temperature sensor corresponding to the analog computing unit at multiple times; determine the recommended number of circuit units corresponding to the analog computing unit according to the historical temperature of the analog computing unit; determine The number of circuit units corresponding to the analog computing unit is a recommended number, and is located within the first range of the analog computing unit.
第三方面,本公开实施例提供一种计算系统,包括:计算单元和控制器;所述控制器用于执行本公开实施例的任意一种计算系统的运行方法。In a third aspect, an embodiment of the present disclosure provides a computing system, including: a computing unit and a controller; the controller is configured to execute any one of the operating methods of the computing system of the embodiments of the present disclosure.
在一些实施例中,所述计算单元为模拟计算单元;所述计算系统还包括:与所述模拟计算单元对应的温度传感器和电路单元。In some embodiments, the computing unit is an analog computing unit; the computing system further includes: a temperature sensor and a circuit unit corresponding to the analog computing unit.
第四方面,本公开实施例提供一种电子设备,包括:一个或多个处理器;一个或多个存储器,其上存储有一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器能够实现本公开实施例的任意一种计算系统的运行方法,或实现本公开实施例的任意一种计算系统的布局方法。In a fourth aspect, embodiments of the present disclosure provide an electronic device, including: one or more processors; one or more memories, on which one or more programs are stored, when the one or more programs are One or more processors execute, so that the one or more processors can implement any one of the operating methods of the computing system of the embodiments of the present disclosure, or implement any one of the layout methods of the computing systems of the embodiments of the present disclosure.
第五方面,本公开实施例提供一种计算机可读介质,其上存储有计算机程序,所述计算机程序在被处理器执行时实现本公开实施例的任意一种计算系统的运行方法,或实现本公开实施例的任意一种计算系统的布局方法。In a fifth aspect, an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, implements any method for running a computing system in the embodiment of the present disclosure, or implements A layout method of any computing system according to the embodiments of the present disclosure.
本公开实施例中,在计算系统运行时可获取计算单元的温度,并根据计算单元的温度调整其工作状态,从而可保证将计算单元的温度始终处于比较合适的范围,改善计算系统的性能。In the embodiment of the present disclosure, the temperature of the computing unit can be obtained when the computing system is running, and its working state can be adjusted according to the temperature of the computing unit, thereby ensuring that the temperature of the computing unit is always in a suitable range and improving the performance of the computing system.
应当理解,本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征,也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。It should be understood that what is described in this section is not intended to identify key or critical features of embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become readily understood from the following description.
附图说明Description of drawings
附图用来提供对本公开的进一步理解,并且构成说明书的一部分,与详细实施例一起用于解释本公开,并不构成对本公开的限制。通过参考附图对详细实施例进行描述,以上和其它特征和优点对本领域技术人员将变得更加显而易见,在附图中:The accompanying drawings are used to provide a further understanding of the present disclosure and constitute a part of the specification, and together with the detailed embodiments, they are used to explain the present disclosure and do not constitute a limitation on the present disclosure. The above and other features and advantages will become more apparent to those skilled in the art by describing detailed embodiments with reference to the accompanying drawings, in which:
图1为本公开实施例提供的一种计算系统的运行方法的流程图;FIG. 1 is a flowchart of a method for running a computing system according to an embodiment of the present disclosure;
图2为本公开实施例提供的另一种计算系统的运行方法的流程图;FIG. 2 is a flowchart of another operating method of a computing system provided by an embodiment of the present disclosure;
图3为本公开实施例提供的另一种计算系统的运行方法的流程图;FIG. 3 is a flowchart of another operating method of a computing system provided by an embodiment of the present disclosure;
图4为本公开实施例提供的一种计算系统(众核系统)的组成框图;4 is a block diagram of a computing system (many-core system) provided by an embodiment of the present disclosure;
图5为本公开实施例提供的另一种计算系统的运行方法的流程图;FIG. 5 is a flowchart of another operating method of a computing system provided by an embodiment of the present disclosure;
图6为本公开实施例提供的另一种计算系统的运行方法的流程图;FIG. 6 is a flowchart of another operating method of a computing system provided by an embodiment of the present disclosure;
图7为相关技术中不同温度下二极管的伏安特性曲线;Fig. 7 is the volt-ampere characteristic curve of the diode at different temperatures in the related art;
图8为本公开实施例提供的另一种计算系统(片上结构)的结构示意图;FIG. 8 is a schematic structural diagram of another computing system (on-chip structure) provided by an embodiment of the present disclosure;
图9为本公开实施例提供的另一种计算系统(片上结构)的结构示意图;FIG. 9 is a schematic structural diagram of another computing system (on-chip structure) provided by an embodiment of the present disclosure;
图10为本公开实施例提供的另一种计算系统(片上结构)的结构示意图;10 is a schematic structural diagram of another computing system (on-chip structure) provided by an embodiment of the present disclosure;
图11为本公开实施例提供的另一种计算系统(片上结构)的结构示意图;FIG. 11 is a schematic structural diagram of another computing system (on-chip structure) provided by an embodiment of the present disclosure;
图12为本公开实施例提供的另一种计算系统(片上结构)的结构示意图;12 is a schematic structural diagram of another computing system (on-chip structure) provided by an embodiment of the present disclosure;
图13为本公开实施例提供的另一种计算系统(片上结构)的结构示意图;FIG. 13 is a schematic structural diagram of another computing system (on-chip structure) provided by an embodiment of the present disclosure;
图14为本公开实施例提供的一种计算系统的布局方法的流程图;14 is a flowchart of a layout method of a computing system provided by an embodiment of the present disclosure;
图15为本公开实施例提供的一种计算系统的组成框图;FIG. 15 is a block diagram of the composition of a computing system provided by an embodiment of the present disclosure;
图16为本公开实施例提供的一种电子设备的组成框图;FIG. 16 is a block diagram of the composition of an electronic device according to an embodiment of the present disclosure;
图17为本公开实施例提供的一种计算机可读介质的组成框图。FIG. 17 is a block diagram of the composition of a computer-readable medium provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
为使本领域的技术人员更好地理解本公开的技术方案,下面结合附图对本公开提供的计算系统及其运行方法和电子设备、计算机可读介质进行详细描述。In order for those skilled in the art to better understand the technical solutions of the present disclosure, the computing system, its operating method, electronic device, and computer-readable medium provided by the present disclosure will be described in detail below with reference to the accompanying drawings.
在下文中将参考附图更充分地描述本公开,但是所示的实施例可以以不同形式来体现,且不应当被解释为限于本公开阐述的实施例。反之,提供这些实施例的目的在于使本公开透彻和完整,并将使本领域技术人员充分理解本公开的范围。The present disclosure will be described more fully hereinafter with reference to the accompanying drawings, but the illustrated embodiments may be embodied in different forms and should not be construed as limited to the embodiments set forth in this disclosure. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
在不冲突的情况下,本公开各实施例及实施例中的各特征可相互组合。Various embodiments of the present disclosure and various features of the embodiments may be combined with each other without conflict.
本公开中,术语“第一”、“第二”仅由于描述目的,且不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。因此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者多个该特征。本发明的描述中,除非另有说明,“多个”的含义是两个或两个以上。In the present disclosure, the terms "first" and "second" are for descriptive purposes only, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with "first", "second" may expressly or implicitly include one or more of that feature. In the description of the present invention, unless otherwise specified, "plurality" means two or more.
本公开所使用的术语仅用于描述特定实施例,且不意欲限制本公开。如本公开所使用的术语“和/或”包括一个或多个相关列举条目的任何和所有组合。如本公开所使用的单数形式“一个”和“该”也意欲包括复数形式,除非上下文另外清楚指出。如本公开所使用的术语“包括”、“由……制成”,指定存在所述特征、整体、步骤、操作、元件和/或组件,但不排除存在或添加一个或多个其它特征、整体、步骤、操作、元件、组件和/或其群组。The terminology used in this disclosure is used to describe particular embodiments only, and is not intended to limit the disclosure. As used in this disclosure, the term "and/or" includes any and all combinations of one or more of the associated listed items. As used in this disclosure, the singular forms "a" and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. The terms "comprising", "made of", as used in this disclosure, specify the presence of stated features, integers, steps, operations, elements and/or components, but do not preclude the presence or addition of one or more other features, wholes, steps, operations, elements, components and/or groups thereof.
除非另外限定,否则本公开所用的所有术语(包括技术和科学术语)的含义与本领域普通技术人员通常理解的含义相同。还将理解,诸如那些在常用字典中限定的那些术语应当被解释为具有与其在相关技术以及本公开的背景下的含义一致的含义,且将不解释为具有理想化或过度形式上的含义,除非本公开明确如此限定。Unless otherwise defined, all terms (including technical and scientific terms) used in this disclosure have the same meaning as commonly understood by one of ordinary skill in the art. It will also be understood that terms such as those defined in common dictionaries should be construed as having meanings consistent with their meanings in the context of the related art and the present disclosure, and will not be construed as having idealized or over-formal meanings, Unless this disclosure expressly so limited.
本公开不限于附图中所示的实施例,而是包括基于制造工艺而形成的配置的修改。因此,附图中例示的区具有示意性属性,并且图中所示区的形状例示了元件的区的具体形状,但并不是旨在限制性的。The present disclosure is not limited to the embodiments shown in the drawings, but includes modifications of configurations formed based on manufacturing processes. Thus, the regions illustrated in the figures have schematic properties and the shapes of regions illustrated in the figures are illustrative of the specific shapes of regions of elements and are not intended to be limiting.
第一方面,参照图1至图13,本公开实施例提供一种计算系统的运行方法,该计算系统包括计算单元。In a first aspect, referring to FIG. 1 to FIG. 13 , an embodiment of the present disclosure provides a method for operating a computing system, where the computing system includes a computing unit.
本公开实施例用于运行计算系统,该计算系统是具有一定数据处理能力的系统,其中包括一个或多个计算单元;而每个计算单元是相对独立的结构,并可进行一定的运算。The embodiments of the present disclosure are used to run a computing system, which is a system with a certain data processing capability, which includes one or more computing units; and each computing unit is a relatively independent structure and can perform certain operations.
参照图1,本公开实施例的计算系统的运行方法包括:Referring to FIG. 1 , the operation method of the computing system according to the embodiment of the present disclosure includes:
S101、获取计算单元的温度。S101. Acquire the temperature of the computing unit.
S102、根据计算单元的温度,调整计算单元的工作状态。S102. Adjust the working state of the computing unit according to the temperature of the computing unit.
本公开实施例中,在计算系统运行时(至少部分计算单元运行),获取其中至少部分计算单元的温度,并根据计算单元的温度,调整相应计算单元的工作状态。In the embodiment of the present disclosure, when the computing system is running (at least part of the computing units is running), the temperature of at least some of the computing units is acquired, and the working state of the corresponding computing unit is adjusted according to the temperature of the computing unit.
其中,计算单元的温度是指能表征计算单元的温度特性的参数,其也可是计算单元上某个位置的温度值,也可以是离开计算单元特定位置的温度值,在此不再详细描述。The temperature of the computing unit refers to a parameter that can characterize the temperature characteristics of the computing unit, which may also be a temperature value at a certain position on the computing unit, or a temperature value at a specific position away from the computing unit, which will not be described in detail here.
其中,计算单元的温度可通过设于其内部或外部的能感应温度的器件获取(如计算单元内置的温度感应器件,或后续的温度传感器,或红外温度计等),或者通过获取并解析温度数据库或温度数据对应缓存中的温度数据获取,在此不再详细描述。Among them, the temperature of the computing unit can be acquired through a temperature-sensing device located inside or outside the computing unit (such as a temperature sensing device built into the computing unit, or a subsequent temperature sensor, or an infrared thermometer, etc.), or by acquiring and parsing a temperature database Or the temperature data corresponds to the acquisition of the temperature data in the cache, which will not be described in detail here.
本公开实施例中,在计算系统运行时可获取计算单元的温度,并根据计算单元的温度调整其工作状态,从而可保证将计算单元的温度始终处于比较合适的范围,改善计算系统的性能。In the embodiment of the present disclosure, the temperature of the computing unit can be obtained when the computing system is running, and its working state can be adjusted according to the temperature of the computing unit, thereby ensuring that the temperature of the computing unit is always in a suitable range and improving the performance of the computing system.
在一些实施例中,计算系统包括多个计算单元,计算单元为处理核。In some embodiments, the computing system includes a plurality of computing units, the computing units being processing cores.
在一些实施例中,参照图2,获取计算单元的温度(S101)包括:In some embodiments, referring to FIG. 2 , obtaining the temperature of the computing unit ( S101 ) includes:
S201、获取处理核的温度。S201. Obtain the temperature of the processing core.
而根据计算单元的温度,调整计算单元的工作状态(S102),包括:And according to the temperature of the computing unit, the working state of the computing unit is adjusted (S102), including:
S202、根据处理核的温度,对处理核进行任务分配。S202: Assign tasks to the processing cores according to the temperature of the processing cores.
作为本公开实施例的一种方式,参照图4,每个计算单元可以是一个“处理核”。处理核又称为核、核心,其是能被独立调度并拥有完整计算能力的最小处理单元,其具体可为一个芯片(IC)或芯片内的一个核心(Core)。As one way of implementing embodiments of the present disclosure, referring to FIG. 4 , each computing unit may be a "processing core". A processing core, also known as a core or a core, is the smallest processing unit that can be independently scheduled and has complete computing capabilities, and may specifically be a chip (IC) or a core (Core) within a chip.
由此,本公开实施例的计算系统整体为包括多个处理核的“众核系统”,众核系统的多个处理核通过片上网络连接构成一定的拓扑结构,其中每个处理核可处理一定的任务(运算),而通过多个处理核的联合工作,可快速运行整体程序,并提供多任务并行处理的能力。Therefore, the computing system of the embodiment of the present disclosure is a “many-core system” including multiple processing cores as a whole. The multiple processing cores of the many-core system are connected through the on-chip network to form a certain topology structure, wherein each processing core can process a certain number of processing cores. Through the joint work of multiple processing cores, the overall program can be quickly run, and the ability of multi-task parallel processing is provided.
通常而言,处理核中处理的任务越多,则发热量越大,处理核的温度(计算单元的温度)越高。由此,处理核的温度在一定程度上可以用来表征其使用率和负荷量。例如,当处理核的温度处在预定区间内时,则表示该处理核处在正常运作的工作范围中,而当处理核的温度处在预定区间外时,则表示该处理核负载过多或处于空闲状态。Generally speaking, the more tasks processed in the processing core, the higher the heat generation and the higher the temperature of the processing core (the temperature of the computing unit). Thus, the temperature of the processing core can be used to characterize its utilization and load to a certain extent. For example, when the temperature of the processing core is within the predetermined range, it means that the processing core is in the working range of normal operation, and when the temperature of the processing core is outside the predetermined range, it means that the processing core is overloaded or is idle.
在一些相关技术中,未考虑任务处理核的温度的影响,从而到导致不同处理核的温度分布不均匀,各处理核受温度影响性能差距较大,且各处理核中处理的任务量也不均匀,进而导致各处理核的处理进度不均,影响众核系统的整体性能。In some related technologies, the influence of the temperature of the task processing cores is not considered, so that the temperature distribution of different processing cores is not uniform, the performance of each processing core affected by temperature is greatly different, and the amount of tasks processed in each processing core is not equal. Evenly, the processing progress of each processing core is uneven, which affects the overall performance of the many-core system.
而本公开实施例中,根据各处理核的温度对其进行任务分配,从而可保证各处理核的温度均匀,其中处理的任务量也相对均匀,各处理核的处理进度相近,提高众核系统的整体性能。However, in the embodiment of the present disclosure, tasks are allocated according to the temperature of each processing core, so as to ensure that the temperature of each processing core is uniform, the amount of tasks processed is relatively uniform, and the processing progress of each processing core is similar, which improves the many-core system. overall performance.
在一些实施例中,获取计算单元的温度(S101),包括:In some embodiments, acquiring the temperature of the computing unit (S101) includes:
S2011、响应于存在等待分配的新任务,获取处理核的温度。S2011. Acquire the temperature of the processing core in response to the existence of a new task waiting to be allocated.
作为本公开实施例的一种方式,可以是在有新的任务要分配时,获取处理核的温度,以确定应将该任务分配给哪个处理核。As a way of the embodiment of the present disclosure, when there is a new task to be assigned, the temperature of the processing core may be acquired to determine which processing core should be assigned the task.
在一些实施例中,获取计算单元的温度(S101),包括:In some embodiments, acquiring the temperature of the computing unit (S101) includes:
S2012、响应于存在空闲的处理核,获取处理核的温度。S2012. Acquire the temperature of the processing core in response to the existence of an idle processing core.
在众核系统中存在至少一个空闲处理核时,则表明该空闲处理核可接收任务(如新任务,或从其它处理核迁移出的任务),从而作为本公开实施例的另一种方式,也可以是在存在至少一个空闲处理核时获取处理核(不一定是该空闲处理核,也可为其它处理核)的温度,以确定是否向该空闲的处理核分配任务,或者确定应将哪个处理核的任务迁移至该空闲的处理核。When there is at least one idle processing core in the many-core system, it indicates that the idle processing core can receive tasks (such as new tasks, or tasks migrated from other processing cores), so as another way of the embodiment of the present disclosure, It is also possible to obtain the temperature of a processing core (not necessarily the idle processing core, but also other processing cores) when there is at least one idle processing core to determine whether to assign tasks to the idle processing core, or to determine which one should be The tasks of the processing cores are migrated to the idle processing cores.
在一些实施例中,获取计算单元的温度(S101),包括:In some embodiments, acquiring the temperature of the computing unit (S101) includes:
S2013、获取多个处理核的温度。S2013. Acquire the temperatures of the multiple processing cores.
根据处理核的温度,对处理核进行任务分配(S202),包括:According to the temperature of the processing core, tasks are allocated to the processing core (S202), including:
S2021、根据多个处理核的温度,对其中的至少一个处理核进行任务分配。S2021. Perform task assignment on at least one of the processing cores according to the temperature of the multiple processing cores.
作为本公开实施例的一种方式,可以是一次获取多个处理核的温度,并据此决定对其中至少部分处理核进行任务分配。例如,在有等待分配的新任务时,可以是获取多个处理核的温度,以此决定将任务分配给其中哪个处理核。As a method of the embodiment of the present disclosure, the temperature of multiple processing cores may be acquired at one time, and based on this, it is determined to perform task assignment on at least some of the processing cores. For example, when there are new tasks waiting to be assigned, the temperature of multiple processing cores can be obtained, so as to decide which processing core to assign the task to.
当然,获取处理核的温度的具体方式是多样的,例如,在有等待分配的新任务时,也可以是先获取一个处理核的温度,若温度表明可将新任务分配给该处理核则进行分配,若不行则再获取另一处理核的温度。Of course, there are various specific ways to obtain the temperature of the processing core. For example, when there is a new task waiting to be allocated, the temperature of a processing core can also be obtained first, and if the temperature indicates that a new task can be allocated to the processing core, the Allocation, if not, then obtain the temperature of another processing core.
在一些实施例中,参照图3,根据处理核的温度,对处理核进行任务分配(S202),包括以下至少一项:In some embodiments, referring to FIG. 3 , according to the temperature of the processing core, task assignment is performed on the processing core (S202), including at least one of the following:
S20221、响应于第一处理核的温度低于预先设置的第一温度阈值,向第一处理核分配计算密集型任务。S20221. In response to the temperature of the first processing core being lower than a preset first temperature threshold, assign a computation-intensive task to the first processing core.
S20222、响应于第二处理核的温度高于或等于第一温度阈值,且低于或等于预先设置的第二温度阈值,向第二处理核分配数据密集型任务。S20222. In response to the temperature of the second processing core being higher than or equal to the first temperature threshold and lower than or equal to a preset second temperature threshold, assign a data-intensive task to the second processing core.
S20223、响应于第三处理核的温度高于第二温度阈值,停止向第三处理核分配任务。S20223. In response to the temperature of the third processing core being higher than the second temperature threshold, stop allocating tasks to the third processing core.
其中,第二温度阈值高于第一温度阈值,处理核运行计算密集型任务时的发热量大于运行数据密集型任务时的发热量。Wherein, the second temperature threshold is higher than the first temperature threshold, and the calorific value of the processing core when running computation-intensive tasks is greater than the calorific value when running data-intensive tasks.
作为本公开实施例的一种方式,可设定从第一温度阈值到第二温度阈值的预定区间,并将任务分为计算密集型任务和数据密集型任务。As a manner of the embodiment of the present disclosure, a predetermined interval from a first temperature threshold to a second temperature threshold may be set, and tasks may be divided into computationally intensive tasks and data-intensive tasks.
其中,计算密集型任务具备计算量大的特性,会使执行任务的处理核的发热量更大,例如为卷积运算的相关任务等;而数据密集型任务具备数据量大和计算量小的特性,会使执行任务的处理核的发热量更小,例如全连接的相关任务等。Among them, computing-intensive tasks have the characteristics of large amount of calculation, which will make the processing cores performing the task generate more heat, such as tasks related to convolution operations, etc.; while data-intensive tasks have the characteristics of large amount of data and small amount of computation. , which will make the processing cores that perform tasks generate less heat, such as fully connected tasks.
或者,第一温度阈值、第二温度阈值也可与计算密集型任务、数据密集型任务具有一定的映射关系。Alternatively, the first temperature threshold and the second temperature threshold may also have a certain mapping relationship with computation-intensive tasks and data-intensive tasks.
由此,根据处理核的温度与预定区间的关系,可按照不同方式对处理核进行任务分配:Therefore, according to the relationship between the temperature of the processing core and the predetermined interval, the processing core can be assigned tasks in different ways:
(1)对于温度低于第一温度阈值的第一处理核,由于其具备较大的升温空间,故可承担发热量较大的任务(计算密集型任务)。(1) For the first processing core whose temperature is lower than the first temperature threshold, since it has a large heating space, it can undertake a task with a large amount of heat (computation-intensive task).
(2)对于温度在第一温度阈值与第二温度阈值间的第二处理核,其温度仍处在适宜的工作范围,但可供升温的空间不大,因此不宜承担发热量较大的任务,但仍可向其分配发热量较小的任务(数据密集型任务)。(2) For the second processing core whose temperature is between the first temperature threshold and the second temperature threshold, its temperature is still in a suitable working range, but there is not much room for heating, so it is not suitable to undertake the task of generating a large amount of heat , but it can still be assigned tasks that generate less heat (data-intensive tasks).
(3)对于温度高于第二温度阈值的第三处理核,其已经过热,为避免其温度继续升高,至少应等待其降温后再行分配任务。(3) For the third processing core whose temperature is higher than the second temperature threshold, it has been overheated, and in order to prevent its temperature from continuing to rise, it should at least wait for it to cool down before assigning tasks.
其中,以上“过热”仅用于描述温度已超出适宜的工作范围,而不代表其温度已超出极限温度上限而发生故障。Among them, the above "overheating" is only used to describe that the temperature has exceeded the appropriate working range, and it does not mean that the temperature has exceeded the upper limit of the temperature limit and failure occurs.
其中,以上第一处理核、第二处理核、第三处理核只是用于代表处理核处于不同的温度,而不代表特定的处理核。The above first processing core, second processing core, and third processing core are only used to represent that the processing cores are at different temperatures, and do not represent specific processing cores.
在一些实施例中,参照图3,在停止向第三处理核分配任务(S20223)后,还包括:In some embodiments, referring to FIG. 3 , after the assignment of tasks to the third processing core is stopped (S20223), the method further includes:
S202231、将由第三处理核处理的至少部分任务迁出第三处理核。S202231. Move at least part of the tasks processed by the third processing core out of the third processing core.
作为本公开实施例的一种方式,在停止向过热的第三处理核分配任务后,还可进一步将其中已有的至少部分任务迁出。As a manner of the embodiment of the present disclosure, after the assignment of tasks to the overheated third processing core is stopped, at least some of the existing tasks may be further migrated out.
其中,从第三处理核迁出多少个任务,以及迁出哪些任务,可根据其温度确定。例如,从有效降低第三处理核温度的角度考虑,可“优先”迁出其中的计算密集型任务。Among them, how many tasks are transferred out from the third processing core and which tasks are transferred out can be determined according to its temperature. For example, from the viewpoint of effectively lowering the temperature of the third processing core, computationally intensive tasks can be "prioritized" out of it.
其中,被迁出第三处理核的任务,可以是暂时“挂起”而停止处理,也可以是分配给其它的处理核(即迁移至其它处理核)处理。Among them, the tasks transferred out of the third processing core may be temporarily "suspended" to stop processing, or may be allocated to other processing cores (ie, migrated to other processing cores) for processing.
其中,以上任务迁移会占用一定的路由带宽,故也可通过系统中的任务下发模块将任务重新分配。Among them, the above task migration will occupy a certain amount of routing bandwidth, so the task can also be reassigned through the task distribution module in the system.
其中,接收由第三处理核迁出的任务的处理核可根据以上的方式选择,即,可将迁出的计算密集型任务分给温度低于第一温度阈值的第一处理核,而将迁出的数据密集型任务分给温度在第一温度阈值与第二温度阈值间的第二处理核,而其它温度高于第二温度阈值的第三处理核不应再接收迁出的任务。The processing core that receives the tasks migrated from the third processing core can be selected according to the above method, that is, the migrated computing-intensive tasks can be allocated to the first processing core whose temperature is lower than the first temperature threshold, and the The migrated data-intensive tasks are distributed to the second processing core whose temperature is between the first temperature threshold and the second temperature threshold, while other third processing cores whose temperature is higher than the second temperature threshold should no longer receive the migrated tasks.
其中,根据温度进行任务分配的具体方式不限于此。例如,第一处理核虽然优先接收计算密集型任务,但其当然也可接收数据密集型任务;再如,在各处理核温度均超过第一温度阈值时,若有必须分配的计算密集型任务,则仍可将该计算密集型任务分配给其中某个处理核(当然可为温度相对较低的)。The specific manner of assigning tasks according to temperature is not limited to this. For example, although the first processing core receives computing-intensive tasks preferentially, it can also receive data-intensive tasks; for another example, when the temperature of each processing core exceeds the first temperature threshold, if there are computing-intensive tasks that must be allocated , the computationally intensive task can still be assigned to one of the processing cores (of course, the relatively low temperature).
在一些实施例中,计算单元为模拟计算单元,计算系统还包括与模拟计算单元对应的温度传感器和电路单元,电路单元设置在其对应的模拟计算单元的第一范围内,模拟计算单元设置在其对应的温度传感器的第二范围内。In some embodiments, the computing unit is an analog computing unit, and the computing system further includes a temperature sensor and a circuit unit corresponding to the analog computing unit, the circuit unit is arranged in the first range of the corresponding analog computing unit, and the analog computing unit is arranged in the within the second range of its corresponding temperature sensor.
在一些实施例中,参照图5,获取计算单元的温度(S101),包括:In some embodiments, referring to FIG. 5 , acquiring the temperature of the computing unit ( S101 ) includes:
S301、获取温度传感器采集到的温度,根据模拟计算单元对应的温度传感器采集到的温度确定模拟计算单元的温度。S301. Acquire the temperature collected by the temperature sensor, and determine the temperature of the analog computing unit according to the temperature collected by the temperature sensor corresponding to the analog computing unit.
根据计算单元的温度,调整计算单元的工作状态(S102),包括;According to the temperature of the computing unit, adjust the working state of the computing unit (S102), including;
S302、根据模拟计算单元的温度与预设温度范围的比较结果,调整模拟计算单元和其对应的电路单元的运行状态。S302. Adjust the operating state of the analog computing unit and its corresponding circuit unit according to the comparison result between the temperature of the analog computing unit and the preset temperature range.
其中,模拟器件是指处理的信号中有至少部分为模拟信号(非数字信号)的器件,其具体可为电阻、电容、电感、二极管、三极管、模拟放大器、D/A转换电路、A/D转换电路、模拟信号调节器、集成稳压电路、传感器、音视频电路等。Among them, the analog device refers to the device in which at least part of the processed signal is an analog signal (non-digital signal), which can specifically be a resistor, a capacitor, an inductor, a diode, a transistor, an analog amplifier, a D/A conversion circuit, an A/D Conversion circuits, analog signal conditioners, integrated voltage regulator circuits, sensors, audio and video circuits, etc.
相应的,包括至少一个模拟器件的计算单元称为模拟计算单元。例如,模拟计算单元可为以上的处理核或芯片,也可为处理核或芯片中的部分电路模块。Correspondingly, a computing unit including at least one analog device is called an analog computing unit. For example, the analog computing unit may be the above processing core or chip, or may be a part of circuit modules in the processing core or chip.
经研究发现,温度对于模拟器件特性的影响较大。例如,通常情况下,温度升高,电子动能增加,从而会导致模拟器件特性变化。例如,图7为不同温度下二极管的伏安(IV)特性曲线,可见,当温度不同时,二极管(模拟器件)的伏安特性产生了明显变化。The research found that the temperature has a great influence on the characteristics of the simulated device. For example, typically, as the temperature increases, the kinetic energy of the electrons increases, which results in changes in the characteristics of the analog device. For example, Fig. 7 is a volt-ampere (IV) characteristic curve of a diode at different temperatures. It can be seen that when the temperature is different, the volt-ampere characteristic of the diode (analog device) changes significantly.
由于模拟器件具有上述的温度特性,故对于单个模拟器件而言,如果其不同时刻的温度差异大,则会导致其不同时刻的运行结果差异较大;而对于多个模拟器件而言,若处于不同位置的模拟器件的温度差异较大,则会导致不同位置处的模拟器件的运行结果运算产生偏差。总之,以上差异会导致模拟器件在某个时刻产生的运行结果无法在另一时刻使用,或某个模拟器件产生的运行结果无法供其它模拟器件使用,由此导致模拟器件最终产生的数据不可信。Since the analog device has the above-mentioned temperature characteristics, for a single analog device, if the temperature difference at different times is large, the operating results at different times will be greatly different; for multiple analog devices, if it is in the The temperature difference of the analog devices at different positions is large, which will lead to deviations in the operation result calculation of the analog devices at different positions. In a word, the above differences will cause the operation results generated by an analog device at one time to be unusable at another time, or the operation results generated by a certain analog device cannot be used by other analog devices, resulting in unreliable data finally generated by the analog device. .
因此,相关技术中,对于包括模拟器件的模拟计算单元,其运算结果也会受到温度变化的影响,可能导致运算结果不可信。Therefore, in the related art, for an analog computing unit including an analog device, the operation result is also affected by the temperature change, which may cause the operation result to be unreliable.
作为本公开实施例的另一种方式,当计算单元为模拟计算单元时,可参照图8至图13,为模拟计算单元设置对应的温度传感器和电路单元,且模拟计算单元设置与对应的温度传感器、电路单元之间的距离都不超过预设范围(第一范围、第二范围)。As another way of the embodiment of the present disclosure, when the computing unit is an analog computing unit, referring to FIG. 8 to FIG. 13 , a corresponding temperature sensor and a circuit unit are set for the analog computing unit, and the analog computing unit is set with a temperature corresponding to The distance between the sensor and the circuit unit does not exceed the preset range (first range, second range).
其中,模拟计算单元的个数可参照图8为一个,也可参照图9至图13为多个,但不论模拟计算单元的个数如何,每个模拟计算单元都 要对应一个或多个电路单元,以及对应一个或多个温度传感器;且不论以上对应关系如何,电路单元和温度传感器都必须位于对应的模拟计算单元的相应范围内。The number of analog computing units may be one referring to FIG. 8, or multiple referring to FIG. 9 to FIG. 13, but regardless of the number of analog computing units, each analog computing unit must correspond to one or more circuits unit, and one or more temperature sensors; and regardless of the above correspondence, the circuit unit and the temperature sensor must be located within the corresponding range of the corresponding analog computing unit.
例如,参照图8,计算系统(片上结构)可包括1个模拟计算单元,且该模拟计算单元对应1个温度传感器;再如,参照图9,计算系统可包括8个模拟计算单元,且每个温度传感器对应1个模拟计算单元;再如,参照图10,计算系统可包括8个模拟计算单元,且每2个模拟计算单元对应同1个温度传感器;再如,参照图11,每个温度传感器可对应周边的4个模拟计算单元,从而其中虚拟框内靠左边的2个模拟计算单元对应2个温度传感器(第一温度传感器和第二温度传感器),而虚拟框内靠右边的2个模拟计算单元也对应2个温度传感器(第二温度传感器和第三温度传感器);再如,不同模拟计算单元对应的电路单元的个数可参照图9至图11是相同的,也可参照图12和图13,不同模拟计算单元对应的电路单元的个数不同。For example, referring to FIG. 8 , the computing system (on-chip structure) may include 1 analog computing unit, and the analog computing unit corresponds to 1 temperature sensor; for another example, referring to FIG. 9 , the computing system may include 8 analog computing units, and each analog computing unit may include 8 analog computing units. Each temperature sensor corresponds to 1 analog computing unit; for another example, referring to FIG. 10 , the computing system may include 8 analog computing units, and every 2 analog computing units corresponds to the same temperature sensor; for another example, referring to FIG. 11 , each The temperature sensor can correspond to the four surrounding analog computing units, so that the two analog computing units on the left in the virtual frame correspond to two temperature sensors (the first temperature sensor and the second temperature sensor), and the 2 analog computing units on the right in the virtual frame correspond to two temperature sensors (the first temperature sensor and the second temperature sensor). Each analog computing unit also corresponds to 2 temperature sensors (the second temperature sensor and the third temperature sensor); for another example, the number of circuit units corresponding to different analog computing units may be the same with reference to FIG. 9 to FIG. 11 , or refer to 12 and 13, the number of circuit units corresponding to different analog computing units is different.
其中,各附图中计算系统的结构只是举例,在其它的实施方式中可根据实际情况确定计算系统的结构。Wherein, the structure of the computing system in each drawing is just an example, and in other embodiments, the structure of the computing system can be determined according to the actual situation.
从而,计算系统整体上可为一种“片上结构”,即设于芯片或线路板上的模拟计算单元、电路单元、模拟计算单元。Therefore, the computing system as a whole may be an "on-chip structure", that is, an analog computing unit, a circuit unit, and an analog computing unit provided on a chip or a circuit board.
其中,计算系统(片上结构)中,模拟计算单元为以上用于进行运算的单元,而电路单元为设置在计算系统中的,用于实现“非运算”功能的电路结构,如用于为模拟计算单元供电、用于处理模拟计算单元的信号、用于传递模拟计算单元的信号等。例如,电路单元可为具有以上实际功能的功能电路单元,也可为仅能发热的空翻电路单元(仅用于控制温度)。Among them, in the computing system (on-chip structure), the analog computing unit is the above unit for performing operations, and the circuit unit is a circuit structure arranged in the computing system for implementing the "non-operation" function, such as for analog Computing unit power supply, for processing analog computing unit signals, for transmitting analog computing unit signals, etc. For example, the circuit unit may be a functional circuit unit with the above practical functions, or may be a flip circuit unit that can only generate heat (only used for temperature control).
本公开实施例中,根据温度传感器检测到的温度确定对应的模拟计算单元的温度,并根据模拟计算单元的温度与预设温度范围的比较结果来控制对应的模拟计算单元和电路单元的运行状态,以实现对计算系统(片上结构)的温度控制,可避免模拟计算单元的温度差过大(包括模拟计算单元在不同时间的温差过大,以及不同模拟计算单元 之间的温差过大),从而提升模拟计算单元运算结果的准确性,解决相关技术中由于模拟器件的温度差异较大导致其运算结果误差较大或运算结果不可信的问题。In the embodiment of the present disclosure, the temperature of the corresponding analog computing unit is determined according to the temperature detected by the temperature sensor, and the operating states of the corresponding analog computing unit and the circuit unit are controlled according to the comparison result between the temperature of the analog computing unit and the preset temperature range , in order to realize the temperature control of the computing system (on-chip structure), which can prevent the temperature difference of the analog computing unit from being too large (including the temperature difference of the analog computing unit being too large at different times, and the temperature difference between different analog computing units being too large), Therefore, the accuracy of the operation result of the analog computing unit is improved, and the problems in the related art that due to the large temperature difference of the analog device, the error of the operation result is large or the operation result is unreliable is solved.
其中,本公开实施例的计算系统还可包括控制器,而通过该控制器来执行上述步骤S301和步骤S302,即可以由控制器来实现对计算系统温度的控制。The computing system of the embodiment of the present disclosure may further include a controller, and the above steps S301 and S302 are performed by the controller, that is, the temperature of the computing system may be controlled by the controller.
在一些实施例中,参照图6,根据模拟计算单元的温度与预设温度范围的比较结果,调整模拟计算单元和其对应的电路单元的运行状态(S302),包括:In some embodiments, referring to FIG. 6 , according to the comparison result between the temperature of the analog computing unit and the preset temperature range, adjusting the operating state of the analog computing unit and its corresponding circuit unit (S302), including:
S3021、响应于模拟计算单元的温度在预设温度范围内,维持模拟计算单元和其对应的电路单元的当前运行状态。S3021. In response to the temperature of the analog computing unit being within a preset temperature range, maintain the current operating state of the analog computing unit and its corresponding circuit unit.
S3022、响应于模拟计算单元的温度高于预设温度范围的最高温度,控制模拟计算单元和其对应的电路单元停止运行。S3022 , in response to the temperature of the analog computing unit being higher than the highest temperature in the preset temperature range, control the analog computing unit and its corresponding circuit unit to stop running.
S3023、响应于模拟计算单元的温度低于预设温度范围的最低温度,根据模拟计算单元是否需要启动运行,控制模拟计算单元和其对应的电路单元的运行状态。S3023 . In response to the temperature of the analog computing unit being lower than the lowest temperature in the preset temperature range, control the operating state of the analog computing unit and its corresponding circuit unit according to whether the analog computing unit needs to start running.
显然,各种单元的运行都会产生热量导致温度升高,故作为本公开实施例的一种方式,在模拟计算单元的温度过高(超过预设温度范围的最高温度)的情况下,可控制模拟计算单元和对应的电路单元停止运行以防止温度进一步升高;而在模拟计算单元的温度合适(处于预设温度范围内)时,则维持模拟计算单元和对应的电路单元的当前运行状态,以保持模拟计算单元的温度;而在模拟计算单元的温度过低(低过预设温度范围的最低温度)时,则尝试启动模拟计算单元和其对应的电路单元,以提高模拟计算单元的温度。Obviously, the operation of various units will generate heat and cause the temperature to rise. Therefore, as a method of the embodiment of the present disclosure, when the temperature of the analog computing unit is too high (exceeding the maximum temperature of the preset temperature range), the The analog computing unit and the corresponding circuit unit stop running to prevent the temperature from rising further; and when the temperature of the analog computing unit is appropriate (within the preset temperature range), the current operating state of the analog computing unit and the corresponding circuit unit is maintained, To maintain the temperature of the analog computing unit; when the temperature of the analog computing unit is too low (lower than the minimum temperature of the preset temperature range), try to start the analog computing unit and its corresponding circuit unit to increase the temperature of the analog computing unit .
其中,以上预设温度范围的具体取值可以根据实际情况进行调整。例如,可以根据模拟计算单元最佳的工作状态所对应的温度设定预设温度范围,或者,也可以是工作人员工作根据经验设定预设温度范围,在此不再详细描述。The specific value of the above preset temperature range can be adjusted according to the actual situation. For example, the preset temperature range may be set according to the temperature corresponding to the best working state of the simulation computing unit, or the preset temperature range may be set by the staff based on experience, which will not be described in detail here.
其中,对模拟计算单元、电路单元的控制不应影响计算系统整体 的运行。例如,当模拟计算单元、电路单元停止运行时,其进行的工作可由其它模拟计算单元、电路单元处理(如任务迁移);再如,当启动模拟计算单元、电路单元时,应为其分配相应的工作(如分配任务);再如,当电路单元没有实际可进行的工作时,可“空翻”运行,即仅发热而不处理实际内容。Among them, the control of the analog computing unit and the circuit unit should not affect the overall operation of the computing system. For example, when the analog computing unit and circuit unit stop running, the work performed by it can be processed by other analog computing units and circuit units (such as task migration); for another example, when the analog computing unit and circuit unit are started, they should be assigned corresponding (such as assigning tasks); another example, when the circuit unit has no actual work that can be performed, it can "flip" to run, that is, it only generates heat and does not process the actual content.
由此,通过以上方式,可降低模拟计算单元的温度差,提高其温度和性能的均匀性。Therefore, through the above method, the temperature difference of the analog computing unit can be reduced, and the uniformity of its temperature and performance can be improved.
在一些实施例中,根据模拟计算单元是否需要启动运行,控制模拟计算单元和其对应的电路单元的运行状态(S30213),包括:In some embodiments, according to whether the analog computing unit needs to start running, controlling the running state of the analog computing unit and its corresponding circuit unit (S30213) includes:
S30231、判断模拟计算单元是否需要启动运行。S30231. Determine whether the simulation computing unit needs to start running.
S30232、响应于模拟计算单元需要启动运行,控制模拟计算单元启动运行,并控制其对应的电路单元运行。S30232. In response to the need to start the operation of the simulation computing unit, control the simulation computing unit to start the operation, and control the corresponding circuit unit to operate.
S30233、响应于模拟计算单元不需要启动运行,维持模拟计算单元和其对应的电路单元的当前运行状态。S30233. In response to the simulation computing unit not needing to start running, maintain the current operating state of the simulation computing unit and its corresponding circuit unit.
作为本公开实施例的一种方式,模拟计算单元的温度低于预设温度范围的最低温度时,可以是先判断模拟计算单元是否需要启动运行(如是否有待分配的新任务),再据此决定模拟计算单元是否启动运行。As a mode of the embodiment of the present disclosure, when the temperature of the simulation computing unit is lower than the lowest temperature of the preset temperature range, it may be determined whether the simulation computing unit needs to start running (for example, whether there is a new task to be allocated), and then based on the Determines whether the simulation computing unit should start running.
在一些实施例中,模拟计算单元的数量为多个。本公开实施例的方法还包括:In some embodiments, the number of analog computing units is multiple. The method of the embodiment of the present disclosure further includes:
S304、根据模拟计算单元对应的温度传感器在多个时间采集到的多个温度,确定模拟计算单元的历史温度。S304: Determine the historical temperature of the analog computing unit according to multiple temperatures collected by the temperature sensor corresponding to the analog computing unit at multiple times.
S305、根据模拟计算单元的历史温度,确定模拟计算单元对应的电路单元的推荐数量。S305. Determine the recommended number of circuit units corresponding to the analog computing unit according to the historical temperature of the analog computing unit.
作为本公开实施例的一种方式,当有多个模拟计算单元时,其对应的电路单元的总数量是一定的,但这些电路单元可以分布在不同的模拟计算单元附近,也就是对应不同的模拟计算单元。As an embodiment of the present disclosure, when there are multiple analog computing units, the total number of corresponding circuit units is constant, but these circuit units may be distributed near different analog computing units, that is, corresponding to different Analog computing unit.
由此,可用温度传感器在不同时间采集多个温度,以根据这些温 度确定对应的模拟计算单元的历史温度(即模拟计算单元的长期温度分布规律)。从而,可根据历史温度,确定模拟计算单元对应的电路单元的推荐数量(即当模拟计算单元对应推荐数量的电路单元时,可使其温度有较大概率处于比较合理的范围内),并作为确定布局计算系统(片上结构)布局的依据。Therefore, a temperature sensor can be used to collect multiple temperatures at different times, so as to determine the historical temperature of the corresponding analog computing unit (that is, the long-term temperature distribution law of the analog computing unit) according to these temperatures. Therefore, the recommended number of circuit units corresponding to the analog computing unit can be determined according to the historical temperature (that is, when the analog computing unit corresponds to the recommended number of circuit units, the temperature can be within a relatively reasonable range with a high probability), and used as Determines the basis for the layout of the layout computing system (structure-on-chip).
在一些实施例中,模拟计算单元的数量为多个;至少部分温度传感器对应多个模拟计算单元。In some embodiments, the number of analog computing units is multiple; at least some of the temperature sensors correspond to multiple analog computing units.
作为本公开实施例的一种方式,当有多个模拟计算单元时,参照图10至图13,可有温度传感器对应多个模拟计算单元,即温度传感器测得的温度可用于确定多个模拟计算单元的温度。As an embodiment of the present disclosure, when there are multiple analog computing units, referring to FIG. 10 to FIG. 13 , there may be temperature sensors corresponding to multiple analog computing units, that is, the temperature measured by the temperature sensor may be used to determine multiple analog computing units Calculate the temperature of the cell.
在一些实施例中,根据模拟计算单元对应的温度传感器采集到的温度确定模拟计算单元的温度(S302)包括:In some embodiments, determining the temperature of the analog computing unit according to the temperature collected by the temperature sensor corresponding to the analog computing unit (S302) includes:
S3024、在模拟计算单元仅对应一个温度传感器的情况下,确定模拟计算单元的温度为其对应的温度传感器所采集的温度。S3024. In the case that the analog computing unit corresponds to only one temperature sensor, determine the temperature of the analog computing unit as the temperature collected by the corresponding temperature sensor.
S3025、在模拟计算单元对应多个温度传感器的情况下,确定模拟计算单元的温度为其对应的多个温度传感器所采集的多个温度的平均值。S3025 , in the case that the analog computing unit corresponds to multiple temperature sensors, determine that the temperature of the analog computing unit is an average value of multiple temperatures collected by the corresponding multiple temperature sensors.
作为本公开实施例的一种方式,参照图8、图9、图10、图11虚线框外、图12、图13,若模拟计算单元仅对应一个温度传感器,则其温度就是该温度传感器所采集的温度;而参照图11的虚线框内,当模拟计算单元仅对应多个温度传感器,则其温度是多个温度传感器所采集的温度的平均值。As a way of the embodiment of the present disclosure, referring to FIGS. 8 , 9 , 10 , and outside the dashed-line frame in FIG. 11 , FIGS. 12 , and 13 , if the analog computing unit corresponds to only one temperature sensor, its temperature is the temperature of the temperature sensor. However, referring to the dotted box in FIG. 11 , when the analog computing unit only corresponds to multiple temperature sensors, its temperature is the average value of the temperatures collected by multiple temperature sensors.
应当理解,每个模拟计算单元对应的温度传感器的个数,与每个温度传感器对应的模拟计算单元的个数并无必然关系。It should be understood that the number of temperature sensors corresponding to each analog computing unit is not necessarily related to the number of analog computing units corresponding to each temperature sensor.
第二方面,参照图14,本公开实施例提供一种计算系统的布局方法,模拟计算单元已确定设置在计算系统的第一目标位置上。In the second aspect, referring to FIG. 14 , an embodiment of the present disclosure provides a layout method of a computing system, and the simulation computing unit has been determined to be arranged on a first target position of the computing system.
本公开实施例的计算系统的布局方法包括:The layout method of the computing system according to the embodiment of the present disclosure includes:
S401、根据模拟计算单元的温度,确定与模拟计算单元对应的电路单元位于该模拟计算单元的第一范围内。S401. According to the temperature of the analog computing unit, determine that the circuit unit corresponding to the analog computing unit is located within the first range of the analog computing unit.
S402、确定与每个模拟计算单元对应的温度传感器位于该模拟计算单元的第二范围内。S402. Determine that the temperature sensor corresponding to each analog computing unit is located within the second range of the analog computing unit.
在布局(或者说“设计”)以上的计算系统(片上结构)时,通常是模拟计算单元的个数和在计算系统中的位置已经预先确定,故可根据各模拟计算单元的位置和温度确定应在其第一范围内如何设置对应的电路单元,以及在其第二范围内如何设置对应的温度传感器。When laying out (or "designing") the above computing system (on-chip structure), usually the number of analog computing units and their positions in the computing system have been predetermined, so they can be determined according to the location and temperature of each analog computing unit How to set the corresponding circuit unit in its first range, and how to set the corresponding temperature sensor in its second range.
本公开实施例中,根据模拟计算单元的温度设置对应的电路单元和温度传感器,从而可实现对计算系统的合理布局,保证根据该布局的计算系统中的模拟计算单元在运行中多数时间处于合适的温度范围,改善计算系统的性能。In the embodiment of the present disclosure, the corresponding circuit unit and temperature sensor are set according to the temperature of the analog computing unit, so that a reasonable layout of the computing system can be realized, and it is ensured that the analog computing unit in the computing system according to the layout is in a suitable state for most of the time during operation. temperature range, improving the performance of the computing system.
在一些实施例中,模拟计算单元的数量为多个;根据模拟计算单元的温度,确定与模拟计算单元对应的电路单元位于该模拟计算单元的第一范围内(S401),包括:In some embodiments, the number of analog computing units is multiple; according to the temperature of the analog computing unit, it is determined that the circuit unit corresponding to the analog computing unit is located within the first range of the analog computing unit (S401), including:
S4011、运行预置计算系统,预置计算系统包括设于第一目标位置的模拟计算单元,以及与模拟计算单元对应的温度传感器。S4011. Run a preset computing system, where the preset computing system includes an analog computing unit disposed at the first target position, and a temperature sensor corresponding to the analog computing unit.
S4012、获取温度传感器在多个时间采集到的多个温度,根据模拟计算单元对应的温度传感器在多个时间采集到的多个温度,确定模拟计算单元的历史温度。S4012: Acquire multiple temperatures collected by the temperature sensor at multiple times, and determine the historical temperature of the analog computing unit according to the multiple temperatures collected by the temperature sensor corresponding to the analog computing unit at multiple times.
S4013、根据模拟计算单元的历史温度,确定模拟计算单元对应的电路单元的推荐数量。S4013. Determine the recommended number of circuit units corresponding to the analog computing unit according to the historical temperature of the analog computing unit.
S4014、确定与模拟计算单元对应的电路单元个数为推荐数量,并位于该模拟计算单元的第一范围内。S4014. Determine that the number of circuit units corresponding to the analog computing unit is the recommended number and is located within the first range of the analog computing unit.
作为本公开实施例的一种方式,当有多个模拟计算单元时,可以是预先制造具有相同模拟计算单元布局的“预置计算系统”,并运行预置计算系统,以实测其中各模拟计算单元的历史温度,以此为基础确定个模拟计算单元对应的电路单元的推荐数量,并实际布局相应推 荐数量的电路单元。As a mode of the embodiment of the present disclosure, when there are multiple simulation computing units, a "preset computing system" with the same layout of the simulation computing units may be pre-manufactured, and the preset computing system may be run to measure each simulation computing unit. Based on the historical temperature of the unit, the recommended number of circuit units corresponding to each analog computing unit is determined, and the corresponding recommended number of circuit units are actually laid out.
其中,本公开实施例中模拟计算单元的温度不限于以上历史温度,例如其也可以是通过模型模拟运算或经验得到的“预测温度”。The temperature of the simulation computing unit in the embodiment of the present disclosure is not limited to the above historical temperature, for example, it may also be a "predicted temperature" obtained through model simulation operation or experience.
第三方面,参照图15,本公开实施例提供一种计算系统40,包括:计算单元和控制器48。In a third aspect, referring to FIG. 15 , an embodiment of the present disclosure provides a computing system 40 , including: a computing unit and a controller 48 .
控制器48用于执行本公开实施例的任意一种计算系统40的运行方法。The controller 48 is configured to execute any one of the operating methods of the computing system 40 in the embodiments of the present disclosure.
本公开实施例的计算系统40中可包括以上的计算单元,以及用于控制计算单元按照以上的计算系统的运行方法工作的控制器48。The computing system 40 of the embodiment of the present disclosure may include the above computing unit, and a controller 48 for controlling the computing unit to work according to the above operating method of the computing system.
在一些实施例中,参照图8至图13,计算单元为模拟计算单元42;计算系统还包括:与模拟计算单元42对应的温度传感器46和电路单元44。In some embodiments, referring to FIGS. 8 to 13 , the computing unit is an analog computing unit 42 ; the computing system further includes: a temperature sensor 46 and a circuit unit 44 corresponding to the analog computing unit 42 .
当计算系统为以上片上结构时,计算单元为以上模拟计算单元42,且计算系统中还包括以上温度传感器46和电路单元44。When the computing system is the above on-chip structure, the computing unit is the above analog computing unit 42 , and the computing system also includes the above temperature sensor 46 and the circuit unit 44 .
第四方面,参照图16,本公开实施例提供一种电子设备50,包括:In a fourth aspect, referring to FIG. 16 , an embodiment of the present disclosure provides an electronic device 50, including:
一个或多个处理器51;one or more processors 51;
一个或多个存储器52,其上存储有一个或多个程序,当一个或多个程序被一个或多个处理器51执行,使得一个或多个处理器51能够实现本公开实施例的任意一种计算系统的运行方法,或实现本公开实施例的任意一种计算系统的布局方法。One or more memories 52 having one or more programs stored thereon, when the one or more programs are executed by the one or more processors 51, enable the one or more processors 51 to implement any one of the embodiments of the present disclosure A method for running a computing system, or a method for implementing any layout of a computing system in the embodiments of the present disclosure.
在一些实施例中,电子设备50还可包括一个或多个I/O接口(图中用双向箭头表示),连接在处理器51与存储器52之间,配置为实现处理器51与存储器52的信息交互。In some embodiments, the electronic device 50 may further include one or more I/O interfaces (represented by double-sided arrows in the figure), connected between the processor 51 and the memory 52, and configured to implement the communication between the processor 51 and the memory 52. Information exchange.
其中,处理器为具有数据处理能力的器件,其包括但不限于中央处理器(CPU)等;存储器为具有数据存储能力的器件,其包括但不限于随机存取存储器(RAM,更具体如SDRAM、DDR等)、只读存 储器(ROM)、带电可擦可编程只读存储器(EEPROM)、闪存(FLASH);I/O接口(读写接口)连接在处理器与存储器间,能实现存储器与处理器的信息交互,其包括但不限于数据总线(Bus)等。Wherein, the processor is a device with data processing capability, which includes but is not limited to a central processing unit (CPU), etc.; the memory is a device with data storage capability, which includes but is not limited to random access memory (RAM, more specifically such as SDRAM) , DDR, etc.), read-only memory (ROM), electrified erasable programmable read-only memory (EEPROM), flash memory (FLASH); I/O interface (read and write interface) is connected between the processor and the memory, which can realize the memory and the memory. The information exchange of the processor, which includes but is not limited to the data bus (Bus) and the like.
需要注意的是,本公开实施例中的电子设备包括移动电子设备和非移动电子设备。It should be noted that the electronic devices in the embodiments of the present disclosure include mobile electronic devices and non-mobile electronic devices.
第五方面,参照图17,本公开实施例提供一种计算机可读介质60,其上存储有计算机程序,计算机程序在被处理器执行时实现本公开实施例的任意一种计算系统的运行方法,或实现本公开实施例的任意一种计算系统的布局方法。In a fifth aspect, referring to FIG. 17 , an embodiment of the present disclosure provides a computer-readable medium 60 on which a computer program is stored, and when the computer program is executed by a processor, implements any method of running a computing system in the embodiment of the present disclosure , or implement any of the layout methods of the computing system in the embodiments of the present disclosure.
其中,处理器可为上述实施例中的电子设备中的处理器。而本公开实施例的计算机可读介质,包括计算机只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等。The processor may be the processor in the electronic device in the above embodiment. The computer-readable medium of the embodiment of the present disclosure includes a computer read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
本领域普通技术人员可以理解,上文中所公开的全部或某些步骤、系统、装置中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。Those of ordinary skill in the art can understand that all or some of the steps, systems, and functional modules/units in the apparatus disclosed above can be implemented as software, firmware, hardware, and appropriate combinations thereof.
在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical components Components execute cooperatively.
某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器(CPU)、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机可读介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机可读介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其它数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机可读介质 包括但不限于随机存取存储器(RAM,更具体如SDRAM、DDR等)、只读存储器(ROM)、带电可擦可编程只读存储器(EEPROM)、闪存(FLASH)或其它磁盘存储器;只读光盘(CD-ROM)、数字多功能盘(DVD)或其它光盘存储器;磁盒、磁带、磁盘存储或其它磁存储器;可以用于存储期望的信息并且可以被计算机访问的任何其它的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其它传输机制之类的调制数据信号中的其它数据,并且可包括任何信息递送介质。Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit (CPU), digital signal processor or microprocessor, or as hardware, or as an integrated circuit such as Application-specific integrated circuits. Such software may be distributed on computer-readable media, which may include computer-readable media (or non-transitory media) and communication media (or transitory media). As is known to those of ordinary skill in the art, the term computer-readable medium includes both volatile and non-transitory media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules or other data. volatile, removable and non-removable media. Computer readable media include, but are not limited to, random access memory (RAM, more specifically SDRAM, DDR, etc.), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), flash memory (FLASH), or other magnetic disks memory; compact disk read only (CD-ROM), digital versatile disk (DVD), or other optical disk storage; magnetic cartridge, tape, magnetic disk storage, or other magnetic storage; any other storage that can be used to store desired information and that can be accessed by a computer medium. In addition, communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and can include any information delivery media, as is well known to those of ordinary skill in the art .
本公开已经公开了示例实施例,并且虽然采用了具体术语,但它们仅用于并仅应当被解释为一般说明性含义,并且不用于限制的目的。在一些实例中,对本领域技术人员显而易见的是,除非另外明确指出,否则可单独使用与特定实施例相结合描述的特征、特性和/或元素,或可与其它实施例相结合描述的特征、特性和/或元件组合使用。因此,本领域技术人员将理解,在不脱离由所附的权利要求阐明的本公开的范围的情况下,可进行各种形式和细节上的改变。This disclosure has disclosed example embodiments, and although specific terms are employed, they are used and should only be construed in a general descriptive sense and not for purposes of limitation. In some instances, it will be apparent to those skilled in the art that features, characteristics and/or elements described in connection with a particular embodiment may be used alone or in combination with other embodiments, unless expressly stated otherwise. Features and/or elements are used in combination. Accordingly, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the scope of the present disclosure as set forth in the appended claims.

Claims (15)

  1. 一种计算系统的运行方法,其特征在于,所述计算系统包括计算单元;所述方法包括:A method for operating a computing system, wherein the computing system includes a computing unit; the method includes:
    获取所述计算单元的温度;obtain the temperature of the computing unit;
    根据所述计算单元的温度,调整计算单元的工作状态。According to the temperature of the computing unit, the working state of the computing unit is adjusted.
  2. 根据权利要求1所述的方法,其特征在于,所述计算系统包括多个计算单元,所述计算单元为处理核;所述根据所述计算单元的温度,调整计算单元的工作状态,包括:The method according to claim 1, wherein the computing system includes a plurality of computing units, and the computing units are processing cores; the adjusting the working state of the computing units according to the temperature of the computing units includes:
    根据所述处理核的温度,对处理核进行任务分配。The processing cores are assigned tasks according to the temperature of the processing cores.
  3. 根据权利要求2所述的方法,其特征在于,所述获取所述计算单元的温度,包括:The method according to claim 2, wherein the acquiring the temperature of the computing unit comprises:
    响应于存在等待分配的新任务,获取所述处理核的温度。The temperature of the processing core is obtained in response to a new task awaiting assignment.
  4. 根据权利要求2所述的方法,其特征在于,所述获取所述计算单元的温度,包括:The method according to claim 2, wherein the acquiring the temperature of the computing unit comprises:
    响应于存在空闲的处理核,获取所述处理核的温度。In response to the existence of an idle processing core, the temperature of the processing core is obtained.
  5. 根据权利要求2所述的方法,其特征在于,The method of claim 2, wherein:
    所述获取所述计算单元的温度,包括:获取多个所述处理核的温度;The acquiring the temperature of the computing unit includes: acquiring the temperature of a plurality of the processing cores;
    所述根据所述处理核的温度,对处理核进行任务分配,包括:根据多个所述处理核的温度,对其中的至少一个处理核进行任务分配。The performing task assignment on the processing cores according to the temperature of the processing cores includes: performing task assignment on at least one of the processing cores according to the temperatures of the plurality of processing cores.
  6. 根据权利要求2所述的方法,其特征在于,所述根据所述处理核的温度,对处理核进行任务分配,包括以下至少一项:The method according to claim 2, wherein the task assignment to the processing cores according to the temperature of the processing cores comprises at least one of the following:
    响应于第一处理核的温度低于预先设置的第一温度阈值,向第一处理核分配计算密集型任务;In response to the temperature of the first processing core being lower than the preset first temperature threshold, assigning a computationally intensive task to the first processing core;
    响应于第二处理核的温度高于或等于第一温度阈值,且低于或等于预先设置的第二温度阈值,向第二处理核分配数据密集型任务;assigning a data-intensive task to the second processing core in response to the temperature of the second processing core being higher than or equal to the first temperature threshold and lower than or equal to a second preset temperature threshold;
    响应于第三处理核的温度高于第二温度阈值,停止向第三处理核分配任务;in response to the temperature of the third processing core being above the second temperature threshold, ceasing the assignment of tasks to the third processing core;
    其中,所述第二温度阈值高于第一温度阈值,所述处理核运行计算密集型任务时的发热量大于运行数据密集型任务时的发热量。Wherein, the second temperature threshold is higher than the first temperature threshold, and the calorific value of the processing core when running computation-intensive tasks is greater than the calorific value when running data-intensive tasks.
  7. 根据权利要求6所述的方法,其特征在于,在所述停止向第三处理核分配任务后,还包括:The method according to claim 6, wherein after the stopping of allocating tasks to the third processing core, the method further comprises:
    将由所述第三处理核处理的至少部分任务迁出第三处理核。At least some of the tasks processed by the third processing core are migrated out of the third processing core.
  8. 根据权利要求1所述的方法,其特征在于,所述计算单元为模拟计算单元,所述计算系统还包括与模拟计算单元对应的温度传感器和电路单元,所述电路单元设置在其对应的模拟计算单元的第一范围内,所述模拟计算单元设置在其对应的温度传感器的第二范围内;The method according to claim 1, wherein the computing unit is an analog computing unit, the computing system further comprises a temperature sensor and a circuit unit corresponding to the analog computing unit, and the circuit unit is set in the corresponding analog computing unit. Within the first range of the computing unit, the analog computing unit is set within the second range of its corresponding temperature sensor;
    所述获取所述计算单元的温度,包括:获取所述温度传感器采集到的温度,根据所述模拟计算单元对应的温度传感器采集到的温度确定模拟计算单元的温度;The acquiring the temperature of the computing unit includes: acquiring the temperature collected by the temperature sensor, and determining the temperature of the analog computing unit according to the temperature collected by the temperature sensor corresponding to the analog computing unit;
    所述根据所述计算单元的温度,调整计算单元的工作状态,包括:根据所述模拟计算单元的温度与预设温度范围的比较结果,调整模拟计算单元和其对应的电路单元的运行状态。The adjusting the operating state of the computing unit according to the temperature of the computing unit includes: adjusting the operating state of the analog computing unit and its corresponding circuit unit according to a comparison result between the temperature of the analog computing unit and a preset temperature range.
  9. 根据权利要求8所述的方法,其特征在于,所述根据所述模拟计算单元的温度与预设温度范围的比较结果,调整模拟计算单元和其 对应的电路单元的运行状态,包括:method according to claim 8, is characterized in that, described according to the comparison result of the temperature of described simulation calculation unit and preset temperature range, adjust the operation state of simulation calculation unit and its corresponding circuit unit, including:
    响应于所述模拟计算单元的温度在预设温度范围内,维持模拟计算单元和其对应的电路单元的当前运行状态;In response to the temperature of the analog computing unit being within a preset temperature range, maintaining the current operating state of the analog computing unit and its corresponding circuit unit;
    响应于所述模拟计算单元的温度高于预设温度范围的最高温度,控制模拟计算单元和其对应的电路单元停止运行;In response to the temperature of the analog computing unit being higher than the maximum temperature of the preset temperature range, controlling the analog computing unit and its corresponding circuit unit to stop running;
    响应于所述模拟计算单元的温度低于预设温度范围的最低温度,根据模拟计算单元是否需要启动运行,控制模拟计算单元和其对应的电路单元的运行状态。In response to the temperature of the analog computing unit being lower than the lowest temperature of the preset temperature range, the operation state of the analog computing unit and its corresponding circuit unit is controlled according to whether the analog computing unit needs to start running.
  10. 根据权利要求9所述的方法,其特征在于,所述根据模拟计算单元是否需要启动运行,控制模拟计算单元和其对应的电路单元的运行状态,包括:The method according to claim 9, wherein the controlling the operation state of the analog computing unit and its corresponding circuit unit according to whether the analog computing unit needs to start operation, comprising:
    判断所述模拟计算单元是否需要启动运行;judging whether the simulation computing unit needs to start running;
    响应于所述模拟计算单元需要启动运行,控制所述模拟计算单元启动运行,并控制其对应的电路单元运行;In response to the simulation computing unit needing to start running, controlling the simulation computing unit to start running, and controlling its corresponding circuit unit to operate;
    响应于所述模拟计算单元不需要启动运行,维持所述模拟计算单元和其对应的电路单元的当前运行状态。In response to the simulation computing unit not needing to start running, the current operating state of the simulation computing unit and its corresponding circuit unit is maintained.
  11. 根据权利要求8所述的方法,其特征在于,所述模拟计算单元的数量为多个:所述方法还包括:The method according to claim 8, wherein the number of the simulation computing units is multiple: the method further comprises:
    根据所述模拟计算单元对应的温度传感器在多个时间采集到的多个温度,确定模拟计算单元的历史温度;Determine the historical temperature of the analog computing unit according to the multiple temperatures collected by the temperature sensor corresponding to the analog computing unit at multiple times;
    根据所述模拟计算单元的历史温度,确定模拟计算单元对应的电路单元的推荐数量。According to the historical temperature of the analog computing unit, the recommended number of circuit units corresponding to the analog computing unit is determined.
  12. 一种计算系统,其特征在于,包括:计算单元和控制器;A computing system, comprising: a computing unit and a controller;
    所述控制器用于执行权利要求1至11中任意一项所述的方法。The controller is adapted to perform the method of any one of claims 1 to 11.
  13. 根据权利要求12所述的计算系统,其特征在于,所述计算单元为模拟计算单元;所述计算系统还包括:The computing system according to claim 12, wherein the computing unit is an analog computing unit; the computing system further comprises:
    与所述模拟计算单元对应的温度传感器和电路单元。A temperature sensor and a circuit unit corresponding to the analog computing unit.
  14. 一种电子设备,其特征在于,包括:An electronic device, comprising:
    一个或多个处理器;one or more processors;
    一个或多个存储器,其上存储有一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器能够实现如权利要求1至11中任意一项。one or more memories having stored thereon one or more programs which, when executed by the one or more processors, enable the one or more processors to implement the method as claimed in claim 1 to any of 11.
  15. 一种计算机可读介质,其上存储有计算机程序,其特征在于,所述计算机程序在被处理器执行时实现如权利要求1至11中任意一项。A computer-readable medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, any one of claims 1 to 11 is implemented.
PCT/CN2021/130002 2020-11-20 2021-11-11 Computing system and operation method therefor, and electronic device and computer-readable medium WO2022105664A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202011312225.2A CN112416587A (en) 2020-11-20 2020-11-20 Temperature control method of on-chip structure and layout method of on-chip structure
CN202011310981.1 2020-11-20
CN202011310981.1A CN112416586A (en) 2020-11-20 2020-11-20 Task allocation method, processing core, electronic device, and computer-readable medium
CN202011312225.2 2020-11-20

Publications (1)

Publication Number Publication Date
WO2022105664A1 true WO2022105664A1 (en) 2022-05-27

Family

ID=81708343

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/130002 WO2022105664A1 (en) 2020-11-20 2021-11-11 Computing system and operation method therefor, and electronic device and computer-readable medium

Country Status (1)

Country Link
WO (1) WO2022105664A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030018680A1 (en) * 2001-07-18 2003-01-23 Lino Iglesias Smart internetworking operating system for low computational power microprocessors
CN106940657A (en) * 2017-02-20 2017-07-11 深圳市金立通信设备有限公司 A kind of method and terminal that task distribution is carried out to processor
CN111611098A (en) * 2020-05-22 2020-09-01 深圳忆联信息系统有限公司 Solid state disk overheating protection method and device, computer equipment and storage medium
CN112416586A (en) * 2020-11-20 2021-02-26 北京灵汐科技有限公司 Task allocation method, processing core, electronic device, and computer-readable medium
CN112416587A (en) * 2020-11-20 2021-02-26 北京灵汐科技有限公司 Temperature control method of on-chip structure and layout method of on-chip structure

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030018680A1 (en) * 2001-07-18 2003-01-23 Lino Iglesias Smart internetworking operating system for low computational power microprocessors
CN106940657A (en) * 2017-02-20 2017-07-11 深圳市金立通信设备有限公司 A kind of method and terminal that task distribution is carried out to processor
CN111611098A (en) * 2020-05-22 2020-09-01 深圳忆联信息系统有限公司 Solid state disk overheating protection method and device, computer equipment and storage medium
CN112416586A (en) * 2020-11-20 2021-02-26 北京灵汐科技有限公司 Task allocation method, processing core, electronic device, and computer-readable medium
CN112416587A (en) * 2020-11-20 2021-02-26 北京灵汐科技有限公司 Temperature control method of on-chip structure and layout method of on-chip structure

Similar Documents

Publication Publication Date Title
TWI594183B (en) Systems and methods for memory system management based on thermal information of a memory system
US8285961B2 (en) Dynamic performance virtualization for disk access
US10877533B2 (en) Energy efficient workload placement management using predetermined server efficiency data
JP4839328B2 (en) Server power consumption control apparatus, server power consumption control method, and computer program
CN108693938A (en) The thermal throttle of memory devices
US8677365B2 (en) Performing zone-based workload scheduling according to environmental conditions
JP5151203B2 (en) Job scheduling apparatus and job scheduling method
US9411403B2 (en) System and method for dynamic DCVS adjustment and workload scheduling in a system on a chip
JP2016510917A (en) System and method for thermal management in portable computing devices using thermal resistance values to predict optimal power levels
JP2014531698A (en) Dynamic operation for 3D stacked memory using thermal data
US20090210741A1 (en) Information processing apparatus and information processing method
US20190004723A1 (en) Throttling components of a storage device
US20170090755A1 (en) Data Storage Method, Data Storage Apparatus and Solid State Disk
CN110214298B (en) System and method for context aware thermal management and workload scheduling in portable computing devices
CN113826082A (en) Method and equipment for controlling heat dissipation device
US10942850B2 (en) Performance telemetry aided processing scheme
WO2022105664A1 (en) Computing system and operation method therefor, and electronic device and computer-readable medium
KR101725691B1 (en) Non-volatile storage apparatus for storing data and method for controlling access to non-volatile storage apparatus considering heat
CN110275676B (en) Solid state disk control method and device and solid state disk system
JP2022121124A (en) Job assignment control device, job assignment control method, and job assignment control program
WO2020057217A1 (en) Fan control method and system based on machine learning algorithm
TWI575445B (en) Method, system, and computer-readable recording medium for automated storage tiering
US11714442B2 (en) Controlling electrical power consumption for elements in an electronic device based on a platform electrical power limit
US9389919B2 (en) Managing workload distribution among computer systems based on intersection of throughput and latency models
CN115729767A (en) Temperature detection method and device for memory

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21893811

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21893811

Country of ref document: EP

Kind code of ref document: A1