WO2024138796A1

WO2024138796A1 - Data collection structure, method and system, and chip

Info

Publication number: WO2024138796A1
Application number: PCT/CN2023/071362
Authority: WO
Inventors: 汪福全; 刘明
Original assignee: 声龙(新加坡)私人有限公司
Priority date: 2022-12-27
Filing date: 2023-01-09
Publication date: 2024-07-04
Also published as: CN115905088A; CN115905088B

Abstract

Disclosed in the embodiments of the present disclosure are a data collection structure, method and system, and a chip. The data collection structure comprises: a plurality of computing units (1) and a plurality of arbiters (2), the plurality of arbiters (2) comprising first arbiters (21) each corresponding to a computing unit (1), and the plurality of computing units (1) being successively connected by means of the plurality of first arbiters (21) to form a data collection chain, and the computing unit (1) located at the chain end being connected to a control unit (3), wherein the arbiters (2) are configured to receive computing results sent by the computing units (1) and transmit along the data collection chain the computing results to the control unit (3).

Description

A data collection structure, method, chip and system

Technical Field

The embodiments of the present disclosure relate to data collection technology, and in particular to a data collection structure, method, chip and system.

Background technique

There are many computing units in the computing power chip. When processing data, the calculation results of multiple computing units need to be collected and summarized so that all the calculation results can be collected in a control unit. The current solution for collecting multiple calculation results is to add the calculation results to a preset buffer in sequence. When the buffer is full, if the control unit wants to receive all the calculation results submitted by all computing units, it needs to have back pressure on the computing units, which greatly increases the difficulty of designing the computing units. In addition, the current solution does not consider the spatial layout of all computing units.

SUMMARY OF THE INVENTION

The following is a summary of the subject matter described in detail herein; this summary is not intended to limit the scope of the claims.

Embodiments of the present disclosure provide a data collection structure, method, chip and system.

The present disclosure provides a data collection structure, which may include: a plurality of computing units and a plurality of arbitrators; the plurality of arbitrators include a first arbitrator corresponding to each computing unit;

The plurality of computing units are sequentially connected through the plurality of the first arbitrators to form a data collection chain, and the computing unit at the end of the chain is connected to the control unit;

The arbitrator is configured to receive the calculation result sent by the calculation unit and transmit it to the control unit along the data collection chain.

In an exemplary embodiment of the present disclosure, the plurality of computing units are sequentially connected through a plurality of the first arbitrators to form a data collection chain, which may include:

Each computing unit is connected to the next computing unit through a first arbitrator corresponding to the computing unit, so that the multiple computing units are connected in a chain to form the data collection chain.

In an exemplary embodiment of the present disclosure, each of the first arbitrators may include a first inlet, a second inlet, and an outlet;

Each computing unit is connected to the next computing unit through a first arbitrator corresponding to the computing unit, including:

The first inlet of each of the first arbitrators is connected to the output interface of the computing unit corresponding to the first arbitrator;

The second inlet of each of the first arbitrators is connected to the outlet of the first arbitrator corresponding to the previous computing unit.

In an exemplary embodiment of the present disclosure, when the data collection chain is one, the computing unit at the end of the chain is connected to the control unit, which may include:

The first arbitrator corresponding to the computing unit at the end of the chain is directly connected to the control unit through the first arbitrator outlet;

When there are multiple data collection chains, the multiple arbitrators may also include: a second arbitrator; each second arbitrator includes a first inlet, a second inlet and an outlet;

The computing unit at the end of the chain is connected to the control unit and may include:

The outlets of the first arbitrators corresponding to the computing units at the tails of the multiple data collection chains are connected to the control unit through the second arbitrator.

In an exemplary embodiment of the present disclosure, the outlet of the first arbitrator corresponding to the computing unit at the end of the plurality of data collection chains is connected to the control unit through the second arbitrator, which may include:

The outlet of the first arbitrator corresponding to the computing unit at the end of each data collection chain is connected to the first inlet of the second arbitrator or the second inlet of the second arbitrator;

The outlet of the second arbitrator is connected to the input interface of the control unit, or is connected to the first inlet or the second inlet of the next second arbitrator.

In an exemplary embodiment of the present disclosure, a buffer may be provided at the first entrance of each of the first arbitrators;

The buffer may be configured to cache calculation results sent by the computer unit corresponding to the first arbitrator.

The embodiment of the present disclosure further provides a data collection method, which is based on the data collection structure and applied to an arbitrator in the data collection structure; the arbitrator includes a first arbitrator, or includes the first arbitrator and a second arbitrator; the method may include:

Obtaining the calculation results submitted by the corresponding computing unit and/or transmitted by the upper-level arbitrator;

The calculation results are transmitted directly or selectively along a data collection chain until they are transmitted to the control unit.

In an exemplary embodiment of the present disclosure, directly or selectively transmitting the calculation result along a data collection chain may include:

When any one of the first entry and the second entry included in the arbitrator receives the calculation result, the arbitrator directly sends the calculation result to the exit of the arbitrator for output;

When both the first entry and the second entry of the arbitrator receive calculation results, the arbitrator selects one of the two calculation results and sends it to the exit of the arbitrator for output according to a preset selection strategy.

In an exemplary embodiment of the present disclosure, when the number of the computing units is less than a preset number threshold, the selection strategy may include:

No buffer is set at the first entrance of the first arbitrator, and one of the calculation results received from the first entrance and the second entrance is selected at random, and the selected calculation result is sent to the exit of the arbitrator;

When the number of the computing units is greater than or equal to a preset number threshold, the selection strategy includes:

Pre-setting a buffer at a first entrance of the first arbitrator, and buffering the calculation result included in the calculation result sending request received by the first entrance into the buffer pre-set at the first entrance;

After sending the calculation result received by the first entry, the calculation result cached in the buffer is sent or the new calculation result received by the first entry at this time is sent.

The embodiment of the present disclosure also provides a chip which may include the data collection structure.

An embodiment of the present disclosure also provides a data collection system, comprising the chip, a processor and a computer-readable storage medium, wherein the processor stores the calculation results in the chip in the computer-readable storage medium.

Other aspects will be apparent upon reading and understanding the drawings and detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings are used to provide an understanding of the technical solution of the present disclosure and constitute a part of the specification. Together with the embodiments of the present disclosure, they are used to explain the technical solution of the present disclosure and do not constitute a limitation on the technical solution of the present disclosure.

FIG1 is a schematic diagram of a data collection structure including a single data collection chain and a first arbitrator disposed inside a computing unit according to an embodiment of the present disclosure;

FIG2 is a schematic diagram of a data collection structure including a single data collection chain and a first arbitrator disposed outside a computing unit according to an embodiment of the present disclosure;

3 is a schematic diagram of a data collection structure including multiple data collection chains and a first arbitrator disposed inside a computing unit according to an embodiment of the present disclosure;

FIG4 is a schematic diagram of a data collection structure including multiple data collection chains and a first arbitrator disposed outside a computing unit according to an embodiment of the present disclosure;

FIG5 is a flow chart of a data collection method according to an embodiment of the present disclosure;

FIG6 is a block diagram of a data collection device according to an embodiment of the present disclosure;

FIG. 7 is a block diagram of the data collection system according to an embodiment of the present disclosure.

Details

The present disclosure describes multiple embodiments, but the description is exemplary rather than restrictive, and it is obvious to those skilled in the art that there may be more embodiments and implementations within the scope of the embodiments described in the present disclosure. Although many possible feature combinations are shown in the drawings and discussed in the detailed embodiments, many other combinations of the disclosed features are also possible. Unless specifically limited, any feature or element of any embodiment may be used in combination with any other feature or element in any other embodiment, or may replace any other feature or element in any other embodiment.

The present disclosure includes and contemplates combinations of features and elements known to those of ordinary skill in the art. The embodiments, features, and elements disclosed in the present disclosure may also be combined with any conventional features or elements to form a unique invention scheme defined by the claims. Any features or elements of any embodiment may also be combined with features or elements from other invention schemes to form another unique invention scheme defined by the claims. Therefore, it should be understood that any feature shown and/or discussed in the present disclosure may be implemented individually or in any appropriate combination. Therefore, except for the limitations made according to the attached claims and their equivalents, the embodiments are not subject to other limitations. In addition, various modifications and changes may be made within the scope of protection of the attached claims.

In addition, when describing representative embodiments, the specification may have presented the method and/or process as a specific sequence of steps. However, to the extent that the method or process does not rely on the specific order of the steps described herein, the method or process should not be limited to the steps in the specific order described. As will be appreciated by those of ordinary skill in the art, other orders of steps are also possible. Therefore, the specific order of the steps set forth in the specification should not be interpreted as a limitation to the claims. In addition, the claims for the method and/or process should not be limited to the steps performed in the order written, and those skilled in the art can easily understand that these orders can be changed and still remain within the spirit and scope of the disclosed embodiments.

The present disclosure provides a data collection structure A, as shown in FIG1 , FIG2 , FIG3 , and FIG4 , which may include: a plurality of computing units 1 and a plurality of arbitrators 2; the plurality of arbitrators 2 include a first arbitrator 21 corresponding to each computing unit;

The plurality of computing units 1 are connected in sequence through the plurality of the first arbitrators 21 to form a data collection chain, and the computing unit at the end of the chain is connected to the control unit;

The arbitrator 2 is configured to receive the calculation result sent by the calculation unit and transmit it to the control unit 3 along the data collection chain.

In an exemplary embodiment of the present disclosure, a data collection chain structure without back pressure is provided to collect the calculation results of all computing units, which can simplify the design of the computing units, increase the operating frequency of the computing units, and arrange the computing units closely to make full use of the chip space.

In an exemplary embodiment of the present disclosure, the first arbitrator 2 can be arranged inside the corresponding computing unit 1 (as shown in Figures 1 and 3), or can be arranged outside the corresponding computing unit 1 (as shown in Figures 2 and 4). When the first arbitrator 21 is designed inside the computing unit 1, the computing units 1 can be arranged closely to make full use of the chip space.

In the exemplary embodiments of the present disclosure, the multiple computing units 1 shown in Figures 1, 2, 3, and 4 are different computing units 1 and do not refer to the same computing unit 1, the multiple first arbitrators 21 are different first arbitrators 21 and do not refer to the same first arbitrator 21, and the multiple second arbitrators 22 are different second arbitrators 22 and do not refer to the same second arbitrator 22.

Each computing unit 1 is connected to the next computing unit 1 through a first arbitrator 21 corresponding to the computing unit, so that the multiple computing units 1 are connected in a chain to form the data collection chain.

In an exemplary embodiment of the present disclosure, each arbitrator may include two inlets and one outlet: one inlet is connected to the current computing unit, the other inlet is connected to the previous computing unit, and the outlet is connected to the next computing unit or the control unit.

In an exemplary embodiment of the present disclosure, each of the first arbitrators 21 may include a first inlet a1, a second inlet b1, and an outlet c1;

Each computing unit 1 is connected to the next computing unit 1 through the corresponding first arbiter 21, including:

The first inlet a1 of each first arbitrator 21 is connected to the output interface of the computing unit 1 corresponding to the first arbitrator 21;

The second inlet b1 of each of the first arbitrators 21 is connected to the outlet c1 of the first arbitrator 21 corresponding to the previous computing unit 1 .

In an exemplary embodiment of the present disclosure, there may be only one data collection chain, or a plurality of data collection chains may be arranged in parallel, and the end of each data collection chain is connected to the control unit 3 .

In an exemplary embodiment of the present disclosure, when the data collection chain is one, the computing unit 1 at the end of the chain is connected to the control unit 3, which may include:

The first arbitrator 21 corresponding to the computing unit 1 at the end of the chain is directly connected to the control unit 3 through the exit c1.

In an exemplary embodiment of the present disclosure, as shown in FIG3 and FIG4 , when there are multiple data collection chains; the multiple arbitrators 2 may further include a second arbitrator 22; each second arbitrator 22 may include a first inlet a2, a second inlet b2 and an outlet c2;

The computing unit 1 at the end of the chain is connected to the control unit 3 and may include:

The outlet c1 of the first arbitrator 21 corresponding to the computing unit 1 at the end of the chain of the plurality of data collection chains is connected to the control unit 3 through the second arbitrator 22 .

In an exemplary embodiment of the present disclosure, when there are one or more second arbitrators 22 , the second arbitrators 22 are also added to the connected data collection chain as a part of the data collection chain.

In an exemplary embodiment of the present disclosure, the outlet c1 of the first arbitrator 21 corresponding to the computing unit 1 at the end of the chain of the plurality of data collection chains is connected to the control unit 3 through the second arbitrator 22, which may include:

The exit c1 of the first arbitrator 21 corresponding to the computing unit 1 at the end of each data collection chain is connected to the first entrance a2 or the second entrance b2 of the second arbitrator 22;

The outlet c2 of the second arbitrator 22 is connected to the input interface of the control unit 3 , or is connected to the first inlet a2 or the second inlet b2 of the next second arbitrator 22 .

In an exemplary embodiment of the present disclosure, when the multiple data collection chains are two data collection chains, the second arbitrator 22 can be one; the first inlet a2 of the second arbitrator 22 is connected to the outlet c2 of the first arbitrator 21 corresponding to the computing unit 1 at the end of one data collection chain, and the second inlet b2 of the second arbitrator 22 is connected to the outlet c1 of the first arbitrator 21 corresponding to the computing unit 1 at the end of another data collection chain; the outlet c2 of the second arbitrator 22 is directly connected to the input interface of the control unit 3.

In an exemplary embodiment of the present disclosure, when the plurality of data collection chains is greater than two data collection chains, there may be a plurality of second arbitrators 22; the plurality of second arbitrators 22 are also connected in a chain, recorded as a second arbitrator chain, constituting a part of the connected data collection chain; and at least one of the first inlet a2 and the second inlet b2 of each second arbitrator 22 is connected to the outlet c1 of the first arbitrator 22 corresponding to the last computing unit 1 of the data collection chain, and the outlet c2 of the second arbitrator 22 at the end of the second arbitrator chain is directly connected to the input interface of the control unit 3.

In an exemplary embodiment of the present disclosure, as shown in FIG. 3 and FIG. 4 , the detailed connection mode may include: when m (m is a positive integer) second arbitrators 22 are included, the first inlet a2 of the first second arbitrator 22 may be connected to the outlet c1 of the first arbitrator 21 corresponding to the last computing unit 1 of the first data collection chain, the second inlet b2 of the first second arbitrator 22 may be connected to the outlet c1 of the first arbitrator 21 corresponding to the last computing unit 1 of the second data collection chain, and the outlet c2 of the first second arbitrator 22 may be connected to the second inlet b2 of the next second arbitrator 22 (the second second arbitrator 22). The first inlet a2 of the second to m-th second arbitrators 22 may be connected in sequence to the outlet c1 of the first arbitrator 21 corresponding to the last computing unit 1 in the third data collection chain to the last data collection chain, and the second inlet b2 of the second to m-th second arbitrators 22 may be connected in sequence to the outlet c2 of the corresponding previous second arbitrator 22. The outlet c2 of the m-th second arbitrator 22 is directly connected to the input interface of the control unit 3.

In the exemplary embodiments of the present disclosure, the number of computing units 1 in FIG. 1 , FIG. 2 , FIG. 3 , and FIG. 4 is only an example, and may actually be any number, for example, generally 1 to 65536.

The embodiment of the present disclosure further provides a data collection method, which is based on the above data collection structure and is applied to an arbitrator in the data collection structure; the arbitrator may include a first arbitrator, or include the first arbitrator and a second arbitrator; as shown in FIG5 , the method may include steps S101-S102:

S101, obtaining a calculation result submitted by a corresponding calculation unit 1 and/or transmitted by an upper-level arbitrator 2;

S102 , directly or selectively transmitting the calculation result along a data collection chain until it is transmitted to the control unit 3 .

When any one of the first entry and the second entry included in the arbitrator 2 receives the calculation result, the arbitrator 2 directly sends the calculation result to the exit of the arbitrator for output;

When both the first entry and the second entry included in the arbitrator 2 receive calculation results, the arbitrator selects one of the two calculation results and sends it to the output of the arbitrator for output according to a preset selection strategy.

In an exemplary embodiment of the present disclosure, when the number of the computing units 1 is less than a preset number threshold, the selection strategy may include:

No buffer is set at the first entrance of the first arbitrator, and one of the calculation results received from the first entrance and the second entrance is selected at random, and the selected calculation result is sent to the exit of the arbitrator.

In an exemplary embodiment of the present disclosure, if the number of computing units 1 is small, when both the first inlet and the second inlet receive computing results, one computing result can be randomly selected and discarded, and the other computing result can be transmitted downward through the outlet.

In an exemplary embodiment of the present disclosure, in theory, the control unit 3 needs to receive all calculation results submitted by all computing units 1, but this requires back pressure on the computing unit 1, which will greatly increase the design difficulty of the computing unit 1. Therefore, the circuit for the computing unit 1 to submit the calculation results is designed as a chain. Any computing unit 1 can submit the calculation results to the entire chain through an arbitrator 2. If both entrances of the arbitrator 2 have requests, one of them will be discarded.

In the exemplary embodiment of the present disclosure, it is assumed that there are n (n is a positive integer) computing units, and the probability of each computing unit submitting a computing result is 1/2 ³² , an independent event. The computing units from the head of the chain to the end of the chain are numbered as computing unit [0] to computing unit [n-1], then:

The probability that the chain node connected to computing unit [0] has no data is (1-1/2 ³² ) ¹ ,

The probability that the chain node connected to the computing unit [1] has no data is (1-1/2 ³² ) ²

The probability that the chain node connected to computing unit [2] has no data is (1-1/2 ³² ) ³ ,

…

The probability that the chain node connected to computing unit [i] has no data is (1-1/2 ³² ) ⁱ⁺¹ , where i is a positive integer.

…

The probability that the chain node connected to computing unit [n-1] has no data is (1-1/2 ³² ) ⁿ ,

That is, in this data collection structure, the probability that the control unit 3 receives the calculation result is:

The probability of the entire chip calculating the calculation result is:

The percentage of computing power lost by this data collection structure is:

When n=256,

The above formula is approximately equal to 0, which can be regarded as no computing power loss in this data collection structure.

As long as the number n is constant, the calculation units 1 being completely connected in series or partially connected in parallel does not affect the above calculation results.

In the exemplary embodiment of the present disclosure, the aforementioned function of appropriately discarding calculation results makes the design of the arbitrator 2 extremely simple, and there is no back pressure on the computing unit 1, so that the data flow of the entire computing unit 1 is unidirectional, which simplifies the structure of the computing unit 1 and is conducive to the improvement of the operating frequency of the computing unit 1.

In an exemplary embodiment of the present disclosure, when the number of the computing units 1 is greater than or equal to a preset number threshold, the selection strategy may include:

In an exemplary embodiment of the present disclosure, when the number of the computing units is greater than or equal to a preset number threshold, it means that the number of computing units is large. In response to this situation, a cache can be added in advance at the first entrance of each first arbitrator 21. The capacity of the cache is less than the preset capacity threshold (for example, the capacity of the cache is 1 and can only accommodate one calculation result). The cache is configured to cache the calculation results sent by the computer unit 1 corresponding to the first arbitrator 21.

In an exemplary embodiment of the present disclosure, when calculation results are received in both the first entrance a1 and the second entrance b1 included in the first arbitrator 21, the calculation result received by the first entrance a1 can be cached in a corresponding cache first, and the calculation result received by the second entrance b1 can be sent to the exit c1 of the first arbitrator 21 for output.

In an exemplary embodiment of the present disclosure, after the calculation result received by the second entrance b1 is sent to the exit c1 for output, the calculation result cached in the buffer is sent to the exit c1 of the first arbitrator 21 for output; wherein, if a new calculation result sent by the calculation unit 1 corresponding to the buffer is received before the calculation result cached in the buffer is sent to the exit c1 of the first arbitrator 21, the calculation result cached in the buffer is discarded, and the new calculation result is sent to the first exit a1 of the first arbitrator 21 for output.

In an exemplary embodiment of the present disclosure, the method also includes: determining whether to set a buffer based on the size of the recovered computing power loss and the amount of resources that need to be increased; wherein, when the recovered computing power loss is greater than or equal to a preset computing power threshold, and the proportion of resources that need to be increased to the total chip resources is less than or equal to a preset proportion threshold, determining to set the buffer; when the recovered computing power loss is less than the preset computing power threshold, and/or the proportion of resources that need to be increased to the total chip resources is greater than the preset proportion threshold, determining not to set the buffer.

In an exemplary embodiment of the present disclosure, when the number n is large, by adding a cache to the first entrance a1 of the first arbiter 21, caching the calculation result of the calculation unit 1, and abandoning the cached calculation result when the current cached calculation result encounters the newly sent calculation result, this solution can reduce the computing power loss, but this solution may lead to an increase in chip resources. Therefore, it is necessary to comprehensively consider whether to add a cache to the entrance 1 in combination with the computing power loss to be recovered and the resources to be increased. For example, when n is 256, at most 2.9802322387695312e-08 of computing power can be recovered. If the chip resources occupied by the cache resources added at the first entrance a1 exceed this value, the cache should not be added at the first entrance a1.

In the exemplary embodiments of the present disclosure, the embodiments of the present disclosure include at least the following advantages:

1. The computing units can be arranged closely to make full use of the computing space;

2. The arbitrator has no back pressure on the computing unit, which can simplify the design of the computing unit and increase the operating frequency;

3. It has almost no impact on the computing power of the chip.

The embodiment of the present disclosure further provides a chip 10, as shown in FIG6, comprising the data collection structure A described above.

In the exemplary embodiments of the present disclosure, any of the aforementioned data collection structures and methods are applicable to the chip 10 embodiment and will not be described in detail herein.

The embodiment of the present disclosure further provides a data collection system 1 , as shown in FIG. 7 , comprising a chip 10 , a processor 11 and a computer-readable storage medium 12 , wherein the processor 11 stores the calculation results in the chip 10 in the computer-readable storage medium 12 .

In the exemplary embodiments of the present disclosure, any of the aforementioned data collection structures and methods are applicable to the data collection system 1 embodiment, and will not be described one by one here.

In an exemplary embodiment of the present disclosure, the processor 11 and the computer-readable storage medium 12 may be implemented by the above-mentioned control unit.

In the exemplary embodiments of the present disclosure, any of the aforementioned data collection structures and methods are applicable to the computer-readable storage medium embodiments and will not be described in detail herein.

It will be appreciated by those skilled in the art that all or some of the steps, systems, and functional modules/units in the methods disclosed above may be implemented as software, firmware, hardware, and appropriate combinations thereof. In hardware implementations, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, a physical component may have multiple functions, or a function or step may be performed by several physical components in cooperation. Some or all components may be implemented as software executed by a processor, such as a digital signal processor or a microprocessor, or implemented as hardware, or implemented as an integrated circuit, such as an application-specific integrated circuit. Such software may be distributed on a computer-readable medium, which may include a computer storage medium (or non-transitory medium) and a communication medium (or temporary medium). As known to those skilled in the art, the term computer storage medium includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storing information (such as computer-readable instructions, data structures, program modules, or other data). Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tapes, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and can be accessed by a computer. In addition, it is well known to those of ordinary skill in the art that communication media typically contain computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media.

Claims

A data collection structure comprises: a plurality of computing units and a plurality of arbitrators; the plurality of arbitrators comprises a first arbitrator corresponding to each computing unit;

The plurality of computing units are sequentially connected through the plurality of the first arbitrators to form a data collection chain, and the computing unit at the end of the chain is connected to the control unit;

The arbitrator is configured to receive the calculation result sent by the calculation unit and transmit it to the control unit along the data collection chain.
The data collection structure according to claim 1, wherein each of the first arbitrators comprises a first inlet, a second inlet, and an outlet;

Each computing unit is connected to the next computing unit through a first arbitrator corresponding to the computing unit, including:

The first inlet of each of the first arbitrators is connected to the output interface of the computing unit corresponding to the first arbitrator;

The second inlet of each of the first arbitrators is connected to the outlet of the first arbitrator corresponding to the previous computing unit.
The data collection structure according to claim 1, wherein when the data collection chain is one, the computing unit at the end of the chain is connected to the control unit, comprising:

The first arbitrator corresponding to the computing unit at the end of the chain is directly connected to the control unit through the outlet of the first arbitrator;

When there are multiple data collection chains, the multiple arbitrators further include: a second arbitrator; each second arbitrator includes a first inlet, a second inlet and an outlet;

The computing unit at the end of the chain is connected to the control unit and includes:

The outlets of the first arbitrators corresponding to the computing units at the tails of the multiple data collection chains are connected to the control unit through the second arbitrator.
The data collection structure according to claim 3, wherein the outlet of the first arbitrator corresponding to the computing unit at the end of the chain of the plurality of data collection chains is connected to the control unit through the second arbitrator, comprising:

The outlet of the first arbitrator corresponding to the computing unit at the end of each data collection chain is connected to the first inlet of the second arbitrator or the second inlet of the second arbitrator;

The outlet of the second arbitrator is connected to the input interface of the control unit, or is connected to the first inlet or the second inlet of the next second arbitrator.
The data collection structure according to claim 2, wherein a buffer is provided at the first entrance of each of the first arbitrators;

The buffer is configured to buffer the calculation results sent by the computer unit corresponding to the first arbitrator.
A data collection method, characterized in that it is based on the data collection structure according to any one of claims 1 to 5 and is applied to an arbitrator in the data collection structure; the arbitrator includes a first arbitrator, or includes the first arbitrator and a second arbitrator; the method includes:

Obtaining the calculation results submitted by the corresponding computing unit and/or transmitted by the upper-level arbitrator;

The calculation results are transmitted directly or selectively along a data collection chain until they are transmitted to the control unit.
The data collection method according to claim 6, wherein the step of directly or selectively transmitting the calculation result along the data collection chain comprises:

When any one of the first entry and the second entry included in the arbitrator receives the calculation result, the arbitrator directly sends the calculation result to the exit of the arbitrator for output;

When both the first entry and the second entry of the arbitrator receive calculation results, the arbitrator selects one of the two calculation results and sends it to the exit of the arbitrator for output according to a preset selection strategy.
The data collection method according to claim 7, wherein when the number of the computing units is less than a preset number threshold, the selection strategy comprises:

No buffer is set at the first entrance of the first arbitrator, and one of the calculation results received from the first entrance and the second entrance is selected at random, and the selected calculation result is sent to the exit of the arbitrator;

When the number of the computing units is greater than or equal to a preset number threshold, the selection strategy includes:

Pre-setting a buffer at a first entrance of the first arbitrator, and buffering the calculation result included in the calculation result sending request received by the first entrance into the buffer pre-set at the first entrance;

After sending the calculation result received by the first entry, the calculation result cached in the buffer is sent or the new calculation result received by the first entry at this time is sent.
A chip, characterized in that it comprises the data collection structure as described in any one of claims 1-5.
A data collection system, characterized in that it includes the chip, processor and computer-readable storage medium as described in claim 9, and the processor stores the calculation results in the chip in the computer-readable storage medium.