CN114021708B - Data processing method, device and system, electronic equipment and storage medium - Google Patents
Data processing method, device and system, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN114021708B CN114021708B CN202111165135.XA CN202111165135A CN114021708B CN 114021708 B CN114021708 B CN 114021708B CN 202111165135 A CN202111165135 A CN 202111165135A CN 114021708 B CN114021708 B CN 114021708B
- Authority
- CN
- China
- Prior art keywords
- computing
- computing core
- core
- characteristic value
- flow direction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 27
- 238000004364 calculation method Methods 0.000 claims abstract description 81
- 238000000034 method Methods 0.000 claims abstract description 29
- 238000012545 processing Methods 0.000 claims description 39
- 230000005540 biological transmission Effects 0.000 claims description 19
- 230000000977 initiatory effect Effects 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 9
- 238000010586 diagram Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 10
- 238000004891 communication Methods 0.000 description 8
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 5
- 230000005236 sound signal Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- KLDZYURQCUYZBL-UHFFFAOYSA-N 2-[3-[(2-hydroxyphenyl)methylideneamino]propyliminomethyl]phenol Chemical compound OC1=CC=CC=C1C=NCCCN=CC1=CC=CC=C1O KLDZYURQCUYZBL-UHFFFAOYSA-N 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 201000001098 delayed sleep phase syndrome Diseases 0.000 description 1
- 208000033921 delayed sleep phase type circadian rhythm sleep disease Diseases 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/80—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8046—Systolic arrays
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application discloses a data processing method, a device, a system, an electronic device and a computer readable storage medium, wherein the method comprises the following steps: acquiring a setting instruction, and setting a computing network according to the setting instruction; the setting instruction is used for setting the data flow direction among all computing cores in the computing network; acquiring at least one characteristic value, and respectively inputting the at least one characteristic value into at least one initial computing core in a computing network; transmitting the characteristic value according to the data flow by taking the initial computing core as a starting point; generating a calculation result based on the characteristic value and the corresponding weight value by using each calculation core; according to the method, the data flow direction between each computing core in the computing network is set through the setting instruction, so that data flows between different levels or between the levels, and the whole computing network can be comprehensively utilized in percentage no matter what shape of network model is processed.
Description
Technical Field
The present invention relates to the field of deep learning technologies, and in particular, to a data processing method, a data processing apparatus, a data processing system, an electronic device, and a computer readable storage medium.
Background
The development of deep learning at present places very high demands on computational power, and various ASIC (Application Specific Integrated Circuit ) architectures are endless, and representative ones are: google derived tensor processing unit (TPU, tensor Processing Unit), which is a custom ASIC chip, is dedicated to machine learning workload. But the TPU is not a general purpose processor but a matrix processor dedicated to the neural network workload, the main task of which is matrix processing, the hardware designer of the TPU being aware of each step of the operation process. They place thousands of multipliers and adders and connect them directly to build a physical matrix of those operators, called a Systolic Array architecture. A systolic array is a series of processing units (Processing Elements, PE) regularly arranged in a grid. Each PE in the systolic array will perform data transfer with its neighboring PEs in a predetermined step. In addition, other architecture schemes similar to TPU (e.g., da vinci architecture scheme) also employ regular two-dimensional or three-dimensional mesh PE array structures. However, the shape of the neural network or the number of specification parameters are quite different, while the hardware scale of TPU and the like is fixed, the hardware resource utilization rate can be maximized only when the shape of the network is completely matched with the chip, and the neural network which is not matched with the chip has the hardware resource utilization rate related to the size of the network, but cannot achieve the state of fully utilizing the hardware resource, and has the problem of insufficient resource utilization rate.
Disclosure of Invention
In view of the foregoing, it is an object of the present application to provide a data processing method, apparatus, system, electronic device, and computer readable storage medium, which improve the resource utilization.
In order to solve the above technical problems, the present application provides a data processing method, including:
acquiring a setting instruction, and setting a computing network according to the setting instruction; the setting instruction is used for setting the data flow direction among all computing cores in the computing network;
acquiring at least one characteristic value, and respectively inputting the at least one characteristic value into at least one initial computing core in the computing network;
transmitting the characteristic value according to the data flow by taking the initial computing core as a starting point;
generating a calculation result based on the characteristic value and the corresponding weight value by using each calculation core;
the computing network comprises a first computing core, a plurality of second computing cores and a plurality of third computing cores; the starting computing core comprises the first computing core, or comprises the first computing core and the third computing core, or comprises the first computing core, the second computing core and the third computing core; any one of the second computing cores corresponds to an upper computing core and two lower computing cores, the lower computing cores are the second computing core or the third computing core, the third computing core does not have a lower computing core, and the first computing core is an upper computing core of the target second computing core.
Optionally, the method further comprises:
acquiring configuration information and storing the configuration information to each computing core; the configuration information comprises a plurality of identification information and a plurality of corresponding data streams;
correspondingly, the setting the computing network according to the setting instruction comprises the following steps:
and sending the setting instruction to each computing core, and determining the data flow direction corresponding to each computing core by utilizing the target identification information and the configuration information in the setting instruction.
Optionally, the setting instruction is configured to set the computing network to one path of computation, and the data flow direction is a first flow direction from an upper level to a lower level, and the initial computing core is the first computing core;
correspondingly, the at least one characteristic value is respectively input into at least one initial computing core in the computing network; transmitting the characteristic value according to the data flow by taking the initial computing core as a starting point, wherein the characteristic value comprises the following steps:
inputting the characteristic value into the first computing core, and transmitting the characteristic value to the target second computing core by using the first computing core;
and based on the first flow direction, starting from the target second computing core, sequentially sending the characteristic values to corresponding lower computing cores until the characteristic values are sent to the third computing core.
Optionally, the setting instruction is configured to set the computing network to at least three routes of computation, and the data flow direction is a second flow direction flowing between a first flow direction from an upper level to a lower level and a peer level, and the initial computing core includes the first computing core and an initial third computing core, where the first computing core corresponds to the first flow direction, and the initial third computing core corresponds to the second flow direction;
correspondingly, the at least one characteristic value is respectively input into at least one initial computing core in the computing network; transmitting the characteristic value according to the data flow by taking the initial computing core as a starting point, wherein the characteristic value comprises the following steps:
inputting a first characteristic value into the first computing core, and transmitting the first characteristic value to the target second computing core by using the first computing core;
based on the first flow direction, starting from the target second computing core, sequentially sending the first characteristic value to a corresponding lower computing core until the first characteristic value is sent to the second computing core at the end of the first flow direction;
inputting a second characteristic value into the initial third computing core, starting from the initial third computing core, and sequentially sending the second characteristic value to a subsequent peer computing core based on the second flow direction until the second characteristic value is sent to the third computing core at the end of the second flow direction.
Optionally, the initiating computing core further includes initiating a second computing core, the second computing core corresponding to the second flow direction;
correspondingly, the at least one characteristic value is respectively input into at least one initial computing core in the computing network; transmitting the characteristic value according to the data flow by taking the initial computing core as a starting point, wherein the characteristic value comprises the following steps:
inputting a second characteristic value into the initial second computing core, starting from the initial second computing core, and sequentially sending the second characteristic value to a subsequent peer computing core based on the second flow direction until the second characteristic value is sent to the second computing core at the end of the second flow direction.
Optionally, the method further comprises:
determining weight values corresponding to the computing cores respectively;
and sending and storing the weight value to the corresponding computing core.
Optionally, the determining the weight value corresponding to each computing core includes:
acquiring an initial weight value;
and based on the data flow direction, determining the corresponding relation between each computing core and the initial weight value, and finishing the determination of the weight value.
Optionally, the generating, with each computing core, a computing result based on the feature value and the corresponding weight value includes:
Controlling each computing core, and multiplying the characteristic value by the weight value to obtain a target result;
and adding the target result with the historical calculation result stored by the calculation core to obtain the calculation result.
Optionally, the computing core includes an arithmetic logic unit, a weight value interface, a characteristic value interface, a control module and a storage unit, where the weight value interface includes an external input port and a peer input port, the characteristic value interface includes an external input port, an upper input port and a peer input port, the control module is used to store configuration information and determine a data flow direction according to the setting instruction, and the storage unit is used to store the weight value, the characteristic value and the computing result.
The application also provides a data processing device, comprising:
the setting module is used for acquiring a setting instruction and setting a computing network according to the setting instruction; the setting instruction is used for setting the data flow direction among all computing cores in the computing network;
the input module is used for acquiring at least one characteristic value and inputting the at least one characteristic value into at least one initial computing core in the computing network respectively;
The transmission module is used for transmitting the characteristic value according to the data flow direction by taking the initial computing core as a starting point;
the computing module is used for generating a computing result based on the characteristic value and the corresponding weight value by utilizing each computing core;
the computing network comprises a first computing core, a plurality of second computing cores and a plurality of third computing cores; the initial computing core comprises the first computing core, or comprises the first computing core and the third computing core, or comprises the first computing core, the second computing core and the third computing core; any one of the second computing cores corresponds to an upper computing core and two lower computing cores, the lower computing cores are the second computing core or the third computing core, the third computing core does not have a lower computing core, and the first computing core is an upper computing core of the target second computing core.
The application also provides a data processing system, comprising a computing network, wherein the computing network comprises a first computing core, a plurality of second computing cores and a plurality of third computing cores; any one of the second computing cores corresponds to an upper computing core and two lower computing cores, the lower computing cores are the second computing core or the third computing core, the third computing core does not have a lower computing core, and the first computing core is an upper computing core of the target second computing core.
The application also provides an electronic device comprising a memory and a processor, wherein:
the memory is used for storing a computer program;
the processor is configured to execute the computer program to implement the data processing method described above.
The present application also provides a computer readable storage medium storing a computer program, wherein the computer program implements the above-mentioned data processing method when executed by a processor.
According to the data processing method, a setting instruction is obtained, and a computing network is set according to the setting instruction; the setting instruction is used for setting the data flow direction among all computing cores in the computing network; acquiring at least one characteristic value, and respectively inputting the at least one characteristic value into at least one initial computing core in a computing network; transmitting the characteristic value according to the data flow by taking the initial computing core as a starting point; generating a calculation result based on the characteristic value and the corresponding weight value by using each calculation core; the computing network comprises a first computing core, a plurality of second computing cores and a plurality of third computing cores; the initial computing core comprises a first computing core, or comprises a first computing core and a third computing core, or comprises a first computing core, a second computing core and a third computing core; any second computing core corresponds to an upper computing core and two lower computing cores, the lower computing cores are the second computing core or a third computing core, the third computing core does not have a lower computing core, and the first computing core is an upper computing core of the target second computing core.
It can be seen that the method adopts a computing network with a special architecture, three computing cores exist in the computing network, and the computing cores have a preset hierarchical relationship. The first computing core is a first stage, one target second computing core is a lower computing core of the first computing core, two second computing cores are lower computing cores of the target second computing core, four other second computing cores are lower computing cores of the two second computing cores, and the like until all third computing cores are two by two and are lower computing cores of a second computing core of a certain stage respectively. All the computing cores are arranged according to the level, the computing cores at the same level are positioned on the same plane, and the computing cores are in a pyramid shape. In addition, the number of compute cores in the compute network is 1+1+2+4+ … +2 ζ=2++1, and the number of compute cores of each level is the sum of the number of all compute cores at a level higher than that level. Because the number of channels of the neural network is usually 2-k, when calculation is performed, if all calculation cores are not needed to participate in calculation, that is, the shape of the network model is not matched with the size of the calculation network, 2-k calculation results can be obtained by using the calculation cores of the first several stages as a calculation path, and the calculation cores of the lower stages are split to obtain at least two additional calculation paths, and other characteristic values are processed, and the paths are mutually independent and calculated in parallel. Specifically, the data flow direction between each computing core in the computing network is set through the setting instruction, so that data flows between different levels or flows between the levels, the whole computing network can be comprehensively utilized in percentage no matter what shape of network model is processed, and the problem of insufficient hardware utilization rate in the related technology is solved.
In addition, the application also provides a data processing device, electronic equipment and a computer readable storage medium, and the data processing device and the electronic equipment have the beneficial effects.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the related art, the drawings that are required to be used in the embodiments or the related technical descriptions will be briefly described below, and it is apparent that the drawings in the following description are only embodiments of the present application, and other drawings may be obtained according to the provided drawings without inventive effort to those of ordinary skill in the art.
Fig. 1 is a schematic structural diagram of a computing network according to an embodiment of the present application;
FIG. 2 is a flowchart of a data processing method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a level structure between a second computing core and a third computing core according to an embodiment of the present disclosure;
FIG. 4 is a schematic structural diagram of a specific computing core according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a computing network provided in an embodiment of the present application. The computing network includes a first computing core, a plurality of second computing cores, and a plurality of third computing cores. Wherein, any one second computing core corresponds to an upper computing core and two lower computing cores, and the lower computing core is specifically the second computing core, or may be a third computing core. The third computing core does not have a lower computing core, namely the third computing core is the lowest computing core of the whole network, the first computing core is the upper computing core of the target second computing core, and the target second computing core is the highest-level one of the left and right second computing cores.
As can be seen from fig. 1, the entire computing network is partitioned from the perspective of the individual computing core levels, resulting in the pyramid structure shown in fig. 1. The uppermost computing core is the first computing core (can be regarded as being in the 0 th level), the corresponding lower computing core is the target second computing core (can be regarded as being in the 1 st level), and the number of the target second computing cores is one. The target second computing core is likewise a second computing core, which likewise has the characteristics of a second computing core, i.e. has one upper computing core (level 0) and two lower computing cores (which can be regarded as being at level 2). The lower-level computing cores (level 2) of the target second computing core likewise have two lower-level computing cores (level 3), respectively, and so on, until a third computing core is used as the lower-level computing core of the second computing core, which has no lower-level computing core. In addition, in fig. 1, the node realizing connection is a computing core, and the node connected by the dotted line is a transmission node, which is used for transmitting the weight value.
It follows that, starting from the first computing core, if the third computing core is at the nth stage, there are 1+1+2+4+ … +2 ζ=2++1 computing cores in the entire computing network, n being a positive integer. And the number of the computing cores of each stage is the sum of the numbers of all the computing cores with the level higher than the stage. Illustratively, the number of computing cores of level 5 is the sum of the numbers of all computing cores corresponding to level 0, level 1, level 2, level 3, and level 4. Since the number of channels of the neural network is usually 2≡k (k is a positive integer), when performing the calculation, if k=n, all the calculation cores of the whole calculation network can participate in the calculation. If all the computing cores are not needed to participate in the computation, that is, when the shape of the network model is not matched with the size of the computing network, and k < n >, 2-k computing results can be obtained by using the computing core of the previous k-1 level as a computing path, and the computing cores of the k level and lower level are split to obtain at least two additional computing paths, and other characteristic values are processed, and the paths are mutually independent and are computed in parallel.
For example, if n= 6,k =6, the number of computing cores of the whole computing network is 2≡ (6+1) = 2^7, and the number of channels of the network model is 2≡6, the computing cores before the 6 th stage in the computing network can be used as one path for computing, and for the computing cores (actually, the third computing core) of the 7 th stage, the number of computing cores is twice k, so that the computing cores can be split into two paths of independent computing paths, and three paths of computing paths are computed in parallel.
On the basis of the above computing network, the present embodiment also provides a processing method for performing data processing by using the computing network. Referring to fig. 2, fig. 2 is a flowchart of a data processing method according to an embodiment of the present application. The method comprises the following steps:
s101: and acquiring a setting instruction, and setting a computing network according to the setting instruction.
The setting instruction is used for setting a data flow direction between computing cores in the computing network, wherein the data flow direction refers to a transmission direction of characteristic values between the computing cores. It will be appreciated that after the structure of the computational network is determined and the number of channels of the network model is determined, i.e., after the sizes of n and k are determined, whether the computational network is fully adapted to the network model can be determined. Therefore, specific settings are required before data processing using a computing network. By setting the data flow direction between the computing cores, the computing network can be set as one or more parallel computing paths, so that the resource utilization rate of the computing network is fully exerted.
S102: at least one characteristic value is acquired and is input into at least one starting computing core in the computing network, respectively.
The initial computing core refers to the first computing core for feature value input. It will be appreciated that the individual computation paths are computed in parallel, there should be and only one starting computation core per path, so the number of starting computation cores and computation paths is the same. After the characteristic value is obtained, the characteristic value is input into an initial computing core, the initial computing core can utilize the characteristic value to perform data processing, and meanwhile, the characteristic value can be circulated according to the set data flow direction.
S103: and transmitting the characteristic value according to the data flow by taking the initial computing core as a starting point.
The data flow specifies the transmission direction of the feature value, so that the starting computing core can determine the transmission object of the feature value and then transmit the feature value to the next computing core until all computing cores receive the feature value.
S104: and generating a calculation result based on the characteristic value and the corresponding weight value by using each calculation core.
After the feature value is obtained, each computing core can utilize the feature value and the weight value corresponding to the feature value to perform computation, so that a corresponding computing result is obtained. The collective process of calculation is not limited and may be set as needed. The weight values are stored in the computing core in advance, and the number and the size are not limited.
In the present application, the execution order of the steps S102, S103, and S104 is not limited, and in one embodiment, the three steps are executed in series, that is, the step S102 is executed first, the step S103 is executed, and the step S104 is executed last. In another embodiment, three steps may be performed in parallel, i.e. step S103 and/or step S104 may be performed at the same time as step S102.
By applying the data processing method provided by the embodiment of the application, a computing network with a special architecture is adopted, three computing cores exist in the computing network, and a preset hierarchical relationship exists among the computing cores. The first computing core is a first stage, one target second computing core is a lower computing core of the first computing core, two second computing cores are lower computing cores of the target second computing core, four other second computing cores are lower computing cores of the two second computing cores, and the like until all third computing cores are two by two and are lower computing cores of a second computing core of a certain stage respectively. All the computing cores are arranged according to the level, the computing cores at the same level are positioned on the same plane, and the computing cores are in a pyramid shape. In addition, the number of compute cores in the compute network is 1+1+2+4+ … +2 ζ=2++1, and the number of compute cores of each level is the sum of the number of all compute cores at a level higher than that level. Because the number of channels of the neural network is usually 2-k, when calculation is performed, if all calculation cores are not needed to participate in calculation, that is, the shape of the network model is not matched with the size of the calculation network, 2-k calculation results can be obtained by using the calculation cores of the first several stages as a calculation path, and the calculation cores of the lower stages are split to obtain at least two additional calculation paths, and other characteristic values are processed, and the paths are mutually independent and calculated in parallel. Specifically, the data flow direction between each computing core in the computing network is set through the setting instruction, so that data flows between different levels or flows between the levels, the whole computing network can be comprehensively utilized in percentage no matter what shape of network model is processed, and the problem of insufficient hardware utilization rate in the related technology is solved.
Based on the above embodiments, the present embodiment further describes the steps of the data processing method. For the specific content of the set-up instructions, in one embodiment, it may specify the flow of data between all computing cores in the computing network. In another embodiment, before setting the data stream, the method may further include the following steps:
step 11: and acquiring configuration information and storing the configuration information to each computing core.
The configuration information comprises a plurality of identification information and a plurality of corresponding data streams.
Accordingly, the process of setting up the computing network according to the setting instruction may include:
step 12: and sending the setting instruction to each computing core, and determining the data flow direction corresponding to each computing core by utilizing the target identification information and the configuration information in the setting instruction.
The configuration information is provided with a plurality of data streams and unique identification information corresponding to the data streams respectively. The configuration information is preset in each computing core, and after the setting instruction is acquired, the configuration information can be screened by utilizing the target identification information in the setting instruction, so that a corresponding data flow direction is obtained. Specifically, after determining the data flow direction, each computing core may screen the data flow direction by using the identity information corresponding to itself, to obtain the data flow direction corresponding to itself, that is, the sending direction of the feature value.
There are various possibilities for the specific content of the data flow, where in one embodiment, the setting instruction is configured to set the computing network to be one-way computing, and the data flow is a first flow direction flowing from an upper level to a lower level, and the initial computing core is a first computing core;
correspondingly, inputting at least one characteristic value into at least one initial computing core in a computing network respectively; the process of transmitting the characteristic value according to the data flow direction by taking the initial computing core as a starting point specifically comprises the following steps:
step 21: the characteristic value is input into a first computing core, and the characteristic value is sent to a target second computing core by the first computing core.
Step 22: based on the first flow direction, starting from the target second computing core, the characteristic values are sequentially sent to the corresponding lower computing cores until the characteristic values are sent to the third computing core.
In this embodiment, the entire computing network performs computation as one computation path. In this case, there is only one starting computing core, i.e., the first computing core, and the data flow direction is the first flow direction flowing from the upper stage to the lower stage. Therefore, after the characteristic value is input into the first computing core, the first computing core can be controlled to send the characteristic value to the corresponding lower computing core, namely the target second computing core. Starting from the target second computing core, each computing core sends the obtained characteristic value to the corresponding lower computing core until the characteristic value is sent to all third computing cores. The third computing core does not have a lower computing core, and the feature value flow is ended.
In a second embodiment, the setting instruction is configured to set the computing network to at least three paths of computation, and the data flow is a second flow direction flowing between a first flow direction from an upper level to a lower level and a peer level, the initial computing core includes a first computing core and an initial third computing core, the first computing core corresponds to the first flow direction, and the initial third computing core corresponds to the second flow direction;
correspondingly, inputting at least one characteristic value into at least one initial computing core in a computing network respectively; starting from the initial computing core, the process of transmitting the characteristic value according to the data flow direction may include:
step 31: the first eigenvalue is input into a first computing core, and the first eigenvalue is sent to a target second computing core by the first computing core.
Step 32: based on the first flow direction, starting from the target second computing core, sequentially sending the first characteristic value to the corresponding lower computing core until the first characteristic value is sent to the second computing core at the end of the first flow direction.
Step 33: inputting the second characteristic value into the initial third computing core, starting from the initial third computing core, and sequentially sending the second characteristic value to the subsequent peer computing cores based on the second flow direction until the second characteristic value is sent to the third computing core at the end of the second flow direction.
In this embodiment, the entire computing network may be divided into three computing paths, or more than three computing paths. In this case, the data flow direction includes a second flow direction flowing between peers, i.e., the eigenvalues flow between computing cores of the peers, in addition to the first flow direction from the upper level to the lower level. It will be appreciated that since the computation paths are plural, the starting computation core is plural, and it should include at least two starting third computation cores while including the first computation core.
The starting third computing core refers to a third computing core designated as a starting computing core, and the number of starting third computing cores may vary according to the size of the number of computing paths. It will be appreciated that when there are three computation paths, all of the third computation cores are divided into two independent computation paths, so that the number of third computation cores is initiated is at least two, and when there are more computation paths, the number of third computation cores is initiated more.
In this case, the present embodiment is not different from the first embodiment described above with respect to the calculation path in which the first calculation core is the starting calculation core. And after the first computing core sends the first characteristic value to the target second computing core, the first characteristic value is issued step by step according to the rule of the first flow direction until the first characteristic value is sent to the second computing core which is used as the end of the first flow direction. For the computing path of the third computing core as the starting computing core, the characteristic value of the starting third computing core is input as a second characteristic value, and the second flow direction also prescribes the data transmission sequence among the data cores, so that according to the second flow direction, the subsequent peer computing cores corresponding to the third computing cores starting from the starting third computing core can be determined, and each third computing core sends the second characteristic value to the subsequent peer computing core until the second characteristic value is sent to the third computing core at the end of the second flow direction.
A third embodiment is also present in addition to the second embodiment. Specifically, the initiating computing core further includes initiating a second computing core, the second computing core corresponding to a second flow direction;
correspondingly, inputting at least one characteristic value into at least one initial computing core in a computing network respectively; starting from the initial computing core, the process of transmitting the characteristic value according to the data flow direction may include:
step 41: inputting the second characteristic value into the initial second computing core, starting from the initial second computing core, and sequentially sending the second characteristic value to the subsequent peer computing cores based on the second flow direction until the second characteristic value is sent to the second computing core at the end of the second flow direction.
In this embodiment, the initiating computing core further includes initiating a second computing core, that is, in addition to the third computing core having the lowest level being split into at least two computing paths, a number of levels of the second computing core higher than the third computing core are also split into at least two computing paths. Referring to fig. 3, fig. 3 is a schematic level structure diagram between a second computing core and a third computing core according to an embodiment of the present application. Wherein the computing cores of the 6 th level are the third computing core, the computing cores of the 1 st to 5 th levels are the second computing core, and furthermore, there is a first computing core of the 0 th level (not shown in the figure). Illustratively, the first computing core from level 0 to level 4 may be set to one computing path, i.e., the number of channels of the network is 16. On a further basis, the second computing core of stage 5 may be divided into one computing path and the third computing core of stage 5 may be divided into two computing paths.
When the second computing core is used as the initial computing core, the corresponding characteristic value flows in the same way as the third computing core is used as the initial computing core. For the computing path of the second computing core as the starting computing core, the characteristic value of the starting second computing core is input as a second characteristic value, and the second flow direction also prescribes the data transmission sequence among the data cores, so that according to the second flow direction, the subsequent peer computing cores corresponding to the second computing cores starting from the starting second computing core can be determined, and each second computing core sends the second characteristic value to the subsequent peer computing core until the second characteristic value is sent to the second computing core at the end of the second flow direction.
After the feature value is obtained, each computing core should use the feature value and the weight value to calculate to obtain a computing result. It will be appreciated that prior to generating the calculation result using the weight values, it is necessary to assign corresponding weight values to the respective calculation cores. Specifically, the method may further include:
step 51: and determining the weight value corresponding to each computing core.
Step 52: and sending and storing the weight value to the corresponding computing core.
Each computing core needs to store its own corresponding weight value, and the weight value is used to generate an output value of the corresponding channel together with the characteristic value, that is, a computing result. According to the matching degree of the computing network and the number of the model output channels, the data flow direction can be determined, and then the weight value corresponding to each computing core is determined. And sending each weight value to a computing core and storing the weight values, so that the computing core can directly call the weight values to perform computation after acquiring the characteristic values.
Further, the process of determining the weight value corresponding to each computing core may include:
step 61: an initial weight value is obtained.
Step 62: based on the data flow direction, the corresponding relation between each calculation core and the initial weight value is determined, and the determination of the weight value is completed.
The initial weight values refer to weight values which are not allocated to the computing cores, and each initial weight value corresponds to different output channels of the model respectively and can be arranged according to the data flow direction. Therefore, according to the data flow direction corresponding to the computing network, comprising a first flow direction and a second flow direction, the corresponding relation between the computing core and the initial weight value is determined, and the initial weight value corresponding to the computing core is determined as the weight value of the computing core, so that the determination of the weight value is completed.
The embodiment is not limited to a specific manner of calculation core calculation, and the study in deep learning is mainly performed by CNN (Convolutional Neural Networks, convolutional neural network) as a study object. And the performance requirements on CNN are different due to different processing scenes, so that various network structures are developed. The basic composition of CNNs is fixed, however, input layer, convolution layer, activation layer, pooling layer, and full-connection layer, respectively. The most computationally intensive part is the convolution layer, whose main function is to perform the convolution operation between the image (feature) and the neuron (filter). Thus in one embodiment, a convolution calculation may be performed using a computing network. After the weight value is distributed, the process of generating a calculation result based on the feature value and the corresponding weight value by using each calculation core may include:
Step 71: and controlling each calculation core, and multiplying the characteristic value by the weight value to obtain a target result.
Step 72: and adding the target result and the historical calculation result stored by the calculation core to obtain a calculation result.
It will be appreciated that the convolution calculation is a multiply-add calculation. Specifically, the target result obtained by multiplying the characteristic value and the weight value is added with the historical calculation result stored by the calculation core, and the final required calculation result can be obtained. Accordingly, new calculation results should also be stored in the calculation core in order to be used as a historical calculation result of subsequent calculation or read out after the feature values of all input channels are facilitated.
The present embodiment is not limited to the specific structure of the computing core. In one embodiment, please refer to fig. 4, fig. 4 is a schematic structural diagram of a specific computing core provided in an embodiment of the present application. The computing core includes an Arithmetic and Logic Unit (ALU), a weight value interface, a feature value interface, a control module, and a memory Unit. The weight value interface comprises an external input port and a peer input port, wherein the external input port is used for acquiring the weight value of external input, and the peer input port is used for acquiring the weight value input by the transmission node at the same level. The characteristic value interface comprises an external input port, an upper-level input port and a peer-level input port, wherein the external input port is used for acquiring an external input characteristic value, the computing core is used as an initial computing core, the upper-level input port (i.e. a front-level input port) is used for acquiring the characteristic value sent by the upper-level computing core, and the peer-level input port is used for acquiring the characteristic value sent by the peer-level computing core. The control module is used for storing configuration information, determining a data flow direction according to the setting instruction, and the storage unit is used for storing weight values, characteristic values and calculation results. In addition, the device also comprises a selector for selecting information input by each input port as a characteristic value and a weight value used for calculation, and the selection mode of the selector is related to the data flow direction.
The following describes a data processing apparatus provided in an embodiment of the present application, and the data processing apparatus described below and the data processing method described above may be referred to correspondingly to each other.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application, including:
a setting module 110, configured to obtain a setting instruction, and set a computing network according to the setting instruction; the setting instruction is used for setting the data flow direction among all computing cores in the computing network;
an input module 120, configured to obtain at least one feature value, and input the at least one feature value into at least one initial computing core in a computing network respectively;
a transmission module 130, configured to transmit the characteristic value according to the data flow direction with the initial computing core as a starting point;
a calculation module 140, configured to generate a calculation result based on the feature value and the corresponding weight value by using each calculation core;
the computing network comprises a first computing core, a plurality of second computing cores and a plurality of third computing cores; the initial computing core comprises a first computing core, or comprises a first computing core and a third computing core, or comprises the first computing core, a second computing core and the third computing core; any second computing core corresponds to an upper computing core and two lower computing cores, the lower computing cores are the second computing core or a third computing core, the third computing core does not have a lower computing core, and the first computing core is an upper computing core of the target second computing core.
Optionally, the method further comprises:
the configuration storage module is used for acquiring configuration information and storing the configuration information to each computing core; the configuration information comprises a plurality of identification information and a plurality of corresponding data streams;
accordingly, the setting module 110 includes:
the configuration indicating unit is used for sending the setting instruction to each computing core, and determining the data flow direction corresponding to each computing core by utilizing the target identification information and the configuration information in the setting instruction.
Optionally, the setting instruction is configured to set the computing network to one path of computation, and the data flow is a first flow direction from an upper level to a lower level, and the initial computing core is a first computing core;
accordingly, the input module 120 includes:
a first input unit for inputting the feature value into a first calculation core;
accordingly, the transmission module 130 includes:
the first transmission unit is used for transmitting the characteristic value to the target second computing core by utilizing the first computing core;
and the second transmission unit is used for sequentially transmitting the characteristic values to the corresponding lower computing cores from the target second computing core based on the first flow direction until the characteristic values are transmitted to the third computing core.
Optionally, the setting instruction is configured to set the computing network to at least three paths of computation, and the data flow is a first flow direction from an upper level to a lower level and a second flow direction flowing between the same levels, the initial computing core includes a first computing core and an initial third computing core, the first computing core corresponds to the first flow direction, and the initial third computing core corresponds to the second flow direction;
Accordingly, the input module 120 includes:
a second input unit for inputting the first characteristic value into the first calculation core;
accordingly, the transmission module 130 includes:
the third transmission unit is used for transmitting the first characteristic value to the target second computing core by utilizing the first computing core;
the fourth transmission unit is used for sequentially transmitting the first characteristic value to the corresponding lower-level computing cores from the target second computing core based on the first flow direction until the first characteristic value is transmitted to the second computing core at the end of the first flow direction;
and a fifth transmission unit, configured to input the second feature value into an initial third computing core, and sequentially send the second feature value to a subsequent peer computing core based on the second flow direction from the initial third computing core until the second feature value is sent to a third computing core at the end of the second flow direction.
Optionally, the initiating the computing core further comprises initiating a second computing core, the second computing core corresponding to a second flow direction;
accordingly, the input module 120 includes:
a third input unit for inputting a second feature value into the starting second computing core;
accordingly, the transmission module 130 includes:
and a sixth transmission unit, configured to sequentially send, from the start second computing core, the second eigenvalue to the subsequent peer computing cores based on the second flow direction until the second eigenvalue is sent to the second computing core at the end of the second flow direction.
Optionally, the method further comprises:
the weight determining module is used for determining weight values corresponding to the computing cores respectively;
and the weight storage module is used for sending and storing the weight value to the corresponding calculation core.
Optionally, the weight determining module includes:
the weight acquisition unit is used for acquiring an initial weight value;
and the corresponding relation determining unit is used for determining the corresponding relation between each calculation core and the initial weight value based on the data flow direction and finishing the determination of the weight value.
Optionally, the computing module 140 includes:
the multiplication unit is used for controlling each calculation core and multiplying the characteristic value and the weight value to obtain a target result;
and the adding unit is used for adding the target result and the historical calculation result stored by the calculation core to obtain a calculation result.
Optionally, the computing core includes an arithmetic logic unit, a weight value interface, a feature value interface, a control module and a storage unit, the weight value interface includes an external input port and a peer input port, the feature value interface includes an external input port, a superior input port and a peer input port, the control module is used for storing configuration information, determining a data flow direction according to a setting instruction, and the storage unit is used for storing the weight value, the feature value and a computing result.
The following describes a data processing system provided in an embodiment of the present application, where the data processing system described below and the data processing method described above may be referred to correspondingly.
The application also provides a data processing system, comprising a computing network, wherein the computing network comprises a first computing core, a plurality of second computing cores and a plurality of third computing cores; any second computing core corresponds to an upper computing core and two lower computing cores, the lower computing cores are the second computing core or a third computing core, the third computing core does not have a lower computing core, and the first computing core is an upper computing core of the target second computing core.
The electronic device provided in the embodiments of the present application is described below, and the electronic device described below and the data processing method described above may be referred to correspondingly.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application. Wherein the electronic device 100 may include a processor 101 and a memory 102, and may further include one or more of a multimedia component 103, an information input/information output (I/O) interface 104, and a communication component 105.
Wherein the processor 101 is configured to control the overall operation of the electronic device 100 to perform all or part of the steps in the data processing method described above; the memory 102 is used to store various types of data to support operation at the electronic device 100, which may include, for example, instructions for any application or method operating on the electronic device 100, as well as application-related data. The Memory 102 may be implemented by any type or combination of volatile or non-volatile Memory devices, such as one or more of static random access Memory (Static Random Access Memory, SRAM), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), erasable programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk.
The multimedia component 103 may include a screen and an audio component. Wherein the screen may be, for example, a touch screen, the audio component being for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signals may be further stored in the memory 102 or transmitted through the communication component 105. The audio assembly further comprises at least one speaker for outputting audio signals. The I/O interface 104 provides an interface between the processor 101 and other interface modules, which may be a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 105 is used for wired or wireless communication between the electronic device 100 and other devices. Wireless communication, such as Wi-Fi, bluetooth, near field communication (Near Field Communication, NFC for short), 2G, 3G or 4G, or a combination of one or more thereof, the respective communication component 105 may thus comprise: wi-Fi part, bluetooth part, NFC part.
The electronic device 100 may be implemented by one or more application specific integrated circuits (Application Specific Integrated Circuit, abbreviated as ASIC), digital signal processors (Digital Signal Processor, abbreviated as DSP), digital signal processing devices (Digital Signal Processing Device, abbreviated as DSPD), programmable logic devices (Programmable Logic Device, abbreviated as PLD), field programmable gate arrays (Field Programmable Gate Array, abbreviated as FPGA), controllers, microcontrollers, microprocessors, or other electronic components for performing the data processing methods set forth in the above embodiments.
The following describes a computer readable storage medium provided in an embodiment of the present application, where the computer readable storage medium described below and the data processing method described above may be referred to correspondingly.
The present application also provides a computer readable storage medium having a computer program stored thereon, which when executed by a processor, implements the steps of the data processing method described above.
The computer readable storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Those skilled in the art may implement the described functionality using different approaches for each particular application, but such implementation should not be considered to be beyond the scope of this application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it is further noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms include, comprise, or any other variation is intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The principles and embodiments of the present application are described herein with specific examples, the above examples being provided only to assist in understanding the methods of the present application and their core ideas; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.
Claims (9)
1. A method of data processing, comprising:
acquiring a setting instruction, and setting a computing network according to the setting instruction; the setting instruction is used for setting the data flow direction among all computing cores in the computing network;
acquiring at least one characteristic value, and respectively inputting the at least one characteristic value into at least one initial computing core in the computing network;
transmitting the characteristic value according to the data flow by taking the initial computing core as a starting point;
generating a calculation result based on the characteristic value and the corresponding weight value by using each calculation core;
the computing network comprises a first computing core, a plurality of second computing cores and a plurality of third computing cores; the starting computing core comprises the first computing core, or comprises the first computing core and the third computing core, or comprises the first computing core, the second computing core and the third computing core; any one of the second computing cores corresponds to an upper computing core and two lower computing cores, the lower computing cores are the second computing core or the third computing core, the third computing core does not have a lower computing core, and the first computing core is an upper computing core of the target second computing core;
The data processing method further comprises the following steps:
determining weight values corresponding to the computing cores respectively; the weight value is sent to and stored in the corresponding computing core;
the determining the weight value corresponding to each computing core comprises the following steps:
acquiring an initial weight value; based on the data flow direction, determining the corresponding relation between each computing core and the initial weight value, and completing the determination of the weight value;
the setting instruction is used for setting the computing network to be at least three paths of computing, the data flow direction is a first flow direction from an upper level to a lower level and a second flow direction flowing between the same level, the initial computing core comprises the first computing core and an initial third computing core, the first computing core corresponds to the first flow direction, and the initial third computing core corresponds to the second flow direction;
correspondingly, the at least one characteristic value is respectively input into at least one initial computing core in the computing network; transmitting the characteristic value according to the data flow by taking the initial computing core as a starting point, wherein the characteristic value comprises the following steps:
inputting a first characteristic value into the first computing core, and transmitting the first characteristic value to the target second computing core by using the first computing core; based on the first flow direction, starting from the target second computing core, sequentially sending the first characteristic value to a corresponding lower computing core until the first characteristic value is sent to the second computing core at the end of the first flow direction; inputting a second characteristic value into the initial third computing core, starting from the initial third computing core, and sequentially sending the second characteristic value to a subsequent peer computing core based on the second flow direction until the second characteristic value is sent to the third computing core at the end of the second flow direction;
The initiating computing core further includes initiating a second computing core, the second computing core corresponding to the second flow direction;
correspondingly, the at least one characteristic value is respectively input into at least one initial computing core in the computing network; transmitting the characteristic value according to the data flow by taking the initial computing core as a starting point, wherein the characteristic value comprises the following steps:
inputting a second characteristic value into the initial second computing core, starting from the initial second computing core, and sequentially sending the second characteristic value to a subsequent peer computing core based on the second flow direction until the second characteristic value is sent to the second computing core at the end of the second flow direction.
2. The data processing method according to claim 1, characterized by further comprising:
acquiring configuration information and storing the configuration information to each computing core; the configuration information comprises a plurality of identification information and a plurality of corresponding data streams;
correspondingly, the setting the computing network according to the setting instruction comprises the following steps:
and sending the setting instruction to each computing core, and determining the data flow direction corresponding to each computing core by utilizing the target identification information and the configuration information in the setting instruction.
3. The method according to claim 1, wherein the setting instruction is configured to set the computing network as one path of computation, the data flow direction is a first flow direction flowing from an upper level to a lower level, and the initial computing core is the first computing core;
correspondingly, the at least one characteristic value is respectively input into at least one initial computing core in the computing network; transmitting the characteristic value according to the data flow by taking the initial computing core as a starting point, wherein the characteristic value comprises the following steps:
inputting the characteristic value into the first computing core, and transmitting the characteristic value to the target second computing core by using the first computing core;
and based on the first flow direction, starting from the target second computing core, sequentially sending the characteristic values to corresponding lower computing cores until the characteristic values are sent to the third computing core.
4. The data processing method according to claim 1, wherein the generating, with each of the calculation cores, a calculation result based on the feature value and the corresponding weight value includes:
controlling each computing core, and multiplying the characteristic value by the weight value to obtain a target result;
And adding the target result with the historical calculation result stored by the calculation core to obtain the calculation result.
5. The data processing method of claim 1, wherein the computing core includes an arithmetic logic unit, a weight value interface including an external input port and a peer input port, a feature value interface including an external input port, a superior input port and a peer input port, a control module for storing configuration information and determining a data flow direction according to the setting instruction, and a storage unit for storing a weight value, a feature value, and the calculation result.
6. A data processing apparatus, comprising:
the setting module is used for acquiring a setting instruction and setting a computing network according to the setting instruction; the setting instruction is used for setting the data flow direction among all computing cores in the computing network;
the input module is used for acquiring at least one characteristic value and inputting the at least one characteristic value into at least one initial computing core in the computing network respectively;
the transmission module is used for transmitting the characteristic value according to the data flow direction by taking the initial computing core as a starting point;
The computing module is used for generating a computing result based on the characteristic value and the corresponding weight value by utilizing each computing core;
the computing network comprises a first computing core, a plurality of second computing cores and a plurality of third computing cores; the initial computing core comprises the first computing core, or comprises the first computing core and the third computing core, or comprises the first computing core, the second computing core and the third computing core; any one of the second computing cores corresponds to an upper computing core and two lower computing cores, the lower computing cores are the second computing core or the third computing core, the third computing core does not have a lower computing core, and the first computing core is an upper computing core of the target second computing core;
wherein, the data processing device is specifically configured to:
determining weight values corresponding to the computing cores respectively; the weight value is sent to and stored in the corresponding computing core;
the data processing device is specifically configured to:
acquiring an initial weight value; based on the data flow direction, determining the corresponding relation between each computing core and the initial weight value, and completing the determination of the weight value;
The setting instruction is used for setting the computing network to be at least three paths of computing, the data flow direction is a first flow direction from an upper level to a lower level and a second flow direction flowing between the same level, the initial computing core comprises the first computing core and an initial third computing core, the first computing core corresponds to the first flow direction, and the initial third computing core corresponds to the second flow direction;
the data processing device is specifically configured to:
inputting a first characteristic value into the first computing core, and transmitting the first characteristic value to the target second computing core by using the first computing core; based on the first flow direction, starting from the target second computing core, sequentially sending the first characteristic value to a corresponding lower computing core until the first characteristic value is sent to the second computing core at the end of the first flow direction; inputting a second characteristic value into the initial third computing core, starting from the initial third computing core, and sequentially sending the second characteristic value to a subsequent peer computing core based on the second flow direction until the second characteristic value is sent to the third computing core at the end of the second flow direction;
The initiating computing core further includes initiating a second computing core, the second computing core corresponding to the second flow direction;
the data processing device is specifically configured to:
inputting a second characteristic value into the initial second computing core, starting from the initial second computing core, and sequentially sending the second characteristic value to a subsequent peer computing core based on the second flow direction until the second characteristic value is sent to the second computing core at the end of the second flow direction.
7. A data processing system comprising a computing network, wherein the computing network comprises at least one initial computing core, and the computing network comprises a first computing core, a plurality of second computing cores and a plurality of third computing cores; the starting computing core comprises the first computing core, or comprises the first computing core and the third computing core, or comprises the first computing core, the second computing core and the third computing core; any one of the second computing cores corresponds to an upper computing core and two lower computing cores, the lower computing cores are the second computing core or the third computing core, the third computing core does not have a lower computing core, and the first computing core is an upper computing core of the target second computing core;
The data processing system is used for acquiring a setting instruction and setting the computing network according to the setting instruction; the setting instruction is used for setting the data flow direction among all computing cores in the computing network;
the data processing system is used for acquiring at least one characteristic value and inputting the at least one characteristic value into at least one initial computing core in the computing network respectively;
the data processing system is used for transmitting the characteristic value according to the data flow direction by taking the initial computing core as a starting point;
the data processing system is used for generating a calculation result based on the characteristic value and the corresponding weight value by utilizing each calculation core;
the data processing system is used for:
determining weight values corresponding to the computing cores respectively; the weight value is sent to and stored in the corresponding computing core;
the data processing system is used for:
acquiring an initial weight value; based on the data flow direction, determining the corresponding relation between each computing core and the initial weight value, and completing the determination of the weight value;
the setting instruction is used for setting the computing network to be at least three paths of computing, the data flow direction is a first flow direction from an upper level to a lower level and a second flow direction flowing between the same level, the initial computing core comprises the first computing core and an initial third computing core, the first computing core corresponds to the first flow direction, and the initial third computing core corresponds to the second flow direction;
The data processing system is used for:
inputting a first characteristic value into the first computing core, and transmitting the first characteristic value to the target second computing core by using the first computing core; based on the first flow direction, starting from the target second computing core, sequentially sending the first characteristic value to the corresponding lower computing core until the first characteristic value is sent to the second computing core at the end of the first flow direction; inputting a second characteristic value into the initial third computing core, starting from the initial third computing core, and sequentially sending the second characteristic value to a subsequent peer computing core based on the second flow direction until the second characteristic value is sent to the third computing core at the end of the second flow direction;
the initiating computing core further includes initiating a second computing core, the second computing core corresponding to the second flow direction;
the data processing system is used for:
inputting a second characteristic value into the initial second computing core, starting from the initial second computing core, and sequentially sending the second characteristic value to a subsequent peer computing core based on the second flow direction until the second characteristic value is sent to the second computing core at the end of the second flow direction.
8. An electronic device comprising a memory and a processor, wherein:
the memory is used for storing a computer program;
the processor being configured to execute the computer program to implement the data processing method according to any one of claims 1 to 5.
9. A computer-readable storage medium for storing a computer program, wherein the computer program when executed by a processor implements the data processing method according to any one of claims 1 to 5.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111165135.XA CN114021708B (en) | 2021-09-30 | 2021-09-30 | Data processing method, device and system, electronic equipment and storage medium |
PCT/CN2022/090194 WO2023050807A1 (en) | 2021-09-30 | 2022-04-29 | Data processing method, apparatus, and system, electronic device, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111165135.XA CN114021708B (en) | 2021-09-30 | 2021-09-30 | Data processing method, device and system, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114021708A CN114021708A (en) | 2022-02-08 |
CN114021708B true CN114021708B (en) | 2023-08-01 |
Family
ID=80055496
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111165135.XA Active CN114021708B (en) | 2021-09-30 | 2021-09-30 | Data processing method, device and system, electronic equipment and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114021708B (en) |
WO (1) | WO2023050807A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114021708B (en) * | 2021-09-30 | 2023-08-01 | 浪潮电子信息产业股份有限公司 | Data processing method, device and system, electronic equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108427584A (en) * | 2018-03-19 | 2018-08-21 | 清华大学 | The configuration method of the chip and the chip with parallel computation core quickly started |
CN111008243A (en) * | 2019-11-21 | 2020-04-14 | 山东爱城市网信息技术有限公司 | Block chain-based donation flow direction recording supervision method, device and storage medium |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7796527B2 (en) * | 2006-04-13 | 2010-09-14 | International Business Machines Corporation | Computer hardware fault administration |
US20080031243A1 (en) * | 2006-08-01 | 2008-02-07 | Gidon Gershinsky | Migration of Message Topics over Multicast Streams and Groups |
US20090040946A1 (en) * | 2007-08-06 | 2009-02-12 | Archer Charles J | Executing an Allgather Operation on a Parallel Computer |
CN110046704B (en) * | 2019-04-09 | 2022-11-08 | 深圳鲲云信息科技有限公司 | Deep network acceleration method, device, equipment and storage medium based on data stream |
CN111752691B (en) * | 2020-06-22 | 2023-11-28 | 深圳鲲云信息科技有限公司 | Method, device, equipment and storage medium for sorting AI (advanced technology attachment) calculation graphs |
CN114021708B (en) * | 2021-09-30 | 2023-08-01 | 浪潮电子信息产业股份有限公司 | Data processing method, device and system, electronic equipment and storage medium |
-
2021
- 2021-09-30 CN CN202111165135.XA patent/CN114021708B/en active Active
-
2022
- 2022-04-29 WO PCT/CN2022/090194 patent/WO2023050807A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108427584A (en) * | 2018-03-19 | 2018-08-21 | 清华大学 | The configuration method of the chip and the chip with parallel computation core quickly started |
CN111008243A (en) * | 2019-11-21 | 2020-04-14 | 山东爱城市网信息技术有限公司 | Block chain-based donation flow direction recording supervision method, device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN114021708A (en) | 2022-02-08 |
WO2023050807A1 (en) | 2023-04-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6726246B2 (en) | Method and apparatus for performing operations in a convolutional neural network and non-transitory storage medium | |
TWI803663B (en) | A computing device and computing method | |
KR102443546B1 (en) | matrix multiplier | |
CN107454965B (en) | Batch processing in a neural network processor | |
US11521067B2 (en) | Decentralized distributed deep learning | |
KR101803409B1 (en) | Computing Method and Device for Multilayer Neural Network | |
CN110889439B (en) | Image feature extraction method and device, electronic equipment and storage medium | |
CN109543140A (en) | A kind of convolutional neural networks accelerator | |
CN107491416B (en) | Reconfigurable computing structure suitable for convolution requirement of any dimension and computing scheduling method and device | |
US20160187861A1 (en) | Systems and methods to adaptively select execution modes | |
CN114021708B (en) | Data processing method, device and system, electronic equipment and storage medium | |
CN116991560B (en) | Parallel scheduling method, device, equipment and storage medium for language model | |
CN113449842A (en) | Distributed automatic differentiation method and related device | |
CN114386349A (en) | Wiring method and device for system-level digital circuit, equipment and storage medium | |
EP4052188B1 (en) | Neural network instruction streaming | |
CN114065121A (en) | Calculation method and equipment for solving Itanium model | |
WO2020149178A1 (en) | Neural network compression device | |
CN111027688A (en) | Neural network calculator generation method and device based on FPGA | |
TWI817490B (en) | Computer-implemented method of propagation latency reduction in neural network | |
US11297127B2 (en) | Information processing system and control method of information processing system | |
WO2020051918A1 (en) | Neuronal circuit, chip, system and method therefor, and storage medium | |
CN116805155B (en) | LSTM network processing method, device, equipment and readable storage medium | |
CN114595068A (en) | Calculation graph scheduling method and device, electronic equipment and readable storage medium | |
CN113919489A (en) | Method and device for improving resource utilization rate of on-chip multiplier-adder of FPGA | |
WO2016082868A1 (en) | Orchestrator and method for virtual network embedding using offline feedback |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |