CN115048143A

CN115048143A - Instruction generation method, task processing method and electronic equipment

Info

Publication number: CN115048143A
Application number: CN202210973118.7A
Authority: CN
Inventors: 吴臻志; 祝夭龙
Original assignee: Beijing Lynxi Technology Co Ltd
Current assignee: Beijing Lynxi Technology Co Ltd
Priority date: 2022-08-15
Filing date: 2022-08-15
Publication date: 2022-09-13

Abstract

The disclosure provides an instruction generation method, a task processing method and electronic equipment, and belongs to the technical field of computers. The instruction generation method comprises the following steps: determining target data meeting compression conditions and decompression predicted time of the target data from task data of a target task to be executed by a processing unit, wherein the decompression predicted time is used for indicating the time for decompressing specified compressed data corresponding to the target data; generating an instruction execution sequence according to the decompression predicted time length and a processing instruction corresponding to the target data; the instruction execution sequence at least comprises a decompression instruction corresponding to the target data. According to the embodiment of the disclosure, the processing efficiency of the task can be improved.

Description

Instruction generation method, task processing method and electronic equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to an instruction generation method, a task processing method, an electronic device, and a computer-readable storage medium.

Background

In a processing system, a processing core and a memory have a tight coupling relationship. Typically, memory is located within the processing core, resulting in a smaller amount of memory that can be accommodated. In order to relieve the storage pressure, part of data with low use frequency can be stored after being compressed in advance. However, compressing data and decompressing data consume additional time, which affects the processing efficiency of the processing system.

Disclosure of Invention

The disclosure provides an instruction generation method, a task processing method, an electronic device and a computer-readable storage medium.

In a first aspect, the present disclosure provides an instruction generation method applied to a compiler, including: determining target data meeting compression conditions and decompression predicted time of the target data from task data of a target task to be executed by a processing unit, wherein the decompression predicted time is used for indicating the time for decompressing specified compressed data corresponding to the target data; generating an instruction execution sequence according to the decompression predicted time length and the processing instruction corresponding to the target data; and the instruction execution sequence at least comprises a decompression instruction corresponding to the target data.

In a second aspect, the present disclosure provides a task processing method applied to a decompressor, the task processing method including: responding to a decompression request sent by a processing unit, and acquiring specified compressed data corresponding to the decompression request, wherein the decompression request is a request sent by the processing unit according to a decompression instruction in an instruction execution sequence; and decompressing the specified compressed data to obtain target decompressed data, so that the processing unit can perform task processing based on the target decompressed data under the condition that the processing instruction corresponding to the decompressed instruction is triggered.

In a third aspect, the present disclosure provides a task processing method applied to a processing unit, where the task processing method includes: under the condition that a decompression instruction in the instruction execution sequence is triggered, sending a decompression request to the decompressor, wherein the decompression request is used for instructing the decompressor to decompress specified compressed data so as to obtain target decompressed data; and under the condition that the processing instruction corresponding to the decompression instruction is triggered, performing task processing according to the target decompression data.

In a fourth aspect, the present disclosure provides an electronic device comprising: at least one processing unit and at least one decompressor; the processing unit is used for sending a decompression request to the decompressor under the condition that a decompression instruction in an instruction execution sequence is triggered, and performing task processing according to target decompression data under the condition that a processing instruction corresponding to the decompression instruction is triggered; the decompressor is used for responding to a decompression request sent by the processing unit, acquiring specified compressed data corresponding to the decompression request, decompressing the specified compressed data and acquiring the target decompressed data; the position of the decompression instruction in the instruction execution sequence is determined according to a decompression predicted time length of the target data and the processing instruction, wherein the decompression predicted time length is used for indicating the time for decompressing the specified compressed data corresponding to the target data.

In a fifth aspect, the present disclosure provides an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores one or more computer programs executable by the at least one processor, the one or more computer programs being executable by the at least one processor to enable the at least one processor to perform the above-mentioned task processing method.

In a sixth aspect, the present disclosure provides an electronic device comprising: a plurality of processing cores; and a network on chip configured to interact data among the plurality of processing cores and external data; one or more instructions are stored in one or more processing cores, and the one or more instructions are executed by the one or more processing cores to enable the one or more processing cores to execute the task processing method.

In a seventh aspect, the present disclosure provides a computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processing core/processor, implements the task processing method described above.

According to the embodiment provided by the disclosure, according to the decompression predicted time length of the target data, the decompression instruction is inserted at a plurality of instructions before the processing instruction corresponding to the target data, and the decompression instruction is triggered by the processing unit, a decompression request is sent to the decompressor to decompress the specified compressed data, so that the decompression operation of the decompressor is completed before the corresponding processing instruction is triggered by the processing unit, the processing unit can execute the task processing according to the target decompressed data, and when the decompression operation is executed by the decompressor, the processing unit can execute the instruction positioned between the decompression instruction and the processing instruction without being influenced by the decompression operation, thereby improving the task processing efficiency; in addition, the number of instructions between the decompression instruction and the processing instruction is not excessive, and long-term occupation of the target decompression data on the storage space due to premature decompression is avoided as much as possible, so that the storage pressure is effectively relieved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The accompanying drawings are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. The above and other features and advantages will become more apparent to those skilled in the art by describing in detail exemplary embodiments thereof with reference to the attached drawings, which are shown below.

Fig. 1 is a flowchart of an instruction generating method according to an embodiment of the present disclosure.

Fig. 2 is a flowchart of a task processing method according to an embodiment of the present disclosure.

Fig. 3 is a schematic diagram of a single-core processing system according to an embodiment of the present disclosure.

FIG. 4 is a schematic diagram of a many-core processing system according to an embodiment of the present disclosure.

Fig. 5 is a flowchart of a task processing method according to an embodiment of the present disclosure.

Fig. 6 is a schematic diagram of a working process of a task processing method according to an embodiment of the present disclosure.

Fig. 7 is a schematic data timing diagram of a task processing method according to an embodiment of the present disclosure.

Fig. 8 is a schematic diagram of a working process of a task processing method according to an embodiment of the present disclosure.

Fig. 9 is a schematic view of an electronic device provided in an embodiment of the present disclosure.

Fig. 10 is a schematic view of an operating process of an electronic device according to an embodiment of the present disclosure.

Fig. 11 is a block diagram of an electronic device provided in an embodiment of the present disclosure.

Fig. 12 is a block diagram of an electronic device provided in an embodiment of the present disclosure.

Detailed Description

To facilitate a better understanding of the technical aspects of the present disclosure, exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, wherein various details of the embodiments of the present disclosure are included to facilitate an understanding, and they should be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Embodiments of the present disclosure and features of embodiments may be combined with each other without conflict.

As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

Generally, a processing unit (e.g., a processing core) and a memory in a processing system are disposed in a tightly coupled manner, and even the memory is disposed inside the processing unit, so that the capacity of the memory is small, and the data processing capability of the processing unit or the processing system may be affected. To alleviate the storage pressure, for some of the less common code segments, the constant region (lookup table) may be compressed for storage, and when the relevant data is used, the data is decompressed, which may affect the processing efficiency.

In view of this, embodiments of the present disclosure provide an instruction generating method, a task processing method, and an electronic device, where a compiler can determine target data to be compressed and stored according to a compression condition, and determine a corresponding decompression prediction duration, inserting a decompression instruction at a position before a processing instruction corresponding to the target data according to the decompression predicted time length, so that before the processing unit uses the data based on the processing instructions, the corresponding decompression instructions are triggered, the decompressor is instructed to decompress the corresponding specified compressed data, thereby ensuring that the processing unit, when using the data, has decompressed the data without having to wait for a decompression process, and, in the decompression operation process executed by the decompressor, the processing unit can still normally execute other instructions in the instruction execution sequence, and the instruction execution process of the processing unit is not influenced, so that the processing efficiency of the processing unit is improved.

A first aspect of the embodiments of the present disclosure provides an instruction generation method, which is applicable to a compiler.

Fig. 1 is a flowchart of an instruction generating method according to an embodiment of the present disclosure. Referring to fig. 1, the method includes the following steps.

In step S11, from among the task data of the target task to be executed by the processing unit, target data satisfying the compression condition and a decompression predicted time length of the target data indicating a time for decompressing specified compressed data corresponding to the target data are determined.

In step S12, generating an instruction execution sequence according to the decompression predicted time length and the processing instruction corresponding to the target data; the instruction execution sequence at least comprises a decompression instruction corresponding to the target data.

In some alternative implementations, the processing unit may be a functional unit with data processing capabilities. For example, the processing unit may be a processing core in a single-core processing system, or may be a processing core in a many-core processing system, and the processing unit is not limited in the embodiments of the present disclosure.

In some alternative implementations, the target task may include any one of an image processing task, a voice processing task, a text processing task, and a video processing task, and the task data includes various types of data related to the target task.

Illustratively, the task data of the target task may include task type, task code, parameters required to execute the target task, and the like.

It should be noted that, the target task and the task data are only examples, and the embodiment of the disclosure does not limit this.

In some optional implementations, in order to reduce the storage space occupied by the task data, the task data may be compressed and then stored.

In some alternative implementations, it is considered that the task data may include multiple items of data, and there are differences in the frequency of use, importance, data types, and the like of different data, so that part of the task data may be compressed and stored, and the rest of the task data may be stored in a normal form according to the above. Based on the setting mode, the storage space occupied by the task data can be reduced, meanwhile, the task data which is frequently used, has higher importance and is difficult to compress can be prevented from being compressed and stored, and the use convenience of the task data is considered.

In some alternative implementations, the target data to be compressed may be determined from the task data of the target task by setting a compression condition. The compression condition may be set according to any one or more of experience, statistical data, task requirements, and the like, in combination with the data type, the usage frequency, the importance, and the like of the task data, and the setting manner of the compression condition and the content of the compression condition are not limited in the embodiment of the present disclosure. In the process of executing precompilation, the compiler can analyze and process the task data of the target task, so as to determine the target data meeting the compression condition.

For example, the task data with a low frequency of use may be determined as data satisfying a compression condition, so that frequent compression and decompression processing of the task data with a high frequency of use is minimized to relieve compression and decompression pressure. The compiler can analyze and analyze the task data of the target task and determine the use frequency of each item of task data, so that the task data with lower use frequency is determined as the target data.

For example, a frequency threshold may be set in the compression condition, and the task data satisfying the compression condition may be determined by comparing the frequency threshold with the usage frequency of each item of task data.

For example, some specific types of task data may be determined as data satisfying the compression condition. The compiler can determine the types of various task data by analyzing and analyzing the task data, so that the task data conforming to a specific type is determined as target data.

For example, considering that sparse data has a better compression effect (the compression effect means that the amount of data after compression is effectively reduced compared to that before compression), sparse data may be set as task data that satisfies the compression condition. Accordingly, the compiler may determine whether each item of task data is sparse data, thereby determining task data satisfying the compression condition.

It should be noted that the above compression conditions are only examples, and the embodiments of the present disclosure do not limit this.

It should be further noted that after the target data meeting the compression condition is determined, the target data may be compressed to obtain the specified compressed data, and after the specified compressed data is decompressed, the target decompressed data may be obtained. The target data may not be completely consistent with the target decompressed data, considering that losses may be introduced during compression and/or decompression, but the degree of such inconsistency does not affect the execution of the target task or affect the execution of the target task within an acceptable range.

As described above, after the target data satisfying the compression condition is determined, the decompression prediction time of each target data is also determined.

In some optional implementations, the decompression predicted time duration is used to indicate a time for decompressing the specified compressed data corresponding to the target data, in other words, the decompression predicted time duration is a time required by the compiler to decompress the specified compressed data corresponding to the target data, which is determined by prediction. The decompression prediction time length has an incidence relation with at least one of information such as data volume, data type, decompression mode and storage space information of specified compressed data.

For example, when the data size of the specified compressed data is larger, and/or the data type of the specified compressed data is more complicated, and/or the decompression manner is more cumbersome, it may be determined that the decompression prediction time duration is also relatively longer.

Illustratively, the storage space information includes an arrangement of the data areas; accordingly, the compiler may obtain the arrangement of the data region in advance, and when determining the decompression prediction duration, it is further required to combine the arrangement of the data region to ensure that the data region has enough storage space for storing the decompressed data (i.e., the target decompressed data) after the specified compressed data is decompressed.

It should be noted that, the above determination manner for the decompression prediction time duration is only an example, and the embodiment of the present disclosure does not limit this.

After the target data and the predicted decompression duration of the target data are determined, in step S12, an instruction execution sequence may be generated according to the predicted decompression duration and the processing instruction corresponding to the target data, and the instruction execution sequence at least includes the decompression instruction corresponding to the target data.

The decompression instruction may instruct the processing unit to decompress the specified compressed data, and the processing instruction may instruct the processing unit to process the decompressed data (i.e., the target decompressed data).

In some alternative implementations, the instruction execution sequence may be a queue formed by a plurality of instructions arranged in an order. As described above, the processing unit may execute the target task, the instruction execution sequence is a command that can be recognized by the processing unit, and the processing unit may complete the target task by executing the instruction execution sequence. Due to the difference of target tasks, the length of the instruction execution sequence, the content of the instruction, and the like have corresponding differences, and the embodiment of the present disclosure does not limit this.

It should be emphasized that, in the instruction execution sequence in the embodiment of the present disclosure, a decompression instruction is included, and the decompression instruction is located before a corresponding processing instruction (possibly several instructions apart from the corresponding processing instruction), the specified compressed data can be decompressed through the decompression instruction, so that when the processing instruction uses the relevant data, the data is already decompressed, the waiting time length is reduced, and meanwhile, since the insertion position of the decompression instruction is determined according to the decompression predicted time length, the problem that the target decompressed data occupies the storage space for a long time due to premature decompression can be avoided. However, in the related art, the instruction execution sequence does not include a decompression instruction, and therefore, when the task data is in a compressed data form, it is necessary to wait for a decompression process of the data, resulting in low data processing efficiency.

In some optional implementations, step S12 includes: and determining the position of the decompression instruction corresponding to the target data in the initial instruction execution sequence according to the decompression predicted time length and the position of the processing instruction corresponding to the target data in the initial instruction execution sequence, and inserting the decompression instruction into the initial instruction execution sequence to generate an instruction execution sequence.

The initial instruction execution sequence refers to an instruction sequence without a decompression instruction inserted. When the target task is executed based on the initial instruction execution sequence, decompression of the specified compressed data corresponding to the execution instruction is started only when the execution instruction is executed, and in the decompression process, the processing unit is blocked at the processing instruction, and other instructions cannot be executed, so that the processing efficiency is low.

Correspondingly, in the embodiment of the present disclosure, after the position of a certain processing instruction that needs to use the compressed data in the instruction execution sequence is known, according to the predicted decompression duration of the compressed data, it may be determined where (or when) to perform the data decompression operation, and it may be ensured that when the processing instruction is executed, the specified compressed data is already decompressed (i.e., the data is in a usable state), thereby ensuring smooth execution of the processing instruction. Meanwhile, long-term occupation of the storage space by the decompressed data (namely, the target decompressed data) due to premature decompression of the data is avoided as much as possible.

In some alternative implementations, the insertion location of the decompression instruction may be determined based on a predicted execution duration of at least one instruction preceding the processing instruction. The predicted execution duration is obtained by predicting the execution time length of the instruction, and may be related to the type of the instruction, the data processing amount, and the like.

For example, the initial instruction execution sequence includes 5 instructions, namely z1, z2, z3, z4 and z5, and the corresponding predicted execution duration is 26ns (nanoseconds), 10ns, 6ns, 18ns and 10ns, wherein z5 is a processing instruction corresponding to target data, and the decompression predicted duration corresponding to the target data is 30 ns. It can be seen that a decompression instruction can be inserted in the initial instruction execution sequence such that the specified compressed data corresponding to the target data has been decompressed prior to execution of z 5. Since the sum of the predicted execution time lengths of z2, z3, and z4 is 34ns, and the decompression predicted time length 30ns is smaller than the sum of the predicted execution time lengths, a decompression instruction z 5' corresponding to z5 may be set between z1 and z2, so that the decompressor has completed the decompression operation by the time z5 is executed, and the processing unit may execute z5 based on the target decompression data.

It should be noted that, if z 5' is inserted into z1, since the sum of the predicted execution time lengths of z1 and z2 is 36ns, target decompressed data may be generated when z2 is not executed, which may result in long-term occupation of storage space by the target decompressed data and waste of storage resources; if z 5' is inserted between z2 and z3, the sum of the predicted execution time lengths of z3 and z4 is 24ns, so that the decompression operation is not completed when the processing unit executes z5, and the processing unit needs to wait for the decompression processing procedure, thereby causing inefficiency in processing.

In some optional implementations, at least one instruction is included between the decompression instruction and the processing instruction corresponding to the target data, and a difference between a predicted execution time length of the at least one instruction and a decompressed predicted time length of the target data is smaller than a preset threshold.

The preset threshold may be set according to any one or more of experience, statistical data, and task requirements, which is not limited in this disclosure.

The above initial instruction execution sequence is taken as an example for explanation. If the preset threshold is 5ns, since the sum of the predicted execution time lengths of z2, z3, and z4 is 34ns, and the difference from the decompression predicted time length 30ns is 4ns, which is smaller than the preset threshold, a decompression instruction z 5' may be inserted between z1 and z 2.

It should be understood that since the sum of the predicted execution time lengths of z3 and z4 is 24ns, and the difference (absolute value) from the decompression predicted time length 30ns is 6ns, which is larger than the preset threshold, the decompression instruction z 5' cannot be inserted between z2 and z 3; in addition, the sum of the predicted execution time periods of z1 to z4 is 60ns, and the difference from the decompression predicted time period is also larger than the preset threshold, so that a decompression command cannot be inserted before z 1.

In some optional implementations, after step S12, the method further includes: the instruction execution sequence is sent to the processing unit. After receiving the instruction execution sequence, the processing unit may execute the target task according to the instruction execution sequence.

In some optional implementations, in order to ensure smooth execution of the target task, no matter what reason the decompressor fails to timely decompress the specified compressed data, the processing instruction corresponding to the specified compressed data is suspended from being executed until the decompressor completes the decompression operation, and then the processing unit executes the corresponding processing instruction according to the decompressed data.

It should be noted that the instruction generation method is applicable to a general single-core processing system and also applicable to a many-core processing system, and the embodiment of the present disclosure does not limit this.

A second aspect of the embodiments of the present disclosure provides a task processing method, which is applicable to a decompressor.

Fig. 2 is a flowchart of a task processing method according to an embodiment of the present disclosure. Referring to fig. 2, the method includes the following steps.

In step S21, in response to the decompression request sent by the processing unit, the specified compressed data corresponding to the decompression request is obtained, and the decompression request is a request sent by the processing unit according to the decompression instruction in the instruction execution sequence.

In step S22, the specified compressed data is decompressed to obtain target decompressed data, so that the processing unit can perform task processing based on the target decompressed data when the processing instruction corresponding to the decompressed instruction is triggered.

In some alternative implementations, the processing unit generates the decompression request according to the execution sequence of the instructions executed in sequence, and sends the decompression request to the decompressor when the decompression instruction is executed.

In some alternative implementations, the decompression request may include an address specifying compressed data, the address specifying compressed data representing a storage address specifying compressed data, and the address specifying compressed data may instruct the decompressor to retrieve corresponding specified compressed data from the corresponding storage space.

In some optional implementations, the decompression request may include an address specifying compressed data and a decompression identifier, where the decompression identifier is used to identify a current decompression operation, so that when the processing unit executes the processing instruction, the processing unit acquires accurate target decompressed data according to the address specifying compressed data and the decompression identifier.

It should be noted that, the above contents of the decompression request are only examples, and the embodiment of the present disclosure does not limit this.

It should be further noted that the specified compressed data may be stored inside the processing unit or in an external storage space of the processing unit, and the storage location of the specified compressed data is not limited in the embodiments of the present disclosure.

For example, the processing unit corresponds to a single-core processing system, and the specified compressed data may be stored in a memory of the processing unit or in an external storage space of the processing unit; the processing unit corresponds to a many-core processing system, and the specified compressed data can be stored in an on-chip storage space or an off-chip storage space. In other words, the processing unit may instruct the decompressor to decompress the specified compressed data of the on-chip storage space, and may also instruct the decompressor to decompress the specified compressed data of the off-chip storage space.

It should be noted that the on-chip storage space is mainly applicable to the case where the decompressor is located inside the processing unit, and includes, but is not limited to, any one or more of a temporary storage space inside the processing unit, a fifo (First in First out) Buffer area, and a Buffer area; the off-chip memory space is primarily suitable for use where the decompressor is located outside the processing unit, and is typically accessible by one or more processing units.

As described above, after the specific compressed data corresponding to the decompression request is obtained, the specific compressed data may be decompressed, so as to obtain the target decompressed data. When the processing unit executes a processing instruction corresponding to the decompression instruction, the processing unit can perform task processing based on the target decompressed data. Any decompression mode may be adopted to decompress the specified compressed data, which is not limited in the embodiments of the present disclosure.

In some optional implementations, the decompression mode for the specified compressed data may be indicated by a preset decompression mode.

In some optional implementations, in step S22, decompressing the specified compressed data to obtain target decompressed data, including: and according to a preset decompression mode, carrying out decompression processing on the specified compressed data to obtain target decompressed data, wherein the decompression mode is used for indicating a mode followed by executing decompression operation.

In some alternative implementations, the decompression modes include a lossless decompression mode and a lossy decompression mode, and the specified compressed data includes program segments and/or parameters; correspondingly, according to the preset decompression mode, carrying out decompression processing on the specified compressed data to obtain target decompressed data, and the method comprises the following steps: under the condition that the specified compressed data belongs to the first class of data, decompressing the specified compressed data by adopting a lossless decompression mode to obtain first target decompressed data; under the condition that the specified compressed data belongs to the second class of data, decompressing the specified compressed data by adopting a lossy decompression mode to obtain second target decompressed data; the first type of data comprises program segments or first parameters, the second type of data comprises second parameters, and the influence degree of the first parameters on task processing is larger than that of the second parameters.

Illustratively, the first parameter includes a key parameter, and the second parameter includes a general parameter, wherein the key parameter is a parameter (e.g., a regularization parameter) that has a significant influence on the processing result of the target task, and the general parameter is a parameter (including a weight coefficient of a convolutional layer, a normalization coefficient of a batch normalization layer, a firing threshold of a neuron, a reset mode voltage, and the like) that does not have a significant influence on the processing result of the target task. The first parameter and the second parameter are only examples, and those skilled in the art can determine which parameter is the first parameter and which parameter is the second parameter according to actual needs, statistical data, and empirical data, which is not limited by the embodiment of the disclosure.

It should be appreciated that in general, the lossy decompression mode decompresses at a higher speed than the lossless decompression mode, but the decompression quality is relatively low. By selecting different decompression modes for different types of data, the decompression speed can be further increased, the decompression efficiency is improved, and meanwhile, the decompression quality of partial data can be effectively considered, so that when a processing instruction is executed based on corresponding target decompression data, the quality of a processing result is not influenced or is slightly influenced, and the accuracy of a task processing result is guaranteed.

It should be noted that, when decompressing the specified compressed data, any decompression algorithm matching with the decompression mode may be used for decompression, and the embodiment of the present disclosure does not limit the type of the decompression algorithm.

For example, if the specified compressed data is data stored bitwise, when decompressing the specified compressed data, the specified compressed data may be expanded to int (integer type) or fp16 (floating point type) data to obtain target decompressed data. For example, the compressed data is designated as 0b 10101011, occupying 1 Byte (Byte) in total, and after it is decompressed based on int method, the obtained target decompressed data is 0x01,0x00,0x01,0x00, … …,0x01, occupying 8Bytes in total. For another example, when the specified compressed data is decompressed based on fp16, the obtained target decompressed data is 0x3C00,0x0,0x3C00,0x0, … …,0x3C00,0x 3C00, and occupies 16Bytes in total.

It should be appreciated that in the data compression process, the raw data may be compressed in a reverse manner to the decompression process described above, thereby generating corresponding specified compressed data.

It should be noted that, if a lossy compression method is used in the process of compressing the original data, even if a lossless decompression mode is used in decompressing the compressed data, it can be guaranteed that no additional loss is introduced in the decompression process, and the generated decompressed data still has a certain loss compared with the original data, and the loss is introduced in the compression process.

In some optional implementations, after the complete specified compressed data is obtained, the decompression processing on the specified compressed data may be started. When decompression is carried out based on the decompression mode, the operation is simple and convenient.

However, considering that the data amount of the specified compressed data may be large and it may take a long time to read the complete specified compressed data in some cases, the specified compressed data may be obtained by a stream data reading method and decompressed by a corresponding stream data decompression method, that is, the data obtaining and data decompression may be performed by reading while decompressing (or decompressing while decompressing).

In some alternative implementations, the specified compressed data is stored in an external memory space of the processing unit; accordingly, in step S21, obtaining the specified compressed data corresponding to the decompression request includes: acquiring specified compressed data from an external storage space in a stream data reading mode; in step S22, decompressing the specified compressed data to obtain target decompressed data, including: and decompressing the target decompressed data in a stream data decompression mode to obtain the target decompressed data.

It should be noted that the decompressor provided in the embodiments of the present disclosure may be a functional module disposed inside the processing unit, or may be a functional module disposed outside the processing unit. The relationship between the decompressor and the processing unit will be described in conjunction with fig. 3 and 4.

Fig. 3 is a schematic diagram of a single-core processing system according to an embodiment of the present disclosure. Referring to fig. 3, in the single core processing system shown in fig. 3 (a), the decompressor is located outside the processing unit, and the two are relatively independent, and when necessary, a communication connection may be established between the two, and operations such as data decompression and data interaction may be performed through the communication connection; in the single core processing system shown in fig. 3 (b), the decompressor is located within the processing unit, and the decompressor can perform the data decompression process using the decompressor.

Fig. 4 is a schematic diagram of a many-core processing system according to an embodiment of the disclosure. Referring to fig. 4, in the many-core processing system shown in fig. 4 (a), a decompressor is provided in each processing unit, and the respective processing units can perform data decompression processing based on the internal decompressor; in the many-core processing system shown in fig. 4 (b), all the processing units share one decompressor, and each processing unit can request the decompressor to perform data decompression processing according to the demand; in the many-core processing system shown in fig. 4 (c), a plurality of decompressors are provided, and one decompressor corresponds to a plurality of processing units (i.e., one decompressor is shared by a plurality of processing units), and when a certain processing unit needs to perform data decompression processing, a request may be issued to the corresponding decompressor.

It should be noted that, when the decompressor is a functional module inside the processing unit and does not involve interaction between the processing units, the processing unit may directly use the decompressor disposed inside the processing unit to perform corresponding decompression operations; when the decompressor is a functional module disposed outside the processing unit and shared by multiple processing units, operations such as arbitration and the like need to be performed by a communication network (e.g., a many-core network on chip) between the processing units, so as to avoid the situation that the decompressor is used by multiple processing units at the same time. In the disclosed embodiment, the task processing method performed by the decompressor is an operation performed in the case where the processing unit to be served at its current time has been determined.

In some alternative implementations, the target decompressed data is obtained in consideration that the decompression process may have been completed for a period of time before the target decompressed data is used, and thus, the target decompressed data may be stored in the storage space for standby after being obtained. The storage space may be a storage space outside the decompressor, or may also be an internal storage space of the decompressor, which is not limited in the embodiments of the present disclosure.

In some optional implementations, after step S22, the method further includes: and sending the target decompressed data to a temporary storage space of the processing unit, so that the processing unit acquires the target decompressed data from the temporary storage space under the condition that the processing instruction corresponding to the decompressed instruction is triggered. In other words, after the decompressor completes the decompression processing, the target decompressed data is sent to the temporary storage space of the processing unit, and when the target decompressed data needs to be used, the processing unit only needs to directly read the corresponding data from the temporary storage space.

In some optional implementations, a buffer space is provided in the decompressor; correspondingly, after step S22, the method further includes: storing the target decompressed data into a buffer space; sending a connection request to the processing unit in the case that the target decompressed data of the buffer space is in a readable state; in the case where a communication connection with the processing unit is established based on the connection request, the target decompressed data is transmitted to the processing unit by the streaming data transmission manner. In other words, the decompressor decompresses the specified compressed data, and caches the target decompressed data into an internal buffer space, and when enough readable target decompressed data are obtained, a connection request can be sent to the processing unit to indicate that the processing unit can read the target decompressed data; the processing unit may establish a communication connection with the decompressor according to the connection request when determining that the target decompressed data is to be used, and read the target decompressed data from the buffer space of the decompressor in a streaming data manner based on the communication connection.

Illustratively, the Buffer space may include a FIFO memory-based region and/or a Buffer region.

In the embodiment of the disclosure, before the processing instruction is executed, that is, a decompression request is sent to the decompressor based on the corresponding decompression instruction, and the decompressor decompresses the specified compressed data in response to the decompression request, so that the target decompressed data is already in an available state when the processing unit executes the processing instruction, thereby reducing time consumption of the processing unit due to waiting for decompression processing, and effectively improving task processing efficiency. Moreover, when the decompressor executes decompression processing, a processing mode of decompressing while adjusting can be adopted, so that the decompression efficiency is further improved.

A third aspect of the embodiments of the present disclosure provides a task processing method, which is applicable to a processing unit.

Fig. 5 is a flowchart of a task processing method according to an embodiment of the present disclosure. Referring to fig. 5, the method includes the following steps.

In step S51, in the case where a decompression instruction in the instruction execution sequence is triggered, a decompression request is sent to the decompressor, the decompression request being used to instruct the decompressor to decompress the specified compressed data to obtain the target decompressed data.

In step S52, when a processing command corresponding to a decompression command is triggered, a task process is performed based on the target decompression data.

In some alternative implementations, the processing unit is a functional unit with data processing capabilities. For example, the processing unit may be a processing core in a single-core processing system, or may be a processing core in a many-core processing system, and the processing unit is not limited in the embodiments of the present disclosure.

In some alternative implementations, the sequence of instruction execution may be for performing at least one of an image processing task, a speech processing task, a text processing task, and a video processing task.

It should be noted that the triggering of the instruction may be understood as the execution of the corresponding instruction by the processing unit. This is due to the consideration that a processing unit may not execute a certain instruction immediately after receiving it, but may execute it some time later.

In some alternative implementations, the processing unit receives the instruction execution sequence in the form of a stack column, and after receiving the decompressed instruction at the end of the stack, the decompressed instruction is not executed immediately, but after reading the decompressed instruction from the stack head, the decompressed instruction is executed (i.e. the decompressed instruction is triggered). Processing instructions is similar and will not be repeated here.

In some optional implementation manners, the compiler determines, according to the decompression predicted time length of the target data and the position of the processing instruction corresponding to the target data in the initial instruction execution sequence, the position of the decompression instruction corresponding to the target data in the initial instruction execution sequence, and inserts the decompression instruction into the initial instruction execution sequence to generate the instruction execution sequence. The initial instruction execution sequence refers to an instruction sequence without a decompression instruction inserted.

It should be noted that, in order to ensure that the target decompressed data is already in a readable state (or the specified compressed data is already decompressed) when the processing unit executes the processing instruction, a certain time interval should be provided between the execution time of the decompressed instruction and the execution time of the processing instruction, so that the decompressor can perform the decompression operation on the specified decompressed data within the time interval. Meanwhile, considering that target decompressed data generated by decompression needs to occupy a certain storage space, the time period occupied by the storage of the target decompressed data is reduced, and therefore, the time period between the execution time of the decompressed instruction and the execution time of the processed instruction cannot be overlong.

Illustratively, at least one instruction (decompression instruction is before and processing instruction is after) is separated between the decompression instruction and the corresponding processing instruction, so that the processing unit executes the decompression instruction first, instructs the decompressor to decompress the specified compressed data to obtain target decompressed data, and ensures that the target decompressed data is in a readable state when the processing instruction is executed subsequently, thereby ensuring smooth execution of the processing instruction. Meanwhile, the number of the instructions spaced between the decompression instruction and the corresponding processing instruction should not be too large, so as to avoid long-term occupation of the storage space by the decompressed data (namely, the target decompressed data) due to premature decompression of the data as much as possible.

In some optional implementations, after sending the decompression request to the decompressor and before performing task processing according to the target decompressed data, the method further includes: and executing the instructions positioned between the decompression instruction and the processing in the instruction execution sequence.

In other words, after the processing unit sends the decompression request to the decompressor, the processing unit can still normally execute the instruction (the instruction is an instruction between the decompression instruction and the processing) during the decompression processing performed by the decompressor in response to the decompression request, thereby improving the processing efficiency of the processing unit to some extent.

In some alternative implementations, the decompressor stores the target decompressed data to a temporary storage space of the processing unit; correspondingly, the task processing is carried out according to the target decompression data, and the task processing comprises the following steps: acquiring target decompressed data from the temporary storage space; and executing the processing instruction according to the target decompressed data. In other words, after the decompressor completes the decompression processing, the target decompressed data is sent to the temporary storage space of the processing unit, and when the target decompressed data needs to be used, the processing unit only needs to directly read the corresponding data from the temporary storage space.

In some alternative implementations, after the processing instruction is executed, the processing unit may invalidate the target decompressed data corresponding to the processing instruction in the temporary storage space, so as to use the temporary storage space to store new, unused target decompressed data.

In some optional implementations, a buffer space is provided in the decompressor, and the decompressor stores the target decompressed data into the buffer space; correspondingly, the task processing is carried out according to the target decompression data, and the task processing comprises the following steps: establishing a communication connection with the decompressor according to a connection request sent by the decompressor, wherein the connection request is a request sent under the condition that target decompressed data in the buffer space is in a readable state; receiving target decompressed data transmitted by a decompressor in a streaming data transmission mode; and executing the processing instruction according to the target decompressed data. In other words, the decompressor decompresses the specified compressed data, and caches the target decompressed data into an internal buffer space, and when enough readable target decompressed data are obtained, a connection request can be sent to the processing unit to indicate that the processing unit can read the target decompressed data; in the case that the processing unit determines that the target decompressed data is to be used, the processing unit may establish a communication connection with the decompressor according to the connection request, and read the target decompressed data from the buffer space of the decompressor in a streaming data manner based on the communication connection.

In the embodiment of the disclosure, in the instruction execution sequence, the decompression instruction is before, the corresponding processing instruction is after, and at least one instruction is spaced between the decompression instruction and the corresponding processing instruction, the processing unit sends a decompression request to the decompressor when executing the decompression instruction, and instructs the decompressor to decompress the specified compressed data, so that the target decompressed data is already in a readable state when the processing unit executes the corresponding processing instruction, thereby ensuring smooth execution of the processing instruction, and in the decompression process, the processing unit can execute the instruction between the decompression instruction and the processing instruction, thereby improving the processing effect and also improving the task execution efficiency.

The following describes an instruction generation method and a task processing method according to an embodiment of the present disclosure with reference to fig. 6 to 8.

Fig. 6 is a schematic diagram of a working process of a task processing method according to an embodiment of the present disclosure. Referring to fig. 6, the processing unit receives the instruction execution sequence, decodes the instruction execution sequence by using the instruction decoding module, and executes the instruction according to the decoding result. The decoding result may include microcode, input operands, and result addresses corresponding to the instruction.

In some optional implementations, the data segment is a storage space disposed in the slice, and is used for storing the specified compressed data and the target decompressed data, and the instruction execution sequence corresponding to the specified compressed data and the target decompressed data includes the following contents.

……

Decompress (addr, id, addr _ data)% decompression instruction, wherein addr represents the address for specifying compressed data, id represents the decompression identifier, and addr _ data represents the initial address of target decompression data

……

Use (addr _ data)% processing instruction

……

The instruction decoding module of the processing unit decodes the instruction execution sequence, triggers a decoding result of the decompression instruction Decompress (addr, id, addr _ data), generates a decompression request according to the address addr of the specified compressed data, the decompression identifier id and the first address of the addr _ data target decompression data, sends the decompression request to the decompressor, and sets the enable signal en to be high level to instruct the decompressor to execute the decompression operation. The decompressor reads specified compressed data corresponding to the addr from the data segment according to the addr in the decompression request, performs decompression operation on the specified compressed data to obtain target decompressed data de _ data, and stores the de _ data into a corresponding storage space of the data segment according to the addr _ data.

When the processing unit executes a processing instruction Use (addr _ data), de _ data corresponding to the addr _ data is read from the data segment and processed based on the processing instruction. After the processing instruction Use (addr _ data) is executed, the processing unit sets de _ data in the data segment to be invalid, and releases the corresponding storage space for storing new and unused target decompressed data based on the released storage space.

It should be noted that, during the decompression of the specified compressed data performed by the decompressor, the processing unit may execute other instructions according to the instruction execution sequence (the other instructions are instructions located between decompression (addr, id, addr _ data) and Use (addr _ data) in the instruction execution sequence), and do not need to be in an idle state; when the processing unit executes the Use (addr _ data) according to the instruction execution sequence, acquiring de _ data according to the addr _ data, and processing the de _ data.

It should be further noted that, in the above working process, the decompressor responds to the decompression request, and reads the complete specified compressed data from the data segment through the read operation, and then enters the decompression process. Accordingly, in some alternative implementations, the task processing method may be performed in a manner of reading and decompressing. The following describes a task processing method using a read-while-decompress method with reference to fig. 7.

Fig. 7 is a schematic data timing diagram of a task processing method according to an embodiment of the disclosure. Referring to fig. 7, the compiler generates an instruction execution sequence corresponding to the target task and transmits the instruction execution sequence to the processing unit. After receiving the instruction execution sequence, the processing unit executes the instructions in the instruction execution sequence in sequence. Firstly, the processing unit instructs a decompressor to decompress first specified compressed data based on a first decompression instruction, the decompressor calls the first specified compressed data from an off-chip storage space in a streaming data reading mode, and meanwhile, the decompressor decompresses the called first specified compressed data in a streaming data decompression mode to obtain first target decompressed data and stores the first target decompressed data in a temporary storage space of the processing unit. In the process, the processing unit executes other instructions according to the instruction execution sequence, reads first target decompressed data from the temporary storage space when a first processing instruction corresponding to the first specified compressed data is executed, processes the first target decompressed data, and is in an idle state according to an instruction of no instruction in the instruction execution sequence after the first processing instruction is executed. Furthermore, after the first processing is performed, the first target decompressed data in the temporary storage space is released to provide a storage space for target decompressed data (e.g., second target decompressed data) that may be generated during the execution of subsequent instructions.

The second decompression instruction is executed similarly to the first decompression instruction, when the second decompression instruction is executed, the processing unit instructs the decompressor to decompress the second specified compressed data, the decompressor also calls the second specified compressed data from the off-chip storage space in a streaming data reading mode, and meanwhile, the decompressor decompresses the called second specified compressed data in a streaming data decompression mode so as to obtain second target decompressed data and stores the second target decompressed data in the temporary storage space of the processing unit. In the process, the processing unit executes other instructions according to the instruction execution sequence, reads the second target decompressed data from the temporary storage space when executing a second processing instruction corresponding to the second specified compressed data, processes the second target decompressed data, and continues to execute other instructions in the instruction execution sequence after the second processing instruction is executed. And after the second processing is executed, the second target decompressed data in the temporary storage space is released, so that a storage space is provided for target decompressed data which may be generated in the subsequent instruction execution process.

In the working process shown in fig. 6 and 7, after the decompressor obtains the target decompressed data, the target decompressed data is stored in the temporary storage space of the processing unit, and when the processing unit executes the corresponding processing instruction, the target decompressed data is obtained from the temporary storage space. Correspondingly, other optional implementation manners exist, namely the decompressor is directly connected with the processing unit in a butt joint mode, the target decompressed data are directly sent to the processing unit, and a temporary storage space is not used any more, so that time consumption can be further reduced, and the processing efficiency is improved.

The working process of the task processing method implemented by directly interfacing the decompressor and the processing unit is described below with reference to fig. 8.

Fig. 8 is a schematic diagram of a working process of a task processing method according to an embodiment of the present disclosure. Referring to fig. 8, the decompressor is connected to the storage space and the processing unit, respectively, and the other end of the processing unit is connected to the compressor and the other end of the compressor is connected to the storage space. It should be noted that the processing unit shown in the drawings is a broad concept, and may refer to a single processing core, a processing array composed of multiple processing cores, or an arithmetic unit in a single processing core, which is not limited in this disclosure.

In the working process of task processing, the decompressor acquires specified compressed data from the storage space, decompresses the data to generate target decompressed data, sends the target decompressed data to the processing unit, and the processing unit processes the target decompressed data through corresponding processing cores to obtain processing result data and sends the processing result data to the compressor. The compressor receives the processing result data, compresses the processing result data to generate result compressed data, and stores the result compressed data into the storage space.

In other words, the target decompressed data is directly sent to the processing unit (or the buffer of the processing unit) by the decompressor, and the temporary storage space is no longer used for storing the target decompressed data, but only the storage space for storing the compressed data (the compressed data includes both the specified compressed data and the resulting compressed data) needs to be set.

In some optional implementation manners, a FIFO memory or a Buffer is arranged in the decompressor, so that the decompressor does not need to write back decompressed data to a temporary storage area, but directly provides the decompressed data to the processing unit, and the processing unit executes a corresponding processing instruction according to the decompressed data, thereby implementing a task execution manner of calculating while decompressing, and effectively improving task execution efficiency.

In some alternative implementations, the processing unit triggers a decompression instruction in the instruction execution sequence, sending a decompression request to the decompressor, the decompression request including an address raddr specifying the compressed data rdata in the storage space. The decompressor receives the decompression request, reads rdata from the storage space according to raddr, and decompresses rdata. In this process, the decompressor decompresses the rdata in a stream data decompression manner and transmits the target decompressed data rdata _ raw to the processing unit in a stream data transmission manner. In other words, the decompressor does not decompress the entire rdata to obtain the complete rdata _ raw and then transmit the complete rdata _ raw to the processing unit at one time, but after decompressing the partial rdata to obtain the partial rdata _ raw, the partial rdata _ raw is transmitted to the FIFO/Buffer, and the decompressed data is transmitted to the processing unit based on the FIFO/Buffer.

In order to ensure the smooth execution of such decompression manner, the rdy signal may be set to instruct the decompressor to send the rdata _ raw to the processing unit, that is, in a case where the rdata _ raw is in a readable state (that is, the decompressor has decompressed part of the rdata, and obtains part of the rdata _ raw available for transmission), the decompressor sends the rdy signal to the processing unit, the processing unit establishes a connection with the decompressor based on the rdy signal, and the decompressor sends the rdata _ raw to the processing unit through the connection relation, so that a new free storage space appears in the FIFO/Buffer, so as to Buffer the data newly decompressed by the decompressor through the free storage space, thereby implementing streaming data transmission. After receiving the rdata _ raw, the processing unit processes the rdata _ raw based on the processing instruction to obtain processing result data wdata _ raw, and sends wdata _ raw to the compressor under the control of the first enable signal wen _ raw. The compressor compresses wdata _ raw to obtain resulting compressed data wdata, which is written to the storage space under control of a second enable signal wen.

It should be noted that, in the current implementation, in order to ensure that the decompressor can send rdata _ raw to the processing unit based on the stream data, a Buffer (FIFO or Buffer) is provided in the decompressor for data buffering, where the FIFO corresponds to a hardware storage device and the Buffer corresponds to a segment of the buffered data area. Similarly, a Buffer FIFO or Buffer may be provided in the compressor, or a corresponding Buffer may be provided in the processing unit, and the function is similar to that of a Buffer (FIFO or Buffer) in the decompressor, and will not be described again here.

A fourth aspect of the embodiments of the present disclosure provides an electronic device.

Fig. 9 is a schematic view of an electronic device provided in an embodiment of the present disclosure. Referring to fig. 9, the electronic device includes: at least one processing unit 901 and at least one decompressor 902.

The processing unit 901 is configured to send a decompression request to the decompressor when a decompression instruction in the instruction execution sequence is triggered, and perform task processing according to target decompressed data when a processing instruction corresponding to the decompression instruction is triggered.

The decompressor 902 is configured to, in response to the decompression request sent by the processing unit, obtain specified compressed data corresponding to the decompression request, and decompress the specified compressed data to obtain target decompressed data.

The position of the decompression instruction in the instruction execution sequence is determined according to the decompression predicted time length of the target data and the processing instruction, and the decompression predicted time length is used for indicating the time for decompressing the specified compressed data corresponding to the target data.

In some optional implementations, the electronic device further includes: at least one compiler. The compiler is used for determining target data meeting compression conditions and decompression prediction duration of the target data from task data of a target task to be executed by the processing unit; generating an instruction execution sequence according to the decompression predicted time length and a processing instruction corresponding to the target data; the instruction execution sequence at least comprises a decompression instruction corresponding to the target data.

The operation of the electronic device will be described with reference to fig. 10.

Fig. 10 is a schematic view of an operating process of an electronic device according to an embodiment of the present disclosure. Referring to fig. 10, the operation includes the following steps.

In step S1001, the compiler determines target data from task data of a target task to be executed by the processing unit according to a preset compression condition.

In step S1002, the compiler determines the decompression prediction duration of the target data according to the attribute information of the target data and the layout of the storage space.

Step S1003, the compiler inserts the decompression instruction into the initial instruction execution sequence according to the decompression predicted time length and the position of the processing instruction corresponding to the target data in the initial instruction execution sequence, so as to generate an instruction execution sequence.

At least one instruction is included between the decompressing instruction and the processing instruction, and the difference value between the predicted execution time length of the at least one instruction and the decompressing predicted time length of the target data is smaller than a preset threshold value.

In step S1004, the compiler sends the instruction execution sequence to the corresponding processing unit.

In step S1005, the processing unit receives the instruction execution sequence and sequentially executes the respective instructions based on the instruction execution sequence.

In step S1006, in the case of execution of the decompression instruction, the processing unit generates a decompression request and sends the decompression request to the decompressor.

In step S1007, the decompressor receives the decompression request and obtains the specified compressed data from the corresponding storage space.

In other words, the storage space may be an internal storage space of the processing unit or an external storage space, which is not limited in this disclosure.

In step S1008, the decompressor decompresses the specified compressed data according to the preset decompression mode to obtain the target decompressed data.

In step S1009, the decompressor transmits the target decompressed data to the processing unit after the data decompression processing is completed.

In step S1010, the processing unit receives the target decompressed data and executes a processing instruction based on the target decompressed data.

It should be noted that, during the decompression of the data performed by the decompressor, the processing unit executes the instructions between the decompression instruction and the processing in the instruction execution sequence.

In step S1011, in the case where the processing unit has executed all the instructions in the instruction execution sequence, a processing result is obtained.

For example, the compression and decompression processes may be automatically completed, that is, a user performs programming in an uncompressed manner, analyzes the task data in a pre-compilation stage, automatically compresses and stores the part of task data in the data area when it is determined that the task data can be compressed, and inserts decompression instructions at several instructions before the data needs to be used (that is, inserts decompression instructions at several instructions before corresponding processing instructions), thereby generating an instruction execution sequence. In the process of executing the task based on the instruction execution sequence, because the decompression instruction is positioned before the corresponding processing instruction, the decompression processing can be executed in advance, so that the decompression process does not influence the execution process of the processing instruction. In addition, in the decompression process, the processing unit can run the instruction between the decompression instruction and the processing instruction, and the processing efficiency of the processing unit is further guaranteed.

Referring to fig. 11, an embodiment of the present disclosure provides an electronic device including: at least one processor 1101; at least one memory 1102, and one or more I/O interfaces 1103 coupled between the processor 1101 and the memory 1102; the memory 1102 stores one or more computer programs executable by the at least one processor 1101, and the one or more computer programs are executed by the at least one processor 1101 to enable the at least one processor 1101 to perform the above-described task processing method.

Illustratively, the processor 1101 corresponds to the processing unit in the embodiment of the present disclosure, the processor may be connected to or internally provided with a decompressor, so as to perform the data decompression processing by using the decompressor, and the memory 1102 corresponds to the processor 1101, and may be used to store the instruction execution sequence and the target decompressed data.

Referring to fig. 12, an electronic device according to an embodiment of the present disclosure includes a plurality of processing cores 1201 and a network on chip 1202, where the plurality of processing cores 1201 are all connected to the network on chip 1202, and the network on chip 1202 is configured to interact data between the plurality of processing cores and external data.

One or more instructions are stored in the one or more processing cores 1201, and the one or more instructions are executed by the one or more processing cores 1201, so that the one or more processing cores 1301 can execute the task processing method.

For example, the processing cores 1201 correspond to the processing units in the embodiments of the present disclosure, a decompressor may be disposed in at least one processing core 1201, so as to perform data decompression processing based on the decompressor, and data interaction between the processing cores 1201 may also be implemented through the network on chip 1202.

In some embodiments, the electronic device may be a brain-like chip, since the brain-like chip may adopt a vectorization calculation manner, and needs to call in parameters such as weight information of the neural network model through an external memory, for example, a Double Data Rate (DDR) synchronous dynamic random access memory. Therefore, the operation efficiency of batch processing is high in the embodiment of the disclosure.

Embodiments of the present disclosure also provide a computer-readable storage medium on which a computer program is stored, where the computer program, when executed by a processor/processing core, implements the above-mentioned task processing method based on a many-core system or instruction generation method based on a many-core system. The computer readable storage medium may be a volatile or non-volatile computer readable storage medium.

The disclosed embodiments also provide a computer program product, which includes computer readable code or a non-volatile computer readable storage medium carrying computer readable code, and when the computer readable code runs in a processor of an electronic device, the processor in the electronic device executes the above-mentioned many-core system-based task processing method or many-core system-based instruction generation method.

It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable storage media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).

The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable program instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, Random Access Memory (RAM), Read Only Memory (ROM), Erasable Programmable Read Only Memory (EPROM), Static Random Access Memory (SRAM), flash memory or other memory technology, portable compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computer. In addition, communication media typically embodies computer readable program instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the disclosure are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.

The computer program product described herein may be implemented in hardware, software, or a combination thereof. In an alternative embodiment, the computer program product may be embodied as a computer storage medium, and in another alternative embodiment, the computer program product may be embodied as a Software product, such as a Software Development Kit (SDK), or the like.

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Example embodiments have been disclosed herein, and although terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purposes of limitation. In some instances, features, characteristics and/or elements described in connection with a particular embodiment may be used alone or in combination with features, characteristics and/or elements described in connection with other embodiments, unless expressly stated otherwise, as would be apparent to one skilled in the art. It will, therefore, be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure as set forth in the appended claims.

Claims

1. An instruction generation method, comprising:

determining target data meeting compression conditions and decompression predicted time of the target data from task data of a target task to be executed by a processing unit, wherein the decompression predicted time is used for indicating the time for decompressing specified compressed data corresponding to the target data;

generating an instruction execution sequence according to the decompression predicted time length and the processing instruction corresponding to the target data;

the instruction execution sequence at least comprises a decompression instruction corresponding to the target data.

2. The method of claim 1, wherein generating an instruction execution sequence according to the decompression prediction time length and the processing instruction corresponding to the target data comprises:

and determining the position of a decompression instruction corresponding to the target data in an initial instruction execution sequence according to the decompression predicted time length and the position of a processing instruction corresponding to the target data in the initial instruction execution sequence, and inserting the decompression instruction into the initial instruction execution sequence to generate the instruction execution sequence.

3. The method according to claim 1, wherein at least one instruction is included between the decompression instruction and the processing instruction corresponding to the target data, and a difference value between a predicted execution time length of the at least one instruction and a decompressed predicted execution time length of the target data is smaller than a preset threshold value, wherein the predicted execution time length is an execution time length of an instruction obtained through prediction.

4. A task processing method, comprising:

responding to a decompression request sent by a processing unit, and acquiring specified compressed data corresponding to the decompression request, wherein the decompression request is a request sent by the processing unit according to a decompression instruction in an instruction execution sequence;

and decompressing the specified compressed data to obtain target decompressed data, so that the processing unit can perform task processing based on the target decompressed data under the condition that the processing instruction corresponding to the decompressed instruction is triggered.

5. The method of claim 4, wherein the decompressing the specified compressed data to obtain target decompressed data comprises:

and according to a preset decompression mode, carrying out decompression processing on the specified compressed data to obtain the target decompressed data, wherein the decompression mode is used for indicating a mode followed by the execution of decompression operation.

6. The method of claim 5, wherein the decompression modes comprise a lossless decompression mode and a lossy decompression mode, and the specified compressed data comprises program segments and/or parameters;

the decompressing the specified compressed data according to a preset decompressing mode to obtain the target decompressed data includes:

under the condition that the specified compressed data belong to a first class of data, decompressing the specified compressed data by adopting the lossless decompression mode to obtain first target decompressed data;

under the condition that the specified compressed data belongs to second-class data, decompressing the specified compressed data by adopting the lossy decompression mode to obtain second target decompressed data;

the first class of data comprises a program segment or a first parameter, the second class of data comprises a second parameter, and the degree of influence of the first parameter on task processing is greater than that of the second parameter.

7. The method of claim 4, wherein the specified compressed data is stored in an external memory space of the processing unit;

the obtaining of the specified compressed data corresponding to the decompression request includes:

acquiring the specified compressed data from the external storage space in a streaming data reading mode;

the decompressing the specified compressed data to obtain target decompressed data includes:

decompressing the target decompressed data in a streaming data decompression mode to obtain the target decompressed data.

8. The method according to claim 4, wherein the decompressing the specified compressed data, after obtaining target decompressed data, further comprises:

and sending the target decompressed data to a temporary storage space of the processing unit, so that the processing unit acquires the target decompressed data from the temporary storage space under the condition that a processing instruction corresponding to the decompressed instruction is triggered.

9. The method of claim 4, wherein a buffer space is provided in the decompressor;

the decompressing the specified compressed data, after obtaining the target decompressed data, further comprises:

storing the target decompressed data to the buffer space;

sending a connection request to the processing unit when the target decompressed data of the buffer space is in a readable state;

and in the case of establishing a communication connection with the processing unit based on the connection request, transmitting the target decompressed data to the processing unit by a streaming data transmission manner.

10. A task processing method, comprising:

under the condition that a decompression instruction in the instruction execution sequence is triggered, sending a decompression request to the decompressor, wherein the decompression request is used for instructing the decompressor to decompress specified compressed data so as to obtain target decompressed data;

and under the condition that the processing instruction corresponding to the decompression instruction is triggered, performing task processing according to the target decompression data.

11. The method according to claim 10, wherein after sending the decompression request to the decompressor and before performing the task processing according to the target decompressed data, further comprising:

and executing the instructions positioned between the decompression instruction and the processing in the instruction execution sequence.

12. The method according to claim 10, wherein the decompressor stores the target decompressed data into a temporary storage space of the processing unit, or wherein a buffer space is provided in the decompressor and the decompressor stores the target decompressed data into the buffer space;

the task processing according to the target decompressed data includes:

acquiring the target decompressed data from a temporary storage space;

executing the processing instruction according to the target decompression data;

or the like, or, alternatively,

establishing a communication connection with the decompressor according to a connection request sent by the decompressor, wherein the connection request is a request sent under the condition that target decompressed data in the buffer space is in a readable state;

receiving the target decompressed data transmitted by the decompressor in a streaming data transmission mode;

and executing the processing instruction according to the target decompressed data.

13. The method of any of claims 10-12, wherein the sequence of instruction execution is configured to perform at least one of an image processing task, a speech processing task, a text processing task, and a video processing task.

14. An electronic device, comprising: at least one processing unit and at least one decompressor;

the processing unit is used for sending a decompression request to the decompressor under the condition that a decompression instruction in an instruction execution sequence is triggered, and performing task processing according to target decompression data under the condition that a processing instruction corresponding to the decompression instruction is triggered;

the decompressor is used for responding to a decompression request sent by the processing unit, acquiring specified compressed data corresponding to the decompression request, decompressing the specified compressed data and acquiring the target decompressed data;

the position of the decompression instruction in the instruction execution sequence is determined according to a decompression predicted time length of the target data and the processing instruction, wherein the decompression predicted time length is used for indicating the time for decompressing the specified compressed data corresponding to the target data.

15. The electronic device of claim 14, further comprising: at least one compiler;

the compiler is used for determining target data meeting compression conditions and decompression prediction duration of the target data from task data of a target task to be executed by a processing unit; generating the instruction execution sequence according to the decompression predicted time length and the processing instruction corresponding to the target data; the instruction execution sequence at least comprises a decompression instruction corresponding to the target data.