CN110399124A - A kind of code generating method, device, equipment and readable storage medium storing program for executing - Google Patents

A kind of code generating method, device, equipment and readable storage medium storing program for executing Download PDF

Info

Publication number
CN110399124A
CN110399124A CN201910655665.9A CN201910655665A CN110399124A CN 110399124 A CN110399124 A CN 110399124A CN 201910655665 A CN201910655665 A CN 201910655665A CN 110399124 A CN110399124 A CN 110399124A
Authority
CN
China
Prior art keywords
code
many
core
array
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910655665.9A
Other languages
Chinese (zh)
Other versions
CN110399124B (en
Inventor
朱效民
赵雅倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201910655665.9A priority Critical patent/CN110399124B/en
Publication of CN110399124A publication Critical patent/CN110399124A/en
Application granted granted Critical
Publication of CN110399124B publication Critical patent/CN110399124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/31Programming languages or programming paradigms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/447Target code generation

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Stored Programmes (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

This application discloses a kind of code generating methods, comprising: obtains the object code calculated for realizing Stencil;Parameter transmitting is carried out to the target array of object code processing, parameter is generated and transmits code;Each target circulation is distributed to the different IPs of isomery many-core processor according to the cycle labeling of target circulation, task is generated and distributes code;Target circulation is the row or column in target array;Target operation part in object code is converted into many-core subcode;Parameter transmitting code, task distribution code and many-core subcode group are combined into the many-core code that can run on isomery many-core processor.Complicated Stencil calculation code can be converted to the many-core code that isomery many-core processor can be run by the application, to improve many-core code building efficiency and accuracy.Correspondingly, a kind of code generating unit, equipment and readable storage medium storing program for executing disclosed in the present application, similarly have above-mentioned technique effect.

Description

A kind of code generating method, device, equipment and readable storage medium storing program for executing
Technical field
This application involves technical field of software development, in particular to a kind of code generating method, device, equipment and readable deposit Storage media.
Background technique
Stencil calculating is the common calculating mode in calculating field, and outstanding feature is: there are numerous cycle calculations (such as circulation), intensive and complexity with higher.It has in terms of marine numerical simulation, weather meteorology Important application value.Nowadays, the processor of many-core architectural framework can significantly improve computational efficiency, but due to many-core processor Isomerism, can not directly run for realizing Stencil calculate code.
If desired many-core processor runs the code calculated for realizing Stencil, it is necessary to which technical staff manually will be former There is code many-core, but due to cyclicity and complexity that Stencil is calculated, the efficiency manually converted can be relatively low.Also, people The accuracy for the many-core code that work conversion inevitably will appear careless omission, therefore obtain also is difficult to ensure.Wherein, many-core code is energy Enough run on the code of isomery many-core processor.
Therefore, the formation efficiency and accuracy rate for how improving many-core code are that those skilled in the art need what is solved to ask Topic.
Summary of the invention
In view of this, the application's is designed to provide a kind of code generating method, device, equipment and readable storage medium Matter, to improve the formation efficiency and accuracy rate of many-core code.Its concrete scheme is as follows:
In a first aspect, this application provides a kind of code generating methods, comprising:
Obtain the object code calculated for realizing Stencil;
Parameter transmitting is carried out to the target array of object code processing, parameter is generated and transmits code;
Each target circulation is distributed to the different IPs of isomery many-core processor according to the cycle labeling of target circulation, is generated Task distributes code;Target circulation is the row or column in target array;
Target operation part in object code is converted into many-core subcode, many-core subcode is calculated including at least many-core Code, memory management code and data are loaded into and are stored back to code;
Parameter transmitting code, task distribution code and many-core subcode group, which are combined into, can run on the processing of isomery many-core The many-core code of device.
Preferably, parameter transmitting is carried out to the target array of object code processing, generates parameter and transmits code, comprising:
Static array is created, and by the title, dimension and address of target array, the start-stop index of each dimension is added to quiet State array generates parameter and transmits code.
Preferably, the target operation part in object code is converted into many-core subcode, comprising:
The calculation equation in target operation part is converted according to character string transformation rule, many-core is generated and calculates generation Code;
The array being related to according to many-core calculation code generates memory management code and data are loaded into and/or are stored back to code.
Preferably, the array being related to according to many-core calculation code generates memory management code and data are loaded into and/or deposit Back substitution code, comprising:
The array that analysis many-core calculation code is related to, and the title of array and the corresponding column information of title are inserted into dynamic number Group;
Memory management code is generated according to dynamic array;
Data, which are generated, according to the calculation equation in many-core calculation code is loaded into and is stored back to code.
Preferably, data are generated according to the calculation equation in many-core calculation code and are loaded into and are stored back to code, comprising:
The attribute information for the array that many-core calculation code is related to is determined according to the calculation equation in many-core calculation code;
Data, which are generated, according to attribute information is loaded into and/or is stored back to code.
Preferably, data are generated according to attribute information and are loaded into and/or are stored back to code, comprising:
It determines the reusable array in array that many-core calculation code is related to, and adds reusable mark for reusable array Note;
Data, which are generated, according to reusable label and attribute information is loaded into and/or is stored back to code.
Preferably, isomery can be run on by parameter transmitting code, task distribution code and many-core subcode group being combined into After the many-core code of many-core processor, further includes:
Control many-core code is run in isomery many-core processor, and returns to operation result.
Second aspect, this application provides a kind of code generating units, comprising:
Module is obtained, for obtaining the object code calculated for realizing Stencil;
Parameter transfer module, the target array for handling object code carry out parameter transmitting, generate parameter and transmit generation Code;
Task allocating module distributes each target circulation to isomery many-core for the cycle labeling according to target circulation The different IPs of device are managed, task is generated and distributes code;Target circulation is the row or column in target array;
Conversion module, for the target operation part in object code to be converted to many-core subcode, many-core subcode is extremely It less include that many-core calculation code, memory management code and data are loaded into and are stored back to code;
Composite module is combined into and can run for parameter to be transmitted code, task distribution code and many-core subcode group In the many-core code of isomery many-core processor.
The third aspect, this application provides a kind of code generating devices, comprising:
Memory, for storing computer program;
Processor, for executing the computer program, to realize aforementioned disclosed code generating method.
Fourth aspect, this application provides a kind of readable storage medium storing program for executing, for saving computer program, wherein the meter Calculation machine program realizes aforementioned disclosed code generating method when being executed by processor.
By above scheme it is found that this application provides a kind of code generating methods, comprising: obtain for realizing The object code that Stencil is calculated;Parameter transmitting is carried out to the target array of object code processing, parameter is generated and transmits code; Each target circulation is distributed to the different IPs of isomery many-core processor according to the cycle labeling of target circulation, generates task distribution Code;Target circulation is the row or column in target array;Target operation part in object code is converted into many-core subcode, Many-core subcode includes at least many-core calculation code, memory management code and data and is loaded into and is stored back to code;Parameter is transmitted Code, task distribution code and many-core subcode group are combined into the many-core code that can run on isomery many-core processor.
As it can be seen that after getting the code calculated for realizing Stencil, mesh of the application first to object code processing It marks array and carries out parameter transmitting, generate parameter and transmit code, the effect of parameter transmitting code is are as follows: determines current Stencil meter Calculate the relevant information of calculative array.And then each target circulation is distributed to isomery according to the cycle labeling of target circulation The different IPs of many-core processor, generate task distribute code, task distribution code effect be are as follows: by target array row or Column are distributed to different IPs and are handled, such as: 1-10 row is distributed to No. 1 core.Then by operation generation involved in object code Code is converted to the many-core subcode that isomery many-core processor can identify, the effect of many-core subcode are as follows: handles isomery many-core Each core in device just knows which the operation itself needed to be implemented has, and memory and the data being related to needed for determining operation, This is the effect that many-core calculation code, memory management code and data were loaded into and were stored back to code.Finally by generation these Code is combined, and can be obtained the many-core code that can run on isomery many-core processor, and the many-core code can be realized Stencil is calculated.Wherein, the application can analyze complicated Stencil calculation code, to produce corresponding code Segment, these code snippets can be combined to the many-core code that isomery many-core processor can be run, so that it is raw to improve many-core code At efficiency and accuracy.
Correspondingly, a kind of code generating unit, equipment and readable storage medium storing program for executing provided by the present application similarly have above-mentioned Technical effect.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of application for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is the first code generating method flow chart disclosed in the present application;
Fig. 2 is second of code generating method flow chart disclosed in the present application;
Fig. 3 is a kind of code generating framework schematic diagram disclosed in the present application;
Fig. 4 is a kind of code generating unit schematic diagram disclosed in the present application;
Fig. 5 is a kind of code generating device schematic diagram disclosed in the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on Embodiment in the application, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall in the protection scope of this application.
Currently, the prior art needs technical staff manually by original code many-core, but due to the circulation of Stencil calculating Property and complexity, the efficiency manually converted can be relatively low.Also, the many-core that artificial conversion inevitably will appear careless omission, therefore obtain The accuracy of code is also difficult to ensure.For this purpose, can be improved many-core code building this application provides a kind of code building scheme Efficiency and accuracy.
Shown in Figure 1, the embodiment of the present application discloses the first code generating method, comprising:
The object code that S101, acquisition calculate for realizing Stencil;
S102, parameter transmitting is carried out to the target array of object code processing, generates parameter and transmits code;
In the present embodiment, parameter transmitting is carried out to the array information of the target array of object code processing, generates parameter Transmit code, comprising: creation static array, and by the title, dimension and address of target array, the start-stop index of each dimension adds Static array is added to, parameter is generated and transmits code.Static array is mainly used for carrying out parameter transmitting, it may be assumed that handles object code Target array many-core, so that isomery many-core processor is identified data to be treated.
S103, each target circulation is distributed to the difference of isomery many-core processor according to the cycle labeling of target circulation Core generates task and distributes code;Target circulation is the row or column in target array;
The target array of object code processing is generally Multidimensional numerical, and during processing, row or column can regard one as Target circulation, therefore there are multiple target circulations, and each target circulation has cycle labeling.It therefore can be with based on cycle labeling Each target circulation is distributed to different core, the different IPs in isomery many-core processor is made to handle different data.
It should be noted that column have column label, row has line flag, and column label and line flag are in the present embodiment Cycle labeling.The detailed process of task distribution in the present embodiment can be found in following the description: if target array is the two dimension of 100 rows Array, with 0-99 row data, wherein 0,1,2 ... 99 be the cycle labeling of each row, if to go carry out task distribution, known Isomery many-core processor has 10 cores, and each core can be assigned 10 row data, then the result of task distribution can be with are as follows: " 0- 9 " rows distribute to No. 1 core, and " 10-19 " row distributes to No. 2 cores, other and so on finish until two-dimensional array is assigned.
Wherein, the operation mode of each core processing data is identical, just for process object it is different, it may be assumed that processing Data are different.The processing in data is checked without the limitation of sequencing due to each again, and each core can be with parallel processing number According to so that treatment effeciency can be improved.
S104, the target operation part in object code is converted to many-core subcode, many-core subcode includes at least crowd Core calculation code, memory management code and data are loaded into and are stored back to code;
It is identifiable in addition to needing to be converted into isomery many-core processor for each calculation code in object code Outside character string, it is also necessary to provide memory and the data needed for it for the operation of each calculation code.Therefore in switch target code In target operation part when, need to generate many-core calculation code accordingly, memory management code and data are loaded into and are stored back to Code, so as to its operation.Wherein, many-core calculation code is the identifiable calculation code of isomery many-core processor, memory management Code and data are loaded into and are stored back to code and many-core calculation code is corresponding.Memory management code includes: Memory Allocation code With memory release code.Memory management herein is the memory for managing isomery many-core processor.
It should be noted that the target operation part in the present embodiment refers to any loop computation in object code Sentence.If object code be accomplished that circled addition (such as C=A+B, A and B be can assignment parameter), then target operation Code refers to the calculating process of any one sub-addition, i.e. C_part=A_part+B_part.
S105, by parameter transmit code, task distribution code and many-core subcode group be combined into can run on isomery crowd The many-core code of core processor.
So far, each core in isomery many-core processor can know which the operation itself needed to be implemented has, and determine Memory needed for operation and the data being related to, thus by parameter transmit code, task distribution code and many-core subcode into Row combination, can obtain the many-core code that can run on isomery many-core processor, and the many-core code can be realized Stencil It calculates.Isomery many-core processor such as SW26010.
It should be noted that needing to be combined according to the processing sequence of data when combining these codes.Such as: it is interior It deposits distribution code to need before the operation of many-core calculation code, memory release code needs after the operation of many-core calculation code.
It is understood that isomery many-core processor is accelerator, it is typically only capable to carry out calculating operation, and cannot achieve Logical operation (such as data communication), therefore after isomery many-core processor runs many-core code, it needs to return to operation result To CPU, so that CPU carries out subsequent operation.Parameter is namely transmitted code, task distribution code and many-core subcode to combine For that can run on after the many-core code of isomery many-core processor, further includes: control many-core code is in isomery many-core processor Middle operation, and return to operation result.
As it can be seen that get for realizing Stencil calculate code after, the embodiment of the present application first to object code at The target array of reason carries out parameter transmitting, generates parameter and transmits code, the effect of parameter transmitting code is are as follows: determines current Stencil calculates the relevant information of calculative array.And then according to the cycle labeling of target circulation by each target circulation It distributes to the different IPs of isomery many-core processor, the task that generates distributes code, and the effect of task distribution code is are as follows: by number of targets Row or column in group is distributed to different IPs and is handled, such as: 1-10 row is distributed to No. 1 core.Then it will be related in object code And operation part be converted to the many-core subcode that isomery many-core processor can identify, the effect of many-core subcode are as follows: make different Each core in structure many-core processor just knows which the operation itself needed to be implemented has, and determines and memory and relate to needed for operation And data, this is that many-core calculation code, memory management code and data are loaded into and are stored back to the effect of code.It will finally give birth to At these codes be combined, can be obtained the many-core code that can run on isomery many-core processor, and the many-core code It can be realized Stencil calculating.Wherein, the application can analyze complicated Stencil calculation code, thus producible pair The code snippet answered, these code snippets can be combined to the many-core code that isomery many-core processor can be run, to improve crowd Core code building efficiency and accuracy.
Shown in Figure 2, the embodiment of the present application discloses second of code generating method, comprising:
The object code that S201, acquisition calculate for realizing Stencil;
S202, parameter transmitting is carried out to the target array of object code processing, generates parameter and transmits code;
S203, each target circulation is distributed to the difference of isomery many-core processor according to the cycle labeling of target circulation Core generates task and distributes code;Target circulation is the row or column in target array;
S204, the calculation equation in target operation part is converted according to character string transformation rule, generates many-core meter Calculate code;
S205, the array being related to according to many-core calculation code generate memory management code and data are loaded into and/or are stored back to Code.
S206, parameter is transmitted to code, task distribution code, many-core calculation code, memory management code and data load Entering and/or being stored back to code combination is the many-core code that can run on isomery many-core processor.
In the present embodiment, memory management code is generated according to the array that many-core calculation code is related to and data is loaded into And/or it is stored back to code, comprising: the array that analysis many-core calculation code is related to, and the corresponding column of the title of array and title are believed Breath insertion dynamic array;Memory management code is generated according to dynamic array;It is generated according to the calculation equation in many-core calculation code Data are loaded into and are stored back to code.
Dynamic array is corresponding with static array.Static array information is determining, array name, dimension, each dimension The start-stop coordinate of degree is all determining.And dynamic array be it is relevant to calculating, multidate information be and each calculate clause Relevant, multidate information specifically includes which current array has for current computation partition (i.e. corresponding line number or row number) A row or column participates in calculating and (in the same calculating object code, may having multiple adjacent row or column to participate in calculating);According to etc. Formula assignment information, determines whether the corresponding row or column of data needs to be loaded into or copy out;It is calculated according to it corresponding multiple adjacent Line number or row number have determined if reusable data etc..These information are related to the specific object code that calculates, and are dynamics Variation, so referred to as dynamic array.
In the present embodiment, data are generated according to the calculation equation in many-core calculation code and are loaded into and are stored back to code, comprising: The attribute information for the array that many-core calculation code is related to is determined according to the calculation equation in many-core calculation code;According to attribute information It generates data and is loaded into and/or is stored back to code.
Specifically, the array in dynamic array has certain attribute information, whether which indicates current number group It needs to be loaded into, if needs are stored back to.Wherein, loading refers to: array is transferred to isomery from the traditional main memory at the end CPU Many-core processor.It is stored back to and refers to: array is deposited from the traditional main memory that isomery many-core processor is transmitted to the end CPU Storage.Therefore data can be generated according to attribute information and is loaded into and/or is stored back to code.Because the array on the equation left side generally requires load Enter, the array on the right of equation, which generally requires, to be stored back to.
In the present embodiment, data are generated according to attribute information and is loaded into and/or is stored back to code, comprising: determine that many-core calculates Reusable array in the array that code is related to, and reusable label is added for reusable array;It is marked and is belonged to according to reusable Property information generate data and be loaded into and/or be stored back to code, be loaded into reducing unnecessary data.
It should be noted that this side can be based on since different many-core calculation codes may relate to identical array Realize data-reusing in face.Such as: that first many-core calculation code calculates is A+B, and what second many-core calculation code calculated It is B+C, wherein B is reusable array.It just needs for A and B to be loaded into from the end CPU specifically, calculating A+B, and calculates B+C It just needs for B and C to be loaded into from the end CPU, and since the two all uses B, B can only be loaded into primary.That is: when calculating A+B It is loaded into A and B, after the completion of calculating, A is deleted and retains B simultaneously, so when calculating B+C, it is only necessary to be loaded into C, so that it may complete B+C's It calculates, to also be achieved that the multiplexing of B.Above-mentioned A, B, C are usually adjacent row or column data.
It should be noted that other in the present embodiment realize that step is same as the previously described embodiments or similar, therefore this implementation Details are not described herein for example.
Therefore after getting the code calculated for realizing Stencil, the embodiment of the present application is first to target generation The target array of code processing carries out parameter transmitting, generates parameter and transmits code, the effect of parameter transmitting code is are as follows: determines current Stencil calculates the relevant information of calculative array.And then according to the cycle labeling of target circulation by each target circulation It distributes to the different IPs of isomery many-core processor, the task that generates distributes code, and the effect of task distribution code is are as follows: by number of targets Row or column in group is distributed to different IPs and is handled, such as: 1-10 row is distributed to No. 1 core.Then it will be related in object code And operation part be converted to the many-core subcode that isomery many-core processor can identify, the effect of many-core subcode are as follows: make different Each core in structure many-core processor just knows which the operation itself needed to be implemented has, and determines and memory and relate to needed for operation And data, this is that many-core calculation code, memory management code and data are loaded into and are stored back to the effect of code.It will finally give birth to At these codes be combined, can be obtained the many-core code that can run on isomery many-core processor, and the many-core code It can be realized Stencil calculating.Wherein, the application can analyze complicated Stencil calculation code, thus producible pair The code snippet answered, these code snippets can be combined to the many-core code that isomery many-core processor can be run, to improve crowd Core code building efficiency and accuracy.
Following code generating framework can be constructed according to core concept provided by the present application, specifically refers to Fig. 3.In Fig. 3 institute In the code generating framework shown, comprising: importation and output section are in two sub-sections;Wherein, importation includes: array dimension Module, circulation subscript module and calculating mode module;Output par, c includes: parameter transfer module, task division module (on i.e. State the task distribution referred to), Memory Allocation and recycling module and computing module, wherein the computing module in output par, c wraps again Include: data are loaded into unit, Data Computation Unit and data storage cell.The mapping relations of importation and output par, c refer to Fig. 3.
Specifically, array dimension module is used for area for describing the array that code to be processed is related to, circulation subscript module Divide the circulation subscript of each target circulation, calculates the calculating mode that mode module is used to describe each target operation part.Parameter The array many-core that code to be processed can be related to by transfer module, enables isomery many-core processor to identify to be treated Data;Task division module is for dividing each target circulation to different core;Memory Allocation and recycling module are used to be each The distribution of many-core calculation code and recycling memory;Computing module can be with many-core calculation code, and provides memory sum number for its operation According to acquisition many-core calculation code and data are loaded into and/or are stored back to code.The data that computing module includes are loaded into unit, data meter It calculates unit and data storage cell is used to generate corresponding data for each many-core calculation code and is loaded into and/or is stored back to code.When So, Memory Allocation and recycling module can also be built in computing module.
Wherein, it is based on the frame, only information to be inserted (need to be depended on into importation according to the mapping relations in Fig. 3 Information) combed, and then determine code fixed part, so that it may obtain corresponding many-core code.
If realizing Code automatic build according to frame shown in Fig. 3, specifically include:
According to the array dimension module of importation, static array is established.It include that code to be processed relates in static array And target array title, dimension and address, each dimension start-stop index, so that subsequent operation is examined according to title Rope.Static array can carry out structured storage, so as to follow-up management and retrieval.
Task, which divides, is not necessarily to any average information, establishes reflecting between the circulation subscript of each target circulation and task division It penetrates, the intrinsic code in frame can complete task according to the mapping and divide, and generate task and divide code.Changing in each task Generation index is also required to filling into frame.Iteration index in each task is are as follows: each core handles multiple target circulations when institute The iteration of foundation marks.
The target operation part in object code is converted into many-core subcode respectively, is specifically included:
(a) string processing is carried out to any one target operation part, the element for participating in calculating is distinguished, these yuan Element specifically includes that constant, variable, operator, array (including subscript information).For constant therein, variable, operator not into Row specially treated is analyzed and is handled to array therein, specifically:
By the title, dimension, address of array, the information such as the start-stop index of each dimension are added to dynamic array.If dynamic Have respective name in array, then finds corresponding storage location addition other information;If corresponding name is not present in dynamic array Claim, then creates respective name, and add other information.The column information that label current goal operation part needs to access, i.e., currently The corresponding column information of dynamic array.Because under normal circumstances, current calculate in addition to use adjacent element, space is also used Position is adjacent but the non-conterminous element of storage location.Such as: if code to be processed is Fortran code, and isomery many-core is handled C language code is run in device, due to the row being classified as in C language code in Fortran code, the position of identical element has Change.
(b) calculation equation in target operation part is converted into many-core calculation code.Specific conversion regime are as follows: calculate The title of the corresponding array of equation remains unchanged, and suffix is run in many-core plus " _ slave " with showing, the corresponding column letter of array Breath is then basic information with J, and J+1 is then converted to JA1.Wherein, A indicates to add, i.e. add;1 is converted into corresponding character string " 1 ".Such as This can be by the corresponding array of calculation equation in target operation part to arrange (by taking Fortran as an example, column storage is continuous) Form utilize many-core calculation code present.
In addition it is also necessary to determine that the starting of its dimension indexes according to the title of array, while being subtracted in C language code Begin index.Such as: " Drhs (i, j+1) " is converted to " Drhs_JA1_slave [i-IminS] ", wherein " IminS " is array in I Subscript (indexing) in dimension (at once).
(c) attribute information of array that many-core calculation code is related to is determined according to the calculation equation in many-core calculation code, The array for determining calculation equation both sides is to need to be loaded into or be stored back to.Such as: it is directed to a calculation equation, due on the right side of equation Array be usually to need to be loaded into, unless it has been labeled as being stored back to (i.e. by more in the calculating of its numerical value in front Change, therefore this step is also just without reloading data), the array on the left of equation is usually to need to be stored back to;Therefore by equation Labeled as being loaded into, the array on the left of equation is labeled as being stored back to all arrays on right side.
After having handled a certain target operation part according to above-mentioned (a) (b) (c), that is, corresponding many-core calculation code is produced, The code can be temporarily stored in temporary variable, uniformly be merged to subsequent.
Further, memory management code can be generated according to dynamic array, memory management code includes: Memory Allocation generation Code and memory release code.That is: memory required when the operation of each many-core calculation code is determined, to carry out Memory Allocation and release It puts.Specifically, determining the corresponding column information of its title according to the dynamic array in (a), it is empty to establish storage according to its column information Between, and Memory Allocation is carried out, so that Memory Allocation code is produced, after current many-core calculation code operation, producible pair The memory release code answered.It should be noted that target operation part is identical with the calculating of many-core calculation code, therefore the two relates to And array it is identical, so according to corresponding dynamic array i.e. produce memory management code.
According to the attribute information in (c), it may be determined that array be need be loaded into or be stored back to, thus generate data be loaded into and/ Or it is stored back to code, obtained code is temporarily stored into temporary variable, is uniformly merged to subsequent.Wherein, step is loaded into array In, it can be achieved that array be multiplexed.Specifically, if that first many-core calculation code calculate is 1+2, and second many-core calculation code That calculate is 2+3, and 1,2,3 herein are column index.So 2 corresponding column datas are reusable array.Wherein, it is once counting In calculation, the corresponding column data of non-maximum column index can be preloaded sentence temporary variable in outer be loaded into of circulation in advance, it will be maximum It indexes corresponding column data and iteration load sentence temporary variable is loaded into according to loop iteration in circulation.Wherein, in circulation i.e. Are as follows: the internal layer iterative cycles of a certain many-core calculation code, usually handle continuous a row or column (in C language, a line number According to being continuous, and be then a column data in Fortran it is continuous) data;Circulation is outer i.e. are as follows: calculates except iterative cycles Code.
So far, above-mentioned task code, many-core calculation code, memory management code and data are divided to be loaded into and/or deposit Back substitution code merges, to obtain the many-core code that can be run in isomery many-core processor.
Therefore the present embodiment can automatically generate the many-core that can be run in isomery many-core processor based on source code Code significantly reduces the time of coding and debugging, improves the formation efficiency and accuracy rate of many-core code.
In order to further illustrate the technical effect of the application, following test has been carried out for same batch of data.
Specifically, manually implemented and debugging the realization time is about for 50 circulations in common ocean numerical models Be 1.5 months, and utilize method of the invention, then can the time of minute magnitude generate many-core code, be aided with it is necessary its The realization of his globality Computational frame can then complete the original work that could be completed in 1.5 months within 1 day time.
Secondly, the efficiency for the many-core code that the present invention generates is higher, table 1 show contrast effect, it can be seen that automatic raw At code and senior programmer carry out manual programming and optimized code efficiency is almost the same, and performance is better than ordinary procedure The code efficiency that member writes manually.
Table 1
Version It calculates time (s) Speed-up ratio
CPU version 1154 1
OpenACC version 522 2.21
Data-reusing version is not implemented in the present invention 305 3.784
Junior programmer's version 304 3.796
Senior programmer's version 299 3.85
The present invention realizes data-reusing version 300 3.847
A kind of code generating unit provided by the embodiments of the present application is introduced below, a kind of code described below is raw It can be cross-referenced at device and a kind of above-described code generating method.
Shown in Figure 4, the embodiment of the present application discloses a kind of code generating unit, comprising:
Module 401 is obtained, for obtaining the object code calculated for realizing Stencil;
Parameter transfer module 402, the target array for handling object code carry out parameter transmitting, generate parameter transmitting Code;
Task allocating module 403 distributes each target circulation to isomery crowd for the cycle labeling according to target circulation The different IPs of core processor generate task and distribute code;Target circulation is the row or column in target array;
Conversion module 404, for the target operation part in object code to be converted to many-core subcode, many-core subcode Code is loaded into and is stored back to including at least many-core calculation code, memory management code and data;
Composite module 405 is combined into and can transport for parameter to be transmitted code, task distribution code and many-core subcode group Row is in the many-core code of isomery many-core processor.
In a specific embodiment, parameter transfer module is specifically used for:
Static array is created, and by the title, dimension and address of target array, the start-stop index of each dimension is added to quiet State array generates parameter and transmits code.
In a specific embodiment, conversion module includes:
Character string converting unit, for being turned according to character string transformation rule to the calculation equation in target operation part It changes, generates many-core calculation code;
Data processing unit, the array for being related to according to many-core calculation code generates memory management code and data carry Enter and/or be stored back to code.
In a specific embodiment, data processing unit includes:
Analysis subelement, the array being related to for analyzing many-core calculation code, and the title of array and title is corresponding Column information is inserted into dynamic array;
First generates subelement, for generating memory management code according to dynamic array;
Second generates subelement, is loaded into and is stored back to generation for generating data according to the calculation equation in many-core calculation code Code.
In a specific embodiment, the second generation subelement is specifically used for:
The attribute information for the array that many-core calculation code is related to is determined according to the calculation equation in many-core calculation code;
Data, which are generated, according to attribute information is loaded into and/or is stored back to code.
In a specific embodiment, the second generation subelement is specifically used for:
It determines the reusable array in array that many-core calculation code is related to, and adds reusable mark for reusable array Note;
Data, which are generated, according to reusable label and attribute information is loaded into and/or is stored back to code.
In a specific embodiment, further includes:
Control module runs in isomery many-core processor for controlling many-core code, and returns to operation result.
Wherein, previous embodiment can be referred to by closing the more specifical course of work of modules, unit in this present embodiment Disclosed in corresponding contents, no longer repeated herein.
As it can be seen that present embodiments providing a kind of code generating unit, which can be automatically generated based on source code can In the many-core code of isomery many-core processor operation.The time of coding and debugging is significantly reduced, the life of many-core code is improved At efficiency and accuracy rate.
A kind of code generating device provided by the embodiments of the present application is introduced below, a kind of code described below is raw Forming apparatus can be cross-referenced with a kind of above-described code generating method and device.
Shown in Figure 5, the embodiment of the present application discloses a kind of code generating device, comprising:
Memory 501, for saving computer program;
Processor 502, for executing the computer program, to realize method disclosed in above-mentioned any embodiment.
A kind of readable storage medium storing program for executing provided by the embodiments of the present application is introduced below, one kind described below is readable to deposit Storage media can be cross-referenced with a kind of above-described code generating method, device and equipment.
A kind of readable storage medium storing program for executing, for saving computer program, wherein when the computer program is executed by processor Realize code generating method disclosed in previous embodiment.Specific steps about this method, which can refer in previous embodiment, to be disclosed Corresponding contents, no longer repeated herein.
This application involves " first ", " second ", " third ", the (if present)s such as " the 4th " be for distinguishing similar right As without being used to describe a particular order or precedence order.It should be understood that the data used in this way in the appropriate case can be with It exchanges, so that the embodiments described herein can be implemented with the sequence other than the content for illustrating or describing herein.In addition, Term " includes " and " having " and their any deformation, it is intended that cover it is non-exclusive include, for example, containing a system The process, method or equipment of column step or unit those of are not necessarily limited to be clearly listed step or unit, but may include not having There are other step or units being clearly listed or intrinsic for these process, methods or equipment.
It should be noted that the description for being related to " first ", " second " etc. in this application is used for description purposes only, and cannot It is interpreted as its relative importance of indication or suggestion or implicitly indicates the quantity of indicated technical characteristic.Define as a result, " the One ", the feature of " second " can explicitly or implicitly include at least one of the features.In addition, the skill between each embodiment Art scheme can be combined with each other, but must be based on can be realized by those of ordinary skill in the art, when technical solution Will be understood that the combination of this technical solution is not present in conjunction with there is conflicting or cannot achieve when, also not this application claims Protection scope within.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with it is other The difference of embodiment, same or similar part may refer to each other between each embodiment.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of readable storage medium storing program for executing well known in field.
Specific examples are used herein to illustrate the principle and implementation manner of the present application, and above embodiments are said It is bright to be merely used to help understand the present processes and its core concept;At the same time, for those skilled in the art, foundation The thought of the application, there will be changes in the specific implementation manner and application range, in conclusion the content of the present specification is not It is interpreted as the limitation to the application.

Claims (10)

1. a kind of code generating method characterized by comprising
Obtain the object code calculated for realizing Stencil;
Parameter transmitting is carried out to the target array of object code processing, parameter is generated and transmits code;
Each target circulation is distributed to the different IPs of isomery many-core processor according to the cycle labeling of target circulation, generates task Distribute code;The target circulation is the row or column in the target array;
Target operation part in the object code is converted into many-core subcode, the many-core subcode includes at least many-core Calculation code, memory management code and data are loaded into and/or are stored back to code;
Parameter transmitting code, task distribution code and the many-core subcode group are combined into described in capable of running on The many-core code of isomery many-core processor.
2. code generating method according to claim 1, which is characterized in that the target to object code processing Array carries out parameter transmitting, generates parameter and transmits code, comprising:
Static array is created, and by title, dimension and the address of the target array, the start-stop index of each dimension is added to institute Static array is stated, the parameter transmitting code is generated.
3. code generating method according to claim 2, which is characterized in that the target by the object code is transported Calculation code conversion is many-core subcode, comprising:
The calculation equation in the target operation part is converted according to character string transformation rule, the many-core is generated and calculates Code;
The array being related to according to the many-core calculation code generates the memory management code and the data are loaded into and/or deposit Back substitution code.
4. code generating method according to claim 3, which is characterized in that described to be related to according to the many-core calculation code Array generate the memory management code and the data and be loaded into and/or be stored back to code, comprising:
The array that the many-core calculation code is related to is analyzed, and the title of the array and the corresponding column information of title are inserted into and are moved State array;
The memory management code is generated according to the dynamic array;
The data, which are generated, according to the calculation equation in the many-core calculation code is loaded into and is stored back to code.
5. code generating method according to claim 4, which is characterized in that described according in the many-core calculation code Calculation equation generates the data and is loaded into and is stored back to code, comprising:
The attribute information for the array that the many-core calculation code is related to is determined according to the calculation equation in the many-core calculation code;
The data, which are generated, according to the attribute information is loaded into and/or is stored back to code.
6. code generating method according to claim 5, which is characterized in that described according to attribute information generation Data are loaded into and/or are stored back to code, comprising:
It determines the reusable array in array that the many-core calculation code is related to, and adds reusable for the reusable array Label;
The data, which are generated, according to reusable label and the attribute information is loaded into and/or is stored back to code.
7. code generating method according to claim 1-6, which is characterized in that described that the parameter is transmitted generation Code, task distribution code and the many-core subcode group are combined into the many-core that can run on the isomery many-core processor After code, further includes:
It controls the many-core code to run in the isomery many-core processor, and returns to operation result.
8. a kind of code generating unit characterized by comprising
Module is obtained, for obtaining the object code calculated for realizing Stencil;
Parameter transfer module, the target array for handling the object code carry out parameter transmitting, generate parameter and transmit generation Code;
Task allocating module distributes each target circulation to isomery many-core processor for the cycle labeling according to target circulation Different IPs, generate task distribute code;The target circulation is the row or column in the target array;
Conversion module, for the target operation part in the object code to be converted to many-core subcode, the many-core filial generation Code includes at least many-core calculation code, memory management code and data and is loaded into and is stored back to code;
Composite module is combined into for the parameter to be transmitted code, task distribution code and the many-core subcode group The many-core code of the isomery many-core processor can be run on.
9. a kind of code generating device characterized by comprising
Memory, for storing computer program;
Processor, for executing the computer program, to realize code building side as described in any one of claim 1 to 7 Method.
10. a kind of readable storage medium storing program for executing, which is characterized in that for saving computer program, wherein the computer program is located Reason device realizes code generating method as described in any one of claim 1 to 7 when executing.
CN201910655665.9A 2019-07-19 2019-07-19 Code generation method, device, equipment and readable storage medium Active CN110399124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910655665.9A CN110399124B (en) 2019-07-19 2019-07-19 Code generation method, device, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910655665.9A CN110399124B (en) 2019-07-19 2019-07-19 Code generation method, device, equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN110399124A true CN110399124A (en) 2019-11-01
CN110399124B CN110399124B (en) 2022-04-22

Family

ID=68324714

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910655665.9A Active CN110399124B (en) 2019-07-19 2019-07-19 Code generation method, device, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN110399124B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111324355A (en) * 2020-02-11 2020-06-23 苏州浪潮智能科技有限公司 Method and system for debugging many-core code
CN113869801A (en) * 2021-11-30 2021-12-31 阿里云计算有限公司 Maturity state evaluation method and device for enterprise digital middleboxes

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3221301A (en) * 1959-10-13 1965-11-30 Graphic Arts Res Foundation Apparatus for recognition and recording of text matter
CN102968388A (en) * 2012-10-26 2013-03-13 无锡江南计算技术研究所 Method and device for structuring data
US20140146062A1 (en) * 2012-11-26 2014-05-29 Nvidia Corporation System, method, and computer program product for debugging graphics programs locally utilizing a system with a single gpu
CN105242909A (en) * 2015-11-24 2016-01-13 无锡江南计算技术研究所 Method for many-core circulation partitioning based on multi-version code generation
CN105242962A (en) * 2015-11-24 2016-01-13 无锡江南计算技术研究所 Quick lightweight thread triggering method based on heterogeneous many-core
CN107729118A (en) * 2017-09-25 2018-02-23 复旦大学 Towards the method for the modification Java Virtual Machine of many-core processor
CN108541321A (en) * 2016-02-26 2018-09-14 谷歌有限责任公司 Program code is mapped to the technique of compiling of the programmable graphics processing hardware platform of high-performance, high effect

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3221301A (en) * 1959-10-13 1965-11-30 Graphic Arts Res Foundation Apparatus for recognition and recording of text matter
CN102968388A (en) * 2012-10-26 2013-03-13 无锡江南计算技术研究所 Method and device for structuring data
US20140146062A1 (en) * 2012-11-26 2014-05-29 Nvidia Corporation System, method, and computer program product for debugging graphics programs locally utilizing a system with a single gpu
CN105242909A (en) * 2015-11-24 2016-01-13 无锡江南计算技术研究所 Method for many-core circulation partitioning based on multi-version code generation
CN105242962A (en) * 2015-11-24 2016-01-13 无锡江南计算技术研究所 Quick lightweight thread triggering method based on heterogeneous many-core
CN108541321A (en) * 2016-02-26 2018-09-14 谷歌有限责任公司 Program code is mapped to the technique of compiling of the programmable graphics processing hardware platform of high-performance, high effect
CN107729118A (en) * 2017-09-25 2018-02-23 复旦大学 Towards the method for the modification Java Virtual Machine of many-core processor

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘松: "面向局部性和并行优化的循环分块技术", 《计算机研究与发展》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111324355A (en) * 2020-02-11 2020-06-23 苏州浪潮智能科技有限公司 Method and system for debugging many-core code
CN111324355B (en) * 2020-02-11 2022-05-31 苏州浪潮智能科技有限公司 Method and system for debugging many-core code
CN113869801A (en) * 2021-11-30 2021-12-31 阿里云计算有限公司 Maturity state evaluation method and device for enterprise digital middleboxes

Also Published As

Publication number Publication date
CN110399124B (en) 2022-04-22

Similar Documents

Publication Publication Date Title
Chowdhury et al. Oblivious algorithms for multicores and networks of processors
Verschoor et al. Analysis and performance estimation of the conjugate gradient method on multiple GPUs
Li et al. Branch, bound and remember algorithm for U-shaped assembly line balancing problem
CN112764893B (en) Data processing method and data processing system
US20210304066A1 (en) Partitioning for an execution pipeline
CN110399124A (en) A kind of code generating method, device, equipment and readable storage medium storing program for executing
CN107977444A (en) Mass data method for parallel processing based on big data
CN105468439A (en) Adaptive parallel algorithm for traversing neighbors in fixed radius under CPU-GPU (Central Processing Unit-Graphic Processing Unit) heterogeneous framework
Liu et al. Approximate probabilistic analysis of biopathway dynamics
CN105302525A (en) Parallel processing method for reconfigurable processor with multilayer heterogeneous structure
Fioretto et al. Accelerating exact and approximate inference for (distributed) discrete optimization with GPUs
CN109005049B (en) Service combination method based on Bigraph consistency algorithm in Internet environment
CN108108242B (en) Storage layer intelligent distribution control method based on big data
CN105335135B (en) Data processing method and central node
Chen et al. Modeling design iteration in product design and development and its solution by a novel artificial bee colony algorithm
Pawłowski et al. Coalition structure generation with the graphics processing unit
CN106844024A (en) The GPU/CPU dispatching methods and system of a kind of self study run time forecast model
Mawlana et al. Integrating variance reduction techniques and parallel computing in construction simulation optimization
CN103514042B (en) A kind of Dual-adjustment merge-sorting tuning method and device
CN104424123A (en) Lock-free data buffer and usage thereof
CN114880396A (en) Method and system for realizing consensus based on intelligent contract
CN105573717B (en) A kind of procedure division method and device of multi-core processor oriented
CN108804788B (en) Web service evolution method based on data cell model
CN115066673A (en) System and method for ETL pipeline processing
CN102968388B (en) Data layout's method and device thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant