WO2022249236A1

WO2022249236A1 - Software design assistance system, software design assistance method, and software design assistance program

Info

Publication number: WO2022249236A1
Application number: PCT/JP2021/019604
Authority: WO
Inventors: 潤矢吹; 知彦東山; 正勝外山
Original assignee: 三菱電機株式会社
Priority date: 2021-05-24
Filing date: 2021-05-24
Publication date: 2022-12-01
Also published as: JPWO2022249236A1; TW202246971A; JP7333889B2

Abstract

According to the present invention, a structure analysis unit (212) analyzes a source code (111) including a plurality of processing blocks to generate structure data (112) on the source code. A false dependence detection unit (313) detects false dependence on the basis of the structure data. A structure data correction unit (314), by correcting the structure data, generates structure data on the source code as corrected structure data (124) when the detected false dependence is eliminated. A parallelization unit (214) parallelizes the plurality of processing blocks on the basis of the corrected structure data. A result output unit (215) outputs the parallelization result.

Description

Software design support system, software design support method and software design support program

This disclosure relates to technology for supporting parallelization of software.

Parallel execution using multi-core processors is promising for performance improvement. Therefore, an automatic parallelization technology that analyzes the source code and automatically outputs the source code for parallel execution is expected.
As open source tools for analyzing source code, compiler platforms such as Clang as a compiler front end and LLVM as a compiler back end are known.

Patent Document 1 describes an invention that extracts a synchronous dependency from ignore information that indicates a ignorable data dependency, and improves the parallelization effect while avoiding the simultaneous execution of two processes that have a synchronous required dependency. It is

Japanese Patent No. 6558310

Source code that has existed for a long time, called legacy code, is not designed to be parallelized. Also, existing parallelizing compilers analyze the source code on the assumption that the source code has been correctly implemented. However, data dependencies that are essentially independent are built into the source code at implementation time. Therefore, the data dependency, which is essentially independent, becomes a hindrance to parallelization.

The purpose of this disclosure is to enable software to be parallelized considering data dependencies that are essentially independent.

The software design support system of the present disclosure is
a structural analysis unit that analyzes a source code including a plurality of processing blocks to generate structural data of the source code;
a false dependency detection unit that detects data dependencies that do not affect each other as false dependencies based on the structural data;
a structural data correction unit that generates, as corrected structural data, structural data of the source code when the detected false dependency is eliminated by correcting the structural data;
a parallelization unit that parallelizes the plurality of processing blocks based on the modified structural data;
and a result output unit for outputting a parallelization result obtained by the parallelization and indicating a schedule of the plurality of processing blocks.

According to the present disclosure, it is possible to parallelize software considering data dependencies (false dependencies) that are essentially independent.

1 is a configuration diagram of a software design support system 100 according to Embodiment 1. FIG. FIG. 2 is a configuration diagram of a parallelization device 200 according to Embodiment 1; 1 is a configuration diagram of a repair candidate extraction device 300 according to Embodiment 1. FIG. 1 is a functional configuration diagram of a software design support system 100 according to Embodiment 1. FIG. 4 is a flowchart of a software design support method (parallelization) according to the first embodiment; 4 shows an example of source code 111 according to the first embodiment; FIG. 4 shows an example of structure data 112 according to the first embodiment; FIG. FIG. 4 shows an example of a parallelization result 113 according to the first embodiment; FIG. 4A and 4B show examples of schedules before and after parallelization according to the first embodiment; FIG. 4 is a flowchart of a software design support method (false dependency) according to the first embodiment; 4 shows an example of a false dependency list 123 according to the first embodiment; FIG. 4 is a flowchart of step S120 in Embodiment 1; 4 shows an example of structure data 124 according to the first embodiment; FIG. 4 is a flowchart of step S130 in Embodiment 1; 4 is a flowchart of a software design support method (repair candidate) according to the first embodiment; 4 is a diagram showing an example of repair candidate data 125 according to the first embodiment; FIG. 4 is a flowchart of step S160 in Embodiment 1; FIG. 4 is a supplementary diagram of WAR dependence in Embodiment 1; FIG. 10 is a diagram showing an example of schedules before and after parallelization in the conventional technology; 4A and 4B show examples of schedules before and after parallelization according to the first embodiment; FIG. 4A and 4B show examples of schedules before and after parallelization according to the first embodiment; FIG. FIG. 10 is a block diagram of the false dependency correction device 400 according to the second embodiment; FIG. FIG. 2 is a functional configuration diagram of a software design support system 100 according to Embodiment 2; FIG. 10 is a flowchart of a software design support method (improvement) according to Embodiment 2; 2 is a hardware configuration diagram of a parallelization device 200 according to an embodiment; FIG. 1 is a hardware configuration diagram of a repair candidate extraction device 300 according to an embodiment; FIG. 4 is a hardware configuration diagram of the false dependency correction device 400 according to the embodiment; FIG.

In the embodiments and drawings, the same or corresponding elements are denoted by the same reference numerals. Descriptions of elements having the same reference numerals as those described will be omitted or simplified as appropriate. Arrows in the figure mainly indicate the flow of data or the flow of processing.

Embodiment 1.
A software design support system 100 will be described with reference to FIGS. 1 to 23. FIG.

*** Configuration description ***
The configuration of the software design support system 100 will be described based on FIG.
A software design support system 100 includes a parallelization device 200 and a modification candidate extraction device 300 . However, the software design support system 100 may be realized by one device or by three or more devices. However, data delivery may be performed manually.

The configuration of the parallelization device 200 will be described based on FIG.
The parallelization device 200 is a computer having hardware such as a processor 201 , a memory 202 , an auxiliary storage device 203 , a communication device 204 and an input/output interface 205 . These pieces of hardware are connected to each other via signal lines.

A processor 201 is an IC that performs arithmetic processing and controls other hardware. For example, processor 201 is a CPU, DSP or GPU.
IC is an abbreviation for Integrated Circuit.
CPU is an abbreviation for Central Processing Unit.
DSP is an abbreviation for Digital Signal Processor.
GPU is an abbreviation for Graphics Processing Unit.

Memory 202 is a volatile or non-volatile storage device. Memory 202 is also referred to as main storage or main memory. For example, memory 202 is RAM. The data stored in memory 202 is saved in auxiliary storage device 203 as needed.
RAM is an abbreviation for Random Access Memory.

Auxiliary storage device 203 is a non-volatile storage device. For example, the auxiliary storage device 203 is ROM, HDD or flash memory. Data stored in the auxiliary storage device 203 is loaded into the memory 202 as required.
ROM is an abbreviation for Read Only Memory.
HDD is an abbreviation for Hard Disk Drive.

Communication device 204 is a receiver and transmitter. For example, communication device 204 is a communication chip or NIC. Communication of the parallelization device 200 is performed using the communication device 204 .
NIC is an abbreviation for Network Interface Card.

The input/output interface 205 is a port to which an input device and an output device are connected. For example, the input/output interface 205 is a USB terminal, the input device is a keyboard and mouse, and the output device is a display. Input/output of the parallelization device 200 is performed using the input/output interface 205 .
USB is an abbreviation for Universal Serial Bus.

The parallelization device 200 includes elements such as a source code reception unit 211 , a structure analysis unit 212 , a correction request unit 213 , a parallelization unit 214 and a result output unit 215 . These elements are implemented in software.

The auxiliary storage device 203 stores a parallelization program for causing the computer to function as a source code reception unit 211 , a structure analysis unit 212 , a correction request unit 213 , a parallelization unit 214 and a result output unit 215 . The parallelized program is loaded into memory 202 and executed by processor 201 .
The auxiliary storage device 203 further stores an OS. At least part of the OS is loaded into memory 202 and executed by processor 201 .
The processor 201 executes the parallelization program while executing the OS.
OS is an abbreviation for Operating System.

Input/output data of the parallelized program are stored in the storage unit 290 .
Memory 202 functions as storage unit 290 . However, a storage device such as the auxiliary storage device 203 , a register within the processor 201 and a cache memory within the processor 201 may function as the storage unit 290 instead of or together with the memory 202 .

The parallelization device 200 may include multiple processors that substitute for the processor 201 .

The configuration of the repair candidate extraction device 300 will be described based on FIG.
The modification candidate extraction device 300 is a computer having hardware such as a processor 301 , a memory 302 , an auxiliary storage device 303 , a communication device 304 and an input/output interface 305 . These pieces of hardware are connected to each other via signal lines.

A processor 301 is an IC that performs arithmetic processing and controls other hardware. For example, processor 301 is a CPU, DSP or GPU.
Memory 302 is a volatile or non-volatile storage device. Memory 302 is also referred to as main storage or main memory. For example, memory 302 is RAM. The data stored in the memory 302 is saved in the auxiliary storage device 303 as required.
Auxiliary storage device 303 is a non-volatile storage device. For example, the auxiliary storage device 303 is ROM, HDD or flash memory. Data stored in the auxiliary storage device 303 is loaded into the memory 302 as required.
Communication device 304 is a receiver and transmitter. For example, communication device 304 is a communication chip or NIC. Communication of the modification candidate extraction device 300 is performed using the communication device 304 .
The input/output interface 305 is a port to which an input device and an output device are connected. For example, the input/output interface 305 is a USB terminal, the input device is a keyboard and mouse, and the output device is a display. Input/output of the repair candidate extraction device 300 is performed using the input/output interface 305 .

The modification candidate extraction device 300 includes elements such as a constraint reception unit 311 , a definition acquisition unit 312 , a false dependency detection unit 313 , a structure data correction unit 314 , a modification candidate extraction unit 315 and a modification candidate output unit 316 . These elements are implemented in software.

Auxiliary storage device 303 includes a restriction receiving unit 311, a definition acquisition unit 312, a false dependency detection unit 313, a structure data correction unit 314, a modification candidate extraction unit 315, and a modification candidate output unit 316. program is stored. The modification candidate extraction program is loaded into memory 302 and executed by processor 301 .
The auxiliary storage device 303 further stores an OS. At least part of the OS is loaded into memory 302 and executed by processor 301 .
The processor 301 executes the modification candidate extraction program while executing the OS.

The input/output data of the repair candidate extraction program are stored in the storage unit 390 .
Memory 302 functions as storage unit 390 . However, a storage device such as the auxiliary storage device 303 , a register within the processor 301 and a cache memory within the processor 301 may function as the storage unit 390 instead of or together with the memory 302 .

The modification candidate extraction device 300 may include multiple processors that substitute for the processor 301 .

FIG. 4 shows the functional configuration of the software design support system 100. As shown in FIG. Each arrow in the figure indicates the flow of data or the flow of processing.
The operation of each element and the contents of each data in the software design support system 100 will be described later.

***Description of operation***
The operation procedure of the software design support system 100 corresponds to the software design support method. Further, the operation procedure of the software design support system 100 corresponds to the procedure of processing by the software design support program.
The software design support program includes a parallelization program and a modification candidate extraction program. The software design support program can be recorded (stored) in a computer-readable manner in a non-volatile recording medium such as an optical disc or flash memory.

A software design support method (parallelization) will be described based on FIG.
The software design support method (parallelization) is a method of parallelizing a plurality of processing blocks in the source code 111 and is executed by the parallelization device 200 .
A processing block is the smallest unit of processing that is executed in parallel. A specific example of a processing block is a function.

In step S101 , the source code accepting unit 211 accepts the source code 111 .
For example, a user inputs the source code 111 to the parallelization device 200, and the source code receiving unit 211 receives the input source code 111. FIG.
Source code 111 includes multiple processing blocks.
There is a data dependency called false dependency between at least one of the processing blocks.
A data dependency is a relationship in which data access of one processing block affects data access of another processing block.
False dependencies refer to false data dependencies. In other words, a false dependency means that two processing blocks access data with the same name but their data accesses do not affect each other. For example, a false dependency is built into the source code 111 during implementation even though there is essentially no data dependency.
A true data dependency is called a "true dependency".

A specific example of the source code 111 is shown in FIG.
The source code 111 includes functions (processing blocks) such as func1( ), func2( ), and func3( ). The file name of the source code 111 is "mysample1.c".
In the source code 111, the set of func1() and func2() and the set of func2() and func3() each have a WAR dependency. WAR is an abbreviation for Write After Read.
A WAR dependency is a specific example of a false dependency, and means a relationship in which the same data is written after the data is read.
In the source code 111, the set of func2() and func3() has RAW dependency. RAW is an abbreviation for Read After Write.
RAW dependence is a specific example of true dependence, and means a relationship in which the same data is read after data is written.

Returning to FIG. 5, the description continues from step S102.
In step S102 , the structure analysis unit 212 analyzes the source code 111 and generates structure data 112 of the source code 111 . For example, the structural analysis unit 212 receives the source code 111 and executes an existing analysis tool. A specific example of an existing analysis tool is a tool called LLVM or IR.
Structural data 112 includes order information 112A and access information 112B.
The order information 112A is information that identifies the execution order of multiple processing blocks in the source code 111 . Conditional branching and repetition are taken into account in specifying the execution order.
The access information 112B is information specifying data access in each processing block.

FIG. 7 shows structural data 112 of source code 111 (see FIG. 6).
The order information 112A indicates calling relationships between functions. The order of execution of a plurality of functions is specified by the calling relationship between functions. Specifically, each line of the order information 112A indicates "caller function name, callee function name, call line number". The function call line number is the number of the line in the source code 111 where the call destination function calls the call source function.
Each line of the access information 112B indicates "variable name, function name, access type, line number, file name". The access type indicates the type of data access such as read (r) and write (w).

Returning to FIG. 5, the description continues from step S103.
In step S103 , the modification requesting unit 213 requests the modification candidate extraction device 300 to modify the structural data 112 of the source code 111 .
Specifically, the correction requesting unit 213 transfers the structure data 112 of the source code 111 to the modification candidate extraction device 300 .
By modifying the structural data 112, the structural data 112 is updated to the structural data 124 of the source code 111 when the false dependency is eliminated.

In step S104 , the correction request unit 213 receives the structure data 124 from the modification candidate extraction device 300 .
Structural data 124 is modified structural data 112 . The format of structural data 124 is the same as the format of structural data 112 .

In step S105 , the parallelization unit 214 parallelizes a plurality of processing blocks in the source code 111 based on the structure data 124 .
For example, the parallelization unit 214 inputs the source code 111 and the structural data 124 and executes an existing parallelization tool.

The parallelization unit 214 then generates the parallelization result 113 .
The parallelization result 113 is data indicating the allocation of multiple processing blocks to the multicore processor and the schedule of the multiple processing blocks.

FIG. 8 shows the parallelization result 113 of the source code 111 (see FIG. 6).
The parallelization result 113 indicates the block name, allocated core, start time and end time for each processing block. An assigned core is a processor core to which a processing block is assigned. The start time and end time specify the schedule of the processing block. A processing block is executed from the start time to the end time.

FIG. 9 shows execution timings of three processing blocks before and after parallelization of the source code 111 (see FIG. 6).
Before parallelization, three processing blocks (func1, func2, func3) are executed in order. There is a false dependency between func1 and func2. Also, there is a true dependency between func2 and func3.
After parallelization, false dependence is eliminated, and one processing block (func1) and two processing blocks (func2, func3) are executed in parallel by two processor cores (core0, core1).

Returning to FIG. 5, step S106 will be described.
In step S106 , the result output unit 215 outputs the parallelization result 113 .
For example, the result output unit 215 displays the parallelization result 113 on the display. Also, the result output unit 215 transfers the parallelization result 113 to the modification candidate extraction device 300 .

A software design support method (false dependency) will be described with reference to FIG.
The software design support method (false dependency) is a method of correcting the structural data 112 of the source code 111 in which false dependency exists to the structural data 124 of the source code 111 when the false dependency is eliminated. performed by

In step S111 , the constraint accepting unit 311 accepts the constraint data 121 .
For example, the user inputs the restriction data 121 to the modification candidate extraction device 300, and the restriction reception unit 311 receives the input restriction data 121. FIG.
Constraint data 121 is data for designating false dependencies that must not be resolved.

In step S112 , the definition acquisition unit 312 acquires definition data 122 .
For example, the definition data 122 is stored in the storage unit 390 in advance, and the definition acquisition unit 312 acquires the definition data 122 from the storage unit 390 .
Definition data 122 indicates the definition of the false dependency.

In step S113 , the false dependency detection unit 313 receives the structural data 112 from the parallelization device 200 .

In step S120 , the false dependency detector 313 detects one or more false dependencies existing in the source code 111 by analyzing the structural data 112 .
At this time, except for the false dependencies specified by the constraint data 121, false dependencies that meet the definition indicated by the definition data 122 are detected.

Then, the false dependency detection unit 313 generates the false dependency list 123. FIG.
The false dependency list 123 is data indicating information identifying one or more detected false dependencies.

FIG. 11 shows the false dependency list 123 of the source code 111 (see FIG. 6).
A false dependency list 123 indicates multiple false dependencies (WAR dependencies) existing in the source code 111 .
Specifically, the false dependency list 123 indicates the processing block set as a header, and write access information below the header.
For example, the header "func1-func2" indicates a set of a processing block (func1) for read access and a processing block (func2) for write access. Also, the first line under the header "func1-func2" indicates that there is a WAR dependency between func1 and func2 for variable a, and mysample1. The 15th line of c indicates that write access to variable a is executed.
In FIG. 11, the description regarding the variable b is omitted.

Based on FIG. 12, the procedure of step S120 will be described.
In step S121, the false dependence detection unit 313 selects data from the access information 112B. The selected data is called "selected data".

In step S122, the false dependence detection unit 313 selects access information for selected data from the access information 112B. The access information to be selected is called "selected access information". Data access indicated by the selection access information is called "selection access".

In step S123, the false dependence detection unit 313 refers to the order information 112A and determines the processing block next to the processing block executing the selective access.
Next, the false dependence detection unit 313 extracts access information indicating access to the selected data by the next processing block from the access information 112B. The extracted access information is called "next access information". Also, the data access indicated by the next access information is referred to as "next access".
Then, the false dependence detection unit 313 compares the selected access information and the next access information, and determines whether the selected access and the next access are false dependence.
If the access type of the selected access information is read (r) and the access type of the next access information is write (w), the selected access and the next access are false dependent (WAR dependent).
If the selected access and the next access are false dependencies, the process proceeds to step S124.
If the selected access and the next access are not false dependencies, the process proceeds to step S125.

In step S124 , the false dependency detection unit 313 registers information specifying false dependencies of the selected access and the next access in the false dependency list 123 .
Specifically, the false dependence detection unit 313 describes the processing block set in the false dependence list 123 if the processing block set (header) is not described in the false dependence list 123 . Also, the false dependence detection unit 313 writes the next access information (write access information) in the false dependence list 123 .

In step S125, the false dependence detection unit 313 determines whether there is unselected access information for the selected data.
If there is unselected access information for the selected data, the process proceeds to step S122.
If there is no unselected access information for selected data, the process proceeds to step S126.

In step S126, the false dependence detection unit 313 determines whether there is unselected data.
If there is unselected data, the process proceeds to step S121.
If there are no unselected data, the process ends.

Returning to FIG. 10, the description continues from step S130.
In step S130 , the structural data correction unit 314 generates the structural data 124 by correcting the structural data 112 based on the false dependency list 123 .
Structural data 124 is modified structural data 112 .

FIG. 13 shows structural data 124 obtained by modifying the structural data 112 (see FIG. 7).
Structural data 124 includes order information 124A and access information 124B.
Order information 124A is the same as order information 112A (see FIG. 7).
Access information 124B is modified access information 112B (see FIG. 7).
WAR dependencies are resolved by renaming. In other words, WAR dependency is resolved by changing the data name (variable name).
In the access information 124B, the variables a and c are renamed in the access information of func2 and func3.
In FIG. 13, the description of the variable b is omitted.

Based on FIG. 14, the procedure of step S130 will be described.
In step S131 , the structural data correction unit 314 selects a false dependency from the false dependency list 123 .

In step S132, the structural data correction unit 314 extracts access information corresponding to the selected false dependency from the structural data 112, and corrects the extracted access information so as to eliminate the selected false dependency.
Specifically, the structure data correction unit 314 extracts the write access information corresponding to the selected false dependency (WAR dependency) write access from the structure data 112, and renames the data name in the extracted access information. do. For example, if the data name ends with a number, the structure data correction unit 314 increases the number by one and changes the value. Also, if the data name does not end with a number, the structural data correction unit 314 adds a number (for example, "1") to the end of the data name.

In step S133, the structural data correction unit 314 determines whether there is an unselected false dependency.
If there are unselected false dependencies, the process proceeds to step S131.
If there are no unselected false dependencies, processing ends.

Returning to FIG. 10, step S141 will be described.
In step S141 , the structural data correction unit 314 transfers the structural data 124 to the parallelization device 200 .

Based on FIG. 15, the software design support method (repair candidate) will be described.
The software design support method (modification candidate) is a method for extracting modification candidates from the false dependency list 123 and is executed by the modification candidate extraction device 300 .
Modification candidates are false dependencies that require modification in order to execute multiple processing blocks according to the schedule shown in the parallelization result 113 .

In step S151 , the modification candidate extraction unit 315 receives the parallelization result 113 from the parallelization device 200 .

In step S160 , the modification candidate extraction unit 315 extracts the modification candidate false dependencies from the modification candidate list 123 based on the schedules of the plurality of processing blocks shown in the parallelization result 113 .

Then, the modification candidate extraction unit 315 generates modification candidate data 125 .
The modification candidate data 125 is data indicating information about false dependencies that are candidates for modification.

FIG. 16 shows a specific example of the repair candidate data 125. As shown in FIG.
The modification candidate data 125 indicates a false dependence (WAR dependence) that is a modification candidate and a false dependence cancellation method (rename).

Based on FIG. 17, the procedure of step S160 will be described.
In step S161 , the modification candidate extraction unit 315 selects a processing block set from the parallelization result 113 . A processing block set is two processing blocks.
In the processing block set, the processing block with the earlier start time is called the "preceding block", and the processing block with the later start time is called the "following block".

In step S162, the modification candidate extraction unit 315 refers to the parallelization result 113 and determines whether the succeeding block starts before the preceding block ends.
Specifically, the modification candidate extraction unit 315 compares the end time of the preceding block with the start time of the succeeding block. Then, the modification candidate extraction unit 315 determines whether the start time of the subsequent block is earlier than the end time of the preceding block.
If the succeeding block begins before the preceding block ends, processing proceeds to step S163.
If the succeeding block does not begin before the preceding block ends, processing proceeds to step S164.

In step S163 , the modification candidate extraction unit 315 extracts false dependencies of pairs of preceding and following blocks from the false dependency list 123 .
Then, the modification candidate extraction unit 315 registers the extracted false dependency information in the modification candidate data 125 .

In step S164, the modification candidate extraction unit 315 determines whether there is an unselected block set.
If there is an unselected block set, the process proceeds to step S161.
If there is no unselected block set, the process ends.

Returning to FIG. 15, step S171 will be described.
In step S171 , the modification candidate output unit 316 outputs modification candidate data 125 .
For example, the repair candidate output unit 316 displays the repair candidate data 125 on the display.

Based on FIG. 18, the WAR dependence is supplemented.
WAR dependency is often seen in programs like [original code] that reuse variables.
In [original code], WAR dependency is established between the read of the process (A) for the variable val and the write of the process (B) for the variable val.
By renaming the variable val of at least one of process (A) and process (B) to the alias variable val2 as in [parallelization corrected code], the [original code] is free from WAR dependency.

*** Effect of Embodiment 1 ***
According to idea (1) and idea (2), the first embodiment can eliminate false dependencies and improve parallelization performance.
Idea (1) is to extract and resolve false dependencies that exist in the source code. The software design support system 100 extracts source code S/W structure information using a S/W structure analysis tool such as a compiler front end, and extracts false dependencies based on the S/W structure information. Next, the software design support system 100 corrects the S/W structure information so as to eliminate the extracted false dependence, and parallelizes based on the S/W structure information. Then, the software design support system 100 outputs the optimum parallelization result when the false dependence is eliminated. "S/W" means software.
Idea (2) is to extract necessary and sufficient false dependencies to be resolved for parallel execution indicated by the parallelization result, and to output the extracted false dependencies as false dependency repair candidates. The software design support system 100 extracts necessary and sufficient false dependencies to be resolved for parallel execution indicated by the parallelization result of idea (1), and presents the extracted false dependencies as false dependency correction candidates. It is difficult to judge whether or not false dependencies can be resolved only by analyzing the source code that is the target of analysis. Designers have to decide. Therefore, false dependency modification candidates are presented. Moreover, it is not always necessary to eliminate all false dependencies in order to perform parallel execution indicated by the parallelization result of idea (1). Therefore, false dependencies that are candidates for false dependency repair are narrowed down to necessary and sufficient false dependencies that should be eliminated. This has the effect of reducing the amount of judgment imposed on the designer.

FIG. 19 shows the schedule before and after parallelization according to the prior art.
In the prior art, false dependence is determined to be true dependence. Therefore, even when parallelization is performed with a fine granularity, a parallelization result as shown in FIG. 19 can be obtained. That is, since there is a false dependency between block A and block B, the execution of block B is started after execution of processing relating to the false dependency in block A is finished. Note that fine-grained parallelization means precise parallelization that uses small program blocks such as functions or instruction groups as processing units, rather than parallelization that uses large program blocks such as tasks as processing units. do.
FIG. 20 shows schedules before and after parallelization according to the first embodiment.
In the first embodiment, idea (1) yields a parallelization result as shown in FIG. That is, since there is no true dependency between block A and block B, block A and block B are executed in parallel. Also, according to idea (2), false dependencies that need to be resolved for parallel execution indicated by the parallelization result are output as false dependency repair candidates. In the case of FIG. 20, in order to realize the parallel execution indicated by the parallelization result, it is necessary to eliminate the false dependence between block A and block B. FIG. Therefore, the false dependency between block A and block B is presented as a false dependency repair candidate. The designer confirms the parallelization result and the false dependency repair candidate, and takes measures to eliminate the false dependency in the source code for the false dependency that can be resolved without any problem. This makes it possible to create source code from which false dependencies are eliminated. Then, by inputting the source code from which false dependencies have been eliminated to a conventional parallelizing compiler, a parallelization result as shown in FIG. 20 is realized. In addition, the designer designates a false dependency that has been resolved and has a problem as a parallel execution constraint, and by analyzing the false dependency again and outputting the parallelization result, the parallelization result and the false dependency repair are performed within the constraints. Candidates are obtained.

FIG. 21 shows the schedule before and after parallelization according to the first embodiment. There is a false dependency in each of the sets of blocks A and B and the sets of blocks B and C.
In the first embodiment, idea (1) yields a parallelization result as shown in FIG. In other words, optimal scheduling is realized such that block A is assigned to one processor core, block B and block C are assigned to the other processor core, and block B and block C are executed in parallel with block A. . Also, according to idea (2), false dependencies that need to be resolved for parallel execution indicated by the parallelization result are output as false dependency repair candidates. In the case of FIG. 21, in order to realize the parallel execution indicated by the parallelization result, it is necessary to eliminate the false dependency between block A and block B. FIG. There is no need to resolve the false dependency between block B and block C. Therefore, the false dependency between block A and block B is presented as a false dependency repair candidate. In other words, it is necessary to correct one false dependency in order to eliminate two false dependencies to obtain an optimal parallelization result and to realize parallel execution indicated by the optimal parallelization result. Therefore, one false dependency is presented as a false dependency repair candidate.

With conventional technology, parallelism related to false dependencies could not be extracted. On the other hand, according to the first embodiment, idea (1) can eliminate false dependence. Therefore, more parallelism can be extracted. Also, with idea (1) alone, it is necessary to manually determine whether or not there is a problem by modifying the source code for all automatically resolved false dependencies. However, according to idea (2), the first embodiment extracts necessary and sufficient false dependencies to be resolved in order to perform parallel execution indicated by the parallelization result. Therefore, it is possible to reduce the amount of manual judgment as to whether there is a problem by modifying the source code.

*** Example of Embodiment 1 ***
A specific example of false dependence other than WAR dependence will be described.
***Example 1***
A WAW dependency is a specific example of a false dependency, meaning a relationship in which the same data is written after the data is written. WAW is an abbreviation for Write After Write.
The false dependency detector 313 extracts WAW dependency as one of false dependencies (step S120). Specifically, when the access type of the selected access information is write (w) and the access type of the next access information is write (w), the false dependency detection unit 313 detects false dependency ( WAW dependent) (step S123).
The structural data correction unit 314 corrects the access information of the structural data 112 so as to eliminate the WAW dependence (step S132).
WAW dependence can be resolved by renaming (duplicating) at least one variable, like WAR dependence.

***Example 2***
A scoped variable dependency is a specific example of a false dependency, meaning access to multiple variables with the same name but with different scopes.
A variable whose scope is determined is called a "scope variable". A specific example of a scope variable is a static variable.
Scope variable dependencies are erroneously parsed during parallelization to exist as dependencies, even though they do not exist as dependencies in the source code.
It is possible to limit the scope of variables within a function (or within a file), and to provide variables with the same variable name as variables provided in other functions (or other files) in each function (or each file). . Parallelization may incorrectly parse these variables as being the same variable. In this way, even in cases where different variables are erroneously analyzed as the same variable during parallelization, by correcting the structural data 112, misanalysis is not performed during parallelization. can be made
The false dependency detection unit 313 extracts scope variable dependency as one of false dependencies (step S120).
The structural data correction unit 314 corrects the access information of the structural data 112 so as to eliminate the scope variable dependence (step S132).
Scope variable dependencies are resolved by renaming (duplicating) at least one variable, similar to WAR dependencies.

***Example 3***
An externally declared dependency is a specific example of a false dependency, meaning access to multiple variables with the same name declared in different external files. A specific example of an external file is a library. For example, in a C language external file, variables are declared extern.
Externally declared dependencies are erroneously parsed during parallelization to exist as dependencies, even though they do not exist as dependencies in the source code.
If variables with the same variable name are used in different files and each variable is declared in a different external file, parallelization can incorrectly parse these variables as being the same variable. In this way, even in cases where different variables are erroneously analyzed as the same variable during parallelization, by correcting the structural data 112, misanalysis is not performed during parallelization. can be made
The structural data correction unit 314 corrects the access information of the structural data 112 so as to eliminate the external declaration dependency (step S132).
An external declaration dependency is resolved by renaming (duplicating) at least one variable, like a WAR dependency.

Embodiment 2.
A mode for obtaining a parallelization result after elimination of false dependencies to be eliminated will be described mainly with reference to FIGS. 22 to 24 for differences from the first embodiment.

*** Configuration description ***
The software design support system 100 further comprises a false dependency correction device 400. FIG.

Based on FIG. 22, the configuration of the false dependency correction device 400 will be described.
False dependency correction device 400 is a computer comprising hardware such as processor 401 , memory 402 , auxiliary storage device 403 , communication device 404 and input/output interface 405 . These pieces of hardware are connected to each other via signal lines.

A processor 401 is an IC that performs arithmetic processing and controls other hardware. For example, processor 401 is a CPU, DSP or GPU.
Memory 402 is a volatile or non-volatile storage device. Memory 402 is also referred to as main storage or main memory. For example, memory 402 is RAM. The data stored in memory 402 is saved in auxiliary storage device 403 as needed.
Auxiliary storage device 403 is a non-volatile storage device. For example, the auxiliary storage device 403 is ROM, HDD or flash memory. Data stored in the auxiliary storage device 403 is loaded into the memory 402 as required.
Communication device 404 is a receiver and transmitter. For example, communication device 404 is a communication chip or NIC. Communication of false dependency modification device 400 is performed using communication device 404 .
The input/output interface 405 is a port to which an input device and an output device are connected. For example, the input/output interface 405 is a USB terminal, the input device is a keyboard and mouse, and the output device is a display. Input/output to and from false dependency correction device 400 is performed using input/output interface 405 .

The false dependency correction device 400 comprises elements such as a false dependency reception unit 411, a false dependency analysis unit 412, and a structural data re-correction unit 413. These elements are implemented in software.

The auxiliary storage device 403 stores a false dependency correction program for causing the computer to function as a false dependency acceptance unit 411 , a false dependency analysis unit 412 , and a structure data recorrection unit 413 . The false dependency repair program is loaded into memory 402 and executed by processor 401 .
The auxiliary storage device 403 further stores an OS. At least part of the OS is loaded into memory 402 and executed by processor 401 .
The processor 401 executes the false dependency repair program while executing the OS.

The input/output data of the false dependency repair program are stored in the storage unit 490 .
Memory 402 functions as storage unit 490 . However, a storage device such as the auxiliary storage device 403 , a register within the processor 401 and a cache memory within the processor 401 may function as the storage unit 490 instead of or together with the memory 402 .

The false dependency correction device 400 may include multiple processors that substitute for the processor 401.

The fake dependency modification program is included in the software design support program.

FIG. 23 shows the functional configuration of the software design support system 100. As shown in FIG. Illustration of the functional configuration of the repair candidate extraction device 300 is omitted.
The operation of each element and the contents of each data in the software design support system 100 will be described later.

***Description of operation***
Based on FIG. 24, the software design support method (improvement) will be described.
The software design support method (modification) is a method of obtaining a parallelization result by resolving false dependencies to be resolved, and is executed by the false dependency remediation device 400 and the parallelization device 200 .

In step S181 , the false dependency reception unit 411 receives the false dependency data 131 .
The fake dependency data 131 corresponds to the modification candidate data 125 of the first embodiment. Specifically, the false dependency data 131 indicates the false dependency and the elimination method of the false dependency for which the user has determined that the user may modify the false dependency shown in the modification candidate data 125 .

In step S182, the false dependency analysis unit 412 receives the structural data 112 of the first embodiment.

In step S183 , the false dependency analysis unit 412 generates the false dependency list 132 by analyzing the false dependency data 131 and the structure data 112 .
The method of generating the false dependency list 132 is the same as the method of generating the false dependency list 123 in the first embodiment. The format of the false dependency list 132 is the same as the format of the false dependency list 123 in the first embodiment.
However, in the false dependency list 132, information specifying each false dependency is registered only for each false dependency indicated in the false dependency data 131. FIG.

In step S184 , the structural data recorrection unit 413 generates the structural data 133 by correcting the structural data 112 based on the false dependency list 132 .
Structural data 133 is re-corrected structural data 112 .
A method for generating the structure data 133 is the same as the method for generating the structure data 124 in the first embodiment.

In step S185 , the structural data recorrection unit 413 transfers the structural data 133 to the parallelization device 200 .
The correction requesting unit 213 receives the structural data 133 from the fake dependency correction device 400 .

In step S186 , the parallelization unit 214 performs parallelization based on the structure data 133 and generates the parallelization result 114 .
Step S186 corresponds to step S105 of the first embodiment.

In step S187 , the result output unit 215 outputs the parallelization result 114 .
Step S187 corresponds to step S106 of the first embodiment.

*** Effect of Embodiment 2 ***
The second embodiment can obtain the parallelization result 114 when only the false dependencies indicated in the modification candidate data 125 of the first embodiment are resolved.
In Embodiment 1, the parallelization result 113 is obtained when all the detected false dependencies are resolved without the user making a decision as to whether or not the false dependencies can be resolved.
On the other hand, in the second embodiment, it is possible to obtain the parallelization result 114 in the case where the false dependency determined by the user to be resolvable is resolved.

*** Supplement to the embodiment ***
Based on FIG. 25, the hardware configuration of the parallelization device 200 will be described.
Parallelization device 200 includes processing circuitry 209 .
The processing circuit 209 is hardware that realizes the source code reception unit 211 , the structure analysis unit 212 , the correction request unit 213 , the parallelization unit 214 , and the result output unit 215 .
The processing circuitry 209 may be dedicated hardware, or may be the processor 201 that executes programs stored in the memory 202 .

If processing circuitry 209 is dedicated hardware, processing circuitry 209 may be, for example, a single circuit, multiple circuits, a programmed processor, a parallel programmed processor, an ASIC, an FPGA, or a combination thereof.
ASIC is an abbreviation for Application Specific Integrated Circuit.
FPGA is an abbreviation for Field Programmable Gate Array.

The parallelization device 200 may include a plurality of processing circuits that substitute for the processing circuit 209.

In the processing circuit 209, some functions may be implemented by dedicated hardware, and the remaining functions may be implemented by software or firmware.

In this way, the functions of the parallelization device 200 can be realized by hardware, software, firmware, or a combination thereof.

Based on FIG. 26, the hardware configuration of the repair candidate extraction device 300 will be described.
The modification candidate extraction device 300 includes a processing circuit 309 .
The processing circuit 309 is hardware that implements the constraint reception unit 311 , the definition acquisition unit 312 , the false dependency detection unit 313 , the structure data correction unit 314 , the modification candidate extraction unit 315 , and the modification candidate output unit 316 .
The processing circuit 309 may be dedicated hardware, or may be the processor 301 that executes a program stored in the memory 302 .

If the processing circuit 309 is dedicated hardware, the processing circuit 309 is, for example, a single circuit, multiple circuits, a programmed processor, a parallel programmed processor, an ASIC, an FPGA, or a combination thereof.

The modification candidate extraction device 300 may include a plurality of processing circuits that substitute for the processing circuit 309.

In the processing circuit 309, some functions may be implemented by dedicated hardware, and the remaining functions may be implemented by software or firmware.

In this way, the functions of the modification candidate extraction device 300 can be realized by hardware, software, firmware, or a combination thereof.

Based on FIG. 27, the hardware configuration of the false dependency correction device 400 will be described.
False dependency remediation device 400 comprises processing circuitry 409 .
The processing circuit 409 is hardware that implements the false dependency reception unit 411 , the false dependency analysis unit 412 , and the structural data re-correction unit 413 .
The processing circuit 409 may be dedicated hardware, or may be the processor 401 that executes a program stored in the memory 402 .

If the processing circuit 409 is dedicated hardware, the processing circuit 409 is, for example, a single circuit, multiple circuits, a programmed processor, a parallel programmed processor, an ASIC, an FPGA, or a combination thereof.

The false dependency correction device 400 may include multiple processing circuits that substitute for the processing circuit 409.

In the processing circuit 409, some functions may be implemented by dedicated hardware, and the remaining functions may be implemented by software or firmware.

In this way, the functions of the false dependency repair device 400 can be realized by hardware, software, firmware, or a combination thereof.

Each embodiment is an example of a preferred form and is not intended to limit the technical scope of the present disclosure. Each embodiment may be implemented partially or in combination with other embodiments. The procedures described using flowcharts and the like may be changed as appropriate.

The "unit" that is an element of the software design support system 100 may be read as "processing", "process", "circuit" or "circuitry".

100 Software design support system, 101 network, 111 source code, 112 structure data, 112A order information, 112B access information, 113 parallelization result, 121 constraint data, 122 definition data, 123 false dependency list, 124 structure data, 124A order information , 124B access information, 125 modification candidate data, 200 parallelization device, 201 processor, 202 memory, 203 auxiliary storage device, 204 communication device, 205 input/output interface, 209 processing circuit, 211 source code reception unit, 212 structure analysis unit, 213 correction request unit, 214 parallelization unit, 215 result output unit, 290 storage unit, 300 modification candidate extraction device, 301 processor, 302 memory, 303 auxiliary storage device, 304 communication device, 305 input/output interface, 309 processing circuit, 311 Constraint reception unit 312 definition acquisition unit 313 false dependency detection unit 314 structure data correction unit 315 repair candidate extraction unit 316 repair candidate output unit 390 storage unit 400 false dependency repair device 401 processor 402 memory 403 Auxiliary storage device, 404 communication device, 405 input/output interface, 409 processing circuit, 411 false dependency reception unit, 412 false dependency analysis unit, 413 structural data re-correction unit.

Claims

a structural analysis unit that analyzes a source code including a plurality of processing blocks to generate structural data of the source code;
a false dependency detection unit that detects data dependencies that do not affect each other as false dependencies based on the structural data;
a structural data correction unit that generates, as corrected structural data, structural data of the source code when the detected false dependency is eliminated by correcting the structural data;
a parallelization unit that parallelizes the plurality of processing blocks based on the modified structural data;
a result output unit for outputting a parallelization result obtained by the parallelization and indicating a schedule of the plurality of processing blocks;
A software design support system with
The software design support system is
a constraint reception unit that accepts constraint data specifying a false dependency that must not be resolved;
2. The software design support system according to claim 1, wherein said false dependency detector detects false dependencies other than false dependencies indicated in said constraint data.
The false dependency detection unit has a different scope and the same name for a WAR dependency in which the same data is written after the data is read and a WAW dependency in which the same data is written after the data is written. At least one of a scope variable dependence meaning access to multiple variables and an external declaration dependence meaning access to multiple variables having the same name declared in mutually different external files is detected as the false dependence. 3. The software design support system according to claim 1 or 2.
The software design support system is
a modification candidate extracting unit that extracts, from a list of detected false dependencies, as modification candidates, false dependencies that require modification in order to execute the plurality of processing blocks according to the schedule indicated in the parallelization result;
4. The software design support system according to any one of claims 1 to 3, further comprising a modification candidate output unit that outputs modification candidate data indicating information about the false dependency that is the modification candidate.
5. The software design according to claim 4, wherein said modification candidate extracting unit extracts a false dependency of a pair of said preceding block and said succeeding block as said modification candidate when said succeeding block starts before the preceding block ends. support system.
The software design support system is
a structural data recorrecting unit that generates structural data of the source code when the false dependency that is the correction candidate is eliminated by recorrecting the structural data, as recorrected structural data;
The parallelization unit parallelizes the plurality of processing blocks based on the re-corrected structural data,
6. The software design support system according to claim 4, wherein said result output unit outputs a parallelization result obtained by said parallelization and indicating a schedule of said plurality of processing blocks.
parsing a source code including a plurality of processing blocks to generate structural data for the source code;
Detecting data dependencies that do not affect each other existing in the source code as false dependencies based on the structural data;
generating the structural data of the source code when the detected false dependency is eliminated by correcting the structural data as corrected structural data;
performing parallelization of the plurality of processing blocks based on the modified structural data;
A software design support method for outputting a parallelization result obtained by the parallelization and indicating a schedule of the plurality of processing blocks.
a structural analysis process of analyzing a source code including a plurality of processing blocks to generate structural data of the source code;
false dependency detection processing for detecting data dependencies that do not affect each other as false dependencies based on the structural data;
a structural data correction process for generating the structural data of the source code as corrected structural data when the detected false dependency is eliminated by correcting the structural data;
Parallelization processing for parallelizing the plurality of processing blocks based on the modified structural data;
As a result output process for outputting a parallelization result obtained by the parallelization and indicating the schedule of the plurality of processing blocks,
A software design support program for making computers work.