WO2021171897A1 - Program parallelization method, program parallelization device, and electronic control device - Google Patents

Program parallelization method, program parallelization device, and electronic control device Download PDF

Info

Publication number
WO2021171897A1
WO2021171897A1 PCT/JP2021/003111 JP2021003111W WO2021171897A1 WO 2021171897 A1 WO2021171897 A1 WO 2021171897A1 JP 2021003111 W JP2021003111 W JP 2021003111W WO 2021171897 A1 WO2021171897 A1 WO 2021171897A1
Authority
WO
WIPO (PCT)
Prior art keywords
program
unit
conversion
core processor
parallelization
Prior art date
Application number
PCT/JP2021/003111
Other languages
French (fr)
Japanese (ja)
Inventor
泰輔 植田
茂規 早瀬
一 芹沢
Original Assignee
日立Astemo株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日立Astemo株式会社 filed Critical 日立Astemo株式会社
Publication of WO2021171897A1 publication Critical patent/WO2021171897A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code

Definitions

  • the present invention relates to a program parallelization method, a program parallelization device, and an electronic control device.
  • Patent Document 1 is a parallel compilation method for generating a parallel program parallelized so that it can be processed by a multi-core processor from a serial program described so that it can be processed by a single-core processor, and constitutes the sequential program.
  • the classification procedure for classifying the processing group to be performed into sequential processing that sequentially operates on a single core constituting the multi-core processor and parallel processing that operates in parallel on a plurality of cores constituting the multi-core processor, and the classification procedure.
  • the parallel program is created based on the allocation procedure for executing the non-uniform allocation process for non-uniformly allocating the processes classified into the parallel process to the plurality of cores, the classification result of the classification procedure, and the allocation result of the allocation procedure.
  • a parallel compilation method is disclosed, which comprises a generation procedure for generation.
  • Patent Document 1 With the technology described in Patent Document 1, there is room for improvement in the performance of the generated program.
  • the program parallelization method is a program parallelization method for generating a multi-core processor program from a single-core processor program executed by a computer, and is a program parallelization process of the single-core processor program.
  • the parallel processing step for parallelizing the pre-converted single-core processor program, and the parallelized single-core processor program includes an inverse conversion step that executes the inverse conversion of the pre-conversion in the non-parallelized region.
  • the program parallelizing device executes the above-mentioned program parallelizing method.
  • the electronic control device includes a storage unit for storing the multi-core processor program created by using the above-mentioned program parallelization method, and the multi-core processor program stored in the storage unit. It has a multi-core processor to run.
  • the performance of the program can be improved.
  • the "program” describes a process that can be interpreted by a computer and executed by the computer.
  • the use of a compiled programming language is assumed, but it can also be applied to an interpreted programming language.
  • "program” is used in the same meaning as “source code” that can be read and written by humans, but “program” is "binary code” that is difficult for humans to understand directly and is easy for computers to understand. It may be.
  • FIG. 1 is an overall configuration diagram of an in-vehicle system 1 including a program parallelizing device 111 and an autonomous traveling control device 2 using a program output by the program parallelizing device 111.
  • the configuration and operation of the program parallelizing device 111 will be described in detail later.
  • the in-vehicle system 1 is mounted on the vehicle 100 and has a camera information acquisition unit 101 that acquires the external world condition of the vehicle 100 by a camera, a radar information acquisition unit 102 that acquires the external world condition of the vehicle 100 by a radar, and an external world condition of the vehicle 100. It is provided with a laser information acquisition unit 103 that acquires the information by a laser, and a vehicle position information acquisition unit 104 that detects the position of the vehicle 100 using a satellite navigation system, for example, a GPS (Global Positioning System) receiver.
  • GPS Global Positioning System
  • the in-vehicle system 1 further includes an automatic driving setting unit 105 for setting the automatic driving of the vehicle 100, and a wireless communication unit 106 for updating the information of the in-vehicle system 1 by OTA (Over-The-Air).
  • the wireless communication unit 106 is connected to the program parallelizing device 111 via a wireless network, for example.
  • the program parallelizing device 111 executes a program parallelizing process described later.
  • the in-vehicle system 1 further includes an autonomous driving control device 2 which is an electronic control device, an auxiliary control unit 107, a brake control unit 108, an engine control unit 109, and a power steering control unit 110.
  • an autonomous driving control device 2 which is an electronic control device, an auxiliary control unit 107, a brake control unit 108, an engine control unit 109, and a power steering control unit 110.
  • Each of the autonomous travel control device 2, the auxiliary control unit 107, the brake control unit 108, the engine control unit 109, and the power steering control unit 110 is, for example, an ECU (Electronic Control Unit).
  • ECU Electronic Control Unit
  • the unit 108, the engine control unit 109, and the power steering control unit 110 are connected to each other so as to be able to communicate with each other by an in-vehicle network such as CAN (Controller Area Network) or Ethernet (registered trademark).
  • CAN Controller Area Network
  • Ethernet registered trademark
  • the camera information acquisition unit 101, the radar information acquisition unit 102, the laser information acquisition unit 103, and the own vehicle position information acquisition unit 104 each transmit the information received from the sensor or the like to the autonomous travel control device 2.
  • the automatic driving setting unit 105 transmits setting information such as a destination, a route, and a traveling speed at the time of automatic driving to the autonomous driving control device 2.
  • the autonomous driving control device 2 performs processing for automatic driving control and outputs a control command to the brake control unit 108, the engine control unit 109, and the power steering control unit 110 based on the processing result.
  • the auxiliary control unit 107 performs the same control as the autonomous travel control device 2 as an auxiliary.
  • the brake control unit 108 controls the braking force of the vehicle 100.
  • the engine control unit 109 controls the driving force of the vehicle 100.
  • the power steering control unit 110 controls the steering of the vehicle 100.
  • the autonomous driving control device 2 When the autonomous driving control device 2 receives an automatic driving setting request from the automatic driving setting unit 105, the camera information acquisition unit 101, the radar information acquisition unit 102, the laser information acquisition unit 103, the vehicle position information acquisition unit 104, and the like The trajectory on which the vehicle 100 moves is calculated based on the information of the outside world. Then, the autonomous travel control device 2 issues control commands such as braking force, driving force, and steering to the brake control unit 108, the engine control unit 109, and the power steering control unit 110 so as to move the vehicle 100 according to the calculated track. Output to.
  • the brake control unit 108, the engine control unit 109, and the power steering control unit 110 receive control commands from the autonomous travel control device 2 and output operation signals to actuators, which are not shown, respectively.
  • FIG. 2 is a hardware configuration diagram of the program parallelizing device 111 and the autonomous traveling control device 2. Since the hardware configurations of both are common, the hardware configuration of the program parallelizing device 111 will be described here as a representative.
  • the program parallelizing device 111 includes a CPU 251, a ROM 252, a RAM 253, a flash memory 254, and a communication interface 256.
  • the flash memory 254 is a non-volatile storage area.
  • the CPU 251 realizes a function described later by expanding and executing a program stored in at least one of the ROM 252 and the flash memory 254 in the RAM 253.
  • the hardware of the CPU 251 which constitutes the program parallelizing device 111, the ROM 252, the RAM 253, the flash memory 254, and the communication interface 256 may be configured as a plurality of devices. Further, the CPU 251 and the ROM 252, the RAM 253, and the flash memory 254 may be configured by configuring a plurality of hardware as one device, such as a SoC (System on Chip).
  • SoC System on Chip
  • the program stored in the flash memory 254 of the autonomous travel control device 2 may be a program received from the program parallelizing device 111.
  • the communication interface 256 of the autonomous driving control device 2 is an interface for communicating with a predetermined protocol such as CAN.
  • the communication interface 256 of the program parallelizing device 111 is, for example, a wireless communication module for communicating with the wireless communication unit 106 of the vehicle 100.
  • the autonomous travel control device 2 may be composed of one ECU (Electronic Control Unit) or may be composed of a plurality of ECUs.
  • FIG. 3 is a functional configuration diagram of the autonomous travel control device 2.
  • the autonomous driving control device 2 is a multi-core processor including a first communication interface 211-1, a second communication interface 201-2, and a first core 203-1 to an Nth core 203-N (N is an arbitrary two or more natural numbers). It has 202 and a storage unit 204.
  • the first communication interface 211-1 and the second communication interface 201-2 will be collectively referred to as "communication interface 201”.
  • the first core 203-1 to the Nth core 203-N are collectively referred to as "core 203".
  • the communication interface 201 is realized by the communication interface 256 of FIG.
  • the multi-core processor 202 is realized by the CPU 251 and the core 203 is composed of the multi-core included in the CPU 251.
  • the multi-core processor 202 may be realized by SoC.
  • the storage unit 204 may be configured in the CPU 251 or may be realized by the ROM 252, the RAM 253, or the flash memory 254.
  • the storage unit 204 may be considered as a general term for a storage area required when the multi-core processor 202 expands and executes a program stored in the ROM 252 in a RAM 253 or the like.
  • the autonomous driving control device 2 uses the camera information acquisition unit 101, the radar information acquisition unit 102, the laser information acquisition unit 103, the vehicle position information acquisition unit 104, and the automatic driving setting in FIG. 1 via the first communication interface 211-1. It is connected to the unit 105 and the wireless communication unit 106, and is connected to the auxiliary control unit 107, the brake control unit 108, the engine control unit 109, and the power steering control unit 110 via the second communication interface 201-2.
  • the autonomous driving control device 2 executes processing for automatic driving control in the multi-core processor 202.
  • the multi-core processor 202 receives sensor information from the camera information acquisition unit 101, radar information acquisition unit 102, laser information acquisition unit 103, and own vehicle position information acquisition unit 104, which are input from the first communication interface 211-1, and automatic operation. Acquires automatic operation setting information from the setting unit 105. The acquired information is used to execute peripheral cognitive processing and trajectory calculation processing, and based on the calculation processing results, control commands such as braking force and driving force are output from the second communication interface 201-2.
  • the program for the autonomous travel control device 2 to execute the process is created in the program parallelizing device 111.
  • the autonomous travel control device 2 may acquire the program information from the wireless communication unit 106 and store it in the storage unit 204.
  • FIG. 4 is a functional configuration diagram of the program parallelizing device 111.
  • the program parallelization device 111 includes a determination unit 31, a pre-processing unit 32, a parallelization unit 33, a post-processing unit 34, an integration unit 35, a compiler 39, and a device storage unit 37.
  • the determination unit 31, the pre-processing unit 32, the parallelization unit 33, the post-processing unit 34, the integration unit 35, and the compiler 39 are realized by the CPU 251 executing a program stored in the ROM 252.
  • the device storage unit 37 stores the original program 51, the target program 52, the preprocessed program 53, the converted program 54, the inverse conversion addition program 55, the non-target program 56, and the integrated program 57. Will be done.
  • the device storage unit 37 is a concept including the RAM 253 and the flash memory 254 shown in FIG. 2, and may be either of the two or may be realized by combining the two.
  • the programs indicated by reference numerals 51 to 57 are listed only for the sake of explanation, and it is not essential that all of them exist in the device storage unit 37 at the same time.
  • the original program 51 is a program code created in advance, and is created by, for example, a programmer or an automatic source code generation tool.
  • the original program 51 does not include an explicit command regarding parallelization as described later. That is, since the original program 51 is not specialized in processing by the multi-core processor, the original program 51 can also be called a “single-core processor program”.
  • the target program 52 is a program among the original programs 51 that is determined by the discriminating unit 31 to be the target of parallel processing.
  • the non-target program 56 is a program among the original programs 51 that is determined by the discriminating unit 31 to be not the target of parallel processing. That is, the target program 52 and the non-target program 56 are combined to form the original program 51.
  • the pre-processed program 53 is a target program 52 that has been pre-processed by the pre-processing unit 32.
  • the converted program 54 is a pre-processed program 53 that has been parallelized by the parallelizing unit 33.
  • the inverse transformation addition program 55 is a program output by the post-processing unit 34, and is a program in which the parallelization unit 33 reverse-converts the portion of the converted program 54 that has not been parallelized. Details will be described later.
  • the integrated program 57 is a program that combines the inverse transformation addition program 55 and the non-target program 56.
  • the non-target program 56 is a program obtained by removing the target program 52 from the original program 51, and the target program 52 is converted into an inverse transformation addition program 55 suitable for parallel processing by various processes. Therefore, the integrated program 57 can be said to be the original program 51 optimized for parallel processing.
  • the discrimination unit 31 outputs the target program 52 and the non-target program 56 with the original program 51 as the processing target.
  • the determination unit 31 reads the original program 51, outputs a portion determined to be the target of parallel processing as the target program 52, and outputs a portion determined to be non-target of parallel processing as the non-target program 56.
  • the discriminating unit 31 determines that the target of parallel processing is a part of the target program 52 that satisfies both the improvement possibility and the processability.
  • the possibility of improvement means that there is a high possibility that the processing speed will be improved by parallelization.
  • the processability means that the parallelization unit 33 can process the parallelization.
  • parallelization can be performed by the parallelization unit 33 as it is without performing any special processing, or by performing descriptive conversion processing by the preprocessing unit 32.
  • the case where the unit 33 can be processed is included. That is, the case where the processability is denied is a case where the parallelization process by the parallelization unit 33 is impossible and the description conversion process by the preprocessing unit 32 is impossible as it is.
  • the discriminating unit 31 performs profiling on the original program 51, and there is a high possibility that the processing speed will be improved by parallelization, and the function that can be descriptively converted so that the parallelization unit 33 can process is the target of parallel processing.
  • Judge. The discriminating unit 31 is realized by using, for example, a profiler.
  • the determination unit 31 may determine the target of parallelization in consideration of the measurement of the execution time of each function and the dependency between the functions.
  • the pre-processing unit 32 targets the target program 52 and outputs the pre-processed program 53.
  • the pre-processing unit 32 rewrites the target program 52 so as to meet the known processing restrictions of the parallelization unit 33. This rewriting is also called "description conversion processing".
  • the processing restrictions of the parallelizing unit 33 are, for example, restrictions on variable types and restrictions on function calls.
  • the preprocessing unit 32 may use the floating-point single-precision "float” if the parallelizing unit 33 has a restriction that the "double" type, which is a floating-point double-precision variable, cannot be used in the variable type. Rewrite to type. Further, the preprocessing unit 32 rewrites the recursive function into a non-recursive function when there is a limitation that the parallelizing unit 33 cannot handle the recursive call in the function call.
  • the pre-processing unit 32 records the contents of the pre-processing so that the post-processing unit 34 can perform the reverse conversion processing.
  • This record may be recorded as a comment having a specific format in the preprocessed program 53, a specific character string may be embedded in the preprocessed program 53, or it may be written to an intermediate processing file (not shown). good.
  • the record as a comment to the preprocessed program 53 is, for example, "// # preconv32 # double value1 >> float value1". This comment indicates that "# preconv32 #" at the beginning is the description of the preprocessing unit 32, and further indicates that "double value1" has been rewritten to "float value1".
  • the method of embedding a specific character string in the preprocessed program 53 is, for example, a method of describing "typedef float_preconv32_double_float” and “_preconv32_double_float value1" in the preprocessed program 53. This description indicates that the processing is performed by the preprocessing unit 32 by describing "_preconv32_” at the beginning of the new name specified by "typedef". Furthermore, “double_float” indicates that the "double” type has been changed to the "float” type.
  • the parallelization unit 33 targets the preprocessed program 53 as a processing target, and outputs the converted program 54.
  • the parallelization unit 33 is a known parallelization tool and converts the source code into the source code. That is, the parallelization unit 33 is not a compiler but a program that rewrites the source code.
  • the parallelization unit 33 rewrites the preprocessing program 53 and gives the compiler 39 an explicit command regarding parallelization.
  • a specific parallel processing command is inserted after a specific character string such as "#pragma parallel".
  • the parallelizing unit 33 does not delete at least the comment described by the preprocessing unit 32, but leaves it as it is in the converted program 54.
  • the parallelizing unit 33 specifies in advance the characteristics of the comments described by the preprocessing unit 32, so that only the comments described by the preprocessing unit 32 are selected and left by automatic processing, and other comments are deleted. May be good. Further, the parallelizing unit 33 may specify an operation mode in which the comment is not deleted from the operator by the operation option.
  • the post-processing unit 34 targets the converted program 54 as a processing target, and outputs the inverse conversion addition program 55.
  • the post-processing unit 34 reads the converted program 54 and identifies a part that has not been rewritten by the parallelizing unit 33, for example, a function that has not been rewritten by the parallelizing unit 33. For example, specify a function that does not contain a specific character string such as "#pragma parallel" immediately before the function. Then, the post-processing unit 34 rewrites the portion that has not been rewritten by the parallelizing unit 33 to the state before the pre-processing by the pre-processing unit 32 is performed. In other words, the pre-processing unit 32 converts the program into a format suitable for parallelization, but the post-processing unit 34 reversely converts the program into the original format.
  • the process of inverse transformation by the post-processing unit 34 is performed by referring to the record because the pre-processing unit 32 records the processing content as described above.
  • the post-processing unit 34 includes the comment "// # preconv32 # double value1 >> float value1" in the converted program 54, and "#pragma" immediately before the function containing the declaration of "float value1". If a specific character string such as "parallel” is not described, rewrite "float value1" to "double value1".
  • the integration unit 35 targets the inverse transformation addition program 55 and the non-target program 56, and outputs the integrated program 57. That is, the integration unit 35 creates the integrated program 57 by combining the description contents of the inverse transformation addition program 55 and the description contents of the non-target program 56.
  • FIG. 5 is a flowchart showing the processing of the program parallelizing device 111.
  • the program parallelizing device 111 executes the operation shown in the following flowchart.
  • the discriminating unit 31 executes steps S401 to S404 described below, the pre-processing unit 32 executes step S405, the parallelizing unit 33 executes step S406, and the post-processing unit 34 executes step S407.
  • Step S408 is executed by the integration unit 35, and step S409 is executed by the compiler 39.
  • the method in which the program parallelizing device 111 creates the integrated program 57 based on the original program 51 is referred to as a “program parallelizing method”.
  • step S401 the program parallelizer 111 profiles the original program 51. Specifically, the program parallelizing device 111 extracts a part that can be speeded up by changing from sequential processing to parallel processing.
  • step S402 the program parallelizing device 111 determines, based on the result extracted in step S401, the evaluation of the above-mentioned improvement possibility, that is, whether or not there is a possibility that the speed can be increased by parallel processing. Specifically, the determination unit 31 proceeds to step S403 when it determines that at least a part of the original program 51 is likely to be speeded up by parallel processing. When the determination unit 31 determines that there is no possibility that the speed can be increased by parallel processing at any of the parts of the original program 51, the determination unit 31 proceeds to step S409.
  • step S403 the preprocessing unit 32 of the program parallelizing device 111 determines whether or not the above-mentioned processability evaluation, in other words, descriptive conversion can be performed on the program expected to be speeded up in step S402.
  • the description conversion performed here is a conversion of the program necessary for carrying out program parallelization, and indicates that the description is converted into a description within a range that can be analyzed by the parallelization unit 33 described later. It is generally required to adjust the program type and the bit size to be processed, and as a specific conversion example, it depends on the data model adopted by the OS (Operation System) in order to adjust the double precision to single precision. It is possible to convert a long bit size type of long long type to a short bit size type such as int type.
  • step S403 the determination unit 31 proceeds to step S404 when it determines that the program expected to be faster can be descriptively converted.
  • the determination unit 31 determines that all the programs expected to be speeded up cannot be described and converted, that is, the parallelization unit 33 cannot analyze the program.
  • the process proceeds to step S409.
  • step S404 the discriminating unit 31 of the program parallelizing device 111 extracts a program capable of descriptive conversion, which is expected to be speeded up by parallel processing. Specifically, the discriminating unit 31 and the portion used as it is for the single-core processor are saved as the target program 52, and the portion for performing program parallelization for the multi-core processor is saved as the non-target program 56.
  • step S405 the pre-processed program 53 of the program parallelizing device 111 executes the description conversion of the extracted program. Specifically, the target program 52 saved in step S404 is subjected to descriptive conversion, and the preprocessed program 53 is output.
  • step S406 the parallelizing unit 33 of the program parallelizing device 111 performs parallelization on the preprocessed program 53 output in step S405, and outputs the converted program 54.
  • a sequential program which is a program for a single core processor, that is, a parallel program which is a program for a multi-core processor, that is, a converted program 54 is created from the preprocessed program 53.
  • step S407 the post-processing unit 34 of the program parallelizing device 111 performs descriptive inverse transformation on the program that was not parallelized by the parallelizing unit 33 in step S406.
  • the descriptive transformation executed by the preprocessing unit 32 in step S405 is inversely transformed so as to return to the original program description input at the start of processing.
  • step S408 the integration unit 35 of the program parallelizing device 111 is an integrated program by combining the inverse conversion addition program 55 that has undergone the description inverse transformation in step S407 and the non-target program 56 that has not undergone the description conversion in step S404. Generate 57.
  • step S409 the compiler 39 compiles the integrated program 57 to generate the binary code 59, and ends the process shown in FIG. However, when the compiler 39 proceeds from step S402 and step S403 to step S409, the compiler 39 compiles the original program 51 instead of the integrated program 57.
  • the compiler 39 compiles the original program 51 instead of the integrated program 57. The above is the description of FIG.
  • FIG. 6 is a diagram showing the relationship between each process up to the creation of the integrated program 57 in the program parallelizing device 111 and the program. This will be described together with the flowchart shown in FIG.
  • the original program 51 is a sequential program for a single core processor input to the program parallelizing device 111.
  • the description of "C" indicates that the program is written in, for example, C language.
  • the original program 51 is divided into a target program 52 and a non-target program 56 in the parallel extraction step S501.
  • the parallel extraction step S501 corresponds to steps S401 to S404 shown in FIG.
  • the target program 52 is a program extracted in step S404 as a portion for performing program parallelization for the multi-core processor.
  • the non-target program 56 is a program not extracted in step S404, that is, a program obtained by removing the target program 52 from the original program 51. If it is determined in step S402 and step S403 that the process proceeds to step S409, it may be considered that the target program 52 is an empty set in the parallel extraction step S501.
  • the target program 52 is converted into the pre-processed program 53 in the description conversion step S502.
  • the description conversion step S502 corresponds to step S405 shown in FIG.
  • the pre-processed program 53 is converted into the converted program 54 in the parallel processing step S503.
  • the parallel processing step S503 corresponds to step S406 shown in FIG.
  • the converted program 54 is converted into the inverse conversion addition program 55 in the description inverse conversion step S504.
  • the description inverse conversion step S504 corresponds to step S407 shown in FIG.
  • the inverse transformation addition program 55 and the non-target program 56 are converted into the integrated program 57 in the joining step S505.
  • the joining step S505 corresponds to step S408 shown in FIG.
  • FIG. 7 is a diagram showing an outline of changes in the program in the program parallelizing device 111.
  • specific source code names are described inside each program, and the horizontal length shown shows the processing time when it is assumed that the source code is compiled and executed as it is.
  • processing time when it is assumed that the source code X is compiled and executed as it is is abbreviated as "processing time of the source code X”.
  • FIG. 7 an example will be described in which a multi-core processor 202 having N of 3 and having three cores 203 is used.
  • the original program 51 shown in FIG. 7 is a sequential program that sequentially executes the processes described in the source codes A to C.
  • the original program 51 can be said to be a sequential program in which the processes A to C are sequentially executed.
  • the source code A and the source code B were determined to be capable of speeding up and the description conversion was possible, and became the target program 52, and the remaining source code C became the non-target program 56. Since the source codes A and B included in the target program 52 do not change from the original program 51, the processing time, which is the length in the horizontal direction shown in the drawing, does not change.
  • the pre-processed program 53 is configured as a program that sequentially executes the processes described in the source codes A1 and B1 by the description conversion step S502 shown in FIG.
  • the source code A1 is a source code to which a description conversion is applied to the source code A so that the parallelizing unit 33 can process the source code A.
  • the source code B1 is a source code to which a descriptive conversion is applied to the source code B. In FIG. 7, it is shown that the descriptive conversion is applied to the hatched source code such as the source codes A1 and B1.
  • the pre-processed program 53 has a longer processing time than the target program 52.
  • the reason for this is as follows. That is, when the program type and the bit size to be processed are adjusted within the range that can be analyzed by the parallelizing unit 33 by the descriptive conversion, it is necessary to take measures so that the same processing can be executed under the restricted conditions. This is because the binary obtained by compiling the source code obtained by performing this descriptive conversion as it is is generally considered to have disadvantages such as a long processing time.
  • the processes shown in the source code A1 are parallelized into three by the process of the parallelization process step S503 shown in FIG. 6, and are paralleled as A2-1, A2-2, and A2-3. Is executed, and then the sequential processing shown in the source code B2 is executed. That is, in the example shown in FIG. 7, the parallelization unit 33 indicates that the source code A1 is described for parallelization and the source code B1 is not processed for parallelization. Since the processing of the source code A1 is parallelized in three as processing A2-1 to processing A2-3 in the converted program 54, the execution time indicated by the length in the horizontal direction shown in the figure is not only shorter than that of the source code A1. , Shorter than source code A. This is because even if there is overhead for descriptive conversion and parallelization, the advantage of parallelization is large and the processing time is shortened.
  • the source code B1 that was not described for parallelization by the parallelization unit 33 is also rewritten to the source code B2 by the parallelization unit 33. This means that the parallelization unit 33 has rewritten the description so that the compilation process is easy.
  • the processing time of the source code B1 and the processing time of the source code B2 are substantially the same.
  • the inverse transformation addition program 55 sequentially executes the parallel programs (processes A2-1 to A2-3) parallelized in three and the sequential programs of the processes B by the description inverse transformation step S504 shown in FIG. It is configured as a program.
  • the processing time of the inverse transformation addition program 55 is shorter than that of the converted program 54.
  • the processing time can be shortened by corresponding to.
  • the integrated program 57 is a combination of the inverse transformation addition program 55 and the non-target program 56 by the combination step S505 shown in FIG.
  • FIG. 8 is a sequence diagram showing information reception from the program parallelizing device 111 to the autonomous travel control device 2.
  • the autonomous driving control device 2 detects an abnormality in the multi-core processor 202, it notifies the program parallelizing device 111 installed on the cloud or the like, and receives new program information via the wireless network by OTA.
  • OTA new program information
  • the autonomous driving control device 2 detects a failure of the multi-core processor 202 (S701), it transfers the detected information to the wireless communication unit 106 of the in-vehicle system 1 (S702).
  • the wireless communication unit 106 transfers the received detection information to the program parallelizing device 111 via the wireless network (S703).
  • the program parallelizing device 111 that has received the detection information reconfigures the program for the multi-core processor 202 (S704). Specifically, for example, based on the detection information, the program parallelization process is performed according to the number of usable cores 203 without being affected by the failure.
  • the program parallelizing device 111 transfers the information of the reconstructed program to the wireless communication unit 106 (S705).
  • the wireless communication unit 106 transfers the received program information to the autonomous travel control device 2 (S706).
  • the autonomous travel control device 2 may operate with a new program (S707) according to the update timing and the method in the in-vehicle system 1 to complete the process.
  • the processing time in program parallelization can be shortened by combining the description inverse conversion. Therefore, it is possible to improve the performance of the multi-core processor operated by the parallel program created through the program parallelization. Further, according to this embodiment, the program can be reconfigured according to the state of the multi-core processor that executes the program.
  • the program parallelizing device 111 executes a program parallelizing method for generating an integrated program 57, which is a program for a multi-core processor, from an original program 51, which is a program for a single-core processor.
  • the program parallelization method includes a pre-conversion step (step S405 in FIG. 5) for executing pre-conversion for program parallelization processing of a single-core processor program, and post-processing that is a pre-converted single-core processor program.
  • the parallelization processing step (step S406 in FIG. 5) in which the parallelizing unit 33 parallelizes the program 53 and the non-parallelized region of the parallelized single-core processor program, for example, an example shown in FIG.
  • the source code B2 includes an inverse conversion step (step S407 in FIG. 5) for executing the inverse conversion of the pre-conversion. Therefore, the demerit of the description conversion can be eliminated by inversely converting the description of the program that has not been parallelized by the parallelization unit 33.
  • the disadvantages here are that the accuracy of the calculation is reduced by reducing the number of bits of the variable so that the parallelization unit 33 can process it, and that the calculation time is rewritten so as to avoid recursive representation. For example, to extend. That is, the program parallelizing device 111 can improve the performance of the output program by performing the inverse transformation by the post-processing unit 34.
  • the program parallelization method is the first program area of a program for a single-core processor, which is expected to have a high-speed effect due to the program parallelization processing and can be pre-converted for the program parallelization processing.
  • An extraction step (steps S402 and S403 in FIG. 5) for extracting the target program 52 is included.
  • pre-conversion step pre-conversion is executed in the target program 52, which is the first program area. Therefore, the processing load in steps S404 to S407 can be reduced by limiting the target of the pre-conversion to not the entire original program 51 but a part thereof.
  • the non-target program 56 which is a second program area other than the target program 52
  • the reverse conversion addition which is the first program in which the pre-conversion, parallelization processing, and inverse conversion are executed. It includes a joining step (step S408 of FIG. 5) of joining the programs 55 to obtain the integrated program 57. Therefore, by separating the programs unsuitable for the parallel processing in advance and combining them after the parallel processing, the entire original program 51 can be collectively processed by the compiler 39.
  • the pre-conversion process includes a process of changing the description of a large bit size type to a small bit size type, for example, a process of changing a double precision type to a single precision type.
  • the inverse transformation process includes a process of changing the description of a small bit size type to a large bit size type, for example, a process of changing a single precision type to a double precision type. Therefore, the parallelization process can be executed by using the parallelization unit 33 in which the bit size of the variable that can be processed is limited.
  • the program parallelizing device 111 executes the above-mentioned program parallelizing method. Therefore, the program parallelizing device 111 can output the integrated program 57, which is a program in which the disadvantages due to the description conversion of the non-parallelized portion are eliminated.
  • the autonomous travel control device 2 includes a multi-core processor 202 and a storage unit 204.
  • the storage unit 204 has a binary code 59 obtained by compiling the integrated program 57, which is a program for a multi-core processor created by the above-mentioned program parallelization method. Therefore, the autonomous travel control device 2 can execute the binary code 59 in which the demerit due to the descriptive conversion is eliminated by the post-processing unit 34 for the portion that is parallelized by the action of the parallelization unit 33 and is not parallelized.
  • the program parallelizing device 111 performs the descriptive inverse transformation only for the source code B2 that is not parallelized by the parallelizing unit 33 in FIG. 7.
  • the post-processing unit 34 performs descriptive inverse transformation on the source code A2 parallelized by the parallelization unit 33 as long as the synchronization between the cores can be ensured without breaking the dependency of the parallel processing order. It may be carried out.
  • the program parallelizer 111 is configured to include a compiler 39.
  • the program parallelizing device 111 does not include the compiler 39, and may use a compiler included in another device connected via a network or the like.
  • the program parallelizing device 111 does not include the discriminating unit 31, and the preprocessing unit 32 may process the entire original program 51. Further, in this case, since the non-target program 56 does not exist, the program parallelizing device 111 does not have to include the integrating unit 35.
  • the program created by the program parallelizing device 111 may be stored in the ROM 252 of the autonomous travel control device 2 in advance.
  • the configuration of the functional block is only an example.
  • Several functional configurations shown as separate functional blocks may be integrally configured, or the configuration represented by one functional block diagram may be divided into two or more functions. Further, a part of the functions possessed by each functional block may be provided in the other functional blocks.
  • control lines and information lines indicate those that are considered necessary for explanation, and do not necessarily indicate all the control lines and information lines necessary for implementation. In practice, it can be considered that almost all configurations are interconnected.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

This program parallelization method for generating a multi-core processor program from a single-core processor program includes: a pre-conversion step of executing pre-conversion for program parallelization processing of the single-core processor program; a parallelization processing step of subjecting the pre-converted single-core processor program to parallelization processing; and a reverse conversion step of executing reverse conversion of the pre-conversion in a region of the single-core processor program that was subjected to the parallelization processing but was not parallelized.

Description

プログラム並列化方法、プログラム並列化装置、電子制御装置Program parallelization method, program parallelization device, electronic control device
 本発明は、プログラム並列化方法、プログラム並列化装置、および電子制御装置に関する。 The present invention relates to a program parallelization method, a program parallelization device, and an electronic control device.
 車両の自動運転を目指した技術開発が進められている。自動運転は運転者に代わり周囲の認識、車両の制御を行う必要があり、膨大な情報処理が求められる。増大する情報処理に対応するために、マルチコアプロセッサを活用する検討が進められており、使用するコア数の増加に伴う実装の複雑化を考慮した適切な処理が求められる。その中で、シングルコアプロセッサ向けの逐次プログラムから、マルチコアプロセッサ向けの並列プログラムを自動的に作成するプログラム並列化に期待が寄せられており、そのツールである自動並列化ツールを活用する検討も進められている。 Technology development aimed at automatic driving of vehicles is underway. Autonomous driving requires recognition of the surroundings and control of the vehicle on behalf of the driver, and enormous information processing is required. In order to cope with the increasing amount of information processing, studies are underway to utilize a multi-core processor, and appropriate processing is required in consideration of the complexity of implementation due to the increase in the number of cores used. Among them, there are expectations for program parallelization that automatically creates parallel programs for multi-core processors from sequential programs for single-core processors, and we are also considering using the automatic parallelization tool that is that tool. Has been done.
 特許文献1には、シングルコアプロセッサで処理可能なように記述された逐次プログラムからマルチコアプロセッサで処理可能なように並列化した並列プログラムを生成する並列化コンパイル方法であって、前記逐次プログラムを構成する処理群を、前記マルチコアプロセッサを構成する単一コア上で逐次動作する逐次処理と、前記マルチコアプロセッサを構成する複数コア上で並列動作する並列処理とに分類する分類手順と、前記分類手順によって前記並列処理に分類された処理を前記複数コアに対して不均一に割り当てる不均一割当処理を実行する割当手順と、前記分類手順の分類結果及び前記割当手順の割当結果に基づいて前記並列プログラムを生成する生成手順と、を備えることを特徴とする並列化コンパイル方法が開示されている。 Patent Document 1 is a parallel compilation method for generating a parallel program parallelized so that it can be processed by a multi-core processor from a serial program described so that it can be processed by a single-core processor, and constitutes the sequential program. According to the classification procedure for classifying the processing group to be performed into sequential processing that sequentially operates on a single core constituting the multi-core processor and parallel processing that operates in parallel on a plurality of cores constituting the multi-core processor, and the classification procedure. The parallel program is created based on the allocation procedure for executing the non-uniform allocation process for non-uniformly allocating the processes classified into the parallel process to the plurality of cores, the classification result of the classification procedure, and the allocation result of the allocation procedure. A parallel compilation method is disclosed, which comprises a generation procedure for generation.
日本国特開2016-218503号公報Japanese Patent Application Laid-Open No. 2016-218503
 特許文献1に記載されている技術では、生成されるプログラムの性能に改善の余地がある。 With the technology described in Patent Document 1, there is room for improvement in the performance of the generated program.
 本発明の第1の態様によるプログラム並列化方法は、コンピュータが実行するシングルコアプロセッサ用プログラムからマルチコアプロセッサ用プログラムを生成するプログラム並列化方法であって、前記シングルコアプロセッサ用プログラムのプログラム並列化処理のための事前変換を実行する事前変換ステップと、前記事前変換された前記シングルコアプロセッサ用プログラムを並列化処理する並列化処理ステップと、前記並列化処理された前記シングルコアプロセッサ用プログラムのうち並列化されなかった領域に前記事前変換の逆変換を実行する逆変換ステップとを含む。
 本発明の第2の態様によるプログラム並列化装置は、上述のプログラム並列化方法を実行する。
 本発明の第3の態様による電子制御装置は、上述のプログラム並列化方法を用いて作成された前記マルチコアプロセッサ用プログラムを格納する記憶部と、前記記憶部に格納される前記マルチコアプロセッサ用プログラムを実行するマルチコアプロセッサとを備える。
The program parallelization method according to the first aspect of the present invention is a program parallelization method for generating a multi-core processor program from a single-core processor program executed by a computer, and is a program parallelization process of the single-core processor program. Of the pre-conversion step for executing the pre-conversion for the above, the parallel processing step for parallelizing the pre-converted single-core processor program, and the parallelized single-core processor program. It includes an inverse conversion step that executes the inverse conversion of the pre-conversion in the non-parallelized region.
The program parallelizing device according to the second aspect of the present invention executes the above-mentioned program parallelizing method.
The electronic control device according to the third aspect of the present invention includes a storage unit for storing the multi-core processor program created by using the above-mentioned program parallelization method, and the multi-core processor program stored in the storage unit. It has a multi-core processor to run.
 本発明によれば、プログラムの性能を向上させることができる。 According to the present invention, the performance of the program can be improved.
プログラム並列化装置および車載システムの全体構成図Overall configuration diagram of program parallelizer and in-vehicle system プログラム並列化装置および自律走行制御装置のハードウェア構成図Hardware configuration diagram of program parallelizer and autonomous drive control device 自律走行制御装置の機能構成図Functional configuration diagram of the autonomous driving control device プログラム並列化装置の機能構成図Functional configuration diagram of the program parallelizer プログラム並列化装置の処理を示すフローチャートFlowchart showing the processing of the program parallelizer 各処理とプログラムとの関係を示す図Diagram showing the relationship between each process and the program プログラムの変化の概要を示す図Diagram showing an overview of program changes プログラム並列化装置から自律走行制御装置への情報送信を示すシーケンス図Sequence diagram showing information transmission from the program parallelizing device to the autonomous driving control device
―第1の実施の形態―
 以下、図1~図8を参照して、プログラム並列化装置の第1の実施の形態を説明する。本実施の形態では、「プログラム」とはコンピュータが解釈可能でありコンピュータに実行させる処理を記述したものである。本実施の形態では、コンパイル型のプログラム言語の使用を想定しているが、インタプリタ型のプログラム言語にも適用可能である。また本実施の形態では、「プログラム」を人間が読み書き可能な「ソースコード」と同じ意味で用いるが、「プログラム」は人間による直接の理解が困難でコンピュータには理解が容易な「バイナリコード」であってもよい。
-First Embodiment-
Hereinafter, the first embodiment of the program parallelizing device will be described with reference to FIGS. 1 to 8. In the present embodiment, the "program" describes a process that can be interpreted by a computer and executed by the computer. In this embodiment, the use of a compiled programming language is assumed, but it can also be applied to an interpreted programming language. Further, in the present embodiment, "program" is used in the same meaning as "source code" that can be read and written by humans, but "program" is "binary code" that is difficult for humans to understand directly and is easy for computers to understand. It may be.
 <システム構成>
 図1は、プログラム並列化装置111、およびプログラム並列化装置111が出力するプログラムを利用する自律走行制御装置2を含む車載システム1の全体構成図である。プログラム並列化装置111の構成および動作は後に詳述する。
<System configuration>
FIG. 1 is an overall configuration diagram of an in-vehicle system 1 including a program parallelizing device 111 and an autonomous traveling control device 2 using a program output by the program parallelizing device 111. The configuration and operation of the program parallelizing device 111 will be described in detail later.
 車載システム1は、車両100に搭載され、車両100の外界状況をカメラにより取得するカメラ情報取得部101と、車両100の外界状況をレーダにより取得するレーダ情報取得部102と、車両100の外界状況をレーザにより取得するレーザ情報取得部103と、衛星航法システム、たとえばGPS(Global Positioning System)の受信機を用いて車両100の位置を検出する自車位置情報取得部104とを備える。車載システム1はさらに、車両100の自動運転を設定するための自動運転設定部105と、OTA(Over-The-Air)により車載システム1の情報を更新するための無線通信部106を備える。無線通信部106は例えば、無線ネットワークを介してプログラム並列化装置111に接続される。プログラム並列化装置111は、後述するプログラム並列化処理を実行する。 The in-vehicle system 1 is mounted on the vehicle 100 and has a camera information acquisition unit 101 that acquires the external world condition of the vehicle 100 by a camera, a radar information acquisition unit 102 that acquires the external world condition of the vehicle 100 by a radar, and an external world condition of the vehicle 100. It is provided with a laser information acquisition unit 103 that acquires the information by a laser, and a vehicle position information acquisition unit 104 that detects the position of the vehicle 100 using a satellite navigation system, for example, a GPS (Global Positioning System) receiver. The in-vehicle system 1 further includes an automatic driving setting unit 105 for setting the automatic driving of the vehicle 100, and a wireless communication unit 106 for updating the information of the in-vehicle system 1 by OTA (Over-The-Air). The wireless communication unit 106 is connected to the program parallelizing device 111 via a wireless network, for example. The program parallelizing device 111 executes a program parallelizing process described later.
 車載システム1はさらに、電子制御装置である自律走行制御装置2と、補助制御部107と、ブレーキ制御部108と、エンジン制御部109と、パワーステアリング制御部110とを備える。自律走行制御装置2、補助制御部107、ブレーキ制御部108、エンジン制御部109、およびパワーステアリング制御部110のそれぞれは、たとえばECU(Electronic Control Unit)である。 The in-vehicle system 1 further includes an autonomous driving control device 2 which is an electronic control device, an auxiliary control unit 107, a brake control unit 108, an engine control unit 109, and a power steering control unit 110. Each of the autonomous travel control device 2, the auxiliary control unit 107, the brake control unit 108, the engine control unit 109, and the power steering control unit 110 is, for example, an ECU (Electronic Control Unit).
 カメラ情報取得部101、レーダ情報取得部102、レーザ情報取得部103、自車位置情報取得部104、自動運転設定部105、無線通信部106、自律走行制御装置2、補助制御部107、ブレーキ制御部108、エンジン制御部109、およびパワーステアリング制御部110は、CAN(Controller Area Network)やイーサネット(登録商標)等の車載ネットワークによって相互に通信可能に接続される。 Camera information acquisition unit 101, radar information acquisition unit 102, laser information acquisition unit 103, vehicle position information acquisition unit 104, automatic driving setting unit 105, wireless communication unit 106, autonomous driving control device 2, auxiliary control unit 107, brake control The unit 108, the engine control unit 109, and the power steering control unit 110 are connected to each other so as to be able to communicate with each other by an in-vehicle network such as CAN (Controller Area Network) or Ethernet (registered trademark).
 カメラ情報取得部101、レーダ情報取得部102、レーザ情報取得部103、および自車位置情報取得部104は、それぞれがセンサなどから受信した情報を自律走行制御装置2に送信する。自動運転設定部105は、自動運転時の目的地、ルート、走行速度等の設定情報を自律走行制御装置2に送信する。 The camera information acquisition unit 101, the radar information acquisition unit 102, the laser information acquisition unit 103, and the own vehicle position information acquisition unit 104 each transmit the information received from the sensor or the like to the autonomous travel control device 2. The automatic driving setting unit 105 transmits setting information such as a destination, a route, and a traveling speed at the time of automatic driving to the autonomous driving control device 2.
 自律走行制御装置2は、自動運転制御のための処理を行い処理結果に基づいて制御指令を、ブレーキ制御部108、エンジン制御部109、およびパワーステアリング制御部110へ出力する。補助制御部107は、自律走行制御装置2と同様の制御を補助として行う。ブレーキ制御部108は、車両100の制動力を制御する。エンジン制御部109は、車両100の駆動力を制御する。パワーステアリング制御部110は、車両100のステアリングを制御する。 The autonomous driving control device 2 performs processing for automatic driving control and outputs a control command to the brake control unit 108, the engine control unit 109, and the power steering control unit 110 based on the processing result. The auxiliary control unit 107 performs the same control as the autonomous travel control device 2 as an auxiliary. The brake control unit 108 controls the braking force of the vehicle 100. The engine control unit 109 controls the driving force of the vehicle 100. The power steering control unit 110 controls the steering of the vehicle 100.
 自律走行制御装置2は、自動運転設定部105により自動運転の設定要求を受け付けると、カメラ情報取得部101、レーダ情報取得部102、レーザ情報取得部103、および自車位置情報取得部104等からの外界の情報を基に車両100が移動する軌道を算出する。そして、自律走行制御装置2は、算出した軌道に従って車両100を移動させるように、制動力、駆動力、操舵などの制御指令を、ブレーキ制御部108、エンジン制御部109、およびパワーステアリング制御部110に出力する。ブレーキ制御部108、エンジン制御部109、およびパワーステアリング制御部110は、自律走行制御装置2から制御指令を受けて、それぞれ不図示の制御対象であるアクチュエータに操作信号を出力する。 When the autonomous driving control device 2 receives an automatic driving setting request from the automatic driving setting unit 105, the camera information acquisition unit 101, the radar information acquisition unit 102, the laser information acquisition unit 103, the vehicle position information acquisition unit 104, and the like The trajectory on which the vehicle 100 moves is calculated based on the information of the outside world. Then, the autonomous travel control device 2 issues control commands such as braking force, driving force, and steering to the brake control unit 108, the engine control unit 109, and the power steering control unit 110 so as to move the vehicle 100 according to the calculated track. Output to. The brake control unit 108, the engine control unit 109, and the power steering control unit 110 receive control commands from the autonomous travel control device 2 and output operation signals to actuators, which are not shown, respectively.
 <ハードウェア構成>
 図2は、プログラム並列化装置111および自律走行制御装置2のハードウェア構成図である。両者のハードウェア構成は共通するので、ここでは代表してプログラム並列化装置111のハードウェア構成を説明する。プログラム並列化装置111は、CPU251と、ROM252と、RAM253と、フラッシュメモリ254と、通信インタフェース256とを備える。フラッシュメモリ254は不揮発性の記憶領域である。CPU251は、ROM252およびフラッシュメモリ254の少なくとも一方に格納されたプログラムをRAM253に展開して実行することで後述する機能を実現する。
<Hardware configuration>
FIG. 2 is a hardware configuration diagram of the program parallelizing device 111 and the autonomous traveling control device 2. Since the hardware configurations of both are common, the hardware configuration of the program parallelizing device 111 will be described here as a representative. The program parallelizing device 111 includes a CPU 251, a ROM 252, a RAM 253, a flash memory 254, and a communication interface 256. The flash memory 254 is a non-volatile storage area. The CPU 251 realizes a function described later by expanding and executing a program stored in at least one of the ROM 252 and the flash memory 254 in the RAM 253.
 プログラム並列化装置111を構成するCPU251、ROM252、RAM253、フラッシュメモリ254、通信インタフェース256のハードウェアは、複数のデバイスとして構成されてもよい。また、CPU251、ROM252、RAM253、フラッシュメモリ254は、SoC(System on Chip)のように、複数のハードウェアを1つのデバイスとして構成されてもよい。 The hardware of the CPU 251 which constitutes the program parallelizing device 111, the ROM 252, the RAM 253, the flash memory 254, and the communication interface 256 may be configured as a plurality of devices. Further, the CPU 251 and the ROM 252, the RAM 253, and the flash memory 254 may be configured by configuring a plurality of hardware as one device, such as a SoC (System on Chip).
 自律走行制御装置2のフラッシュメモリ254に格納されるプログラムは、プログラム並列化装置111から受信したプログラムであってもよい。自律走行制御装置2の通信インタフェース256は、CAN等の所定のプロトコルで通信するインタフェースである。プログラム並列化装置111の通信インタフェース256は、車両100の無線通信部106と通信するための、たとえば無線通信モジュールである。自律走行制御装置2は、1つのECU(Electronic Control Unit;電子制御装置)で構成されてもよいし、複数のECUから構成されてもよい。 The program stored in the flash memory 254 of the autonomous travel control device 2 may be a program received from the program parallelizing device 111. The communication interface 256 of the autonomous driving control device 2 is an interface for communicating with a predetermined protocol such as CAN. The communication interface 256 of the program parallelizing device 111 is, for example, a wireless communication module for communicating with the wireless communication unit 106 of the vehicle 100. The autonomous travel control device 2 may be composed of one ECU (Electronic Control Unit) or may be composed of a plurality of ECUs.
 <自律走行制御装置の機能構成>
 図3は、自律走行制御装置2の機能構成図である。自律走行制御装置2は、第1通信インタフェース201-1、第2通信インタフェース201-2、第1コア203-1~第Nコア203-N(Nは任意の2以上の自然数)を備えるマルチコアプロセッサ202、および記憶部204を有する。以下では、第1通信インタフェース201-1、第2通信インタフェース201-2をまとめて、「通信インタフェース201」と呼ぶ。また、第1コア203-1~第Nコア203-Nをまとめて、「コア203」と呼ぶ。
<Functional configuration of autonomous driving control device>
FIG. 3 is a functional configuration diagram of the autonomous travel control device 2. The autonomous driving control device 2 is a multi-core processor including a first communication interface 211-1, a second communication interface 201-2, and a first core 203-1 to an Nth core 203-N (N is an arbitrary two or more natural numbers). It has 202 and a storage unit 204. Hereinafter, the first communication interface 211-1 and the second communication interface 201-2 will be collectively referred to as "communication interface 201". Further, the first core 203-1 to the Nth core 203-N are collectively referred to as "core 203".
 通信インタフェース201は、図2の通信インタフェース256により実現される。マルチコアプロセッサ202は、CPU251により実現され、コア203はCPU251が備えるマルチコアにより構成される。マルチコアプロセッサ202は、SoCにより実現されてもよい。記憶部204は、CPU251内に構成されてもよいし、ROM252、RAM253、または、フラッシュメモリ254により実現されてもよい。記憶部204は、マルチコアプロセッサ202がROM252に格納されたプログラムをRAM253等に展開して実行する際などの、必要となる記憶領域の総称と考えてもよい。 The communication interface 201 is realized by the communication interface 256 of FIG. The multi-core processor 202 is realized by the CPU 251 and the core 203 is composed of the multi-core included in the CPU 251. The multi-core processor 202 may be realized by SoC. The storage unit 204 may be configured in the CPU 251 or may be realized by the ROM 252, the RAM 253, or the flash memory 254. The storage unit 204 may be considered as a general term for a storage area required when the multi-core processor 202 expands and executes a program stored in the ROM 252 in a RAM 253 or the like.
 自律走行制御装置2は、第1通信インタフェース201-1を介して、図1のカメラ情報取得部101、レーダ情報取得部102、レーザ情報取得部103、自車位置情報取得部104、自動運転設定部105、および無線通信部106と接続され、第2通信インタフェース201-2を介して補助制御部107、ブレーキ制御部108、エンジン制御部109、およびパワーステアリング制御部110と接続される。 The autonomous driving control device 2 uses the camera information acquisition unit 101, the radar information acquisition unit 102, the laser information acquisition unit 103, the vehicle position information acquisition unit 104, and the automatic driving setting in FIG. 1 via the first communication interface 211-1. It is connected to the unit 105 and the wireless communication unit 106, and is connected to the auxiliary control unit 107, the brake control unit 108, the engine control unit 109, and the power steering control unit 110 via the second communication interface 201-2.
 自律走行制御装置2は、マルチコアプロセッサ202において、自動運転制御のための処理を実行する。マルチコアプロセッサ202は、第1通信インタフェース201-1から入力される、カメラ情報取得部101、レーダ情報取得部102、レーザ情報取得部103、及び自車位置情報取得部104からのセンサ情報、自動運転設定部105からの自動運転設定情報を取得する。取得した情報を用いて周辺の認知処理や軌道の算出処理を実行し、算出処理結果を基に、制動力や駆動力などの制御指令を第2通信インタフェース201-2から出力する。 The autonomous driving control device 2 executes processing for automatic driving control in the multi-core processor 202. The multi-core processor 202 receives sensor information from the camera information acquisition unit 101, radar information acquisition unit 102, laser information acquisition unit 103, and own vehicle position information acquisition unit 104, which are input from the first communication interface 211-1, and automatic operation. Acquires automatic operation setting information from the setting unit 105. The acquired information is used to execute peripheral cognitive processing and trajectory calculation processing, and based on the calculation processing results, control commands such as braking force and driving force are output from the second communication interface 201-2.
 自律走行制御装置2が処理を実行するためのプログラムはプログラム並列化装置111において作成される。自律走行制御装置2は、無線通信部106からプログラム情報を取得し、記憶部204に格納してもよい。 The program for the autonomous travel control device 2 to execute the process is created in the program parallelizing device 111. The autonomous travel control device 2 may acquire the program information from the wireless communication unit 106 and store it in the storage unit 204.
 <プログラム並列化装置の機能構成>
 図4は、プログラム並列化装置111の機能構成図である。プログラム並列化装置111は、判別部31と、事前処理部32と、並列化部33と、事後処理部34と、統合部35と、コンパイラ39と、装置記憶部37とを備える。判別部31、事前処理部32、並列化部33、事後処理部34、統合部35、およびコンパイラ39は、CPU251がROM252に格納されるプログラムを実行することにより実現される。
<Functional configuration of program parallelizer>
FIG. 4 is a functional configuration diagram of the program parallelizing device 111. The program parallelization device 111 includes a determination unit 31, a pre-processing unit 32, a parallelization unit 33, a post-processing unit 34, an integration unit 35, a compiler 39, and a device storage unit 37. The determination unit 31, the pre-processing unit 32, the parallelization unit 33, the post-processing unit 34, the integration unit 35, and the compiler 39 are realized by the CPU 251 executing a program stored in the ROM 252.
 装置記憶部37には、オリジナルプログラム51と、対象プログラム52と、事前処理後プログラム53と、変換済プログラム54と、逆変換付加プログラム55と、対象外プログラム56と、統合済プログラム57とが格納される。なお装置記憶部37とは、図2に示したRAM253とフラッシュメモリ254を含む概念であり、両者のいずれでもよいし両者をあわせることで実現されてもよい。ただし符号51~符号57で示すプログラムは、説明のために列記したにすぎず、この全てが装置記憶部37に同時に存在することは必須ではない。 The device storage unit 37 stores the original program 51, the target program 52, the preprocessed program 53, the converted program 54, the inverse conversion addition program 55, the non-target program 56, and the integrated program 57. Will be done. The device storage unit 37 is a concept including the RAM 253 and the flash memory 254 shown in FIG. 2, and may be either of the two or may be realized by combining the two. However, the programs indicated by reference numerals 51 to 57 are listed only for the sake of explanation, and it is not essential that all of them exist in the device storage unit 37 at the same time.
 オリジナルプログラム51は、あらかじめ作成されたプログラムコードであり、たとえばプログラマやソースコードの自動生成ツールにより作成される。オリジナルプログラム51には、後述するような並列化に関する明示の指令は記載されていない。すなわちオリジナルプログラム51はマルチコアプロセッサでの処理に特化していないので、オリジナルプログラム51は「シングルコアプロセッサ用プログラム」とも呼べる。 The original program 51 is a program code created in advance, and is created by, for example, a programmer or an automatic source code generation tool. The original program 51 does not include an explicit command regarding parallelization as described later. That is, since the original program 51 is not specialized in processing by the multi-core processor, the original program 51 can also be called a “single-core processor program”.
 対象プログラム52は、オリジナルプログラム51のうち、判別部31により並列処理の対象と判断されたプログラムである。対象外プログラム56は、オリジナルプログラム51のうち、判別部31により並列処理の対象でないと判断されプログラムである。すなわち、対象プログラム52と対象外プログラム56とをあわせると、オリジナルプログラム51となる。 The target program 52 is a program among the original programs 51 that is determined by the discriminating unit 31 to be the target of parallel processing. The non-target program 56 is a program among the original programs 51 that is determined by the discriminating unit 31 to be not the target of parallel processing. That is, the target program 52 and the non-target program 56 are combined to form the original program 51.
 事前処理後プログラム53は、事前処理部32により事前処理が施された対象プログラム52である。変換済プログラム54は、並列化部33により並列化処理が施された事前処理後プログラム53である。逆変換付加プログラム55は、事後処理部34により出力されるプログラムであり、並列化部33が並列化しなかった変換済プログラム54の部分を逆変換したプログラムである。詳しくは後述する。 The pre-processed program 53 is a target program 52 that has been pre-processed by the pre-processing unit 32. The converted program 54 is a pre-processed program 53 that has been parallelized by the parallelizing unit 33. The inverse transformation addition program 55 is a program output by the post-processing unit 34, and is a program in which the parallelization unit 33 reverse-converts the portion of the converted program 54 that has not been parallelized. Details will be described later.
 統合済プログラム57は、逆変換付加プログラム55と対象外プログラム56とを結合したプログラムである。対象外プログラム56は、オリジナルプログラム51から対象プログラム52を除いたプログラムであり、対象プログラム52は様々な処理により並列処理に適した逆変換付加プログラム55に変換された。そのため統合済プログラム57は、並列処理に最適化されたオリジナルプログラム51とも言える。 The integrated program 57 is a program that combines the inverse transformation addition program 55 and the non-target program 56. The non-target program 56 is a program obtained by removing the target program 52 from the original program 51, and the target program 52 is converted into an inverse transformation addition program 55 suitable for parallel processing by various processes. Therefore, the integrated program 57 can be said to be the original program 51 optimized for parallel processing.
 判別部31は、オリジナルプログラム51を処理対象として、対象プログラム52および対象外プログラム56を出力する。判別部31はオリジナルプログラム51を読み込み、並列処理の対象と判断する部分を対象プログラム52として出力し、並列処理の対象外と判断する部分を対象外プログラム56として出力する。判別部31が並列処理の対象と判断するのは、向上可能性と処理可能性の両方を満たす対象プログラム52の一部分である。向上可能性とは、並列化により処理速度が向上する可能性が高いことを意味する。処理可能性とは、並列化部33による並列化の処理が可能なことである。 The discrimination unit 31 outputs the target program 52 and the non-target program 56 with the original program 51 as the processing target. The determination unit 31 reads the original program 51, outputs a portion determined to be the target of parallel processing as the target program 52, and outputs a portion determined to be non-target of parallel processing as the non-target program 56. The discriminating unit 31 determines that the target of parallel processing is a part of the target program 52 that satisfies both the improvement possibility and the processability. The possibility of improvement means that there is a high possibility that the processing speed will be improved by parallelization. The processability means that the parallelization unit 33 can process the parallelization.
 並列化部33による並列化の処理が可能な場合には、特別な処理を施すことなくそのまま並列化部33が処理可能である場合と、事前処理部32が記述変換処理を施すことで並列化部33が処理可能になる場合とが含まれる。すなわち処理可能性が否定される場合とは、そのままでは並列化部33による並列化の処理が不可能であり、かつ事前処理部32による記述変換処理が不可能な場合である。 When parallelization processing is possible by the parallelization unit 33, parallelization can be performed by the parallelization unit 33 as it is without performing any special processing, or by performing descriptive conversion processing by the preprocessing unit 32. The case where the unit 33 can be processed is included. That is, the case where the processability is denied is a case where the parallelization process by the parallelization unit 33 is impossible and the description conversion process by the preprocessing unit 32 is impossible as it is.
 判別部31は、オリジナルプログラム51に対してプロファイリングを行い、並列化により処理速度が向上する可能性が高く、かつ並列化部33が処理可能なように記述変換が可能な関数を並列処理の対象と判断する。判別部31はたとえばプロファイラを用いて実現される。判別部31は、各関数の実行時間の計測や関数同士の依存関係を考慮して並列化の対象を決定してもよい。 The discriminating unit 31 performs profiling on the original program 51, and there is a high possibility that the processing speed will be improved by parallelization, and the function that can be descriptively converted so that the parallelization unit 33 can process is the target of parallel processing. Judge. The discriminating unit 31 is realized by using, for example, a profiler. The determination unit 31 may determine the target of parallelization in consideration of the measurement of the execution time of each function and the dependency between the functions.
 事前処理部32は、対象プログラム52を処理対象とし、事前処理後プログラム53を出力する。事前処理部32は、既知である並列化部33の処理制限に沿うように対象プログラム52を書き換える。この書き換えを「記述変換処理」とも呼ぶ。並列化部33の処理制限とはたとえば、変数の型の制限や関数呼び出しの制限である。たとえば事前処理部32は、変数の型において並列化部33に浮動小数点形式の倍精度の変数である「double」型が使用できない制限が存在する場合は、浮動小数点形式の単精度である「float」型に書き換える。また事前処理部32は、関数呼び出しにおいて並列化部33は再帰呼び出しに対応できない制限がある場合には、再帰関数を非再帰関数に書き換える。 The pre-processing unit 32 targets the target program 52 and outputs the pre-processed program 53. The pre-processing unit 32 rewrites the target program 52 so as to meet the known processing restrictions of the parallelization unit 33. This rewriting is also called "description conversion processing". The processing restrictions of the parallelizing unit 33 are, for example, restrictions on variable types and restrictions on function calls. For example, the preprocessing unit 32 may use the floating-point single-precision "float" if the parallelizing unit 33 has a restriction that the "double" type, which is a floating-point double-precision variable, cannot be used in the variable type. Rewrite to type. Further, the preprocessing unit 32 rewrites the recursive function into a non-recursive function when there is a limitation that the parallelizing unit 33 cannot handle the recursive call in the function call.
 なお事前処理部32は、事後処理部34による逆変換処理が可能なように、事前処理の内容を記録する。この記録は、事前処理後プログラム53に特定の書式を有するコメントとして記録してもよいし、事前処理後プログラム53に特定の文字列を埋め込んでもよいし、不図示の中間処理ファイルに書き出してもよい。事前処理後プログラム53へのコメントとしての記録とはたとえば、「// #preconv32# double value1 >> float value1」である。このコメントは、冒頭の「#preconv32#」が事前処理部32の記載であることを示し、さらに「double value1」を「float value1」に書き換えたことを示している。 The pre-processing unit 32 records the contents of the pre-processing so that the post-processing unit 34 can perform the reverse conversion processing. This record may be recorded as a comment having a specific format in the preprocessed program 53, a specific character string may be embedded in the preprocessed program 53, or it may be written to an intermediate processing file (not shown). good. The record as a comment to the preprocessed program 53 is, for example, "// # preconv32 # double value1 >> float value1". This comment indicates that "# preconv32 #" at the beginning is the description of the preprocessing unit 32, and further indicates that "double value1" has been rewritten to "float value1".
 事前処理後プログラム53に特定の文字列を埋め込む方法とは、たとえば事前処理後プログラム53に「typedef float _preconv32_double_float」および「_preconv32_double_float value1」と記載する方法である。この記載は、「typedef」で規定する新規な名称の冒頭に「_preconv32_」を記載することで事前処理部32の処理であることを示している。さらに「double_float」が「double」型を「float」型に変更したことを示している。 The method of embedding a specific character string in the preprocessed program 53 is, for example, a method of describing "typedef float_preconv32_double_float" and "_preconv32_double_float value1" in the preprocessed program 53. This description indicates that the processing is performed by the preprocessing unit 32 by describing "_preconv32_" at the beginning of the new name specified by "typedef". Furthermore, "double_float" indicates that the "double" type has been changed to the "float" type.
 並列化部33は、事前処理後プログラム53を処理対象とし、変換済プログラム54を出力する。並列化部33は、既知の並列化ツールであり、ソースコードをソースコードに変換する。すなわち並列化部33はコンパイラではなくソースコードを書き換えるプログラムである。並列化部33は、事前処理後プログラム53を書き換えてコンパイラ39に並列化に関する明示の指令を与える。並列化部33が書き換えた箇所は、たとえば「#pragma parallel」などの特定の文字列に続けて、具体的な並列処理の指令が挿入さる。 The parallelization unit 33 targets the preprocessed program 53 as a processing target, and outputs the converted program 54. The parallelization unit 33 is a known parallelization tool and converts the source code into the source code. That is, the parallelization unit 33 is not a compiler but a program that rewrites the source code. The parallelization unit 33 rewrites the preprocessing program 53 and gives the compiler 39 an explicit command regarding parallelization. At the location rewritten by the parallelization unit 33, a specific parallel processing command is inserted after a specific character string such as "#pragma parallel".
 なお並列化部33は、少なくとも事前処理部32が記載するコメントは消去せずに変換済プログラム54にそのまま残す。並列化部33は、事前処理部32が記載するコメントの特徴を事前に指定されることで、自動処理により事前処理部32が記載するコメントのみを選択して残して他のコメントは削除してもよい。また並列化部33は、動作オプションによりオペレータからコメントを削除しない動作モードが指定されてもよい。 The parallelizing unit 33 does not delete at least the comment described by the preprocessing unit 32, but leaves it as it is in the converted program 54. The parallelizing unit 33 specifies in advance the characteristics of the comments described by the preprocessing unit 32, so that only the comments described by the preprocessing unit 32 are selected and left by automatic processing, and other comments are deleted. May be good. Further, the parallelizing unit 33 may specify an operation mode in which the comment is not deleted from the operator by the operation option.
 事後処理部34は、変換済プログラム54を処理対象とし、逆変換付加プログラム55を出力する。事後処理部34は、変換済プログラム54を読み込んで、並列化部33による書き換えが行われていない箇所、たとえば並列化部33により書き換えられていない関数を特定する。たとえば関数の直前に「#pragma parallel」などの特定の文字列が記載されていない関数を特定する。そして事後処理部34は、並列化部33による書き換えが行われていない箇所を事前処理部32による事前処理が行われる前の状態に書き換える。換言すると、事前処理部32はプログラムを並列化に適した形式に変換したが、事後処理部34は元の形式に逆変換する。 The post-processing unit 34 targets the converted program 54 as a processing target, and outputs the inverse conversion addition program 55. The post-processing unit 34 reads the converted program 54 and identifies a part that has not been rewritten by the parallelizing unit 33, for example, a function that has not been rewritten by the parallelizing unit 33. For example, specify a function that does not contain a specific character string such as "#pragma parallel" immediately before the function. Then, the post-processing unit 34 rewrites the portion that has not been rewritten by the parallelizing unit 33 to the state before the pre-processing by the pre-processing unit 32 is performed. In other words, the pre-processing unit 32 converts the program into a format suitable for parallelization, but the post-processing unit 34 reversely converts the program into the original format.
 事後処理部34による逆変換の処理は、事前処理部32が前述のようにその処理内容を記録しているので、その記録を参照することで行われる。事後処理部34はたとえば、変換済プログラム54に「// #preconv32# double value1 >> float value1」というコメントが含まれており、かつ「float value1」の宣言が含まれる関数の直前に「#pragma parallel」などの特定の文字列が記載されていない場合は、「float value1」を「double value1」に書き換える。 The process of inverse transformation by the post-processing unit 34 is performed by referring to the record because the pre-processing unit 32 records the processing content as described above. For example, the post-processing unit 34 includes the comment "// # preconv32 # double value1 >> float value1" in the converted program 54, and "#pragma" immediately before the function containing the declaration of "float value1". If a specific character string such as "parallel" is not described, rewrite "float value1" to "double value1".
 統合部35は、逆変換付加プログラム55および対象外プログラム56を処理対象とし、統合済プログラム57を出力する。すなわち統合部35は、逆変換付加プログラム55の記載内容、および対象外プログラム56の記載内容をあわせて、統合済プログラム57を作成する。 The integration unit 35 targets the inverse transformation addition program 55 and the non-target program 56, and outputs the integrated program 57. That is, the integration unit 35 creates the integrated program 57 by combining the description contents of the inverse transformation addition program 55 and the description contents of the non-target program 56.
<動作フローチャート>
 図5は、プログラム並列化装置111の処理を示すフローチャートである。プログラム並列化装置111はたとえば、オリジナルプログラム51が記憶部に保存されると、または外部から処理実行の指令を受け取ると、以下のフローチャートで示される動作を実行する。なお以下に説明するステップS401~S404は判別部31が実行し、ステップS405は事前処理部32が実行し、ステップS406は並列化部33が実行し、ステップS407は事後処理部34が実行し、ステップS408は統合部35が実行し、ステップS409はコンパイラ39が実行する。以下では、プログラム並列化装置111がオリジナルプログラム51に基づき統合済プログラム57を作成する方法を「プログラム並列化方法」と呼ぶ。
<Operation flowchart>
FIG. 5 is a flowchart showing the processing of the program parallelizing device 111. For example, when the original program 51 is stored in the storage unit or when a processing execution command is received from the outside, the program parallelizing device 111 executes the operation shown in the following flowchart. The discriminating unit 31 executes steps S401 to S404 described below, the pre-processing unit 32 executes step S405, the parallelizing unit 33 executes step S406, and the post-processing unit 34 executes step S407. Step S408 is executed by the integration unit 35, and step S409 is executed by the compiler 39. Hereinafter, the method in which the program parallelizing device 111 creates the integrated program 57 based on the original program 51 is referred to as a “program parallelizing method”.
 ステップS401では、プログラム並列化装置111は、オリジナルプログラム51をプロファイリングする。具体的には、プログラム並列化装置111は、逐次処理から並列処理に変更することで高速化できる箇所を抽出する。 In step S401, the program parallelizer 111 profiles the original program 51. Specifically, the program parallelizing device 111 extracts a part that can be speeded up by changing from sequential processing to parallel processing.
 ステップS402では、プログラム並列化装置111は、ステップS401で抽出した結果に基づいて、前述の向上可能性の評価、すなわち並列処理により高速化できる見込みがあるか否かを判断する。具体的には判別部31は、並列処理によりオリジナルプログラム51の少なくとも一部が高速化できる見込みがあると判断するとステップS403に進む。判別部31は、オリジナルプログラム51のいずれの箇所も並列処理により高速化できる見込みがないと判断するとステップS409に進む。 In step S402, the program parallelizing device 111 determines, based on the result extracted in step S401, the evaluation of the above-mentioned improvement possibility, that is, whether or not there is a possibility that the speed can be increased by parallel processing. Specifically, the determination unit 31 proceeds to step S403 when it determines that at least a part of the original program 51 is likely to be speeded up by parallel processing. When the determination unit 31 determines that there is no possibility that the speed can be increased by parallel processing at any of the parts of the original program 51, the determination unit 31 proceeds to step S409.
 ステップS403では、プログラム並列化装置111の事前処理部32は、ステップS402で高速化が見込まれたプログラムについて、前述の処理可能性の評価、換言すると、記述変換できるか否かを判断する。ここで実施する記述変換は、プログラム並列化を実施するために必要なプログラムの変換であり、後述する並列化部33が解析できる範囲の記述に変換することを示す。プログラムの型や処理するビットサイズを調整することが一般的に求められ、具体的な変換例としては、倍精度を単精度に調整するために、OS(Operation System)が採用するデータモデルに応じてlong long型の長いビットサイズの型をint型などの短いビットサイズの型に変換することがあげられる。 In step S403, the preprocessing unit 32 of the program parallelizing device 111 determines whether or not the above-mentioned processability evaluation, in other words, descriptive conversion can be performed on the program expected to be speeded up in step S402. The description conversion performed here is a conversion of the program necessary for carrying out program parallelization, and indicates that the description is converted into a description within a range that can be analyzed by the parallelization unit 33 described later. It is generally required to adjust the program type and the bit size to be processed, and as a specific conversion example, it depends on the data model adopted by the OS (Operation System) in order to adjust the double precision to single precision. It is possible to convert a long bit size type of long long type to a short bit size type such as int type.
 ステップS403において判別部31は、高速化が見込まれたプログラムが記述変換できると判断する場合はステップS404に進む。判別部31は、高速化が見込まれたプログラムの全てが記述変換できない、すなわち、並列化部33がプログラムを解析できないと判断するとステップS409に進む。ステップS404では、プログラム並列化装置111の判別部31は、並列処理による高速化が見込め、記述変換が可能なプログラムを抽出する。具体的には判別部31、シングルコアプロセッサ用のままとして使用する部分を対象プログラム52として保存し、マルチコアプロセッサ用にプログラム並列化を実施する部分を対象外プログラム56として保存する。 In step S403, the determination unit 31 proceeds to step S404 when it determines that the program expected to be faster can be descriptively converted. When the determination unit 31 determines that all the programs expected to be speeded up cannot be described and converted, that is, the parallelization unit 33 cannot analyze the program, the process proceeds to step S409. In step S404, the discriminating unit 31 of the program parallelizing device 111 extracts a program capable of descriptive conversion, which is expected to be speeded up by parallel processing. Specifically, the discriminating unit 31 and the portion used as it is for the single-core processor are saved as the target program 52, and the portion for performing program parallelization for the multi-core processor is saved as the non-target program 56.
 ステップS405では、プログラム並列化装置111の事前処理後プログラム53は、抽出したプログラムの記述変換を実施する。具体的には、ステップS404において保存した対象プログラム52に対して記述変換を実施して事前処理後プログラム53を出力する。ステップS406では、プログラム並列化装置111の並列化部33は、ステップS405において出力した事前処理後プログラム53を対象として並列化を行い、変換済プログラム54を出力する。換言するとステップS406では、シングルコアプロセッサ用のプログラムである逐次プログラム、すなわち事前処理後プログラム53からマルチコアプロセッサ用のプログラムである並列プログラム、すなわち変換済プログラム54が作成される。 In step S405, the pre-processed program 53 of the program parallelizing device 111 executes the description conversion of the extracted program. Specifically, the target program 52 saved in step S404 is subjected to descriptive conversion, and the preprocessed program 53 is output. In step S406, the parallelizing unit 33 of the program parallelizing device 111 performs parallelization on the preprocessed program 53 output in step S405, and outputs the converted program 54. In other words, in step S406, a sequential program which is a program for a single core processor, that is, a parallel program which is a program for a multi-core processor, that is, a converted program 54 is created from the preprocessed program 53.
 ステップS407では、プログラム並列化装置111の事後処理部34は、ステップS406において並列化部33によって並列化されなかったプログラムに対して記述逆変換を実施する。並列化されなかったプログラムにおいて、ステップS405において事前処理部32が実行した記述変換を、処理開始時に入力された元々のプログラムの記述にもどすように逆変換を実施する。 In step S407, the post-processing unit 34 of the program parallelizing device 111 performs descriptive inverse transformation on the program that was not parallelized by the parallelizing unit 33 in step S406. In the program that has not been parallelized, the descriptive transformation executed by the preprocessing unit 32 in step S405 is inversely transformed so as to return to the original program description input at the start of processing.
 ステップS408では、プログラム並列化装置111の統合部35は、ステップS407で記述逆変換した逆変換付加プログラム55、およびステップS404で記述変換を実施しなかった対象外プログラム56を結合して統合済プログラム57を生成する。続くステップS409では、コンパイラ39は統合済プログラム57をコンパイルしてバイナリコード59を生成し、図5に示す処理を終了する。ただしコンパイラ39は、ステップS402およびステップS403からステップS409に進んだ場合には、統合済プログラム57ではなくオリジナルプログラム51をコンパイルする。以上が図5の説明である。 In step S408, the integration unit 35 of the program parallelizing device 111 is an integrated program by combining the inverse conversion addition program 55 that has undergone the description inverse transformation in step S407 and the non-target program 56 that has not undergone the description conversion in step S404. Generate 57. In the following step S409, the compiler 39 compiles the integrated program 57 to generate the binary code 59, and ends the process shown in FIG. However, when the compiler 39 proceeds from step S402 and step S403 to step S409, the compiler 39 compiles the original program 51 instead of the integrated program 57. The above is the description of FIG.
 <各処理とプログラムとの関係>
 図6は、プログラム並列化装置111における統合済プログラム57の作成までの各処理とプログラムとの関係を示す図である。図5に示したフローチャートと合わせて説明する。オリジナルプログラム51は、プログラム並列化装置111に入力されるシングルコアプロセッサ用の逐次プログラムである。「C」の記載は、例えばC言語で書かれたプログラムであることを示す。
<Relationship between each process and program>
FIG. 6 is a diagram showing the relationship between each process up to the creation of the integrated program 57 in the program parallelizing device 111 and the program. This will be described together with the flowchart shown in FIG. The original program 51 is a sequential program for a single core processor input to the program parallelizing device 111. The description of "C" indicates that the program is written in, for example, C language.
 オリジナルプログラム51は、並列化抽出ステップS501において、対象プログラム52と、対象外プログラム56に分けられる。並列化抽出ステップS501は、図5に示したステップS401からステップS404に相当する。対象プログラム52は、ステップS404においてマルチコアプロセッサ用にプログラム並列化を実施する部分として抽出されたプログラムである。対象外プログラム56は、ステップS404に抽出されなかったプログラム、すなわちオリジナルプログラム51から対象プログラム52を除いたプログラムである。ステップS402およびステップS403において、ステップS409に進むと判断された場合には、並列化抽出ステップS501において対象プログラム52が空集合の場合と考えてもよい。 The original program 51 is divided into a target program 52 and a non-target program 56 in the parallel extraction step S501. The parallel extraction step S501 corresponds to steps S401 to S404 shown in FIG. The target program 52 is a program extracted in step S404 as a portion for performing program parallelization for the multi-core processor. The non-target program 56 is a program not extracted in step S404, that is, a program obtained by removing the target program 52 from the original program 51. If it is determined in step S402 and step S403 that the process proceeds to step S409, it may be considered that the target program 52 is an empty set in the parallel extraction step S501.
 次に、対象プログラム52は、記述変換ステップS502において、事前処理後プログラム53に変換される。記述変換ステップS502は、図5に示したステップS405に相当する。次に、事前処理後プログラム53は、並列化処理ステップS503において、変換済プログラム54に変換される。並列化処理ステップS503は、図5に示したステップS406に相当する。 Next, the target program 52 is converted into the pre-processed program 53 in the description conversion step S502. The description conversion step S502 corresponds to step S405 shown in FIG. Next, the pre-processed program 53 is converted into the converted program 54 in the parallel processing step S503. The parallel processing step S503 corresponds to step S406 shown in FIG.
 次に、変換済プログラム54は、記述逆変換ステップS504において、逆変換付加プログラム55に変換される。記述逆変換ステップS504は、図5に示したステップS407に相当する。次に、逆変換付加プログラム55及び対象外プログラム56は、結合ステップS505において、統合済プログラム57に変換される。結合ステップS505は、図5に示したステップS408に相当する。 Next, the converted program 54 is converted into the inverse conversion addition program 55 in the description inverse conversion step S504. The description inverse conversion step S504 corresponds to step S407 shown in FIG. Next, the inverse transformation addition program 55 and the non-target program 56 are converted into the integrated program 57 in the joining step S505. The joining step S505 corresponds to step S408 shown in FIG.
 <プログラムの変化の概要>
 図7は、プログラム並列化装置111における、プログラムの変化の概要を示す図である。図7では、各プログラム内部に具体的なソースコードの名称を記載しており、図示横方向の長さがそのソースコードをそのままコンパイルして実行したと仮定した場合の処理時間を示す。また図7の説明では、「ソースコードXをそのままコンパイルして実行したと仮定した場合の処理時間」のことを、「ソースコードXの処理時間」と省略して記載する。さらに図7では、Nが3であり、3つのコア203を備えるマルチコアプロセッサ202を用いた例で説明する。
<Outline of program changes>
FIG. 7 is a diagram showing an outline of changes in the program in the program parallelizing device 111. In FIG. 7, specific source code names are described inside each program, and the horizontal length shown shows the processing time when it is assumed that the source code is compiled and executed as it is. Further, in the description of FIG. 7, "processing time when it is assumed that the source code X is compiled and executed as it is" is abbreviated as "processing time of the source code X". Further, in FIG. 7, an example will be described in which a multi-core processor 202 having N of 3 and having three cores 203 is used.
 図7に示すオリジナルプログラム51は、ソースコードA~Cに記載される処理を順番に実施する逐次プログラムであることを示す。なおオリジナルプログラム51は、処理A~Cを順番に実施する逐次プログラムとも言える。並列化抽出ステップS501により、ソースコードAおよびソースコードBは高速化が見込まれ記述変換が可能と判断されて対象プログラム52となり、残りのソースコードCは対象外プログラム56となった。対象プログラム52に含まれる、ソースコードAおよびBは、オリジナルプログラム51から何ら変化がないため、図示横方向の長さである処理時間は変化がない。 The original program 51 shown in FIG. 7 is a sequential program that sequentially executes the processes described in the source codes A to C. The original program 51 can be said to be a sequential program in which the processes A to C are sequentially executed. According to the parallel extraction step S501, the source code A and the source code B were determined to be capable of speeding up and the description conversion was possible, and became the target program 52, and the remaining source code C became the non-target program 56. Since the source codes A and B included in the target program 52 do not change from the original program 51, the processing time, which is the length in the horizontal direction shown in the drawing, does not change.
 事前処理後プログラム53は、図6に示した記述変換ステップS502により、ソースコードA1およびB1に記載される処理を順番に実施するプログラムとして構成されている。ソースコードA1は、ソースコードAに対して並列化部33が処理可能なように記述変換が適用されたソースコードである。ソースコードB1は、ソースコードBに対して記述変換が適用されたソースコードである。図7では、ソースコードA1,B1のようにハッチングを施しているソースコードは、記述変換が適用されていることを示す。 The pre-processed program 53 is configured as a program that sequentially executes the processes described in the source codes A1 and B1 by the description conversion step S502 shown in FIG. The source code A1 is a source code to which a description conversion is applied to the source code A so that the parallelizing unit 33 can process the source code A. The source code B1 is a source code to which a descriptive conversion is applied to the source code B. In FIG. 7, it is shown that the descriptive conversion is applied to the hatched source code such as the source codes A1 and B1.
 ここで、事前処理後プログラム53は、対象プログラム52より処理時間が長くなっている。この理由は以下のとおりである。すなわち、記述変換により並列化部33が解析できる範囲にプログラムの型や処理するビットサイズ調整などを実施する場合は、制約された条件の中で同じ処理を実行できるように対応する必要がある。そしてこの記述変換を行ったソースコードをそのままコンパイルして得られるバイナリは、一般的には処理時間が長くなることなどのデメリットが存在すると考えられるからである。 Here, the pre-processed program 53 has a longer processing time than the target program 52. The reason for this is as follows. That is, when the program type and the bit size to be processed are adjusted within the range that can be analyzed by the parallelizing unit 33 by the descriptive conversion, it is necessary to take measures so that the same processing can be executed under the restricted conditions. This is because the binary obtained by compiling the source code obtained by performing this descriptive conversion as it is is generally considered to have disadvantages such as a long processing time.
 変換済プログラム54は、図6に示した並列化処理ステップS503の処理により、ソースコードA1に示される処理が3つに並列化されて、A2-1,A2-2,およびA2-3として並列に実行され、その後にソースコードB2に示される逐次処理が実行されることを示している。すなわち図7に示す例では、並列化部33により、ソースコードA1は並列化のための記述がなされ、ソースコードB1は並列化のための処理がなされなかったことを示す。ソースコードA1の処理は変換済プログラム54では処理A2-1~処理A2-3として3つに並列化されるため、図示横方向の長さで示す実行時間はソースコードA1よりも短いだけでなく、ソースコードAよりも短い。記述変換や並列化のためのオーバーヘッドが存在しても、並列化による利点が大きく処理時間が短縮されるためである。 In the converted program 54, the processes shown in the source code A1 are parallelized into three by the process of the parallelization process step S503 shown in FIG. 6, and are paralleled as A2-1, A2-2, and A2-3. Is executed, and then the sequential processing shown in the source code B2 is executed. That is, in the example shown in FIG. 7, the parallelization unit 33 indicates that the source code A1 is described for parallelization and the source code B1 is not processed for parallelization. Since the processing of the source code A1 is parallelized in three as processing A2-1 to processing A2-3 in the converted program 54, the execution time indicated by the length in the horizontal direction shown in the figure is not only shorter than that of the source code A1. , Shorter than source code A. This is because even if there is overhead for descriptive conversion and parallelization, the advantage of parallelization is large and the processing time is shortened.
 なお並列化部33により並列化のための記述がされなかったソースコードB1も、並列化部33によりソースコードB2に書き換えられている。これは、並列化部33によりコンパイル処理がしやすい記述に書き換えられたことを意味する。なおソースコードB1の処理時間と、ソースコードB2の処理時間とは略同一である。 The source code B1 that was not described for parallelization by the parallelization unit 33 is also rewritten to the source code B2 by the parallelization unit 33. This means that the parallelization unit 33 has rewritten the description so that the compilation process is easy. The processing time of the source code B1 and the processing time of the source code B2 are substantially the same.
 逆変換付加プログラム55は、図6に示した記述逆変換ステップS504により、3つに並列化された並列プログラム(処理A2-1~処理A2-3)及び処理Bの逐次プログラムを順番に実施するプログラムとして構成されている。 The inverse transformation addition program 55 sequentially executes the parallel programs (processes A2-1 to A2-3) parallelized in three and the sequential programs of the processes B by the description inverse transformation step S504 shown in FIG. It is configured as a program.
 ここで、逆変換付加プログラム55は、変換済プログラム54より処理時間が短くなっている。並列化部33で並列化されなかったソースコードB2に対して記述逆変換を実施し、調整していたプログラムの型や処理するビットサイズを元に戻すなど、記述変換前と同じ処理に戻すように対応することで、処理時間を短くすることができる。統合済プログラム57は、図6に示した結合ステップS505により、逆変換付加プログラム55と対象外プログラム56とを結合したものである。 Here, the processing time of the inverse transformation addition program 55 is shorter than that of the converted program 54. Perform descriptive inverse conversion on the source code B2 that was not parallelized by the parallelization unit 33, and restore the same processing as before the descriptive conversion, such as restoring the adjusted program type and the bit size to be processed. The processing time can be shortened by corresponding to. The integrated program 57 is a combination of the inverse transformation addition program 55 and the non-target program 56 by the combination step S505 shown in FIG.
 <情報受信シーケンスの例>
 図8は、プログラム並列化装置111から自律走行制御装置2への情報受信を示すシーケンス図である。図8では、例えば、自律走行制御装置2がマルチコアプロセッサ202の異常を検知した場合に、クラウド上などに設置されたプログラム並列化装置111に通知し、OTAにより無線ネットワーク経由で新しいプログラム情報を受信する例を示す。
<Example of information reception sequence>
FIG. 8 is a sequence diagram showing information reception from the program parallelizing device 111 to the autonomous travel control device 2. In FIG. 8, for example, when the autonomous driving control device 2 detects an abnormality in the multi-core processor 202, it notifies the program parallelizing device 111 installed on the cloud or the like, and receives new program information via the wireless network by OTA. Here is an example of how to do it.
 はじめに、自律走行制御装置2は、マルチコアプロセッサ202の障害を検知すると(S701)、検知した情報を車載システム1の無線通信部106に転送する(S702)。次に、無線通信部106は、無線ネットワークを経由し、受信した検知情報をプログラム並列化装置111に転送する(S703)。 First, when the autonomous driving control device 2 detects a failure of the multi-core processor 202 (S701), it transfers the detected information to the wireless communication unit 106 of the in-vehicle system 1 (S702). Next, the wireless communication unit 106 transfers the received detection information to the program parallelizing device 111 via the wireless network (S703).
 検知情報を受信したプログラム並列化装置111は、マルチコアプロセッサ202向けのプログラムを再構成する(S704)。具体的には、例えば、検知情報に基づき、障害の影響が無く、使用可能なコア203の数に応じたプログラム並列化処理を実施する。次に、プログラム並列化装置111は、再構成したプログラムの情報を無線通信部106に転送する(S705)。次に、無線通信部106は、受信したプログラム情報を自律走行制御装置2に転送する(S706)。その後、自律走行制御装置2が、車載システム1における更新タイミングやその方法に従って新しいプログラムで稼働し(S707)、処理を完了するとしてもよい。 The program parallelizing device 111 that has received the detection information reconfigures the program for the multi-core processor 202 (S704). Specifically, for example, based on the detection information, the program parallelization process is performed according to the number of usable cores 203 without being affected by the failure. Next, the program parallelizing device 111 transfers the information of the reconstructed program to the wireless communication unit 106 (S705). Next, the wireless communication unit 106 transfers the received program information to the autonomous travel control device 2 (S706). After that, the autonomous travel control device 2 may operate with a new program (S707) according to the update timing and the method in the in-vehicle system 1 to complete the process.
 本実施の形態によれば、自動変換ツールに合わせた記述変換を実施するだけでなく、記述逆変換を組み合わせることで、プログラム並列化における処理時間を短縮することができる。したがって、プログラム並列化を通して作成された並列プログラムにより動作するマルチコアプロセッサの性能を向上することができる。また、本実施例によれば、プログラムを実行するマルチコアプロセッサの状態に応じてプログラムを再構成することができる。 According to this embodiment, not only the description conversion according to the automatic conversion tool is performed, but also the processing time in program parallelization can be shortened by combining the description inverse conversion. Therefore, it is possible to improve the performance of the multi-core processor operated by the parallel program created through the program parallelization. Further, according to this embodiment, the program can be reconfigured according to the state of the multi-core processor that executes the program.
 上述した第1の実施の形態によれば、次の作用効果が得られる。
(1)プログラム並列化装置111は、シングルコアプロセッサ用プログラムであるオリジナルプログラム51からマルチコアプロセッサ用プログラムである統合済プログラム57を生成するプログラム並列化方法を実行する。プログラム並列化方法は、シングルコアプロセッサ用プログラムのプログラム並列化処理のための事前変換を実行する事前変換ステップ(図5のステップS405)と、事前変換されたシングルコアプロセッサ用プログラムである事前処理後プログラム53を並列化部33が並列化処理する並列化処理ステップ(図5のステップS406)と、並列化処理されたシングルコアプロセッサ用プログラムのうち並列化されなかった領域、たとえば図7に示す例ではソースコードB2に事前変換の逆変換を実行する逆変換ステップ(図5のステップS407)と、を含む。そのため、並列化部33により並列化されなかったプログラムの記述を逆変換することで、記述変換によるデメリットを解消できる。ここでいうデメリットとは、並列化部33による処理が可能なように、変数のビット数を減らすことで計算の精度が低下することや、再帰的な表現を避けるように書き換えることで演算時間が延びることなどである。すなわちプログラム並列化装置111は、事後処理部34が逆変換を行うことで、出力するプログラムの性能を向上させることができる。
According to the first embodiment described above, the following effects can be obtained.
(1) The program parallelizing device 111 executes a program parallelizing method for generating an integrated program 57, which is a program for a multi-core processor, from an original program 51, which is a program for a single-core processor. The program parallelization method includes a pre-conversion step (step S405 in FIG. 5) for executing pre-conversion for program parallelization processing of a single-core processor program, and post-processing that is a pre-converted single-core processor program. The parallelization processing step (step S406 in FIG. 5) in which the parallelizing unit 33 parallelizes the program 53 and the non-parallelized region of the parallelized single-core processor program, for example, an example shown in FIG. Then, the source code B2 includes an inverse conversion step (step S407 in FIG. 5) for executing the inverse conversion of the pre-conversion. Therefore, the demerit of the description conversion can be eliminated by inversely converting the description of the program that has not been parallelized by the parallelization unit 33. The disadvantages here are that the accuracy of the calculation is reduced by reducing the number of bits of the variable so that the parallelization unit 33 can process it, and that the calculation time is rewritten so as to avoid recursive representation. For example, to extend. That is, the program parallelizing device 111 can improve the performance of the output program by performing the inverse transformation by the post-processing unit 34.
(2)プログラム並列化方法には、シングルコアプロセッサ用プログラムのうち、プログラム並列化処理による高速化効果が見込まれ、プログラム並列化処理のための事前変換が可能である第1のプログラム領域である対象プログラム52を抽出する抽出ステップ(図5のステップS402およびS403)が含まれる。事前変換ステップでは、第1のプログラム領域である対象プログラム52に事前変換を実行する。そのため事前変換の対象をオリジナルプログラム51の全体ではなくその一部に限定することで、ステップS404~S407における処理負荷を軽減できる。 (2) The program parallelization method is the first program area of a program for a single-core processor, which is expected to have a high-speed effect due to the program parallelization processing and can be pre-converted for the program parallelization processing. An extraction step (steps S402 and S403 in FIG. 5) for extracting the target program 52 is included. In the pre-conversion step, pre-conversion is executed in the target program 52, which is the first program area. Therefore, the processing load in steps S404 to S407 can be reduced by limiting the target of the pre-conversion to not the entire original program 51 but a part thereof.
(3)オリジナルプログラム51のうち、対象プログラム52以外の第2のプログラム領域である対象外プログラム56及び、事前変換と並列化処理と逆変換とを実行された第1のプログラムである逆変換付加プログラム55を結合して統合済プログラム57を得る結合ステップ(図5のステップS408)を含む。そのため並列化処理に適しないプログラムをあらかじめ分離しておき、並列化処理の後に結合することで、オリジナルプログラム51の全体をまとめてコンパイラ39で処理ができる。 (3) Of the original programs 51, the non-target program 56, which is a second program area other than the target program 52, and the reverse conversion addition, which is the first program in which the pre-conversion, parallelization processing, and inverse conversion are executed. It includes a joining step (step S408 of FIG. 5) of joining the programs 55 to obtain the integrated program 57. Therefore, by separating the programs unsuitable for the parallel processing in advance and combining them after the parallel processing, the entire original program 51 can be collectively processed by the compiler 39.
(4)事前変換処理(図5のステップS405)は、大きいビットサイズの型を小さいビットサイズの型に記述を変更する処理、たとえば倍精度型を単精度型に変更する処理を含む。逆変換処理(図5のステップS407)は、小さいビットサイズの型を大きいビットサイズの型に記述を変更する処理、たとえば単精度型を倍精度型に変更する処理を含む。そのため処理可能な変数のビットサイズに制限がある並列化部33を用いて並列化処理を実行することができる。 (4) The pre-conversion process (step S405 in FIG. 5) includes a process of changing the description of a large bit size type to a small bit size type, for example, a process of changing a double precision type to a single precision type. The inverse transformation process (step S407 in FIG. 5) includes a process of changing the description of a small bit size type to a large bit size type, for example, a process of changing a single precision type to a double precision type. Therefore, the parallelization process can be executed by using the parallelization unit 33 in which the bit size of the variable that can be processed is limited.
(5)プログラム並列化装置111は、前述のプログラム並列化方法を実行する。そのためプログラム並列化装置111は、並列化されていない部分についての記述変換によるデメリットが解消されたプログラムである統合済プログラム57を出力できる。 (5) The program parallelizing device 111 executes the above-mentioned program parallelizing method. Therefore, the program parallelizing device 111 can output the integrated program 57, which is a program in which the disadvantages due to the description conversion of the non-parallelized portion are eliminated.
(6)自律走行制御装置2は、マルチコアプロセッサ202と、記憶部204と、を備える。記憶部204は、前述のプログラム並列化方法により作成されたマルチコアプロセッサ用プログラムである統合済プログラム57をコンパイルして得られるバイナリコード59を有する。そのため自律走行制御装置2は、並列化部33の働きにより並列化され、かつ並列化されていない部分については事後処理部34により記述変換によるデメリットが解消されたバイナリコード59を実行できる。 (6) The autonomous travel control device 2 includes a multi-core processor 202 and a storage unit 204. The storage unit 204 has a binary code 59 obtained by compiling the integrated program 57, which is a program for a multi-core processor created by the above-mentioned program parallelization method. Therefore, the autonomous travel control device 2 can execute the binary code 59 in which the demerit due to the descriptive conversion is eliminated by the post-processing unit 34 for the portion that is parallelized by the action of the parallelization unit 33 and is not parallelized.
(変形例1)
 上述した実施の形態では、プログラム並列化装置111は、図7において並列化部33で並列化されなかったソースコードB2のみに対して記述逆変換を実施した。しかし事後処理部34は、並列処理の順序の依存関係を壊さず、コア間の同期を確保できる範囲であれば、並列化部33で並列化されたソースコードA2に対しても記述逆変換を実施してもよい。
(Modification example 1)
In the above-described embodiment, the program parallelizing device 111 performs the descriptive inverse transformation only for the source code B2 that is not parallelized by the parallelizing unit 33 in FIG. 7. However, the post-processing unit 34 performs descriptive inverse transformation on the source code A2 parallelized by the parallelization unit 33 as long as the synchronization between the cores can be ensured without breaking the dependency of the parallel processing order. It may be carried out.
(変形例2)
 上述した実施の形態では、プログラム並列化装置111はコンパイラ39を含んで構成された。しかしプログラム並列化装置111はコンパイラ39を含まず、ネットワーク等を介して接続される他の装置に含まれるコンパイラを利用してもよい。
(Modification 2)
In the above-described embodiment, the program parallelizer 111 is configured to include a compiler 39. However, the program parallelizing device 111 does not include the compiler 39, and may use a compiler included in another device connected via a network or the like.
(変形例3)
 プログラム並列化装置111は判別部31を備えず、オリジナルプログラム51の全体を事前処理部32が処理対象としてもよい。またこの場合は、対象外プログラム56が存在しないので、プログラム並列化装置111は統合部35も備えなくてよい。
(Modification example 3)
The program parallelizing device 111 does not include the discriminating unit 31, and the preprocessing unit 32 may process the entire original program 51. Further, in this case, since the non-target program 56 does not exist, the program parallelizing device 111 does not have to include the integrating unit 35.
(変形例4)
 プログラム並列化装置111が作成したプログラムは、あらかじめ自律走行制御装置2のROM252に格納されてもよい。
(Modification example 4)
The program created by the program parallelizing device 111 may be stored in the ROM 252 of the autonomous travel control device 2 in advance.
 上述した各実施の形態および変形例において、機能ブロックの構成は一例に過ぎない。別々の機能ブロックとして示したいくつかの機能構成を一体に構成してもよいし、1つの機能ブロック図で表した構成を2以上の機能に分割してもよい。また各機能ブロックが有する機能の一部を他の機能ブロックが備える構成としてもよい。 In each of the above-described embodiments and modifications, the configuration of the functional block is only an example. Several functional configurations shown as separate functional blocks may be integrally configured, or the configuration represented by one functional block diagram may be divided into two or more functions. Further, a part of the functions possessed by each functional block may be provided in the other functional blocks.
 本発明は前述した実施例や変形例に限定されるものではなく、添付した特許請求の範囲の趣旨内における様々な変形例及び同等の構成が含まれる。例えば、前述した実施例や変形例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに本発明は限定されるものではない。また、制御線や情報線は説明上必要と考えられるものを示しており、実装上必要な全ての制御線や情報線を示しているとは限らない。実際にはほとんど全ての構成が相互に接続されていると考えてもよい。 The present invention is not limited to the above-mentioned examples and modifications, but includes various modifications and equivalent configurations within the scope of the attached claims. For example, the above-described examples and modifications have been described in detail in order to explain the present invention in an easy-to-understand manner, and the present invention is not necessarily limited to those having all the described configurations. In addition, the control lines and information lines indicate those that are considered necessary for explanation, and do not necessarily indicate all the control lines and information lines necessary for implementation. In practice, it can be considered that almost all configurations are interconnected.
 上述した各実施の形態および変形例は、それぞれ組み合わせてもよい。上記では、種々の実施の形態および変形例を説明したが、本発明はこれらの内容に限定されるものではない。本発明の技術的思想の範囲内で考えられるその他の態様も本発明の範囲内に含まれる。 The above-described embodiments and modifications may be combined. Although various embodiments and modifications have been described above, the present invention is not limited to these contents. Other aspects conceivable within the scope of the technical idea of the present invention are also included within the scope of the present invention.
1…車載システム
2…自律走行制御装置
31…判別部
32…事前処理部
33…並列化部
34…事後処理部
35…統合部
39…コンパイラ
51…オリジナルプログラム
52…対象プログラム
53…事前処理後プログラム
54…変換済プログラム
55…逆変換付加プログラム
56…対象外プログラム
57…統合済プログラム
59…バイナリコード
111…プログラム並列化装置
202…マルチコアプロセッサ
204…記憶部
1 ... In-vehicle system 2 ... Autonomous driving control device 31 ... Discrimination unit 32 ... Pre-processing unit 33 ... Parallelization unit 34 ... Post-processing unit 35 ... Integration unit 39 ... Compiler 51 ... Original program 52 ... Target program 53 ... Pre-processing program 54 ... Converted program 55 ... Reverse conversion additional program 56 ... Excluded program 57 ... Integrated program 59 ... Binary code 111 ... Program parallelizer 202 ... Multi-core processor 204 ... Storage unit

Claims (6)

  1.  コンピュータが実行する、シングルコアプロセッサ用プログラムからマルチコアプロセッサ用プログラムを生成するプログラム並列化方法であって、
     前記シングルコアプロセッサ用プログラムのプログラム並列化処理のための事前変換を実行する事前変換ステップと、
     前記事前変換された前記シングルコアプロセッサ用プログラムを並列化処理する並列化処理ステップと、
     前記並列化処理された前記シングルコアプロセッサ用プログラムのうち並列化されなかった領域に前記事前変換の逆変換を実行する逆変換ステップと、を含むプログラム並列化方法。
    It is a program parallelization method that a computer executes to generate a program for a multi-core processor from a program for a single-core processor.
    A pre-conversion step for executing pre-conversion for program parallelization processing of the single-core processor program, and
    A parallel processing step for parallel processing the pre-converted program for a single core processor, and
    A program parallelization method including an inverse transformation step of executing the inverse transformation of the pre-conversion in a non-parallelized region of the parallelized program for a single core processor.
  2.  請求項1に記載のプログラム並列化方法であって、
     前記シングルコアプロセッサ用プログラムのうち、前記プログラム並列化処理による高速化効果が見込まれ、前記プログラム並列化処理のための前記事前変換が可能である第1のプログラム領域を抽出する抽出ステップをさらに含み、
     前記事前変換ステップは、前記第1のプログラム領域に前記事前変換を実行するプログラム並列化方法。
    The program parallelization method according to claim 1.
    Further, an extraction step of extracting the first program area in which the program parallelization process is expected to speed up the program for the single core processor and the pre-conversion is possible for the program parallelization process is further performed. Including
    The pre-conversion step is a program parallelization method for executing the pre-conversion in the first program area.
  3.  請求項2に記載のプログラム並列化方法であって、
     前記シングルコアプロセッサ用プログラムのうち、前記第1のプログラム領域以外の第2のプログラム領域及び、前記事前変換と前記並列化処理と前記逆変換とを実行された第1のプログラム、を結合する結合ステップをさらに含むプログラム並列化方法。
    The program parallelization method according to claim 2.
    Among the programs for a single core processor, a second program area other than the first program area and a first program in which the pre-conversion, the parallel processing, and the inverse conversion are executed are combined. A program parallelization method that further includes a join step.
  4.  請求項1に記載のプログラム並列化方法であって、
     前記事前変換は、大きいビットサイズの型を小さいビットサイズの型に記述を変更する処理を含み、
     前記逆変換は、小さいビットサイズの型を大きいビットサイズの型に記述を変更する処理を含むプログラム並列化方法。
    The program parallelization method according to claim 1.
    The pre-conversion involves changing the description of a large bit size type to a small bit size type.
    The inverse transformation is a program parallelization method including a process of changing a description from a small bit size type to a large bit size type.
  5.  請求項1に記載のプログラム並列化方法を実行するプログラム並列化装置。 A program parallelizer that executes the program parallelization method according to claim 1.
  6.  請求項1に記載のプログラム並列化方法を用いて作成された前記マルチコアプロセッサ用プログラムを格納する記憶部と、
     前記記憶部に格納される前記マルチコアプロセッサ用プログラムを実行するマルチコアプロセッサとを備える電子制御装置。
     
    A storage unit for storing the program for the multi-core processor created by using the program parallelization method according to claim 1.
    An electronic control device including a multi-core processor that executes a program for the multi-core processor stored in the storage unit.
PCT/JP2021/003111 2020-02-26 2021-01-28 Program parallelization method, program parallelization device, and electronic control device WO2021171897A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2020030512A JP7323478B2 (en) 2020-02-26 2020-02-26 Program parallelization method, program parallelization device, electronic control device
JP2020-030512 2020-02-26

Publications (1)

Publication Number Publication Date
WO2021171897A1 true WO2021171897A1 (en) 2021-09-02

Family

ID=77490092

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/003111 WO2021171897A1 (en) 2020-02-26 2021-01-28 Program parallelization method, program parallelization device, and electronic control device

Country Status (2)

Country Link
JP (1) JP7323478B2 (en)
WO (1) WO2021171897A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07261992A (en) * 1994-03-22 1995-10-13 Matsushita Electric Ind Co Ltd Program converting and editing method and program converting and editing device
JPH11282814A (en) * 1998-03-31 1999-10-15 Nec Corp Program paralleling method and device, and recording medium recording paralleling program
JP2004157893A (en) * 2002-11-08 2004-06-03 Hitachi Ltd Method for forming pseudomeasure array into aligned array
JP2012014526A (en) * 2010-07-01 2012-01-19 Hitachi Ltd Structure conversion apparatus for program code, and code structure conversion program

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2939114A1 (en) * 2012-12-26 2015-11-04 Huawei Technologies Co., Ltd. Processing method for a multicore processor and multicore processor
JP7261992B2 (en) 2017-10-31 2023-04-21 パナソニックIpマネジメント株式会社 COMMUNICATION DEVICE, COMMUNICATION SYSTEM, EQUIPMENT CONTROL SYSTEM, COMMUNICATION CONTROL METHOD AND PROGRAM

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07261992A (en) * 1994-03-22 1995-10-13 Matsushita Electric Ind Co Ltd Program converting and editing method and program converting and editing device
JPH11282814A (en) * 1998-03-31 1999-10-15 Nec Corp Program paralleling method and device, and recording medium recording paralleling program
JP2004157893A (en) * 2002-11-08 2004-06-03 Hitachi Ltd Method for forming pseudomeasure array into aligned array
JP2012014526A (en) * 2010-07-01 2012-01-19 Hitachi Ltd Structure conversion apparatus for program code, and code structure conversion program

Also Published As

Publication number Publication date
JP7323478B2 (en) 2023-08-08
JP2021135664A (en) 2021-09-13

Similar Documents

Publication Publication Date Title
CN107976986B (en) Method for programming an electronic control module of a vehicle
US20220044149A1 (en) Techniques for generating machine learning trained models
CN112232000B (en) Authentication system, authentication method and authentication device spanning multiple authentication domains
JP2009524866A (en) System and method for parallel execution of programs
JP2021526670A (en) General-purpose machine learning model, model file generation and analysis method
CN114684185A (en) Vehicle safety response control hierarchy for automated vehicle safety control and corresponding method
JP6853746B2 (en) Vehicle control device
WO2021171897A1 (en) Program parallelization method, program parallelization device, and electronic control device
JP2017228029A (en) Parallelization method, parallelization tool, on-vehicle device
Nyboe et al. MPSoC4Drones: An open framework for ROS2, PX4, and FPGA integration
KR101826828B1 (en) System and method for managing log data
US20220058020A1 (en) Electronic control device and control method
CN116596048A (en) Deep learning model reasoning deployment method and system
CN107133017B (en) Method for operating a microprocessor
CN112650503B (en) Software compiling method and system on chip
CN110915138B (en) Electronic control device and circuit reconstruction method
EP2336883B1 (en) Programming system in multi-core environment, and method and program of the same
WO2021074874A1 (en) Techniques for training systems for autonomous vehicle navigation
CN112989352A (en) Method and apparatus for model-based analysis
JP7201381B2 (en) Electronic controller, parallel processing method
JP6933001B2 (en) Parallelization method, parallelization tool
JP7344109B2 (en) Resource allocation system, server, computing device
KR102666578B1 (en) Method for configuring a build environment based on the trosar platform
Etzel et al. Modeling and analysis of partitions on functional architectures using east-adl
Kobayashi et al. Communication Overhead Schema Independent of Libraries for Software/Hardware Interface

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21759886

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21759886

Country of ref document: EP

Kind code of ref document: A1