EP2122464A1 - Procédé de traduction mis en uvre par ordinateur - Google Patents

Procédé de traduction mis en uvre par ordinateur

Info

Publication number
EP2122464A1
EP2122464A1 EP08724024A EP08724024A EP2122464A1 EP 2122464 A1 EP2122464 A1 EP 2122464A1 EP 08724024 A EP08724024 A EP 08724024A EP 08724024 A EP08724024 A EP 08724024A EP 2122464 A1 EP2122464 A1 EP 2122464A1
Authority
EP
European Patent Office
Prior art keywords
class
source code
programming language
expression
program structure
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP08724024A
Other languages
German (de)
English (en)
Other versions
EP2122464A4 (fr
Inventor
Stephen Ming Ko Cheng
Alex Potanin
Christopher Michael Andreae
Simon Marsh David Robinson
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Innaworks Development Ltd
INNAWORKS Dev Ltd
Original Assignee
Innaworks Development Ltd
INNAWORKS Dev Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Innaworks Development Ltd, INNAWORKS Dev Ltd filed Critical Innaworks Development Ltd
Publication of EP2122464A1 publication Critical patent/EP2122464A1/fr
Publication of EP2122464A4 publication Critical patent/EP2122464A4/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/76Adapting program code to run in a different environment; Porting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/51Source to source
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis

Definitions

  • This invention relates to the field of translating source code associated with one programming language to a second source code associated with a second programming language.
  • the invention relates to the porting of an application written in Java or C# to C++ or C.
  • the present invention relates to software development and porting for mobile devices and embedded devices, where Java, C#, C and C++ are the programming languages.
  • Mobile devices have become ubiquitous over the last few years. Mobile devices are now increasingly powerful, and most are capable of executing software applications.
  • Java Micro Edition is a very popular software development platform for mobile devices. According to some estimates, more than 60% of mobile devices worldwide are capable of executing software applications written for the Java Micro Edition platform.
  • One variant of Java is the programming language used to write applications for the Java Micro Edition platform.
  • the primary programming languages for the software development platforms BREW, Symbian, Microsoft Mobile, Microsoft CE, Palm OS are C and C++. Although it is possible to develop for these platforms with other programming languages, they will be referred to in this application collectively as C/C++ based software development platforms.
  • porting Essentially one development team develops the application for one particular software development platform. After the application is completed, it will be translated to, or otherwise modified for, the other software development platforms. The translation or porting process can be outsourced to a porting specialist company, which may be operating from a location with a lower cost base. Although this approach is typically more cost effective than parallel development, there is a significant increase in turn-around time, as well as a reduction of control of the quality of the ported application.
  • JVM bundling Another approach is known as "JVM bundling". Essentially it involves bundling a Java virtual machine with the Java Micro Edition version of an application, such that it could run on one of the C/C++ based mobile development platforms. This approach has a number of major disadvantages, including relatively poor performance, high cost of licensing the Java virtual machine, high memory use and large download footprint, as well as the difficulty to leverage the special capabilities of the target mobile development platforms.
  • JCVM converts Java class files to C. However, this can result in the structure of the original source code being easily lost. Also, the JCVM generated source code is hard to understand compared to human written C++ code. In addition, comments are no longer available as they are not placed in the Java class files. Further, class hierarchy is lost as C does not directly support object oriented programming concepts.
  • Java2cpp is an automated Java source code to C++ source code translator.
  • Java2cpp is based on pre-processor technologies.
  • Java2cpp is not capable of accurately translating some Java constructs and expressions common in Java source code. For example the try-catch-finally construct in the Java source code will result in the same construct in the C++ source code, although finally is not supported by C++. Due to the different order of evaluation rules in C++, and the inability in java2cpp to make necessary adjustments, expressions in the C++ source code may be evaluated differently from the original Java source code.
  • Java2cpp output requires significant human effort to post-process after each translation attempt. The correction process is costly, time-consuming and negates the advantages of automated porting.
  • Java and C# languages from the perspective of computer language analysis.
  • the two languages share many common features, syntax, constructs and philosophy.
  • methods and systems that facilitate translation from Java to C++ or C can also be applicable to translations from C# to C++ or C.
  • the invention provides a computer implemented method for automatically translating a first source code associated with a first programming language to a second source code associated with a second programming language wherein the first and second source codes are associated with the same functionality, the method comprising the steps of: parsing the first source code to form a program structure representation comprising a plurality of program structure elements associated with the first programming language, analysing the program structure elements, wherein the analysis includes the step of searching for at least one program structure element that has no direct associated representation that produces the same result in the second programming language, and transforming the program structure representation into the second source code based on said analysis.
  • the method may further comprise the steps of detecting at least one program structure element during the analysis step, and transforming the detected program structure element into a transformed program structure element that can be represented in the second programming language.
  • the first programming language may be a programming language from the group comprising: Java; Java Micro Edition; C#; a language derived from Java; a language derived from C#
  • the second programming language is a programming language from the group comprising: C; C++; a language derived from C; a language derived from
  • the second source code may be for a target platform from the group comprising: BREW; Symbian; Windows CE.
  • program structure representation may comprise an abstract syntax tree constructed from the first source code.
  • a separate abstract syntax tree may be constructed for a single class.
  • program structure representation may comprise class hierarchy information constructed from the first source code.
  • the second programming language may be a programming language from the group comprising: C; C++; a language derived from C; a language derived from C++, and the method may further comprise the steps of: compiling the second source code into a target object code, and linking the target object code with a first set of run-time libraries associated with the second programming language, wherein the first set of run-time libraries provide at least some of the capabilities of a second set of runtime libraries associated with the first programming language.
  • the method may further comprise the steps of: analysing the program structure elements to identify expressions containing sub-expressions where the direct associated representation of the expression in the first programming language requires the sub-expressions to be executed in a specific order, but the direct associated representation of the expression in the second programming language does not, and converting an identified expression such that in the direct associated representation in the second programming language of the converted expression, the sub-expressions are executed in the specific order.
  • sub-expressions may be required to be operated on in the order from left to right.
  • the expression may be a binary operator.
  • the sub-expressions may be an argument list.
  • the argument list may form part of a method or constructor invocation.
  • the expression may comprise a first set of sub-expressions, and the expression is expressible in both the first and second programming language as one of the group comprising: language-defined operator; language-defined function; application-defined function, the method further comprising the steps of: extracting a first set of sub-expressions from the expression, and creating a new expression comprising the extracted subexpressions such that the direct associated representation in the second programming language of the new expression produces the same result when executed as the execution of the direct associated representation of the original expression in the first programming language.
  • the method may further comprise the step of using a temporary variable to store a result of one of the first set of sub-expressions.
  • the method may further comprise the steps of: combining into the new expression, using the C sequence operator, one or more assignments to a temporary variable storing the result of a sub-expression of the first set in the required order of execution, and transforming the original expression with the sub-expression replaced by its corresponding temporary variable.
  • the method may further comprise the step of: analysing the subexpressions to determine if they are sensitive to the order in which they are evaluated and, upon a positive determination, creating the new expression.
  • the method may further comprise the steps of: analysing the program structure representation to find a constructor method, wherein the constructor method is associated with a first class and a first set of parameters, creating a new method in the first class that has equivalent parameters to the first set of parameters, moving the logic embodied in the constructor method into the newly created method, and replacing an expression that instantiates the first class using the constructor and a set of arguments with an expression that instantiates the first class with a constructor and invokes the newly created method on the instantiated result with the set of arguments.
  • the method may further comprise the step of: analysing the program structure representation to find an interface, wherein a class implements the interface, super-classes of the class do not implement the interface, the interface declares a method of a method signature, and the class does not define a method of the method signature, and there exists a super-class of the class that does define a method of the method signature.
  • the method may further comprise the step of: adding to the class a method with the method signature the behaviour of which is to invoke the method of the method signature in the super-class.
  • the method may further comprise the steps of: determining if the class is an abstract class, and, upon a positive determination, and adding to a concrete subclass of the class a method with the method signature the behaviour of which is to invoke the method of the method signature in the super-class.
  • the method may further comprise the steps of: analysing the program structure representation to find a nested class, extracting the nested class from an enclosing class to a non-nested class, and associating the extracted nested class with the previously enclosing class.
  • the extracted nested class may be associated with the previously enclosing class by marking each class as a friend of the other.
  • the method may further comprise the steps of: analysing the program structure representation to find an inner class associated with the first source code, modifying the inner class by adding a field referring to the previously enclosing class, and adding additional parameters to constructor methods of the inner class denoting the outer class.
  • the inner class may be a local inner class or anonymous inner class
  • the method may further comprise the step of adding extra construction parameters and fields to the inner class denoting the final local variables of the enclosing method.
  • the method may further comprise the steps of: analysing the program structure representation to find an array initializer, and upon finding, and transforming the array initializer to a form suitable for representation in the second source code.
  • the method may further comprise the steps of: creating a method that creates an array, initializes the contents of the created array using parameters to the method corresponding to the elements contained in the array initializer, and returns the created array, and replacing the array initializer with an invocation of the method, the arguments of which are the original elements contained in the array initializer.
  • the method may further comprise the steps of: analysing the program structure representation to identify the use of any non-primitive arrays of any dimension associated with the first source code, and replacing references to any non-primitive array types associated with the first source code with references to a class representing more than one non-primitive array types, wherein the class is associated with the second source code.
  • an instance of the class may contain information pertaining to an element type and dimension of the array it represents.
  • the method may further comprise the step of: modifying the signature of methods with one or more parameter types or return type which is a non-primitive array type, resulting, after the replacement of references, in a signature that is based on the original declared element type and dimension of each of the non-primitive array type parameter or return types in order to eliminate or reduce the possibility of name conflicts.
  • the method may further comprise the step of: replacing: creations of reads from, writes to or type test and cast operations on instances of non- primitive array types associated with the first source code with expressions performing an equivalent operation on the non-primitive array class associated with the second source code.
  • the method may further comprise the steps of: analysing the program structure representation to find any static initialization component associated with the first source code, modifying the static initialization component to create a representation suitable for the second programming language, and invoking the modified static initialization component.
  • the method may further comprise the steps of: analysing the program structure representation to find any static initialization component for a class associated with the first source code, modifying the class by adding a method to the class, the method having the same function as the static initialization component, removing the static initialisation component, and finding a location involving use of static fields of the class, invocation of the static methods of the class or an instantiation of the class.
  • the method may further comprise the steps of: inserting instructions immediately before the location to determine whether the class has completed static initialisation, and if static initialisation has not been completed, invoking the added method, and registering that the class has completed static initialisation.
  • the method may further comprise the step of: determining if the static initialization component has any effect that would result in different behaviour of the program if it were evaluated at a point in program execution other than the first encounter of one of the locations of claim 34, and, upon a positive determination, causing the static initialization component to be evaluated at a different time.
  • the method may further comprise the steps of: analysing the program structure representation to find any instance initialization component associated with the first source code, modifying the instance initialization component to create a representation suitable for the second programming language, and invoking the modified instance initialization component.
  • the method may further comprise the steps of: analysing the program structure representation to find any instance initialization component for a class associated with the first source code, modifying the class by adding a method to the class, the method having the same function as the instance initialization component, removing the instance initialization component, and inserting an invocation of the method at the beginning of a constructor.
  • the method may further comprise the steps of: analysing the program structure representation to find class hierarchies containing original classes associated with the first source code, and, if found, modifying the original classes to merge classes together in order to reduce the number of classes associated with the second source code.
  • the method may further comprise the steps of: determining if the original classes can be merged to form a second source code that has substantially the same functionality as the first source code, and upon a positive determination, modifying the program structure representation to merge the original classes to form a new single class by moving the class elements, and modifying any references to the original classes such that they refer to the new single class.
  • the original classes may be merged such that a first original class is merged into a second original class.
  • the original classes may be merged such that first and second original classes are merged into a new class.
  • the method may further comprise the steps of: determining if the original classes to be merged include a class and its direct super-class, and the direct super-class has only one subclass and is non-instantiated, and, upon a positive determination, merging the super-class and class, and replacing references to the class and the super-class with reference to the merged class.
  • the method may further comprise the steps of: determining if the original classes to be merged include a class and an interface that the class directly implements, wherein the interface is directly implemented by the class or its subclasses, but not directly implemented by any other classes, and the interface is not extended by any other interfaces, and, upon a positive determination, merging the interface with the class, replacing references to the interface with references to the class, and removing the implementation of the interface from any subclass that implements the interface.
  • the method may further comprise the steps of: determining if the original classes to be merged include a first class and a second class, wherein the first class is a direct subclass of a root class of the class hierarchy, the second class is not an interface, and the first class has no non-static fields, no non-static methods and no subclasses, further determining by static analysis if a class initializer associated with the first class has no side-effects, or can be performed such that it would result in different program behaviour if it were evaluated in a different order with respect to the class initializer associated with the second class, and, upon positive determinations, merging the first and second classes, and replacing references to the first class and the second class with references to the merged first and second classes.
  • the first set of run-time libraries may include an implementation of automatic garbage collector.
  • the first set of run-time libraries may include a co- operative thread scheduler.
  • the present invention provides a computer implemented method for automatically translating an exception functionality in a first source code associated with a first programming language to an equivalent exception functionality in a second source code associated with a second programming language wherein the first and second source codes are associated with the same functionality, the method comprising the steps of: analysing a program structure representation of a first source code in order to find a program structure element that is associated with an exception functionality, determining if the analysis step has found an exception functionality, and, upon a positive determination, and converting the exception functionality to a suitably equivalent exception functionality in the second source code.
  • the order in the second source code of any components of the converted exception functionality may be the same as the order in the first source code of the equivalent components of the exception functionality.
  • the elements of the exception functionality may be contiguous in the first source code
  • the elements of the converted exception functionality in the second source code may be contiguous in the second source code
  • the first programming language may be Java and the exception functionality in the first source code may be a try/catch/finally statement.
  • the method may further comprise the steps of: determining if there exists an occurrence of control flow which would exit a try region and cause a finally region to be executed in the first programming language, and, upon a positive determination, using in the second source code one or more means of storage to record the type of control flow, including a continue, break or return expression or an exception, by which the try region was exited, executing instead the finally region, and subsequently using the stored information to provide equivalent functionality of control flow in the second source code as the functionality when the finally block exits in the first source code.
  • the method may further comprise the steps of: saving the original control flow immediately before an expression establishing the original control flow by means of at least one of the functions in a group consisting of: setjmpO in the C programming language; getcontext() in the POSIX API for the C programming language; a function producing substantially the same effect as setjmp() or getcontext(); and resuming the original control flow after the finally region is executed to return to the expression establishing the original control flow by means of at least one of the functions in a group consisting of: longjmp() in the C programming language; setcontext() in the POSIX API for the C programming language; a function producing substantially the same effect as longjmp() or setcontext().
  • the means of storage may include one of a field or a local variable.
  • the method may further comprise the step of: converting the try/catch/finally statement to a mechanism in the second source code using a method to store the current state of the program and a method to restore the state.
  • the method may further comprise the step of: converting the try/catch /finally statement to a mechanism in the second source code using one of the group consisting of: setjmp() in the C programming language; IongjmpO in the C programming language; setcontext() in the POSIX API for the C programming language; getcontext() in the POSIX API for the C programming language.
  • the method may further comprise the step of: defining any local variables modified inside the try block in the first source code as volatile local variables in the second source code.
  • the method may further comprise the steps of: determining if, for a method of a method signature in a first class, a method invocation of that signature on an object reference whose declared type is the type of the first class could result in polymorphic method dispatch to any method other than the method, and, upon a negative determination, translating the method to a translated method in the second source code that is not marked as virtual.
  • the determination step may further comprise: determining whether the method is not private, not abstract, and there exists no non- private method of the method signature in any class or interface that is a supertype or subtype of the first class.
  • the current invention provides a means to automatically translate an application written in a first programming language, such as Java to a second programming language, such as C/C++, essentially with no postprocessing required.
  • a first programming language such as Java
  • a second programming language such as C/C++
  • Figure 1 is a perspective view of a computing system for implementing the preferred method
  • Figure 2A shows a first portion of a flow diagram of the process associated with the computer implemented method according to a preferred embodiment
  • Figure 2B shows a second portion of a flow diagram of the process associated with the computer implemented method according to a preferred embodiment
  • the computer implemented method is executed on a system that includes a computer 101 with a microprocessor 103, memory 105 and a power supply 107 to provide power to the respective elements of the computer 101. Attached to the computer are input and output devices, such as a keyboard 109 and display monitor 111 , which are connected to the computer via interfaces (115, 117).
  • the method is implemented by the microprocessor 103 executing a computer program 113 residing in the memory 105. Alternatively, the program may reside in an external memory device.
  • the computer implemented method for translating source code intended for use in one language to a second language includes the following processes.
  • the classes are defined in Java source code.
  • compiler front end semantic and syntactic analysis is performed. This produces, at step 205, an abstract representation of syntax (AST), annotated with type and symbolic information.
  • AST abstract representation of syntax
  • explicit constructors are created.
  • nested and inner class extraction is performed.
  • the AST has no implicit constructors, and nested and inner classes have been refactored into top-level classes with fields representing salient components of their outer class, marked as a mutual friend of the ex-outer class.
  • the conversion of static synchronised methods is performed.
  • the conversion of static initializers is performed.
  • the conversion of instance initializers is performed.
  • the AST has had initializer components of a class moved into methods, and checks inserted to explicitly invoke those methods at appropriated points.
  • string concatenation is converted into StringBuffer.
  • Class merging is carried out at step 223.
  • an AST is provided in which uninstantiated classes with a single subclass have been merged with that subclass. The procedure then moves to figure 2B. Referring to figure 2B, the method continues from step 229 with the following processes.
  • the step to correct inheritance of a method defined in an interface is performed. Such that, at step 233, the AST includes "trampoline" dispatch methods inserted into interface multiple inheritance points.
  • array initializers are converted to methods.
  • the step to convert constructors is performed.
  • the AST includes constructors implemented as regular methods.
  • expression order correction is performed.
  • the AST includes predictable expression evaluation side-effects.
  • array type signature modification is performed.
  • the AST is exported in C++ format.
  • array access conversion is performed.
  • try/catch/finally conversion is performed.
  • synchronisation primitive conversion is performed. This results in the final C++ source code at step 255, which is forwarded to a compiler 257.
  • a runtime library 259 is accessed by the compiler.
  • Object code is created at step 261 , and linked at step 263 to provide executable binary code for a mobile device at step 265.
  • the original source code is parsed and a program structure representation is produced in the form of an abstract syntax tree (AST).
  • the AST includes a number of original language program structure elements that are associated with the original programming language.
  • the AST is also capable of representing program structure elements that are associated with a target programming language. It will be understood that program structure representations other than an AST may be utilised.
  • the AST is analysed by a program in order to modify any program structure elements that require modification in order to produce a target program in the target programming language, such that the target program operates in the same desired manner as the original programming language.
  • the program structure representation is analysed to find specific program structure elements that fall into a defined group.
  • the group consists of program structure elements that have no direct associated representation in the second programming language. That is, a direct associated representation is a straight forward and direct mapping from the AST to the source.
  • the original programming language may provide a specific functionality that the target programming language does not provide, such that there is no direct associated representation of the program structure element for that functionality in the target programming language. For example, when using the program structure element in association with the target programming language, the target programming language may produce a different result, such as a different program state, to the result produced by the original target language for the same program structure element.
  • the program structure elements may be analysed to identify expressions containing sub-expressions where the direct associated representation of the expression in the first programming language requires the subexpressions to be executed in a specific order, but the direct associated representation of the expression in the second programming language does not. Therefore, conversion of the identified expression is required such that in the direct associated representation in the second programming language of the converted expression, the sub-expressions are executed in the specific order. Different methods of conversion are provided depending on the type of program structure elements that require conversion.
  • the AST is exported in the target programming language format.
  • a conversion is made from Java to C++.
  • a design for a Java to C++ translator is as follows.
  • the translator has three stages:
  • AST Abstract Syntax Tree
  • This AST model must be capable of representing the Java language, those features in the C++ language which have a direct analogue in Java, and several C++ language features that are not present in Java, such as sequencing expressions, explicit pointer and reference use, and non-virtual method calls.
  • type-checked As the AST is read in the program is type-checked, and the tree is annotated with type and symbolic (that is, the program entity referred to by a given identifier) information. Further, class hierarchy information is generated. Further, comments in the source code are also included as metadata in the AST.
  • the overall task of translation is to transform those sections of the initial parse of the Java program AST that are not representable with the same semantics in C++ into an AST representation of valid C++ code, and then to output the AST as C++ source.
  • StringBuffer class as described in the Java specification (JLS 15.18.1.2).
  • f. Class merging is performed.
  • g. "Trampoline" dispatch methods are inserted into interface multiple inheritance points.
  • h. Array initializers are converted to methods.
  • i. The bodies of constructors are extracted to separate virtual methods to permit virtual dispatch.
  • Expressions in the AST are modified to strictly enforce the left-to- right sub-expression evaluation order used by Java.
  • k. Array type signature modification. Type signatures of methods involving array typed arguments are modified to prevent name conflict when arrays are converted to use a single class.
  • Devirtualisation optimisation is performed.
  • the AST is then output as C++ source format.
  • a header file and a source file named after the class are created.
  • the header file is initialized with #include directives for the runtime library and for the header file of each class which is statically referenced by code in the interface of C.
  • the source file is initialized with a #include directive for the header, and for the header file of each class which is statically referenced by code in the body of C.
  • a C++ class declaration is created for the class C, defined as extending the superclass and interfaces of C, and output into the header file. For each method and field in C, a C++ declaration for that method or field is added to the class declaration in the header file.
  • a method definition is created in the source file to match the corresponding declaration in the header file if the method is not pure virtual.
  • the AST structure of the body of the method is traversed to produce a C++ representation, which is output as the body of the method definition in the source file. Comments in the AST are also included at the translated equivalents of their position in the original source code.
  • Most remaining AST constructs in method bodies have either a direct representation in C++, or a simple direct translation to a construct with a direct representation in C++ which may be performed during output. The following more complex translations are also performed during this source output phase: a. Try/catch/finally are transformed. b. Object array creation and access modification.
  • the resulting C++ source is compiled against a runtime library which provides the API expected by the translated code, including an automatic garbage collector and co-operative threading and synchronization support. This compiled code is finally linked to produce a binary which can be used on the target device.
  • This step is a process of normalisation in the AST. For each class C in the AST, if C declares no constructor methods, then a default constructor must be created. Create a public constructor method M for C with no parameters. Add as the only statement in M an explicit super-constructor invocation statement with no arguments. If C declares constructor methods, then implicit super-constructor invocations in those methods must be made explicit. For every constructor method M in C, if the first statement in the body of M is not a constructor invocation, then add as the first statement in M an explicit super-constructor invocation statement with no arguments.
  • the first group consists of static nested classes wherein an outer class textually encompasses a class that is declared as static.
  • a nested class has access to the private static members of the enclosing class.
  • Many C++ compilers do not support nested classes, or support them in a way that is different from Java.
  • the second group is an inner class, which is a non-static nested class.
  • Java programming language supports a feature known as 'Inner Classes' (Java Language Specification (3 rd edition), ⁇ 8.1.3), which has no direct analogue in C++.
  • An inner class is a class whose definition is nested within the body of or a method of another 'enclosing' class, which violates the expectations for normal classes by:
  • the enclosing instance is the qualifying expression; iii.
  • the enclosing instance is the corresponding enclosing instance argument of the calling constructor.
  • An explicit non-virtual call is legitimate in C++ (for example, var . TypeName : : method ( args ) ) , but not in Java, and must be supported by the AST abstraction.
  • Method-local or anonymous inner classes may access final local variables that are declared in the enclosing method of the enclosing class. If / is a method-local or anonymous inner class in method O.m, make its use of final local variables explicit: for each final local variable declared in m that is used in /, follow the procedure described in 1. to add an instance field and constructor parameters to / to store that variable, and alter uses of that variable in the AST of O to refer to the new field. 5. If / is an anonymous inner class, then create a non-conflicting top- level name for /. 6.
  • Remove / from O making it a non-nested class with a valid top- level name, and update AST nodes which refer to the type explicitly (such as new instance creation) to reflect the new location of the type. 7.
  • Mark / and O as 'friend classes' of one another. This construct is valid in C++ code but not in Java, and must be supported by the combined AST abstraction.
  • C++ style to represent constructs that are not valid in Java.
  • this includes use of the keyword friend to mark a class as a C++ friend of another, placed in the pseudo-Java source immediately after the definition of the class name.
  • static initialisation component used in this description is to be understood to mean the initializer expression of a static field, or a static initialization block.
  • the Java programming language includes the concept of "Static Initialisation” (Java Language Specification (3 rd edition), ⁇ 12.4).
  • JLS 3e ⁇ 12.4.1 T is initialized by executing its static initializer blocks and the initializer expressions of its static variables in textual order.
  • the C++ language has no equivalent construct to static initializer blocks, and static variable initializers are executed in an implementation-defined order before the mainQ function. It is therefore necessary to convert static initializers in a Java program before they can be accurately represented by C++.
  • INIT SIG be the method signature "public static void static_initializer()"
  • E For each element E of T in textual order do: If E is a static block, then: Remove E from T; Append E to L; Remove static modifier from E;
  • T declares a method IM with signature INIT SIG and IM ⁇ M, do: Create a new statement S containing a method invocation of IM on 7;
  • T be the type declaring K; If T declares a method IM with signature INIT_SIG then do:
  • code size may be reduced and performance improved by calling this initializer once at a point in the program prior to the first static access, rather than testing whether static initialisation has occurred and evaluating it if it has not at each static access.
  • One example of a method for determining whether a static initializer may be evaluated out of sequence is as follows: If a class K declares no static blocks, and for all static field initializers in K, the initializing expression does not invoke any method or constructor or refer to any field or variable that is not a static field of the class K, and the static initializers of all superclasses of K may be evaluated out of sequence according to this method, then the static initializer of K may be safely evaluated out of sequence.
  • E For each element E of T in textual order do: If E is a static block, then: Remove E from T; Append E to L; Remove static modifier from E;
  • append E For each element E of L, append E to the body of IM; Add an invocation of the method IM to the special method which is called by the runtime environment at program initialization.
  • Examples are given as a pseudo-Java textual rendering of the AST, using C++ style to represent constructs that are not valid in Java.
  • instance initialisation component used in this description is to be understood to mean the initializer of an instance field, or a non static initilization block.
  • Java programming language allows many forms of initialization of an object instance.
  • Instance variables may be declared with initializer expressions, and classes may specify instance initializer blocks (JLS ⁇ 8.6). These initializers are executed in textual order during object construction immediately after the invocation of the super- constructor.
  • the C++ language has no equivalent construct to these forms of initialization. It is therefore necessary to convert these initializers in a Java program before they can be accurately represented by C++.
  • INIT SIG be the method signature "private void instance_init()"
  • Examples are given as a pseudo-Java textual rendering of the AST, using C++ style to represent constructs that are not valid in Java.
  • Java's String concatenation operation is supported by explicitly converting String concatenation operations to uses of the StringBuffer class as suggested by the Java Language Specification ( ⁇ 15.18.1.2). This conversion may be performed by for each sequence S of String concatenation operations s1 + s2 + s3 + ... + sn in the AST, replacing S with the AST representation of "(new StringBuffer(s7).append(s2).append( s3) ... .append(sn).toString())".
  • Programs in Java typically have deep class hierarchies.
  • deep class hierarchies result in large polymorphic method lookup tables (vtables), which adversely affect program size.
  • vtables polymorphic method lookup tables
  • SP has precisely one subclass, SB, and there exists no new instantiation in the AST instantiating SP, then do:
  • M is a method whose signature conflicts with that of a method in SB, then:
  • Rename M by adding a prefix "super$" to its name
  • V is a super method or constructor invocation within SB, then convert V to a this invocation. Remove M from SP;
  • C contains only static fields and methods, is never instantiated, and has no subclasses, then: Identify a target class T in the AST where the static initializer of C does not conflict with the static initializer of T. For each class element E in T in reverse textual order, do: If E is a field F, then do:
  • a simple means of determining whether a pair of static initializers would conflict is to use the procedure described above in relation to static initialization conversion to determine if a static initializer may be evaluated out of sequence. If either initializer satisfies this procedure, then the pair will not conflict.
  • Examples are given as a pseudo-Java textual rendering of the AST, using C++ style to represent constructs that are not valid in Java.
  • Source in examples are given as a pseudo-Java textual rendering of the AST, using C++ style to represent constructs that are not valid in Java. In the following situations, a trampoline method matching
  • Append E to the arguments of INV; Create an assignment expression statement A to index / in the array declared by LV of the variable declared by P;
  • the example is given as a pseudo-Java textual rendering of the AST, using C++ style to represent constructs that are not valid in Java.
  • Examples are given as a pseudo-Java textual rendering of the AST, using C++ style to represent constructs that are not valid in Java.
  • Assignment expressions are a special case, as it is necessary to preserve the assignability of the l-value, or assignable program entity. This may be done by using an explicit pointer or reference, or by decomposing the left- hand side. In the latter method, we recognise that there are two distinct and separate ways a conflict can occur in an assignment expression: first, if the left-hand side expression of the assignment is a field access expression (a.b) there may be a conflict between the left hand part of the field access expression and the right-hand side of the assignment, which requires the left-hand side to be extracted and pre-evaluated; second, the variable being assigned may be itself be modified in the right hand side, which requires the right-hand side to be extracted and pre-evaluated. As evaluation of a variable itself has no side-effect, it is always unnecessary to extract the variable access component from the left hand side.
  • An expression including a write to an array element conflicts with any read or write of any array element.
  • An expression including a write to a field or variable conflicts with any read or write of that field or variable.
  • Examples are given as a pseudo-Java textual rendering of the AST, using C++ style to represent constructs that are not valid in Java.
  • the evaluation of c() could affect the evaluation of b(), and additionally could write to the field obtained by b().a.
  • Arrays in the Java programming language are required to do more than their counterparts in C++. While a C++ array is little more than a contiguous block of memory, a Java array must provide element type and bounds checking, and be of a type extending Object with covariant subtyping with respect to the element type. It would be possible to implement arrays with these features in C++ by creating array classes as C++ classes on demand for each Object array type used in the translated program. However, this method would result in significant code-size increase due to the many additional classes that would be required. This part of the application is directed towards representing all Java Object array types using a single C++ class.
  • a method x (string [] y) must be differentiable from a method with the same name, x (List [] y) .
  • a procedure to enable this differentiation is to modify the names of methods with object-array parameters with unique strings representing the types of their arguments. This can be done by appending '$' and a hexadecimal representation of the CRC32 hash of the concatenation of the fully qualified Java type names of all object-array- typed parameters to the method name. This mangling may be done at any stage of translation.
  • Java method signature void arrayArgument (String [] strings) void arrayArgument (Integer [] integers) void manyArrayArguments (String [] s, String [] [] ss,
  • polymorphic method dispatch may be enabled by the programmer on a method-by-method basis, using the 'virtual' keyword. As polymorphic method dispatch has both code size and runtime overhead, it is therefore desirable to not use polymorphic method dispatch for those methods for which it can be guaranteed to be unused.
  • M is private, then M is non-virtual. Otherwise, if M is abstract, then M is virtual. Otherwise, if there exists a non-private method of the same signature as M in a class or interface C where C is a subtype of the class declaring M, then M is virtual.
  • M is virtual.
  • Java C++ headers class A ⁇ class A ⁇ private : private void void ameth ( ) ; atneth ( ) ⁇ ⁇ public : virtual void abstract void anabsmeth ( ) 0 ; anabsmethO ; virtual void public void apublicmeth () ; apublicmeth ( ) ⁇ ⁇ void anothermeth ( ); public void ⁇ ; anothermeth ( ) ⁇ ⁇ ⁇
  • class B public A ⁇ public : class B extends A ⁇ virtual void apublicmeth () ; public void virtual void apublicmeth () ⁇ ⁇ moremeth ( ) ; public void void evenmoremeth 0; moremeth ( ) ⁇ ⁇ ⁇ ; public void evenmoremeth ( ) ⁇ ⁇ class C: public B ⁇
  • virtual void class C extends B ⁇ moremeth ( ) ; void lastmethO; public void ⁇ ; moremeth ( ) ⁇ ⁇ public void lastmethO ⁇ ⁇
  • Object Array Conversion As explained above in the section dealing with array type signature modification, arrays in the Java programming language are required to do more than their counterparts in C++. While a C++ array is little more than a contiguous block of memory, a Java array must provide element type and bounds checking, and be of a type extending Object with covariant subtyping with respect to the element type.
  • the C++ representation of an object array must include the following information:
  • Runtime type identifier of the innermost element type • Number of inner array dimensions before innermost type
  • type and bounds checking may be done on store, instanceof and cast operations.
  • the C++ object array class is created with these fields, and methods for array creation, access, update and type checking.
  • Type test X instanceof INSTANCEOF ARRAYTYPE (X , T , 2 )
  • the methods create, get and set on the JavaObjectArray type are equivalent to the Java array creation, access and assignment operations.
  • the arguments to create are length, runtime type id of element type, and number of inner array dimensions before elements.
  • the macros CAST and ARRAYCAST reproduce the functionality of the Java runtime-checked cast operation.
  • the arguments to ARRAYCAST are element type, dimension and expression.
  • the macro INSTANCEOF_ARRAYTYPE reproduces the functionality of the Java runtime type test operator instanceof for array types.
  • INSTANCEOF_ARRAYTYPE are expression, element type, and dimension. Two or more dimensional arrays are created using convenience methods that recursively use the JavaObjectArray: :create method to create their element types, for example ObjectArray2dCreate(TypelD, elt_dim, first dim, second_dim).
  • JavaObj ectArray *x JavaObjectArray:: create (3 ,
  • the Java programming language provides a try/catch/finally exception model: A try statement executes a block. If a value is thrown and the try statement has one or more catch clauses that can catch it, then control will be transferred to the first such catch clause. If the try statement has a finally clause, then another block of code is executed, no matter whether the try block completes normally or abruptly, and no matter whether a catch clause is first given control.
  • exception support is compiler-dependent, and finally is not part of the C++ language. It is thus necessary to provide a mechanism to model the semantics of Java exceptions and the finally construct in C++.
  • Java's exception support is simulated by using Cs setjmpllongjmp mechanism to jump from a throw to an enclosing catch, and finally is supported within non-exception control flow by modification of control structures in methods that include try blocks to enable evaluation of finally blocks on break, continue, and return.
  • This code is preferably substituted for the Java constructs during the C++ output phase of AST processing.
  • setjmp/longjmp can be substituted by an equivalent pair of functions that saves the execution state of the program, and restores the execution state of the program.
  • setjmp/longjmp can be substituted by getcontext/setcontext as defined in the POSIX API.
  • Exceptions are modelled using setjmp/longjmp to return to enclosing try blocks on the stack.
  • the point in the program is stored using setjmp; in the example method being saved on a stack of try locations.
  • control flow enters a do ⁇ .. ⁇ while(false) loop.
  • the try block is executed, and a break is used to escape the loop. Otherwise control has returned to the point of the setjmp via a longjmp at an exception throw, and the value returned represents the particular exception thrown.
  • catch clauses are considered: if the exception matches a particular catch clause, then it is recorded that the exception has been caught, and a break used to escape the loop. If no catch blocks match the exception, then a flag is set indicating that the exception must be rethrown after executing the finally block and the loop exits.
  • control flow exits the try normally or via a caught or uncaught exception the saved location is removed from the stack, and the finally block is evaluated. After evaluation, if the rethrow flag is set, then the exception is rethrown using longjmp. Even if a finally block is not declared, this surrounding code must still be included.
  • push_new_try_location_jump_buf fer ( ) Creates a new jump buffer suitable for use with setjmp() and pushes it onto a global stack.
  • the topmost element of this stack is accessible via the global pointer current_exception_jump_buffer .
  • the top buffer is removed from the global stack with pop_try_location_jump_buf fer ( ) .
  • setjmp/longjmp can be substituted by an equivalent pair of functions that saves the execution state of the program, and restores the execution state of the program.
  • setjmp/longjmp can be substituted by getcontext/setcontext as defined in the POSIX API.
  • the current invention is also equally applicable to the development of embedded software, where a Java virtual machine may not be available.
  • Java is a highly productive language as it eliminates classes of common programming mistakes such as dangling pointers.
  • a software developer can develop in Java, and then translate to C or C++, which are the dominant computer languages for embedded software development.
  • Java Micro Edition and C# share many common language features, constructs, syntax, and philosophy. Through applying the methods described above, a software developer is able to develop in C#, and then translate to C or C++. The majority of the methods described in the embodiment are equally applicable if the programming language is originally in C# rather than Java.
  • Objective C may be used as a target language in the methods described herein.
  • program structure representation can be representative of the program in source code or any other suitable format.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)
  • Stored Programmes (AREA)

Abstract

L'invention concerne un procédé de traduction mis en œuvre par ordinateur, qui sert à traduire automatiquement un premier code source associé à un premier langage de programmation en un second code source associé à un second langage de programmation, les premier et second codes sources étant associés à la même fonctionnalité. Le procédé comporte les étapes consistant à : analyser le premier code source pour former une représentation de la structure du programme comprenant une pluralité d'éléments de structure de programme associés au premier langage de programmation ; analyser les éléments de structure de programme, l'analyse comprenant l'étape consistant à rechercher au moins un élément de structure de programme qui est dépourvu de représentation directe associée produisant le même résultat dans le second langage de programmation ; et transformer la représentation de la structure du programme en second code source sur la base de l'analyse.
EP08724024A 2007-03-05 2008-02-26 Procédé de traduction mis en uvre par ordinateur Withdrawn EP2122464A4 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
NZ55361407 2007-03-05
PCT/NZ2008/000034 WO2008108665A1 (fr) 2007-03-05 2008-02-26 Procédé de traduction mis en œuvre par ordinateur

Publications (2)

Publication Number Publication Date
EP2122464A1 true EP2122464A1 (fr) 2009-11-25
EP2122464A4 EP2122464A4 (fr) 2010-06-30

Family

ID=39738458

Family Applications (1)

Application Number Title Priority Date Filing Date
EP08724024A Withdrawn EP2122464A4 (fr) 2007-03-05 2008-02-26 Procédé de traduction mis en uvre par ordinateur

Country Status (2)

Country Link
EP (1) EP2122464A4 (fr)
WO (1) WO2008108665A1 (fr)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
LU92071B1 (en) * 2012-09-12 2014-03-13 Univ Luxembourg Computer-implemented method for computer program translation
US9459848B1 (en) * 2015-05-29 2016-10-04 International Business Machines Corporation Obtaining correct compile results by absorbing mismatches between data types representations
EP3712763A1 (fr) * 2019-03-21 2020-09-23 Siemens Aktiengesellschaft Procédé de migration mise en uvre par ordinateur d'un environnement de développement de logiciel d'un ordinateur dans un composant matériel pour une installation d'automatisation
US20220357934A1 (en) * 2021-05-05 2022-11-10 Michael Ling Methods, devices, and media for two-pass source code transformation

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6516461B1 (en) * 2000-01-24 2003-02-04 Secretary Of Agency Of Industrial Science & Technology Source code translating method, recording medium containing source code translator program, and source code translator device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5768564A (en) * 1994-10-07 1998-06-16 Tandem Computers Incorporated Method and apparatus for translating source code from one high-level computer language to another
CA2266291C (fr) * 1998-09-03 2004-06-29 Brian J. Sullivan Methode et appareil de traduction du langage cobol au langage java
US7770158B2 (en) * 2004-10-14 2010-08-03 Bea Systems, Inc. Source code translator

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6516461B1 (en) * 2000-01-24 2003-02-04 Secretary Of Agency Of Industrial Science & Technology Source code translating method, recording medium containing source code translator program, and source code translator device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of WO2008108665A1 *

Also Published As

Publication number Publication date
EP2122464A4 (fr) 2010-06-30
WO2008108665A1 (fr) 2008-09-12

Similar Documents

Publication Publication Date Title
US20080222616A1 (en) Software translation
US20170228223A1 (en) Unified data type system and method
KR101150003B1 (ko) 소프트웨어 개발 툴 생성 방법
Börger et al. A high-level modular definition of the semantics of C♯
US7346897B2 (en) System for translating programming languages
US10466975B2 (en) Execution of parameterized classes on legacy virtual machines to generate instantiation metadata
JP2007521568A (ja) 複数の例外処理モデルの中間表現
Grimmer et al. Dynamically composing languages in a modular way: Supporting C extensions for dynamic languages
US20160246622A1 (en) Method and system for implementing invocation stubs for the application programming interfaces embedding with function overload resolution for dynamic computer programming languages
US20220300260A1 (en) Implementing optional specialization when executing code
Pawlak et al. Spoon: Program analysis and transformation in java
EP2122464A1 (fr) Procédé de traduction mis en uvre par ordinateur
Tanaka et al. Safe low-level code generation in Coq using monomorphization and monadification
Salib Faster than C: Static type inference with Starkiller
JP2022522880A (ja) プログラム論理の表現を生成する方法、逆コンパイル装置、再コンパイルシステムおよびコンピュータプログラム製品
Chen et al. Type-preserving compilation for large-scale optimizing object-oriented compilers
Börger et al. Exploiting abstraction for specification reuse. The Java/C# case study
Tuong et al. Isabelle/C
Kalleberg et al. Fusing a transformation language with an open compiler
CN117235746B (zh) 一种基于多维ast融合检测的源代码安全管控平台
Irwin Understanding and improving object-oriented software through static software analysis
Berg et al. Generic Metamodel Refactoring with Automatic Detection of Applicability and Co-evolution of Artefacts
Kats Supporting language extension and separate compilation by mixing Java and bytecode
Baráth et al. Detecting binary incompatible software components using dynamic loader
Licker Low-level cross-language post-link optimisation

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20091001

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MT NL NO PL PT RO SE SI SK TR

DAX Request for extension of the european patent (deleted)
RIN1 Information on inventor provided before grant (corrected)

Inventor name: ANDREAE, CHRISTOPHER, MICHAEL

Inventor name: ROBINSON, SIMON, MARSH, DAVID

Inventor name: CHENG, STEPHEN, MING, KO

Inventor name: POTANIN, ALEX

A4 Supplementary search report drawn up and despatched

Effective date: 20100601

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20101229