WO2023061452A1 - 语言互操作方法、装置、存储介质及程序产品 - Google Patents

语言互操作方法、装置、存储介质及程序产品 Download PDF

Info

Publication number
WO2023061452A1
WO2023061452A1 PCT/CN2022/125164 CN2022125164W WO2023061452A1 WO 2023061452 A1 WO2023061452 A1 WO 2023061452A1 CN 2022125164 W CN2022125164 W CN 2022125164W WO 2023061452 A1 WO2023061452 A1 WO 2023061452A1
Authority
WO
WIPO (PCT)
Prior art keywords
language
languages
code
interoperability
abstract representation
Prior art date
Application number
PCT/CN2022/125164
Other languages
English (en)
French (fr)
Inventor
轩加振
袁健
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP22880388.8A priority Critical patent/EP4361796A1/en
Publication of WO2023061452A1 publication Critical patent/WO2023061452A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code

Definitions

  • the present application relates to the field of computer programming, and in particular to a language interoperability method, device, storage medium and program product.
  • programming language is usually designed for the development needs of specific fields or industries.
  • Java language is mostly used in IT industries such as enterprise software development, Android mobile development, and big data cloud computing; Python language is commonly used In graphics processing, scientific computing, web programming, multimedia application and engine development, machine learning, artificial intelligence and other fields. That said, each programming language excels in a different area. Therefore, for different development environments, it is necessary to adaptively select an appropriate programming language to meet development requirements.
  • the language interoperability method can provide language interoperability for programming languages while reducing language interoperability.
  • the cost and difficulty of operation improve the scalability of language interoperability of programming languages.
  • the embodiment of the present application provides a language interoperability method, the method includes: acquiring a first language code and multiple second language codes; generating multiple second language codes according to the multiple second language codes; A unified abstract representation of interoperable boundary information in two languages, the unified abstract representation is a binary code of interoperable boundary information in multiple second languages, and the interoperable boundary information in multiple second languages represents multiple second languages Among the constituent elements that allow mutual access or use with the first language; according to the unified abstract representation, compile the first language code, obtain the binary code of the first language code and output it, and the first language code When the binary code is executed, the constituent elements in the code in the first language and the constituent elements in any one of the plurality of second languages can access or use each other.
  • a unified abstract representation of interoperable boundary information in multiple second languages can be generated according to the multiple second language codes, and the unified abstract representation is multiple
  • the binary code of the interoperable boundary information of the second language enables the unified abstract representation to embody the constituent elements that allow mutual access or use with the first language among the constituent elements of multiple second languages, and the first language can be compiled according to the unified abstract representation code to get the binary code of the first language code.
  • the binary code of the first language code is executed, the constituent elements in the first language code and the constituent elements in multiple second language codes can access or use each other, that is, the first language has compatibility with multiple second languages Interoperability.
  • the unified abstract representation can be obtained from the interoperable boundary information of multiple second languages, does not involve the internal methods of the constituent elements, and does not need to parse all the grammars of multiple second languages, so that the language interoperability method in the embodiment of the present application has a relatively low
  • the development cost the update of the internal methods of the constituent elements of multiple languages will not affect the interoperability boundary information of multiple second languages, so it will not affect the unified abstract representation, and there is no need to maintain the unified abstract representation, making
  • the language interoperability method of the embodiment of the present application has a lower maintenance cost; the increase of the second language may increase the constituent elements of multiple second languages, which will not affect the original constituent elements, and will not affect the unified abstract representation
  • the original content facilitates the further expansion of interoperability between the first language and multiple second languages; for developers, it is enough to complete the writing of the first language code and start the language interoperability method, which reduces the developer's effort
  • the workload reduces the operational difficulty of the language inter
  • generating a unified abstract representation of interoperability boundary information in multiple second languages based on the multiple second language codes includes: Identifying interoperability boundary information of the plurality of second languages according to the plurality of second language codes; generating the unified abstract representation according to the interoperability boundary information of the plurality of second languages.
  • the unified abstract representation can be obtained directly from the interoperability boundary information, avoiding the analysis of a large number of grammars of multiple second languages, and can improve the efficiency of the language interoperability method according to the embodiment of the present application.
  • the interoperability boundary information of the plurality of second languages includes at least one repeated constituent elements and at least one unique constituent element, the at least one repeated constituent element is a constituent element that appears repeatedly among the constituent elements included in the interoperable boundary information of the plurality of second languages; the at least one unique The constituent elements are constituent elements that appear only once among the constituent elements included in the interoperable boundary information of the plurality of second languages.
  • the number of constituent elements in the interoperable boundary information can be reduced, the memory space occupied by the unified abstract representation of the interoperable boundary information can also be reduced, and the subsequent operation complexity of compiling the first language code according to the unified abstract representation can be reduced reduce.
  • the interoperability boundary information of the plurality of second languages includes common parts and A unique part, each constituent element in the common part corresponds to at least two second languages in the plurality of second languages; each constituent element in the unique part corresponds to the plurality of second languages The only second language in .
  • each constituent element in the common part can correspond to at least two second languages among the plurality of second languages, and each constituent element in the unique part can correspond to a unique one among the plurality of second languages.
  • each constituent element in the unique part can correspond to a unique one among the plurality of second languages.
  • the common part and the unique part can accurately represent the characteristics of multiple second languages.
  • the first language code is compiled according to the unified abstract representation , obtain the binary code of the first language code and output it, including: according to the difference between the unified abstract representation and the interoperability boundary information of the first language, obtain a processing means for processing the unified abstract representation and the semantics of the first language , the interoperability boundary information of the first language is determined according to the first language code; when compiling the first language code, the processing means is used to obtain and output the binary code of the first language code.
  • the semantics of the first language describe the behaviors performed by the computer when executing programs written in the first language, such as logical operations, reading and writing data, and so on.
  • the processing means for processing the unified abstract representation and the semantics of the first language may be a means for integrating the unified abstract representation and the semantics of the first language.
  • the constituent elements of the unified abstract representation and the constituent elements of the first language are "fused" into the same constituent element with the constituent element memory as the constituent element name.
  • Another example is to add a mark to the constituent elements of the first language, indicating that the parameters in the constituent elements are implemented in the same way as the constituent elements of the unified abstract representation, so that the semantics of the constituent elements of the unified abstract representation are "fused" into the first language .
  • the process of unifying the abstract representation and the semantics of the first language can be completed, so that the first language code that accesses or uses the constituent elements of the second language can be successfully compiled.
  • the processing method is pre-set, the developer does not need to give instructions during the compilation process, which can further reduce the difficulty of the developer's work.
  • the processing method is determined by the developer in real time, the flexibility of processing the unified abstract representation and the semantics of the first language when compiling the first language code can be improved.
  • the processing means includes mapping processing, wherein the mapping processing is for the first In the language code, the components that are the same as the memory of the unified abstract representation and have different names are compiled according to the data types of the corresponding memory in the mapping relationship, and the mapping relationship indicates that the components of the unified abstract representation and the interaction of the first language The corresponding relationship between the constituent elements of the operation boundary information and the data types of different memories.
  • the memory required for the first language code needs to be paid attention to when compiling, and the name of the constituent elements is not concerned. Therefore, it is possible to calculate in advance that the memory of the first language code is the same as that of the unified abstract representation, but the name is different. components, and determine the mapping relationship between the components and different memories based on the statistics of the memory of the components with the same memory and different names, so that the memory of the components can be directly determined based on the mapping relationship when compiling the first language code, realizing the first Semantic processing of components in the language code that have the same memory as the unified abstract representation but have different names.
  • the component variable of the unified abstract representation includes a null pointer
  • the component variable of the first language does not include a null pointer
  • the processing means includes a first runtime conversion process, wherein the first runtime conversion process is to use a runtime conversion code to determine that the runtime is When empty, throws the currently compiled component variable as an outlier.
  • a component variable refers to an instance variable of a component. Since whether the runtime is empty during compilation is related to whether the variable of the currently compiled component is a null pointer, it can be indirectly determined that the currently compiled component variable is a null pointer by using the runtime conversion code to determine that the runtime is null. The safety of the first language after the variable is passed into the first language code is determined before the variable is passed into the first language code; by using the currently compiled component variable as an outlier when it is determined that the runtime is empty Throwing, so that the exception will not be passed to the first language, so as to ensure the security of the first language.
  • the composition of the unified abstract representation includes a null pointer, and the constituent element variable of the first language does not include a null pointer, and the processing means includes a second runtime conversion process, wherein the second runtime conversion process is determined using a runtime conversion code When the runtime is null, returns a null value in the optional components.
  • the eighth possible implementation of the language interoperability method compile the first language code, obtain the binary code of the first language code and output it, including: in the interoperability boundary information between the unified abstract representation and the first language, there are the same name but different syntax
  • add a mark corresponding to the constituent element of the unified abstract representation for the constituent element of the first language obtain the binary code of the first language code and output; said mark indicates that in the first language code, the said mark
  • the syntax of the constituent elements of the unified abstract representation is realized when executed.
  • the interoperability boundary information of different second languages is not completely the same.
  • the interoperable boundary information of multiple second languages can include at least one unique component, so that the unique parts of the interoperable boundary information of multiple second languages can be obtained, and then multiple second languages can be obtained.
  • Two-language interoperability boundary information Two-language interoperability boundary information.
  • an embodiment of the present application provides a language interoperability device, the device comprising: a compiler, configured to: acquire a first language code and a plurality of second language codes; A code for generating a unified abstract representation of interoperable boundary information in multiple second languages, where the unified abstract representation is a binary code of interoperable boundary information in multiple second languages, and the interoperable boundary information in multiple second languages Represents the constituent elements that allow mutual access or use with the first language among the constituent elements of multiple second languages; according to the unified abstract representation, compile the first language code, obtain the binary code of the first language code and output it, so When the binary code of the first language code is executed, the constituent elements in the first language code and the constituent elements in any one of the multiple second languages can access or use each other.
  • a compiler configured to: acquire a first language code and a plurality of second language codes
  • a code for generating a unified abstract representation of interoperable boundary information in multiple second languages where the unified abstract representation is a binary code of interoperable boundary
  • generating a unified abstract representation of interoperability boundary information in multiple second languages includes: Identifying interoperability boundary information of the plurality of second languages according to the plurality of second language codes; generating the unified abstract representation according to the interoperability boundary information of the plurality of second languages.
  • the interoperability boundary information of the plurality of second languages includes at least one repeated constituent elements and at least one unique constituent element, the at least one repeated constituent element is a constituent element that appears repeatedly among the constituent elements included in the interoperable boundary information of the plurality of second languages; the at least one unique The constituent elements are constituent elements that appear only once among the constituent elements included in the interoperable boundary information of the plurality of second languages.
  • the interoperability boundary information of the plurality of second languages includes a common part and A unique part, each constituent element in the common part corresponds to at least two second languages in the plurality of second languages; each constituent element in the unique part corresponds to the plurality of second languages The only second language in .
  • the first language code is compiled according to the unified abstract representation , obtain the binary code of the first language code and output it, including: according to the difference between the unified abstract representation and the interoperability boundary information of the first language, obtain a processing means for processing the unified abstract representation and the semantics of the first language , the interoperability boundary information of the first language is determined according to the first language code; when compiling the first language code, the processing means is used to obtain and output the binary code of the first language code.
  • the processing means includes mapping processing, wherein the mapping processing is for the first In the language code, the components that are the same as the memory of the unified abstract representation and have different names are compiled according to the data types of the corresponding memory in the mapping relationship, and the mapping relationship indicates that the components of the unified abstract representation and the interaction of the first language The corresponding relationship between the constituent elements of the operation boundary information and the data types of different memories.
  • the component variable of the unified abstract representation includes a null pointer
  • the component variable of the first language does not include a null pointer
  • the processing means includes a first runtime conversion process, wherein the first runtime conversion process is to use a runtime conversion code to determine that the runtime is When empty, throws the currently compiled component variable as an outlier.
  • the composition of the unified abstract representation includes a null pointer, and the constituent element variable of the first language does not include a null pointer, and the processing means includes a second runtime conversion process, wherein the second runtime conversion process is determined using a runtime conversion code When the runtime is null, returns a null value in the optional components.
  • the eighth possible implementation of the language interoperability device compile the first language code, obtain the binary code of the first language code and output it, including: in the interoperability boundary information between the unified abstract representation and the first language, there are the same name but different syntax
  • add a mark corresponding to the constituent element of the unified abstract representation for the constituent element of the first language obtain the binary code of the first language code and output; said mark indicates that in the first language code, the said mark
  • the syntax of the constituent elements of the unified abstract representation is realized when executed.
  • the interoperability boundary information of different second languages is not completely the same.
  • an embodiment of the present application provides a language interoperability device, including: a processor; a memory for storing processor-executable instructions; wherein, the processor is configured to execute the above-mentioned first aspect or the first One or several language interoperability methods among various possible implementations of aspects.
  • the embodiments of the present application provide a non-volatile computer-readable storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the above-mentioned first aspect or multiple aspects of the first aspect are implemented.
  • the embodiments of the present application provide a computer program product, including computer readable code, or a non-volatile computer readable storage medium bearing computer readable code, when the computer readable code is stored in an electronic
  • the processor in the electronic device executes the language interoperability method of the first aspect or one or more of the multiple possible implementations of the first aspect.
  • Fig. 1 shows an example of the interoperability implementation of Kotlin language and Java language in the prior art.
  • Fig. 2 shows an example of the interoperability implementation of Kotlin language and C language in the prior art.
  • FIG. 3 shows a schematic diagram of the GraalVM multilingual virtual machine architecture in the prior art.
  • Fig. 4 shows an exemplary application scenario of the language interoperability method according to the embodiment of the present application.
  • Fig. 5 shows an exemplary schematic diagram of a language interoperability method according to an embodiment of the present application.
  • Fig. 6 shows an example of generating a unified abstract representation of interoperable boundary information in multiple second languages according to an embodiment of the present application.
  • FIG. 7 shows an example of a binary format of a unified abstract representation according to an embodiment of the present application.
  • Fig. 8 shows an example of acquiring the binary code of the first language code and running the binary code of the first language code according to the embodiment of the present application.
  • Fig. 9 shows a schematic structural diagram of an exemplary language interoperability device according to an embodiment of the present application.
  • Fig. 10 shows an exemplary structural diagram of a language interoperability device according to an embodiment of the present application.
  • Constituent Elements Represents the elements that make up a programming language code.
  • the constituent elements may include at least one of the content in codes such as classes, interfaces, functions, and data formats, for example, it may include multiple types of data formats such as integer types, floating point types, and Boolean types, and/or multiple types of classes, multiple types of interfaces, multiple types of functions, etc.
  • Host language, target language When providing interoperability for language A, it can interoperate with languages such as B, C, and D, that is, classes written in language A can directly communicate with classes written in language B/C/D When communicating, language A can be called the host language, and languages B, C, and D can be called target languages.
  • Bytecode usually refers to the intermediate code that has been compiled but has nothing to do with the machine code (code that the computer can directly execute) of the current use environment, and needs to be translated by an interpreter (or virtual machine) before it can be called machine code.
  • Bytecode is usually generated by a compiler.
  • a typical example is Java bytecode, where languages such as Java and kotlin usually support compilation to obtain Java bytecode.
  • Garbage collection An automatic form of memory management in which the garbage collector attempts to reclaim memory that a program has allocated but is no longer in use, and is therefore called garbage because it is no longer referenced.
  • Application binary interface Indicates a set of rules followed by the compiler and linker, including calling conventions, name-mangling, etc.
  • the calling convention is used to specify how functions are translated to assembly and how to call them , and name mangling is used to describe how the function is exposed.
  • the principle of a language interoperability method proposed in the prior art is to design a specific interoperability mechanism between two specific programming languages so that the two programming languages have the ability to interoperate with each other.
  • the following describes the implementation of interoperability between Kotlin language and Java language, and the implementation of interoperability between Kotlin language and C language as examples.
  • Kotlin language is a static programming language for modern multi-platform applications, including Java virtual machine (java virtual machine, JVM) backend, Native backend, JavaScript backend and other backends responsible for compilation optimization and object code generation , different backends are in different operating environments and have different capabilities, so they cannot communicate with each other.
  • JVM Java virtual machine
  • Native backend Native backend
  • JavaScript backend JavaScript backend
  • other backends responsible for compilation optimization and object code generation
  • different backends are in different operating environments and have different capabilities, so they cannot communicate with each other.
  • the Kotlin language under the JVM backend has the interoperability with the Java language, and the further use of the Java native interface (java native interface, JNI) can have the interoperability with the C language.
  • JNI Java native interface
  • Fig. 1 shows an example of the interoperability implementation of Kotlin language and Java language in the prior art.
  • the developer uses the Kotlin language to program to obtain a Kotlin source file (.kt), wherein, during the programming process, a Java method, such as a class or interface of the Java language, is called.
  • a Java method such as a class or interface of the Java language
  • Step 1 the Kotlin compiler (kotlinc) parses the Java source file (.java) and the Kotlin source file respectively, and determines the Java method called in the Kotlin source file. After that, the Kotlin compiler compiles the Kotlin source file to obtain the byte code file (.class).
  • step 2 the Java compiler (Javac) compiles the bytecode file and the Java source file generated by the Kotlin compiler to obtain a bytecode file (.class).
  • Step 3 the bytecode file generated by the Kotlin compiler and the bytecode file generated by the Java compiler are packaged together to obtain a Java file package (.jar), and the Java virtual machine JVM runs the Java file package.
  • a Java file package .jar
  • the Java virtual machine JVM runs the Java file package.
  • Kotlin language and Java language can achieve interoperability.
  • Fig. 2 shows an example of the interoperability implementation of Kotlin language and C language in the prior art.
  • the interoperability between kotlin language and C language can be realized through Java native interface JNI.
  • the Java native interface is a standard Java virtual machine interface.
  • the Java native interface can be used to create, check, update Java objects, call Java methods, etc. It serves as a bridge connecting the Java virtual machine and C language/C++ language, so that Java Multiple languages supported by the virtual machine have gained interoperability with C language/C++ language.
  • developers need to use the application programming interface (application programming interface, API) provided by the Java native interface JNI in the middle layer code (.c or .
  • cpp encapsulates the methods that need to be operated through the Java local interface, so that the encapsulated C language/C++ language methods can be called when the kotlin code is executed.
  • completing the code writing and obtaining the kotlin source file that calls the Java method can be the first step for the developer to realize the interoperability between the kotlin language and the C language.
  • the developer needs to use the development environment Complete the following:
  • the second step is to compile the bytecode file according to the kotlin source file.
  • the third step is to generate the C language header file according to the bytecode file.
  • the fourth step is to write the relevant code of the Java local interface according to the C language header file.
  • the code obtained in the fourth step is linked to the library file to obtain an executable file.
  • JNI Java Native Access
  • the principle of another language interoperability method proposed in the prior art is to provide a common virtual machine to run various language codes, which can provide language interoperability for any combination of codes that can run.
  • the following takes the GraalVM multilingual virtual machine as an example to describe its mechanism for realizing multilingual interoperability.
  • FIG. 3 shows a schematic diagram of the GraalVM multilingual virtual machine architecture in the prior art.
  • GraalVM is a cross-language virtual machine enhanced on the basis of Java HotSpot virtual machine, which can be used as a running platform for multiple programming languages.
  • multiple programming languages include languages based on Java virtual machines such as Java, Scala, and Groovy, and languages based on low-level virtual machines (LLVM) such as C and C++. Languages like JavaScript, Ruby, and R, etc.
  • GraalVM can mix these programming languages, support the interface and objects of each other in different languages, and also support these languages to use the local library files that have been written.
  • the Java HotSpot virtual machine distributes compilation requests to the Graal compiler through the JVM compiler interface (JVM compiler interface, JVMCI), and the Graal compiler responds to the compilation requests issued by the Java HotSpot virtual machine through the JVM compiler interface.
  • JVM compiler interface JVM compiler interface
  • JVMCI Java compiler interface
  • Graal compiler responds to the compilation requests issued by the Java HotSpot virtual machine through the JVM compiler interface.
  • the intermediate code of (such as loading a Java field) will be converted to the intermediate code of the underlying operation (such as reading the data at the address + offset).
  • the intermediate code of the underlying operation will eventually be translated into machine code.
  • the underlying layer of GraalVM is the Java HotSpot virtual machine, so it can directly run Java language, Scala language, groovy language and other Java virtual machine-based languages.
  • Non-JVM languages such as C, C++, JavaScript, Ruby, and R languages, can be implemented on the Java HotSpot virtual machine through the Truffle framework.
  • the Truffle framework is a Java-based language implementation framework. The Truffle-based language implementation needs to use Java to implement language lexical analysis, syntax analysis, and generate an abstract syntax tree (abstract syntax tree) for syntax analysis. syntax tree, AST) interpretation executor.
  • Truffle-based language implementation itself and the Truffle framework are implemented in Java, so they can run on any Java virtual machine JVM.
  • the corresponding interpreted executors can be obtained by using the Truffle framework.
  • the high-performance LLVM bytecode interpreter called Sulong can be obtained by using the Truffle framework.
  • the Java HotSpot virtual machine will call the interface provided by the Graal compiler to actively trigger the just-in-time compilation of non-JVM languages, and convert the interpretation and execution of the abstract syntax tree into machine code after execution of just-in-time compilation.
  • any language that needs to be interoperable needs to use the Truffle framework to completely rewrite the language’s lexical analysis, syntax analysis, and AST interpreter. For most interoperability scenarios, it is only necessary to focus on the function names and calling conventions called during the interoperability process, and there is no need to interpret and execute the entire interoperability target language. Cost and handling costs are high.
  • the second is that during the language evolution process of the interoperable language itself, the evolution of grammatical features and other functions will have a major impact and change on the interoperability of the language. Even a grammatical change of a symbol may affect the work of the interoperability mechanism. The maintainability is poor, which further increases the maintenance cost and operation difficulty of interoperability.
  • an embodiment of the present application provides a language interoperability method, device, storage medium, and program product.
  • the language interoperability method of the embodiment of the application it can provide language interoperability for programming languages while reducing language
  • the cost and operational difficulty of interoperability can improve the scalability of language interoperability of programming languages.
  • Fig. 4 shows an exemplary application scenario of the language interoperability method according to the embodiment of the present application.
  • the application scenario can be, for example, a software development scenario for the Hongmeng system.
  • Various programming languages may be used in software development: such as Cangjie language, Java language, C language, JS/TS language, etc., the above-mentioned various
  • the programming language code may, for example, be stored on memory.
  • the language interoperability method in this embodiment of the present application may be executed by a processor to provide interoperability for a certain programming language stored in a memory.
  • the host language and the target language in the current application scenario may be determined first.
  • a programming language that is most suitable for the current application scenario such as selecting Cangjie language
  • the first language can be selected, such as Java language, C language, JS/TS language, as a variety of second languages.
  • language target language
  • the processor can acquire the first language code (source file or bytecode file of the first language, etc.) and multiple second language codes (multiple second language codes) from the memory source file or bytecode file, etc.), and execute the language interoperability method of the embodiment of the present application.
  • first language code source file or bytecode file of the first language, etc.
  • second language codes multiple second language codes
  • a unified abstract representation of the interoperable boundary information of multiple second languages is generated according to multiple second language codes, wherein the unified abstract representation is the interoperable boundary of multiple second languages
  • the binary code of the information, the interoperable boundary information of multiple second languages may include the components of multiple second languages that allow mutual access or use with the first language, for example, the multiple second languages that allow interoperability with the first language
  • the first language code is compiled to obtain the binary code of the first language code, such as byte code.
  • the first language can be compatible with multiple second language codes.
  • Fig. 5 shows an exemplary schematic diagram of a language interoperability method according to an embodiment of the present application.
  • the language interoperability method according to the embodiment of the present application includes steps S1-S3:
  • the unified abstract representation is a binary code of interoperable boundary information in multiple second languages
  • the unified abstract representation is a binary code of interoperable boundary information in multiple second languages.
  • the interoperable boundary information indicates the constituent elements that allow mutual access or use with the first language among the constituent elements of the plurality of second languages;
  • the constituent elements in the first language code and the constituent elements in any second language among multiple second languages access or use each other which can be the class in the first language code and any second language among multiple second languages Classes in languages access or use each other, interfaces in code in a first language and interfaces in any of multiple second languages access or use each other, and so on.
  • a unified abstract representation of interoperable boundary information in multiple second languages can be generated according to the multiple second language codes, and the unified abstract representation is multiple
  • the binary code of the interoperable boundary information of the second language enables the unified abstract representation to embody the constituent elements that allow mutual access or use with the first language among the constituent elements of multiple second languages, and the first language can be compiled according to the unified abstract representation code to get the binary code of the first language code.
  • the binary code of the first language code is executed, the constituent elements in the first language code and the constituent elements in multiple second language codes can access or use each other, that is, the first language has compatibility with multiple second languages Interoperability.
  • the unified abstract representation can be obtained from the interoperable boundary information of multiple second languages, does not involve the internal methods of the constituent elements, and does not need to parse all the grammars of multiple second languages, so that the language interoperability method in the embodiment of the present application has a relatively low
  • the development cost the update of the internal methods of the constituent elements of multiple languages will not affect the interoperability boundary information of multiple second languages, so it will not affect the unified abstract representation, and there is no need to maintain the unified abstract representation, making
  • the language interoperability method of the embodiment of the present application has a lower maintenance cost; the increase of the second language may increase the constituent elements of multiple second languages, which will not affect the original constituent elements, and will not affect the unified abstract representation
  • the original content facilitates the further expansion of interoperability between the first language and multiple second languages; for developers, it is enough to complete the writing of the first language code and start the language interoperability method, which reduces the developer's effort
  • the workload reduces the operational difficulty of the language inter
  • the first language can be Cangjie language
  • the multiple second languages can be, for example, C language, Java language, and JS/TS language
  • the multiple second language codes can be, for example, C Language code, Java language code, JS/TS language code
  • the C language code can be a file with a suffix of .c or a file with a suffix of .h
  • the Java language code can be a file with a suffix of .java or a file with a suffix of .class
  • the JS/TS language code can be a file with a suffix of .js or a file with a suffix of .ts.d.
  • the first language may be different in different application scenarios.
  • the first language may be Java language
  • the first language may be JS language.
  • the multiple second languages in different application scenarios may also be different.
  • the embodiment of the present application does not limit specific types of the first language and multiple second languages.
  • Fig. 6 shows an example of generating a unified abstract representation of interoperable boundary information in multiple second languages according to an embodiment of the present application.
  • step S2 a unified abstract representation of interoperable boundary information in multiple second languages is generated according to multiple second language codes, including:
  • the unified abstract representation is generated according to the interoperability boundary information of the plurality of second languages.
  • the unified abstract representation can be obtained directly from the interoperability boundary information, avoiding the analysis of a large number of grammars of multiple second languages, and can improve the efficiency of the language interoperability method according to the embodiment of the present application.
  • the interoperability boundary information of C language identified according to the C language code may include multiple constituent elements such as function, char*, Struct, Pointer, and Primitive Types;
  • the interoperability boundary information of the Java language can include multiple components such as interface and class;
  • the interoperability boundary information of the JS/TS language identified according to the JS/TS language code can include function, Primitive Types, Class, etc. constituent elements. It can be seen that since the constituent elements of different second languages are not completely the same, the interoperability boundary information of different second languages is not completely the same.
  • the interoperability boundary information of multiple second languages may include the above-mentioned interoperability boundary information of the C language, the interoperability boundary information of the Java language, and the interoperability boundary information of the JS/TS language.
  • the interoperable boundary information of multiple second languages can include at least one unique component, so that the unique parts of the interoperable boundary information of multiple second languages can be obtained, and then multiple second languages can be obtained. Two-language interoperability boundary information.
  • the interoperability boundary information of multiple second languages may also include more content related to the constituent elements of multiple second languages, as long as the necessary information of the constituent elements (for example, the name and calling convention), and does not include the internal information of the constituent elements (such as the internal information of the function and the execution information within the method of the class). This application does not limit the specific components of the interoperability boundary information.
  • the following introduces an exemplary method of generating a unified abstract representation according to the interoperable boundary information of multiple second languages according to the embodiment of the present application with reference to FIG. 6 and FIG. 7 .
  • the interoperable boundary information of the plurality of second languages includes at least one repeated constituent element and at least one unique constituent element, and the at least one repeated constituent element is the Among the constituent elements included in the interoperable boundary information of the second language, the constituent elements that appear repeatedly; the at least one unique constituent element is the constituent element that appears only once among the constituent elements included in the interoperable boundary information of the multiple second languages constituent elements.
  • the number of constituent elements in the interoperable boundary information can be reduced, the memory space occupied by the unified abstract representation of the interoperable boundary information can also be reduced, and the subsequent operation complexity of compiling the first language code according to the unified abstract representation can be reduced reduce.
  • generating the unified abstract representation according to the interoperability boundary information of the multiple second languages includes:
  • the interoperability boundary information of each second language may include multiple components, and among the components included in the interoperability boundary information of multiple second languages, there may be repeated occurrences Components of and components that occur only once. Then the interoperability boundary information of multiple second languages may include at least one repeated constituent element and at least one unique constituent element.
  • the repeated constituent elements may include Primitive Types data format (C language), Primitive Types data format (JS/TS language), function function (C language), function function (JS/TS language), class Class (Java language), class class (JS/TS language), the only constituent elements may include char* data format (C language), Struct class (C language), Pointer pointer (C language), interface interface (Java language) , then at least one repeated constituent element included in the interoperable boundary information of multiple second languages may be at least one of Primitive Types data format, function function, and class class, and at least one unique constituent element may be char* data format, At least one of Struct class, Pointer pointer, and interface interface.
  • the interoperability boundary information of the multiple second languages includes a common part and a unique part, and each constituent element in the common part corresponds to at least one of the multiple second languages. Two second languages; each constituent element in the unique part corresponds to a second language in the plurality of second languages.
  • the common part may be the same constituent elements included in the interoperable boundary information of at least two second languages in the multiple second languages.
  • Each repeated constituent element of at least one repeated constituent element is obtained as a constituent element, that is, each constituent element in the common part corresponds to at least two second languages among the plurality of second languages.
  • At least one repeated constituent element may include Primitive Types data format (C language), Primitive Types data format (JS/TS language), function function (C language), function function (JS/TS language) , class class (Java language), class class (JS/TS language), in this case, the constituent elements included in the common part may include Primitive Types data format, function function, and class class, where the Primitive Types data format and function function can be It corresponds to two second languages, which are C language and JS/TS language respectively; the class class can correspond to two second languages, which are Java language and JS/TS language respectively.
  • the unique part can, for example, include at least one unique component, that is, each component in the unique part corresponds to one of the multiple second languages. a second language.
  • at least one unique constituent element can comprise char* data format (C language), Struct class (C language), Pointer pointer (C language), interface interface (Java language), in this case,
  • the constituent elements included in the unique part may include char* data format, Struct class, Pointer pointer, and interface interface.
  • the char* data format, the Struct class, and the Pointer pointer may all correspond to a second language and be C language
  • the interface interface may correspond to a second language and be Java language.
  • each constituent element in the common part can correspond to at least two second languages among the plurality of second languages, and each constituent element in the unique part can correspond to a unique one among the plurality of second languages.
  • each constituent element in the unique part can correspond to a unique one among the plurality of second languages.
  • the common part and the unique part can accurately represent the characteristics of multiple second languages.
  • the unified abstract representation may include Primitive Types data format, function function, class class, char* data format, Struct class, Pointer pointer, and interface interface.
  • the unified abstract representation can be stored in a binary format, and can be stored through the structure of the struct class.
  • the constituent elements can include class, interface, function, variable, parameter, field, etc.
  • the corresponding data types can include i8, i16, i32, i64, u8, u16, u32, u64, f16, f32, f64, char, bool, function, array, class, interface, generics wait.
  • FIG. 7 shows an example of a binary format of a unified abstract representation according to an embodiment of the present application. Among them, the relevant information of the constituent elements can be filled in the Decl part, and the relevant information of the data type can be filled in the Type part.
  • the interoperable boundary information that can be used to generate a unified abstract representation should not be limited to the interoperable boundary information of the above three second languages. When there are more types of second languages, more types can be included.
  • the interoperable boundary information of the second language, the way to generate more kinds of unified abstract representations of the interoperable boundary information of the second language can refer to the interoperable boundary based on the C language, Java language, and JS/TS language in the above example
  • An example of unified abstract representation of information, so the language interoperability method according to the embodiment of the present application has good scalability and low development cost.
  • the unified abstract representation is only obtained by processing the interoperable boundary information of multiple second languages, the evolution of functions such as grammatical features other than the interoperable boundary information of multiple second languages will not affect the unified abstract representation. Make an impact, you can reduce maintenance costs.
  • step S3 An exemplary implementation method of step S3 in the embodiment of the present application is introduced below.
  • step S3 the first language code is compiled according to the unified abstract representation, and the binary code of the first language code is obtained and output, including:
  • a processing means for processing the unified abstract representation and the semantics of the first language is obtained, and the interoperable boundary information of the first language is based on the described First language code determination;
  • the processing means is used when compiling the first language code to obtain and output the binary code of the first language code.
  • the semantics of the first language describe the behaviors performed by the computer when executing programs written in the first language, such as logical operations, reading and writing data, and so on.
  • the processing means for processing the unified abstract representation and the semantics of the first language may be a means for integrating the unified abstract representation and the semantics of the first language.
  • the constituent elements of the unified abstract representation and the constituent elements of the first language are "fused" into the same constituent element with the constituent element memory as the constituent element name.
  • Another example is to add a mark to the constituent elements of the first language, indicating that the parameters in the constituent elements are implemented in the same way as the constituent elements of the unified abstract representation, so that the semantics of the constituent elements of the unified abstract representation are "fused" into the first language .
  • the process of unifying the abstract representation and the semantics of the first language can be completed, so that the first language code that accesses or uses the constituent elements of the second language can be successfully compiled.
  • the processing method is pre-set, the developer does not need to give instructions during the compilation process, which can further reduce the difficulty of the developer's work.
  • the processing method is determined by the developer in real time, the flexibility of processing the unified abstract representation and the semantics of the first language when compiling the first language code can be improved.
  • the interoperability boundary information of the first language can be determined according to the first language code, and can include the constituent elements in the first language code.
  • the interoperability boundary information of the first language may include Class, Interface, and function.
  • the processing means may also be different.
  • the following describes different processing means and exemplary implementations thereof.
  • the processing means includes mapping processing, wherein the mapping processing is for components in the first language code that have the same memory as the unified abstract representation but have different names, according to the The relationship is compiled corresponding to the data type of the memory, and the mapping relationship indicates the corresponding relationship between the constituent elements of the unified abstract representation, the constituent elements of the interoperable boundary information of the first language, and different memory data types.
  • mapping processing which makes the compilation
  • the components in the first language can be mapped to the corresponding memory data types according to the preset mapping relationship before compiling.
  • Table 1 an example of the mapping relationship is shown.
  • Constituent elements in the first language code Constituent elements in a unified abstract representation Mapped data type unit Void u1 Int8 int8_t i8 Uint8 uint8_t u8 Int16 int16_t i16 UInt16 uint16_t u16 Int32 int32_t i32 Uint32 uint32_t u32 Int64 int64_t i64 UInt64 uint64_t u64 Float16 / f16 Float32 Float f32 Float64 Double f64
  • the mapping relationship indicates the corresponding relationship between the constituent elements of the unified abstract representation, the constituent elements of the interoperable boundary information of the first language (such as the constituent elements in the first language code) and the data types of different memories, and the mapped data A type can represent the memory footprint of a constituent element.
  • the memory required for the first language code needs to be paid attention to when compiling, and the name of the constituent elements is not concerned. Therefore, it is possible to calculate in advance that the memory of the first language code is the same as that of the unified abstract representation, but the name is different. components, and determine the mapping relationship between the components and different memories based on the statistics of the memory of the components with the same memory and different names, so that the memory of the components can be directly determined based on the mapping relationship when compiling the first language code, realizing the first Semantic processing of components in the language code that have the same memory as the unified abstract representation but have different names.
  • the instance variable (hereinafter referred to as the variable) of any component of the Java language may be a null pointer with a value of null (null), indicating that there is no such object, so directly access its members to the null pointer Or when calling the properties and methods of its members, a null pointer exception (null pointer exception, NPE) will be thrown at runtime.
  • null pointer exception null pointer exception
  • the variables of the Java language called by the first language are passed into the first language, if the If the variable happens to be null, the variable whose value is null will be passed into the runtime of the first language, that is, a null pointer exception will occur in the runtime of the first language, destroying the security of the first language.
  • runtime conversion code can be used to check whether it is empty at runtime. Multiple treatments may all use runtime conversion codes, but the runtime conversion codes may be used differently in different treatments.
  • the attributes of the multiple second languages and the first language can be judged first, and which processing method to use can be determined according to the different situations of the attributes of the multiple second languages and the first language, combined with the specific requirements of the application scenario.
  • the component variable of the unified abstract representation includes a null pointer
  • the component variable of the first language does not include a null pointer
  • the processing means includes a first runtime conversion process, wherein , the first runtime conversion process is to throw the currently compiled component variable as an abnormal value when the runtime conversion code is used to determine that the runtime is empty.
  • a component variable refers to an instance variable of a component. Since whether the runtime is empty during compilation is related to whether the variable of the currently compiled component is a null pointer, it can be indirectly determined that the currently compiled component variable is a null pointer by using the runtime conversion code to determine that the runtime is null. The safety of the first language after the variable is passed into the first language code is determined before the variable is passed into the first language code; by using the currently compiled component variable as an outlier when it is determined that the runtime is empty Throwing, so that the exception will not be passed to the first language, so as to ensure the security of the first language.
  • the first runtime conversion process enables the runtime conversion code to be used to check whether the runtime is empty when multiple variables in the second language are assigned to variables in the first language during the compilation process of the first language code.
  • null pointer exception NPE can be thrown directly, that is, the currently compiled component variable is thrown as an abnormal value. In this case, the null pointer exception is still thrown, but the null pointer exception only appears in the interoperable boundary information part of the first language, and will not be further propagated into the first language, so the security of the first language can be guaranteed.
  • the component variable of the unified abstract representation includes a null pointer
  • the component variable of the first language does not include a null pointer
  • the processing means includes a second runtime conversion process, wherein , the second runtime conversion process is to return a null value in the optional constituent elements when the runtime conversion code is used to determine that the runtime is null.
  • the second runtime conversion process enables the runtime conversion code to be used to check whether the runtime is empty when multiple second language variables are assigned to variables in the first language during the compilation process of the first language code.
  • null value (None) in the optional component
  • specific value in the optional component. In this case, no null pointer exception is thrown, so first-language safety is guaranteed.
  • the first language may also not have security.
  • the first language when it is determined that the first language has a nullable attribute, that is, when the component variables of the first language include null pointers, the first language itself does not have high security at this time, so no matter whether multiple second languages have nullable Attributes, i.e. variables that are constituent elements of the unified abstract representation, will not further compromise the security of the first language, in which case one can choose not to use runtime conversion when compiling the first language code code.
  • the determination method of the processing means of the first runtime conversion processing and the processing means of the second runtime conversion processing may be fixed and pre-set, or may be determined in real time by the developer. This application is not limited to this.
  • compiling the first language code according to the unified abstract representation to obtain and output the binary code of the first language code includes: interoperating between the unified abstract representation and the first language In the boundary information, when there are constituent elements with the same name and different grammar, add a mark corresponding to the constituent elements of the unified abstract representation for the constituent elements of the first language, obtain the binary code of the first language code and output it; the mark Indicating the syntax of the constituent elements of the unified abstract representation in the first language code, when the constituent elements with the mark are executed.
  • the syntax difference mark processing can be performed at compile time, and the binary code can be obtained and output.
  • the grammatical difference tag processing enables the components to be compiled according to the syntax of the second language corresponding to the tags they have when compiling the first language code. That is, when the components with tags are executed, the syntax of the components in the abstract representation is unified.
  • the second language is the Java language
  • the generics in the Java language are implemented by generic erasure
  • the semantic processing between the second language and the first language will be different. Huge conflict.
  • You can mark the generic components in the unified abstract representation (the mark is stored in the attributes attribute), if you want to call the generic class, interface, etc. of the Java language in the first language, you can add explicit in the first language Marking syntax (such as annotations, macros, etc.), that is, adding tags corresponding to the constituent elements of the unified abstract representation for the constituent elements of the first language, used to distinguish whether a generic class belongs to the first language or the second language, such as through @java, etc. to mark a generic component, indicating that the generic in this component is implemented using generic erasure.
  • the grammatical difference mark will be described exemplarily below in conjunction with an example of the first language code.
  • @java is used to mark the component class B in the first language.
  • This class B inherits the generic type from java and rewrites the foo generic function.
  • the generic The parameter T can be processed by the generic erasure of the Java language, and can be marked with @java to distinguish the processing of the generic function of the first language itself, that is, to add the components of the first language to correspond to the components of the unified abstract representation
  • the tag such as @java
  • the syntax (such as generic erasure) of the constituent elements of the abstract expression is unified.
  • the grammatical difference mark can be used to distinguish various huge differences in semantics, and the generic types of the second language including the Java language and the first language are taken as an example here. Those skilled in the art should understand that the grammatical difference mark is also applicable to the constituent elements of other second languages and first languages, such as C language, JS/TS language, and the like.
  • the binary code of the first language code compiled from the first language code such as bytecode
  • the binary code of the first language code and the binary code of the second language code compiled from multiple second language codes can be input to the virtual machine to run together, so that the interoperability between the first language and multiple second languages can be realized .
  • Fig. 8 shows an example of acquiring the binary code of the first language code and running the binary code of the first language code according to the embodiment of the present application. Wherein, if the second language includes a relatively basic language such as C language, the library file of the corresponding language can be directly used when the virtual machine is running.
  • the first language can generally be equipped with basic language interoperability at an implementation cost of 3/person-year, compared with the average of 40+/person in the prior art annual implementation cost, the language interoperability method of the embodiment of the application has low cost and small workload; a unified abstract representation is used to represent the interoperability boundary information of multiple second languages, which greatly reduces the maintenance cost; the characteristics of the second language are changed The impact on the unified abstract representation is minimal, and only focusing on interoperable boundary information can also improve the stability of the solution.
  • the language interoperability method can still be used; interoperability
  • the operation boundary information can be expanded at any time with the expansion of the capabilities of the first language. It only needs to expand the unified abstract representation according to the feature level. For example, adding the array interoperability function of C language will not affect the existing parts of other parts.
  • Interoperability function the steps for users are reduced to two steps: writing code in the first language and making compilation instructions, and the processing means used when compiling the code in the first language is concise and clear, which brings better experience to users.
  • Fig. 9 shows a schematic structural diagram of an exemplary language interoperability device according to an embodiment of the present application.
  • an embodiment of the present application provides a language interoperability device, the device includes: a compiler 90, configured to: acquire the first language code and various second language codes; Two language codes; according to the multiple second language codes, generate a unified abstract representation of interoperable boundary information in multiple second languages, where the unified abstract representation is a binary code of interoperable boundary information in multiple second languages,
  • the interoperable boundary information of the plurality of second languages indicates the constituent elements in the plurality of second languages that allow mutual access or use with the first language; according to the unified abstract representation, compile the first language code,
  • the binary code of the first language code is obtained and output, and the binary code of the first language code enables the constituent elements in the first language code and the constituent elements of any second language in multiple second languages to be able to interact with each other when executed. access or use.
  • generating a unified abstract representation of interoperable boundary information in multiple second languages according to the multiple second language codes includes: identifying the multiple second language codes according to the Interoperable boundary information of multiple second languages; generating the unified abstraction according to the interoperable boundary information of multiple second languages.
  • the interoperable boundary information of the plurality of second languages includes at least one repeated constituent element and at least one unique constituent element, and the at least one repeated constituent element is the Among the constituent elements included in the interoperable boundary information of the second language, the constituent elements that appear repeatedly; the at least one unique constituent element is the constituent element that appears only once among the constituent elements included in the interoperable boundary information of the multiple second languages constituent elements.
  • the interoperability boundary information of the multiple second languages includes a common part and a unique part, and each constituent element in the common part corresponds to at least one of the multiple second languages. Two second languages; each constituent element in the unique part corresponds to a unique second language among the plurality of second languages.
  • compiling the first language code according to the unified abstract representation to obtain and output the binary code of the first language code includes: interoperating between the unified abstract representation and the first language Differences in boundary information to obtain a processing means for processing the unified abstract representation and the semantics of the first language, the interoperability boundary information of the first language is determined according to the first language code; compile the first language code When using the processing means, the binary code of the first language code is obtained and output.
  • the processing means includes mapping processing, wherein the mapping processing is for components in the first language code that have the same memory as the unified abstract representation but have different names, according to the The relationship is compiled corresponding to the data type of the memory, and the mapping relationship indicates the corresponding relationship between the constituent elements of the unified abstract representation, the constituent elements of the interoperable boundary information of the first language, and different memory data types.
  • the component variable of the unified abstract representation includes a null pointer
  • the component variable of the first language does not include a null pointer
  • the processing means includes a first runtime conversion process, wherein , the first runtime conversion process is to throw the currently compiled component variable as an abnormal value when the runtime conversion code is used to determine that the runtime is empty.
  • the component variable of the unified abstract representation includes a null pointer
  • the component variable of the first language does not include a null pointer
  • the processing means includes a second runtime conversion process, wherein , the second runtime conversion process is to return a null value in the optional constituent elements when the runtime conversion code is used to determine that the runtime is null.
  • compiling the first language code according to the unified abstract representation to obtain and output the binary code of the first language code includes: interoperating between the unified abstract representation and the first language In the boundary information, when there are constituent elements with the same name and different grammar, add a mark corresponding to the constituent elements of the unified abstract representation for the constituent elements of the first language, obtain the binary code of the first language code and output it; the mark Indicating the syntax of the constituent elements of the unified abstract representation in the first language code, when the constituent elements with the mark are executed.
  • the interoperability boundary information of different second languages is not completely the same.
  • An embodiment of the present application provides a language interoperability device, including: a processor and a memory for storing instructions executable by the processor; wherein the processor is configured to implement the above method when executing the instructions.
  • An embodiment of the present application provides a non-volatile computer-readable storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the foregoing method is realized.
  • An embodiment of the present application provides a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium bearing computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method.
  • Fig. 10 shows an exemplary structural diagram of a language interoperability device according to an embodiment of the present application.
  • language interoperability devices may include desktop computers, laptop computers, handheld computers, notebook computers, ultra-mobile personal computers (ultra-mobile personal computers, UMPCs), netbooks, personal digital assistants (personal digital assistant, PDA), augmented reality (augmented reality, AR) equipment, virtual reality (virtual reality, VR) equipment, artificial intelligence (AI) equipment, wearable equipment, vehicle equipment, smart home equipment, or smart city At least one of device and server device.
  • PDA personal digital assistant
  • augmented reality augmented reality, AR
  • virtual reality virtual reality
  • AI artificial intelligence
  • wearable equipment wearable equipment
  • vehicle equipment smart home equipment
  • smart city At least one of device and server device.
  • the embodiment of the present application does not specifically limit the specific type of the language interoperability device.
  • the language interoperability device may include a processor 110 and a memory 121 . It can be understood that the structure shown in the embodiment of the present application does not constitute a specific limitation on the language interoperability device. In other embodiments of the present application, the language interoperability device may include more or fewer components than shown in the illustration, or combine certain components, or separate certain components, or arrange different components. The illustrated components can be realized in hardware, software or a combination of software and hardware.
  • the processor 110 may include one or more processing units, for example: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor, AP
  • modem processor graphics processing unit
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller video codec
  • digital signal processor digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit
  • the processor 110 can generate an operation control signal according to the instruction operation code and the timing signal, and complete the control of obtaining and executing the instruction.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 may be a cache memory.
  • the memory may store instructions or data used by the processor 110 or used frequently. If the processor 110 needs to use the instruction or data, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
  • the memory 121 may be used to store computer-executable program code including instructions.
  • the memory 121 may include an area for storing programs and an area for storing data.
  • the stored program area can store an operating system, an application program (such as a processing means) required by at least one function, and the like.
  • the storage data area can store data created during the use of the language interoperability device (such as unified abstract representation) and the like.
  • the memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
  • the processor 110 executes various functional methods of the language interoperability device or the above-mentioned language interoperability methods by executing the instructions stored in the memory 121 and/or the instructions stored in the memory provided in the processor.
  • a computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device.
  • a computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Non-exhaustive list of computer-readable storage media include: portable computer disk, hard disk, random access memory (Random Access Memory, RAM), read only memory (Read Only Memory, ROM), erasable Electrically Programmable Read-Only-Memory (EPROM or flash memory), Static Random-Access Memory (Static Random-Access Memory, SRAM), Portable Compression Disk Read-Only Memory (Compact Disc Read-Only Memory, CD - ROM), Digital Video Disc (DVD), memory sticks, floppy disks, mechanically encoded devices such as punched cards or raised structures in grooves with instructions stored thereon, and any suitable combination of the foregoing .
  • RAM Random Access Memory
  • ROM read only memory
  • EPROM or flash memory erasable Electrically Programmable Read-Only-Memory
  • Static Random-Access Memory SRAM
  • Portable Compression Disk Read-Only Memory Compact Disc Read-Only Memory
  • CD - ROM Compact Disc Read-Only Memory
  • DVD Digital Video Disc
  • Computer readable program instructions or codes described herein may be downloaded from a computer readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, local area network, wide area network, and/or wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
  • Computer program instructions for performing the operations of the present application may be assembly instructions, instruction set architecture (Instruction Set Architecture, ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or in one or more source or object code written in any combination of programming languages, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages.
  • Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer can be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or it can be connected to an external computer such as use an Internet service provider to connect via the Internet).
  • electronic circuits such as programmable logic circuits, field-programmable gate arrays (Field-Programmable Gate Array, FPGA) or programmable logic arrays (Programmable Logic Array, PLA), the electronic circuit can execute computer-readable program instructions, thereby realizing various aspects of the present application.
  • These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
  • each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block in the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented with hardware (such as circuits or ASIC (Application Specific Integrated Circuit, application-specific integrated circuit)), or can be implemented with a combination of hardware and software, such as firmware.
  • hardware such as circuits or ASIC (Application Specific Integrated Circuit, application-specific integrated circuit)
  • firmware such as firmware

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

一种语言互操作方法、装置、存储介质及程序产品,所述方法包括,获取第一语言代码以及多种第二语言代码(S1);根据多种第二语言代码,生成多种第二语言的互操作边界信息的统一抽象表示(S2),统一抽象表示是多种第二语言的互操作边界信息的二进制代码,多种第二语言的互操作边界信息表示多种第二语言的构成要素中允许与第一语言互相访问或使用的构成要素;根据统一抽象表示,编译第一语言代码,得到第一语言代码的二进制代码并输出(S3)。根据该语言互操作方法,能够在为编程语言提供语言互操作能力的同时,降低语言互操作实现的成本及操作难度,提高编程语言的语言互操作能力扩展性。

Description

语言互操作方法、装置、存储介质及程序产品
本申请要求于2021年10月14日提交中国专利局、申请号为202111200966.6、申请名称为“语言互操作方法、装置、存储介质及程序产品”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机编程领域,尤其涉及一种语言互操作方法、装置、存储介质及程序产品。
背景技术
随着计算机技术的发展,编程语言的种类也不断增加。编程语言作为一种开发工具,通常是针对特定的领域或行业的开发需求来设计的,例如,Java语言多应用于企业软件开发、安卓移动开发,大数据云计算领域等IT行业;Python语言常用在图形处理,科学计算,web编程,多媒体应用和引擎开发、机器学习、人工智能等领域。也就是说,每种编程语言擅长的领域不同。因此,针对不同的开发环境,需要适应性选择合适的编程语言来实现开发需求。
其中,一种编程语言不能满足特定环境下的开发需求时,或者相比该种编程语言,其他编程语言对于完成特定环境下开发需求中的某一部分具有优势时,可以采用该种编程语言结合其他编程语言来共同实现开发需求。为此,提出了语言互操作能力的概念。语言互操作能力表示不同编程语言作为同一系统的一部分进行互操作的能力。现有的语言互操作方法,虽然能够使得编程语言具有语言互操作能力,但其缺点也十分明显:一是开发成本和维护成本较高,二是难以进一步扩展实现多语言的互操作,三是使用编程语言实现语言互操作时需要多个步骤,操作复杂,降低开发者的体验。
有鉴于此,如何在为编程语言提供语言互操作能力的同时,降低语言互操作实现的成本及操作难度,提高编程语言的语言互操作能力扩展性,成为本领域的研究热点。
发明内容
有鉴于此,提出了一种语言互操作方法、装置、存储介质及程序产品,根据本申请实施例的语言互操作方法,能够在为编程语言提供语言互操作能力的同时,降低语言互操作实现的成本及操作难度,提高编程语言的语言互操作能力扩展性。
第一方面,本申请的实施例提供了一种语言互操作方法,所述方法包括:获取第一语言代码以及多种第二语言代码;根据所述多种第二语言代码,生成多种第二语言的互操作边界信息的统一抽象表示,所述统一抽象表示是多种第二语言的互操作边界信息的二进制代码,所述多种第二语言的互操作边界信息表示多种第二语言的构成要素中允许与第一语言互相访问或使用的构成要素;根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,所述第一语言代码的二进制代码在执行时使得第一语言代码中的构成要素和多种第二语言中的任一第二语言的构成要素能够互相访问或使用。
根据本申请实施例的语言互操作方法,通过获取多种第二语言代码,可以根据多种第二 语言代码生成多种第二语言的互操作边界信息的统一抽象表示,统一抽象表示是多种第二语言的互操作边界信息的二进制代码,使得统一抽象表示可以体现多种第二语言的构成要素中允许与第一语言互相访问或使用的构成要素,根据统一抽象表示,可以编译第一语言代码,得到第一语言代码的二进制代码。第一语言代码的二进制代码在执行时,使得第一语言代码中的构成要素和多种第二语言代码中的构成要素能够互相访问或使用,即使得第一语言具有与多种第二语言的互操作能力。统一抽象表示可以由多种第二语言的互操作边界信息得到,不涉及构成要素的内部方法,不需要解析多种第二语言的全部语法,使得本申请实施例的语言互操作方法具有较低的开发成本;多种语言的构成要素的内部方法的更新不会影响到多种第二语言的互操作边界信息,因此不会对统一抽象表示带来影响,不必对统一抽象表示进行维护,使得本申请实施例的语言互操作方法具有较低的维护成本;第二语言的增加使得多种第二语言的构成要素可能增加,不会影响到原有的构成要素,进而不会影响统一抽象表示原有的内容,便于第一语言和多种第二语言的互操作能力的进一步扩展;对于开发者而言,完成第一语言代码的撰写以及启动语言互操作方法即可,减少了开发者的工作量,降低了语言互操作方法的操作难度;综上所述,根据本申请实施例的语言互操作方法,能够在为编程语言提供语言互操作能力的同时,降低语言互操作实现的成本及操作难度,提高编程语言的语言互操作能力扩展性。
根据第一方面,在所述语言互操作方法的第一种可能的实现方式中,根据所述多种第二语言代码,生成多种第二语言的互操作边界信息的统一抽象表示,包括:根据所述多种第二语言代码,识别所述多种第二语言的互操作边界信息;根据所述多种第二语言的互操作边界信息,生成所述统一抽象表示。
通过这种方式,使得统一抽象表示可以直接由互操作边界信息处理得到,避免对多种第二语言的大量语法进行分析,可以提升根据本申请实施例的语言互操作方法的效率。
根据第一方面或第一方面的第一种可能的实现方式,在所述语言互操作方法的第二种可能的实现方式中,所述多种第二语言的互操作边界信息包括至少一个重复的构成要素和至少一个唯一的构成要素,所述至少一个重复的构成要素为所述多种第二语言的互操作边界信息包括的构成要素中,重复出现的构成要素;所述至少一个唯一的构成要素为所述多种第二语言的互操作边界信息包括的构成要素中,仅一次出现的构成要素。
通过这种方式,可以减少互操作边界信息中构成要素的数量,使得互操作边界信息的统一抽象表示所占据的内存空间也降低,并使得后续根据统一抽象表示编译第一语言代码的操作复杂度降低。
根据第一方面或第一方面的第一种可能的实现方式,在所述语言互操作方法的第三种可能的实现方式中,所述多种第二语言的互操作边界信息包括共性部分和特有部分,所述共性部分中的每一构成要素,对应所述多种第二语言中的至少两种第二语言;所述特有部分中的每一构成要素,对应所述多种第二语言中的唯一一种第二语言。
通过这种方式,使得共性部分中的每一构成要素能够与多种第二语言中的至少两种第二语言相对应,特有部分中的每一构成要素能够与多种第二语言中的唯一一种第二语言相对应,因此共性部分与特有部分能够准确表征多种第二语言中的特征。
根据第一方面,以及以上第一方面的任意一种可能的实现方式,在所述语言互操作方法的第四种可能的实现方式中,根据所述统一抽象表示,编译所述第一语言代码,得到第一语 言代码的二进制代码并输出,包括:根据所述统一抽象表示与第一语言的互操作边界信息的差异,得到对所述统一抽象表示与第一语言的语义进行处理的处理手段,所述第一语言的互操作边界信息根据所述第一语言代码确定;编译所述第一语言代码时使用所述处理手段,得到所述第一语言代码的二进制代码并输出。
第一语言的语义描述了计算机执行使用第一语言编写的程序时所表现的行为,例如逻辑运算、读写数据等。对统一抽象表示与第一语言的语义进行处理的处理手段,可以是对统一抽象表示与第一语言的语义进行融合的手段。例如将统一抽象表示的构成要素与第一语言的构成要素“融合”为以构成要素内存作为构成要素名称的同一构成要素。又例如为第一语言的构成要素增加标记,表示该构成要素中的参数以统一抽象表示的构成要素的实现方式来实现,从而将统一抽象表示的构成要素的语义“融合”到第一语言中。
通过这种方式,可以在对第一语言代码进行编译时,完成统一抽象表示与第一语言的语义的处理,使得访问或使用第二语言的构成要素的第一语言代码得以成功编译。在处理手段预先设置好的前提下,不需开发者在编译过程中再做出指示,可以进一步降低开发者的工作难度。在处理手段由开发者实时确定时,可以提升编译第一语言代码时,对所述统一抽象表示与所述第一语言的语义进行处理的灵活性。
根据第一方面的第四种可能的实现方式,在所述语言互操作方法的第五种可能的实现方式中,所述处理手段包括映射处理,其中,所述映射处理为针对所述第一语言代码中、与所述统一抽象表示内存相同、名称不同的构成要素,按照映射关系中对应内存的数据类型进行编译,所述映射关系指示所述统一抽象表示的构成要素、第一语言的互操作边界信息的构成要素与不同内存的数据类型的对应关系。
对于第一语言代码中的构成要素,在编译时仅关注其所需内存即可,不关注构成要素的名称,因此,可以预先统计第一语言代码与所述统一抽象表示的内存相同、名称不同的构成要素,并基于统计的内存相同、名称不同的构成要素的内存确定构成要素到不同内存的映射关系,使得在编译第一语言代码时基于映射关系可直接确定构成要素的内存,实现第一语言代码中与统一抽象表示内存相同、名称不同的构成要素的语义处理。
根据第一方面的第四种可能的实现方式或第五种可能的实现方式,在所述语言互操作方法的第六种可能的实现方式中,所述统一抽象表示的构成要素变量包括空指针、且所述第一语言的构成要素变量不包括空指针,所述处理手段包括第一运行时转换处理,其中,所述第一运行时转换处理为使用运行时转换代码确定所述运行时为空时,将当前编译的构成要素变量作为异常值抛出。
构成要素变量指的是构成要素的实例变量。由于编译过程中运行时是否为空与当前编译的构成要素的变量是否为空指针相关联,因此通过使用运行时转换代码确定运行时为空,可以间接确定当前编译的构成要素变量为空指针,使得变量传入第一语言代码后第一语言的安全性,在变量传入第一语言代码内部之前得以确定;通过在确定所述运行时为空时,将当前编译的构成要素变量作为异常值抛出,使得异常不会传入第一语言,从而能保证第一语言的安全性。
根据第一方面的第四种至第六种可能的实现方式中的任意一种可能的实现方式,在所述语言互操作方法的第七种可能的实现方式中,所述统一抽象表示的构成要素变量包括空指针、且所述第一语言的构成要素变量不包括空指针,所述处理手段包括第二运行时转换处理,其 中,所述第二运行时转换处理为使用运行时转换代码确定所述运行时为空时,返回可选构成要素中的空值。
由于编译过程中运行时是否为空与当前编译的构成要素的变量是否为空指针相关联,因此通过使用运行时转换代码确定运行时为空,可以间接确定当前编译的构成要素变量为空指针,使得变量传入第一语言代码后第一语言的安全性,在变量传入第一语言代码内部之前得以确定;通过在确定所述运行时为空时,返回可选构成要素中的空值,使得编译第一语言代码时不会产生异常,从而能保证第一语言的安全性。
根据第一方面,或第一方面的第一种至第三种可能的实现方式中的任意一种可能的实现方式,在所述语言互操作方法的第八种可能的实现方式中,根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,包括:在所述统一抽象表示与第一语言的互操作边界信息中,存在名称相同、语法不同的构成要素时,为第一语言的构成要素增加与所述统一抽象表示的构成要素对应的标记,得到第一语言代码的二进制代码并输出;所述标记指示第一语言代码中,所述具有标记的构成要素执行时,实现所述统一抽象表示的构成要素的语法。
通过这种方式,可以在编译时指示当前编译的构成要素应当使用的语法,避免因同一名称的构成要素对应第一语言和第二语言的多个语法时无法做出选择。使得能够提升本申请实施例的语言互操作方法支持的互操作能力。
根据第一方面,在所述语言互操作方法的第九种可能的实现方式中,不同的第二语言的互操作边界信息不完全相同。
通过这种方式,使得多种第二语言的互操作边界信息中,可以包括至少一个唯一的构成要素,从而能够得到多种第二语言的互操作边界信息的特有部分,进而能够得到多种第二语言的互操作边界信息。
第二方面,本申请的实施例提供了一种语言互操作装置,所述装置包括:编译器,用于:获取第一语言代码以及多种第二语言代码;根据所述多种第二语言代码,生成多种第二语言的互操作边界信息的统一抽象表示,所述统一抽象表示是多种第二语言的互操作边界信息的二进制代码,所述多种第二语言的互操作边界信息表示多种第二语言的构成要素中允许与第一语言互相访问或使用的构成要素;根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,所述第一语言代码的二进制代码在执行时使得第一语言代码中的构成要素和多种第二语言中的任一第二语言的构成要素能够互相访问或使用。
根据第二方面,在所述语言互操作装置的第一种可能的实现方式中,根据所述多种第二语言代码,生成多种第二语言的互操作边界信息的统一抽象表示,包括:根据所述多种第二语言代码,识别所述多种第二语言的互操作边界信息;根据所述多种第二语言的互操作边界信息,生成所述统一抽象表示。
根据第二方面或第二方面的第一种可能的实现方式,在所述语言互操作装置的第二种可能的实现方式中,所述多种第二语言的互操作边界信息包括至少一个重复的构成要素和至少一个唯一的构成要素,所述至少一个重复的构成要素为所述多种第二语言的互操作边界信息包括的构成要素中,重复出现的构成要素;所述至少一个唯一的构成要素为所述多种第二语言的互操作边界信息包括的构成要素中,仅一次出现的构成要素。
根据第二方面或第二方面的第一种可能的实现方式,在所述语言互操作装置的第三种可 能的实现方式中,所述多种第二语言的互操作边界信息包括共性部分和特有部分,所述共性部分中的每一构成要素,对应所述多种第二语言中的至少两种第二语言;所述特有部分中的每一构成要素,对应所述多种第二语言中的唯一一种第二语言。
根据第二方面,以及以上第二方面的任意一种可能的实现方式,在所述语言互操作装置的第四种可能的实现方式中,根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,包括:根据所述统一抽象表示与第一语言的互操作边界信息的差异,得到对所述统一抽象表示与第一语言的语义进行处理的处理手段,所述第一语言的互操作边界信息根据所述第一语言代码确定;编译所述第一语言代码时使用所述处理手段,得到所述第一语言代码的二进制代码并输出。
根据第二方面的第四种可能的实现方式,在所述语言互操作装置的第五种可能的实现方式中,所述处理手段包括映射处理,其中,所述映射处理为针对所述第一语言代码中、与所述统一抽象表示内存相同、名称不同的构成要素,按照映射关系中对应内存的数据类型进行编译,所述映射关系指示所述统一抽象表示的构成要素、第一语言的互操作边界信息的构成要素与不同内存的数据类型的对应关系。
根据第二方面的第四种可能的实现方式或第五种可能的实现方式,在所述语言互操作装置的第六种可能的实现方式中,所述统一抽象表示的构成要素变量包括空指针、且所述第一语言的构成要素变量不包括空指针,所述处理手段包括第一运行时转换处理,其中,所述第一运行时转换处理为使用运行时转换代码确定所述运行时为空时,将当前编译的构成要素变量作为异常值抛出。
根据第二方面的第四种至第六种可能的实现方式中的任意一种可能的实现方式,在所述语言互操作装置的第七种可能的实现方式中,所述统一抽象表示的构成要素变量包括空指针、且所述第一语言的构成要素变量不包括空指针,所述处理手段包括第二运行时转换处理,其中,所述第二运行时转换处理为使用运行时转换代码确定所述运行时为空时,返回可选构成要素中的空值。
根据第二方面,或第二方面的第一种至第三种可能的实现方式中的任意一种可能的实现方式,在所述语言互操作装置的第八种可能的实现方式中,根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,包括:在所述统一抽象表示与第一语言的互操作边界信息中,存在名称相同、语法不同的构成要素时,为第一语言的构成要素增加与所述统一抽象表示的构成要素对应的标记,得到第一语言代码的二进制代码并输出;所述标记指示第一语言代码中,所述具有标记的构成要素执行时,实现所述统一抽象表示的构成要素的语法。
根据第二方面,在所述语言互操作装置的第九种可能的实现方式中,不同的第二语言的互操作边界信息不完全相同。
第三方面,本申请的实施例提供了一种语言互操作装置,包括:处理器;用于存储处理器可执行指令的存储器;其中,处理器被配置为可以执行上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的语言互操作方法。
第四方面,本申请实施例提供一种非易失性计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的语言互操作方法。
第五方面,本申请的实施例提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的语言互操作方法。
本申请的这些和其他方面在以下(多个)实施例的描述中会更加简明易懂。
附图说明
包含在说明书中并且构成说明书的一部分的附图与说明书一起示出了本申请的示例性实施例、特征和方面,并且用于解释本申请的原理。
图1示出现有技术中Kotlin语言和Java语言的互操作实现的一个示例。
图2示出现有技术中Kotlin语言和C语言的互操作实现的一个示例。
图3示出现有技术的GraalVM多语言虚拟机架构示意图。
图4示出根据本申请实施例的语言互操作方法的示例性应用场景。
图5示出根据本申请实施例的语言互操作方法的示例性示意图。
图6示出根据本申请实施例生成多种第二语言的互操作边界信息的统一抽象表示的一个示例。
图7示出根据本申请实施例的统一抽象表示的二进制格式的一个示例。
图8示出根据本申请实施例获取第一语言代码的二进制代码以及运行第一语言代码的二进制代码的一个示例。
图9示出根据本申请实施例的语言互操作装置的示例性结构示意图。
图10示出根据本申请实施例的语言互操作装置的示例性结构示意图。
具体实施方式
以下将参考附图详细说明本申请的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。
另外,为了更好的说明本申请,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本申请同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本申请的主旨。
以下,对本文可能出现的术语进行解释。
构成要素:表示构成编程语言代码的要素。构成要素可以包括类、接口、函数、数据格式等代码中的内容中的至少一种,例如可以包括整数类型、浮点数类型、布尔类型等多种类型的数据格式,和/或多种类型的类、多种类型的接口、多种类型的函数等。
宿主语言、目标语言:当为A语言提供互操作能力,使其能够和B、C、D等语言进行互操作,即用A语言编写的类能直接与用B/C/D语言编写的类通信时,可称A语言为宿主语言,称B、C、D语言等为目标语言。
泛型擦除:在编译期间所有泛型信息(泛型的构成要素的实例的使用参数的实际类型,即类/接口/函数/数据结构等的具体类型)都被擦掉,以Java泛型为例,如泛型类型List<String>等在编译后都会变成List,Java虚拟机无法获取泛型附加的类型信息<String>,只能获取到List。
泛型实例化:类似C++模板采用的方式,产生模板的特性类型的构成要素的实例(例如类的实例等)的过程称为实例化。
字节码(Bytecode):通常是指已经编译但与当前使用环境的机器码(计算机可以直接执行的代码)无关,需要解释器(或虚拟机)转译后才能称为机器码的中间代码。字节码通常通过编译器生成。典型的例子为Java字节码,其中,Java、kotlin等语言通常都支持编译得到Java字节码。
垃圾回收:一种自动的内存管理形式,垃圾收集器尝试回收程序已分配但不再使用的内存,因这些内存不再被引用因此称为垃圾。
应用程序二进制接口(application binary interface,ABI):表示编译器和链接器遵守的一组规则,包括调用约定,名字修饰(name-mangling)等,调用约定用于规定函数如何翻译到汇编以及如何调用,名字修饰则用于描述如何暴露函数。
下面介绍现有技术提出的一种语言互操作方法。
现有技术提出的一种语言互操作方法的原理是,针对特定的两种编程语言之间设计特定的互操作机制,使得该两种编程语言具备与对方进行的互操作的能力。下面以Kotlin语言和Java语言的互操作实现、以及Kotlin语言与C语言的互操作实现为例进行描述。
Kotlin语言是一种用于现代多平台应用的静态编程语言,包括Java虚拟机(java virtual machine,JVM)后端、Native后端、JavaScript后端等多个负责编译优化和目标代码生成的后端,各个不同后端处于不同的运行环境,具备不同的能力,因此无法互通。其中,JVM后端下Kotlin语言具备和Java语言的互操作能力,以及进一步利用Java本地接口(java native interface,JNI)可具备与C语言的互操作能力。
图1示出现有技术中Kotlin语言和Java语言的互操作实现的一个示例。
如图1所示,开发者使用Kotlin语言编程得到Kotlin源文件(.kt),其中,编程过程中,调用了Java方法,例如Java语言的类或者接口等。在此情况下,在对Kotlin源文件进行编译时,依次执行以下步骤:
步骤一,Kotlin编译器(kotlinc)分别对Java源文件(.java)和Kotlin源文件进行解析,确定Kotlin源文件中调用的Java方法,此后,Kotlin编译器对Kotlin源文件进行编译,得到字节码文件(.class)。
步骤二,Java编译器(Javac)对由Kotlin编译器生成的字节码文件和Java源文件进行编译,得到字节码文件(.class)。
步骤三,由Kotlin编译器生成的字节码文件和由Java编译器生成的字节码文件共同打包得到Java文件包(.jar),Java虚拟机JVM运行Java文件包。在此情况下,Kotlin语言和Java语言可以实现互操作。
由图1的相关描述可知,Kotlin语言和Java语言的互操作主要是通过Kotlin编译器对Java源文件的直接解析,使得Kotlin编译器能确定Kotlin源文件中调用的Java方法并处理得到对应的字节码文件来实现的,这一过程是Kotlin编译器针对Kotlin语言和Java语言 的互操作引入的新机制,该机制只能用于Kotlin语言和Java语言的互操作实现,无法在Kotlin语言和其他语言的互操作实现中进行复用。
图2示出现有技术中Kotlin语言和C语言的互操作实现的一个示例。
kotlin语言和C语言的互操作可以通过Java本地接口JNI实现。Java本地接口是一种标准的Java虚拟机接口,Java本地接口可以用于创建、检查、更新Java对象,调用Java方法等,它作为桥梁连接了Java虚拟机和C语言/C++语言,从而使Java虚拟机支持的多种语言获得了和C语言/C++语言的互操作能力。如图2所示,为使得Kotlin语言能够操作C语言,开发者需要根据Java本地接口JNI提供的应用程序编程接口(application programming interface,API),在Java本地接口JNI中间层代码(.c或.cpp)中封装需要通过Java本地接口操作的方法,以便在kotlin代码执行时能够调用封装好的C语言/C++语言方法。为使得C语言能够操作Kotlin语言,需要在Java本地接口JNI的底层代码中封装应用程序编程接口API的相应能力,例如减弱互操作实现所需的调用步骤对垃圾回收带来的影响等,以便在C语言代码执行时能够通过应用程序编程接口API调用Kotlin语言方法/Java方法。通过这种方式,使得Kotlin语言和C语言具有互操作能力。
其中,完成代码编写、得到调用了Java方法的kotlin源文件可以是开发者在实现kotlin语言与C语言的互操作的第一步工作,为了完成kotlin源文件的编译,开发者还需使用开发环境完成以下操作:
第二步,根据kotlin源文件需要编译得到字节码文件,第三步,根据字节码文件生成C语言的头文件,第四步,根据C语言的头文件编写Java本地接口的相关代码,第五步,根据第四步得到的代码链接到库文件,以得到可执行文件。
由此可知,Kotlin语言与C语言的互操作采用的JNI机制使用流程复杂,需要多个步骤,而现有的其他Kotlin语言与C语言的互操作机制,诸如Java本地访问(Java Native Access,JNA)等框架虽然可以对JNI机制的使用步骤进行了简化,代价则是互操作性能的降低。
结合图1、图2,可以得出,使用现有技术提出的语言互操作方法,一种语言想要具备与多种其他语言都能互操作的能力,需要针对多种语言中的每一种都设计对应的互操作机制,使得一种语言通过增加可进行互操作的对象来扩展其互操作能力的实现代价过大,实用性差。且对于设计互操作机制的开发者来说,互操作机制涉及的每一种编程语言都要掌握,对开发者的个人能力要求较高,开发者在使用具有语言互操作性的编程语言进行互操作编程时,需要较多的步骤,降低开发者的体验。并且,开发者完成代码撰写后需要编译器对代码进行实现,这种语言互操作方法对于编译器来说,实现成本高、维护复杂。
下面介绍现有技术提出的另一种语言互操作方法。
现有技术提出的另一种语言互操作方法的原理是,提供一个通用的虚拟机来运行各种语言代码,可以为能够运行的全部代码中的任意代码组合之间提供语言互操作能力。下面以GraalVM多语言虚拟机为例对其实现多语言互操作的机制进行描述。
图3示出现有技术的GraalVM多语言虚拟机架构示意图。
GraalVM是一个在Java HotSpot虚拟机基础上增强而成的跨语言虚拟机,可以作为多种编程语言的运行平台使用。如图3所示,多种编程语言包括了Java、Scala、Groovy等基于Java虚拟机的语言,还包括了C、C++等基于低级虚拟机(low level virtual machine,LLVM)的语言,同时支持其他像JavaScript、Ruby和R语言等。GraalVM可以混合使用这些编程语 言,支持不同语言中混用对方的接口和对象,也能够支持这些语言使用已经编写好的本地库文件。
Java HotSpot虚拟机通过JVM编译器接口(JVM compiler interface,JVMCI)将编译请求分发给Graal编译器,Graal编译器又通过JVM编译器接口响应Java HotSpot虚拟机发出的编译请求,在编译期间,高级操作(比如加载Java字段)的中间码会被转换为底层操作(比如读取地址+偏移量处的数据)的中间码。而底层操作的中间码最终会被翻译为机器码。
GraalVM的底层是Java HotSpot虚拟机,因此可以直接运行Java语言、Scala语言、groovy语言等基于Java虚拟机的语言。非JVM语言,即C、C++、JavaScript、Ruby和R语言等可以通过Truffle框架来实现在Java HotSpot虚拟机上运行。Truffle框架作为GraalVM多语言互操作机制中的重要组成部分,是一个基于Java的语言实现框架,基于Truffle的语言实现需要使用Java实现语言的词法分析、语法分析以及针对语法分析生成抽象语法树(abstract syntax tree,AST)的解释执行器。基于Truffle的语言实现本身以及Truffle框架均采用Java实现,因此可以运行在任何Java虚拟机JVM上。对于JavaScript、Ruby和R语言等编程语言,采用Truffle框架可以分别得到对应的解释执行器。对于C、C++语言等,采用Truffle框架可以得到称为Sulong的高性能LLVM字节码解释器。Java HotSpot虚拟机将调用Graal编译器所提供的接口,主动触发对非JVM语言的即时编译,将对抽象语法树的解释执行转换为执行即时编译后的机器码。
采用通用的虚拟机实现多语言互操作机制仍具有以下缺点:一是任何要进行互操作的语言都需要使用Truffle框架完整重写该语言的词法解析、语法解析以及AST解释执行器,而对于大多数的互操作场景来说,仅专注于互操作过程中调用的函数名称及调用约定即可,并不需要解释执行整个互操作目标语言,因此采用通用的虚拟机实现多语言互操作机制的时间成本和处理成本都较大。二是进行互操作的语言本身的语言演进过程中,语法特性等功能的演进会对该语言的互操作实现造成重大影响和变更,哪怕只是一个符号的语法改动也可能影响互操作机制的工作,可维护性差,进一步提高了互操作实现的维护成本及操作难度。
有鉴于此,本申请实施例提供一种语言互操作方法、装置、存储介质及程序产品,根据本申请实施例的语言互操作方法,能够在为编程语言提供语言互操作能力的同时,降低语言互操作实现的成本及操作难度,提高编程语言的语言互操作能力扩展性。
图4示出根据本申请实施例的语言互操作方法的示例性应用场景。
如图4所示,该应用场景可例如是针对鸿蒙系统的软件开发场景,软件开发中可能使用到多种编程语言:例如仓颉语言、Java语言、C语言、JS/TS语言等,上述多种编程语言代码可例如存储在存储器上。本申请实施例的语言互操作方法可以由处理器执行,以对存储器存储的某种编程语言提供互操作能力。
在执行本申请实施例的语言互操作方法前,可以先确定当前应用场景下的宿主语言以及目标语言。例如,结合当前应用场景以及存储器存储的代码对应的编程语言的多种类型(仓颉、Java、C、JS/TS),可以先选择最适用于当前应用场景的一种编程语言,例如选择仓颉语言,作为第一语言(宿主语言)。根据应用场景需求,可以选择存储器中存储的代码对应的编程语言中、除第一语言外的其他语言中的部分或全部,例如选择Java语言、C语言、JS/TS语言,作为多种第二语言(目标语言)。
确定第一语言以及多种第二语言后,处理器可以从存储器获取第一语言代码(第一语言 的源文件或字节码文件等)以及多种第二语言代码(多种第二语言的源文件或字节码文件等),并执行本申请实施例的语言互操作方法。执行本申请实施例的语言互操作方法时,根据多种第二语言代码,生成多种第二语言的互操作边界信息统一抽象表示,其中,统一抽象表示是多种第二语言的互操作边界信息的二进制代码,多种第二语言的互操作边界信息可以包括多种第二语言的构成要素中允许与第一语言互相访问或使用的构成要素,例如,多种第二语言中允许与第一语言互相访问或使用的函数和/或类的名称及调用约定。根据统一抽象表示,编译第一语言代码,可以得到第一语言代码的二进制代码,例如字节码。在虚拟机运行基于本申请实施例的语言互操作方法得到的第一语言代码的二进制代码以及基于现有技术得到的第二语言代码的二进制代码时,可以使得第一语言具备与多种第二语言进行互操作的能力。
下面介绍本申请实施例的语言互操作方法。图5示出根据本申请实施例的语言互操作方法的示例性示意图。
如图5所示,根据本申请实施例的语言互操作方法包括步骤S1-S3:
S1,获取第一语言代码以及多种第二语言代码;
S2,根据多种第二语言代码,生成多种第二语言的互操作边界信息的统一抽象表示,统一抽象表示是多种第二语言的互操作边界信息的二进制代码,多种第二语言的互操作边界信息表示多种第二语言的构成要素中允许与第一语言互相访问或使用的构成要素;
S3,根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,所述第一语言代码的二进制代码在执行时使得第一语言代码中的构成要素和多种第二语言中的任一第二语言的构成要素能够互相访问或使用。
第一语言代码中的构成要素和多种第二语言中的任一第二语言的构成要素互相访问或使用,可以是第一语言代码中的类和多种第二语言中的任一第二语言的类互相访问或使用、第一语言代码中的接口和多种第二语言中的任一第二语言的接口互相访问或使用等等。
根据本申请实施例的语言互操作方法,通过获取多种第二语言代码,可以根据多种第二语言代码生成多种第二语言的互操作边界信息的统一抽象表示,统一抽象表示是多种第二语言的互操作边界信息的二进制代码,使得统一抽象表示可以体现多种第二语言的构成要素中允许与第一语言互相访问或使用的构成要素,根据统一抽象表示,可以编译第一语言代码,得到第一语言代码的二进制代码。第一语言代码的二进制代码在执行时,使得第一语言代码中的构成要素和多种第二语言代码中的构成要素能够互相访问或使用,即使得第一语言具有与多种第二语言的互操作能力。统一抽象表示可以由多种第二语言的互操作边界信息得到,不涉及构成要素的内部方法,不需要解析多种第二语言的全部语法,使得本申请实施例的语言互操作方法具有较低的开发成本;多种语言的构成要素的内部方法的更新不会影响到多种第二语言的互操作边界信息,因此不会对统一抽象表示带来影响,不必对统一抽象表示进行维护,使得本申请实施例的语言互操作方法具有较低的维护成本;第二语言的增加使得多种第二语言的构成要素可能增加,不会影响到原有的构成要素,进而不会影响统一抽象表示原有的内容,便于第一语言和多种第二语言的互操作能力的进一步扩展;对于开发者而言,完成第一语言代码的撰写以及启动语言互操作方法即可,减少了开发者的工作量,降低了语言互操作方法的操作难度;综上所述,根据本申请实施例的语言互操作方法,能够在为编程语言提供语言互操作能力的同时,降低语言互操作实现的成本及操作难度,提高编程语言的语言互操作能力扩展性。
结合图4的应用场景,步骤S1中,第一语言可以是仓颉语言,多种第二语言可例如分别是C语言、Java语言、JS/TS语言,多种第二语言代码可例如分别是C语言代码、Java语言代码、JS/TS语言代码,其中,C语言代码可以是后缀为.c的文件或后缀为.h的文件,Java语言代码可以是后缀为.java的文件或后缀为.class的文件,JS/TS语言代码可以是后缀为.js的文件或后缀为.ts.d的文件。
本领域技术人员应理解,不同应用场景的第一语言可以不同,例如,在安卓软件开发应用场景中,第一语言可以是Java语言,在浏览器开发场景中,第一语言可以是JS语言。相应地,不同应用场景的多种第二语言也可以不同。本申请实施例对于第一语言和多种第二语言的具体类型不作限制。
其中,互操作边界信息、统一抽象表示的示例性解释说明,可以参照上文图4的应用场景的相关描述。
下面介绍本申请实施例的步骤S2的示例性实现方法。图6示出根据本申请实施例生成多种第二语言的互操作边界信息的统一抽象表示的一个示例。
在一种可能的实现方式中,步骤S2中,根据多种第二语言代码,生成多种第二语言的互操作边界信息的统一抽象表示,包括:
根据所述多种第二语言代码,识别所述多种第二语言的互操作边界信息;
根据所述多种第二语言的互操作边界信息,生成所述统一抽象表示。
通过这种方式,使得统一抽象表示可以直接由互操作边界信息处理得到,避免对多种第二语言的大量语法进行分析,可以提升根据本申请实施例的语言互操作方法的效率。
举例来说,如图6所示,根据C语言代码识别到的C语言的互操作边界信息,可以包括function、char*、Struct、Pointer、Primitive Types等多个构成要素;根据Java语言代码识别到的Java语言的互操作边界信息,可以包括interface、class等多个构成要素;根据JS/TS语言代码识别到的JS/TS语言的互操作边界信息,可以包括function、Primitive Types、Class等多个构成要素。可以看出,由于不同的第二语言的构成要素不完全相同,因此不同的第二语言的互操作边界信息不完全相同。在此情况下,多种第二语言的互操作边界信息,可以包括上述C语言的互操作边界信息、Java语言的互操作边界信息以及JS/TS语言的互操作边界信息。通过这种方式,使得多种第二语言的互操作边界信息中,可以包括至少一个唯一的构成要素,从而能够得到多种第二语言的互操作边界信息的特有部分,进而能够得到多种第二语言的互操作边界信息。
本领域技术人员应理解,多种第二语言的互操作边界信息还可以包括与多种第二语言的构成要素相关的更多内容,只要包括构成要素的必须信息(例如函数和/或类的名称及调用约定),且不包括构成要素内部信息(例如函数内部信息和类的方法内部的执行信息)即可,本申请对于互操作边界信息的具体组成成分不作限制。
下面结合图6和图7介绍介绍本申请实施例根据多种第二语言的互操作边界信息生成统一抽象表示的示例性方法。
在一种可能的实现方式中,所述多种第二语言的互操作边界信息包括至少一个重复的构成要素和至少一个唯一的构成要素,所述至少一个重复的构成要素为所述多种第二语言的互操作边界信息包括的构成要素中,重复出现的构成要素;所述至少一个唯一的构成要素为所述多种第二语言的互操作边界信息包括的构成要素中,仅一次出现的构成要素。
通过这种方式,可以减少互操作边界信息中构成要素的数量,使得互操作边界信息的统一抽象表示所占据的内存空间也降低,并使得后续根据统一抽象表示编译第一语言代码的操作复杂度降低。
在一种可能的实现方式中,根据所述多种第二语言的互操作边界信息,生成所述统一抽象表示,包括:
根据所述多种第二语言的互操作边界信息的共性部分和特有部分生成所述统一抽象表示,其中,所述共性部分通过将所述至少一个重复的构成要素中的每一重复的构成要素作为一个构成要素得到,所述特有部分包括所述至少一个唯一的构成要素。
举例来说,参见上文图6的相关描述,每种第二语言的互操作边界信息可能包括多个构成要素,多种第二语言的互操作边界信息包括的构成要素中,可能存在重复出现的构成要素和仅一次出现的构成要素。则多种第二语言的互操作边界信息可包括至少一个重复的构成要素和至少一个唯一的构成要素。例如图6的示例中,重复的构成要素可包括Primitive Types数据格式(C语言)、Primitive Types数据格式(JS/TS语言)、function函数(C语言)、function函数(JS/TS语言)、class类(Java语言)、class类(JS/TS语言),唯一的构成要素可包括char*数据格式(C语言)、Struct类(C语言)、Pointer指针(C语言)、interface接口(Java语言),则多种第二语言的互操作边界信息包括的至少一个重复的构成要素可以是Primitive Types数据格式、function函数、class类中的至少一个,至少一个唯一的构成要素可以是char*数据格式、Struct类、Pointer指针、interface接口中的至少一个。
在一种可能的实现方式中,所述多种第二语言的互操作边界信息包括共性部分和特有部分,所述共性部分中的每一构成要素,对应所述多种第二语言中的至少两种第二语言;所述特有部分中的每一构成要素,对应所述多种第二语言中的一种第二语言。
例如,针对多种第二语言的互操作边界信息,可以寻找共性部分,共性部分可以是多种第二语言中至少两种第二语言的互操作边界信息包括的相同的构成要素,可例如通过将至少一个重复的构成要素中的每一重复的构成要素作为一个构成要素得到,也即共性部分中的每一构成要素,对应多种第二语言中的至少两种第二语言。例如图6的示例中,至少一个重复的构成要素可包括Primitive Types数据格式(C语言)、Primitive Types数据格式(JS/TS语言)、function函数(C语言)、function函数(JS/TS语言)、class类(Java语言)、class类(JS/TS语言),在此情况下,共性部分包括的构成要素可包括Primitive Types数据格式、function函数、class类,其中Primitive Types数据格式、function函数可对应两种第二语言,并分别是C语言和JS/TS语言;class类可对应两种第二语言,并分别是Java语言和JS/TS语言。
针对多种第二语言的互操作边界信息,可以寻找特有部分,特有部分可例如是包括至少一个唯一的构成要素,也即特有部分中的每一构成要素,对应多种第二语言中的一种第二语言。例如图6的示例中,至少一个唯一的构成要素可包括char*数据格式(C语言)、Struct类(C语言)、Pointer指针(C语言)、interface接口(Java语言),在此情况下,特有部分包括的构成要素可包括char*数据格式、Struct类、Pointer指针、interface接口。其中char*数据格式、Struct类、Pointer指针均可对应一种第二语言且为C语言,interface接口可对应一种第二语言且为Java语言。
通过这种方式,使得共性部分中的每一构成要素能够与多种第二语言中的至少两种第二 语言相对应,特有部分中的每一构成要素能够与多种第二语言中的唯一一种第二语言相对应,因此共性部分与特有部分能够准确表征多种第二语言中的特征。
通过上述得到的共性部分和特有部分,可以合成得到统一抽象表示。例如图6的示例中,统一抽象表示可以包括Primitive Types数据格式、function函数、class类、char*数据格式、Struct类、Pointer指针、interface接口。统一抽象表示可以存储为二进制格式,可以通过struct类的结构来存储,例如,以C语言、Java语言、JS/TS语言作为第二语言得到的统一抽象表示中,构成要素可包括class、interface、function、variable、parameter、field等,对应的数据类型可包括i8、i16、i32、i64、u8、u16、u32、u64、f16、f32、f64、char、bool、function、array、class、interface、generics等。图7示出根据本申请实施例的统一抽象表示的二进制格式的一个示例。其中,构成要素的相关信息可填充在Dec l部分,数据类型的相关信息可填充在Type部分。
本领域技术人员应理解,可以用于生成统一抽象表示的互操作边界信息应不限于上述三种第二语言的互操作边界信息,在第二语言有更多种类时,可以包括更多种类的第二语言的互操作边界信息,生成更多种类的第二语言的互操作边界信息的统一抽象表示的方式,可以参照上述示例中根据C语言、Java语言、JS/TS语言的的互操作边界信息的统一抽象表示的示例,因此根据本申请实施例的语言互操作方法具有良好的可扩展性以及较低的开发成本。并且,由于统一抽象表示仅针对多种第二语言的互操作边界信息处理得到,因此,多种第二语言的互操作边界信息之外的语法特性等功能的演进,并不会对统一抽象表示造成影响,可以降低维护成本。
下面介绍本申请实施例的步骤S3的示例性实现方法。
在一种可能的实现方式中,步骤S3中,根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,包括:
根据所述统一抽象表示与第一语言的互操作边界信息的差异,得到对所述统一抽象表示与第一语言的语义进行处理的处理手段,所述第一语言的互操作边界信息根据所述第一语言代码确定;
编译所述第一语言代码时使用所述处理手段,得到所述第一语言代码的二进制代码并输出。
第一语言的语义描述了计算机执行使用第一语言编写的程序时所表现的行为,例如逻辑运算、读写数据等。对统一抽象表示与第一语言的语义进行处理的处理手段,可以是对统一抽象表示与第一语言的语义进行融合的手段。例如将统一抽象表示的构成要素与第一语言的构成要素“融合”为以构成要素内存作为构成要素名称的同一构成要素。又例如为第一语言的构成要素增加标记,表示该构成要素中的参数以统一抽象表示的构成要素的实现方式来实现,从而将统一抽象表示的构成要素的语义“融合”到第一语言中。
通过这种方式,可以在对第一语言代码进行编译时,完成统一抽象表示与第一语言的语义的处理,使得访问或使用第二语言的构成要素的第一语言代码得以成功编译。在处理手段预先设置好的前提下,不需开发者在编译过程中再做出指示,可以进一步降低开发者的工作难度。在处理手段由开发者实时确定时,可以提升编译第一语言代码时,对所述统一抽象表示与所述第一语言的语义进行处理的灵活性。
第一语言的互操作边界信息可以根据第一语言代码确定,可以包括第一语言代码中的构 成要素。在图6的示例中,第一语言的互操作边界信息可以包括Class类、Interface接口、function函数。
其中,统一抽象表示与第一语言的互操作边界信息的差异不同时,处理手段可能也不同,下面分别针对不同的处理手段及其示例性实现方式进行描述。
在一种可能的实现方式中,所述处理手段包括映射处理,其中,所述映射处理为针对所述第一语言代码中、与所述统一抽象表示内存相同、名称不同的构成要素,按照映射关系中对应内存的数据类型进行编译,所述映射关系指示所述统一抽象表示的构成要素、第一语言的互操作边界信息的构成要素与不同内存的数据类型的对应关系。
举例来说,第一语言代码中的构成要素和统一抽象表示中的构成要素仅存在名称上的差异,不存在内存占用空间的差异时,可以选择使用包括映射处理的处理手段,映射处理使得编译第一语言代码时,可以按照预设的映射关系将第一语言中的构成要素映射为对应内存的数据类型再进行编译,以第二语言包括C语言为例,映射关系的一个示例如表1所示。
表1
第一语言代码中的构成要素 统一抽象表示中的构成要素 映射后的数据类型
Unit Void u1
Int8 int8_t i8
Uint8 uint8_t u8
Int16 int16_t i16
UInt16 uint16_t u16
Int32 int32_t i32
Uint32 uint32_t u32
Int64 int64_t i64
UInt64 uint64_t u64
Float16 / f16
Float32 Float f32
Float64 Double f64
参见表1,映射关系指示统一抽象表示的构成要素、第一语言的互操作边界信息的构成要素(例如第一语言代码中的构成要素)与不同内存的数据类型的对应关系,映射后的数据类型可以体现构成要素的内存占用空间。
对于第一语言代码中的构成要素,在编译时仅关注其所需内存即可,不关注构成要素的名称,因此,可以预先统计第一语言代码与所述统一抽象表示的内存相同、名称不同的构成要素,并基于统计的内存相同、名称不同的构成要素的内存确定构成要素到不同内存的映射关系,使得在编译第一语言代码时基于映射关系可直接确定构成要素的内存,实现第一语言代码中与统一抽象表示内存相同、名称不同的构成要素的语义处理。
多种第二语言包括Java语言时,Java语言的任何构成要素的实例变量(以下简称变量)都可能是值为空(null)的空指针,表示没有这个对象,那么对空指针直接访问其成员或调用其成员的属性和方法时,将在运行时抛出空指针异常(null pointer exception,NPE)。假设第一语言在其任何变量都不允许为空值时具有较高的安全性,第一语言与Java语言进行 互操作时,第一语言调用的Java语言的变量传入第一语言,如果该变量恰好是空值,就会使得该值为空值的变量传入第一语言的运行时中,即使得第一语言的运行时出现空指针异常,破坏第一语言的安全性。
为了保护第一语言的安全性,在编译第一语言代码时,可以选择使用运行时转换代码。其中,运行时转换代码可以用来检查运行时是否为空。多个处理手段可能均使用运行时转换代码,但不同处理手段中运行时转换代码的使用方式可以不同。可以先对多种第二语言和第一语言的属性进行判断,根据多种第二语言和第一语言的属性的不同情况,结合应用场景的具体需求来确定使用哪一种处理手段。
在一种可能的实现方式中,所述统一抽象表示的构成要素变量包括空指针、且所述第一语言的构成要素变量不包括空指针,所述处理手段包括第一运行时转换处理,其中,所述第一运行时转换处理为使用运行时转换代码确定所述运行时为空时,将当前编译的构成要素变量作为异常值抛出。
构成要素变量指的是构成要素的实例变量。由于编译过程中运行时是否为空与当前编译的构成要素的变量是否为空指针相关联,因此通过使用运行时转换代码确定运行时为空,可以间接确定当前编译的构成要素变量为空指针,使得变量传入第一语言代码后第一语言的安全性,在变量传入第一语言代码内部之前得以确定;通过在确定所述运行时为空时,将当前编译的构成要素变量作为异常值抛出,使得异常不会传入第一语言,从而能保证第一语言的安全性。
例如,确定多种第二语言具有nullable属性,第一语言不具有nullable属性时,也即统一抽象表示的构成要素变量包括空指针,第一语言的构成要素变量不包括空指针时,认为多种第二语言中至少有一种第二语言具备较低的安全性,第一语言具备较高的安全性。如果应用场景的需求是保证第一语言的安全性,可以选择包括第一运行时转换处理的处理手段。第一运行时转换处理使得对第一语言代码进行编译过程中,在多种第二语言的变量赋值到第一语言的变量时使用运行时转换代码,用来检查运行时是否为空。如果运行时为空,可以直接抛出空指针异常NPE,也即将当前编译的构成要素变量作为异常值抛出。在此情况下,空指针异常仍旧抛出,但空指针异常仅出现在第一语言的互操作的边界信息部分,不会进一步向第一语言内传播,因此可以保证第一语言的安全性。
在一种可能的实现方式中,所述统一抽象表示的构成要素变量包括空指针、且所述第一语言的构成要素变量不包括空指针,所述处理手段包括第二运行时转换处理,其中,所述第二运行时转换处理为使用运行时转换代码确定所述运行时为空时,返回可选构成要素中的空值。
由于编译过程中运行时是否为空与当前编译的构成要素的变量是否为空指针相关联,因此通过使用运行时转换代码确定运行时为空,可以间接确定当前编译的构成要素变量为空指针,使得变量传入第一语言代码后第一语言的安全性,在变量传入第一语言代码内部之前得以确定;通过在确定所述运行时为空时,返回可选构成要素中的空值,使得编译第一语言代码时不会产生异常,从而能保证第一语言的安全性。
例如,确定多种第二语言具有nullable属性,第一语言不具有nullable属性时,也即统一抽象表示的构成要素变量包括空指针,第一语言的构成要素变量不包括空指针时,认为多种第二语言中至少有一种第二语言具备较低的安全性,第一语言具备较高的安全性。如果 应用场景的需求是保证第一语言的安全性,可以选择包括第二运行时转换处理的处理手段。第二运行时转换处理使得对第一语言代码进行编译过程中,在多种第二语言变量赋值到第一语言的变量时使用运行时转换代码,用来检查运行时是否为空。如果运行时为空,可以返回可选构成要素中的空值(None),如果运行时不为空,可以返回可选构成要素中的具体值。在此情况下,不会抛出空指针异常,因此可以保证第一语言的安全性。
在一种可能的实现方式中,除上述第一语言具备较高的安全性时,要求保证第一语言的安全性的应用场景,也可能存在不要求保证第一语言具备安全性的应用场景。例如,确定多种第二语言具有nullable属性,第一语言不具有nullable属性时,也即统一抽象表示的构成要素变量包括空指针,第一语言的构成要素变量不包括空指针时,认为多种第二语言中至少有一种第二语言具备较低的安全性,第一语言具备较高的安全性。如果应用场景的需求是不必保证第一语言的安全性,可以选择在对第一语言代码进行编译时,不使用运行时转换代码。在此情况下,能够最大程度地保证第一语言与多种第二语言的互操作能力。
在一种可能的实现方式中,除上述第一语言具备安全性的情况,第一语言也可能不具备安全性。例如,确定第一语言具有nullable属性时,也即第一语言的构成要素变量包括空指针时,此时第一语言本身不具备较高的安全性,因此不管多种第二语言的是否具有nullable属性,也即统一抽象表示的构成要素变量是否包括空指针,都不会进一步破坏第一语言的安全性,在此情况下,可以选择在对第一语言代码进行编译时,不使用运行时转换代码。
本领域技术人员应理解,上述第一运行时转换处理的处理手段、第二运行时转换处理的处理手段的确定方式,可以是固定的、预先设置好的,也可以由开发者进行实时确定,本申请对此不作限制。
在一种可能的实现方式中,根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,包括:在所述统一抽象表示与第一语言的互操作边界信息中,存在名称相同、语法不同的构成要素时,为第一语言的构成要素增加与所述统一抽象表示的构成要素对应的标记,得到第一语言代码的二进制代码并输出;所述标记指示第一语言代码中,所述具有标记的构成要素执行时,实现所述统一抽象表示的构成要素的语法。
通过这种方式,可以在编译时指示当前编译的构成要素应当使用的语法,避免因同一名称的构成要素对应第一语言和第二语言的多个语法时无法做出选择。使得能够提升本申请实施例的语言互操作方法支持的互操作能力。
举例来说,当第一语言的构成要素和统一抽象表示中的构成要素名称相同但某些特性差异相当巨大,例如实现机制语法等要求都不同时,也即,统一抽象表示与第一语言的互操作边界信息中,存在名称相同、语法不同的构成要素时,可以在编译时进行语法差异标记处理,得到二进制代码并输出。语法差异标记处理使得编译第一语言代码时,构成要素可以按照其具有的标记对应的第二语言的语法来编译。也即,具有标记的构成要素执行时,实现统一抽象表示的构成要素的语法。例如第二语言是Java语言,Java语言中的泛型采用泛型擦除来实现,而当第一语言泛型采用实例化实现时,第二语言和第一语言在语义的处理上就会产生巨大的冲突。可以通过对统一抽象表示中的泛型构成要素进行标记(标记存储在attributes属性中),如果期望在第一语言中调用Java语言的泛型类、接口等,可以在第一语言中添加明确的标记语法(比如注解、宏等),也即,为第一语言的构成要素增加与统一抽象表示的构成要素对应的标记,用来区分一个泛型类属于第一语言还是第二语言,比如通过@java等来 标记一个泛型构成要素,表示该构成要素中的泛型采用泛型擦除来实现。
下面结合第一语言代码示例来对语法差异标记进行示例性描述。
//第一语言代码示例
package example
import java.GenericClassExample
@java
class B<:GenericClassExample{
override public foo<T>(t:T){
//do something
}
}
func main(){
var p=B()
p.foo(p)
}
如上述第一语言代码,通过@java来标记第一语言中的构成要素B类,该B类继承了来自java的泛型类型,并重写了foo泛型函数,在此情况下,泛型参数T可以采用Java语言的泛型擦除处理,可以通过@java进行标记用于区别第一语言本身泛型函数的处理,也即为第一语言的构成要素增加与统一抽象表示的构成要素对应的标记(例如@java),具有标记的构成要素执行时,实现统一抽象表示的构成要素的语法(例如泛型擦除)。
通过语法差异标记可用于区分各种语义上的巨大差异,在此以包括Java语言的第二语言和第一语言的泛型作为示例。本领域技术人员应理解,语法差异标记同样也适用于其他第二语言与第一语言的构成要素,例如C语言、JS/TS语言等。
通过上述多种处理手段,可以获得由第一语言代码编译得到的第一语言代码的二进制代码,例如字节码。第一语言代码的二进制代码,与由多种第二语言代码编译得到的第二语言代码的二进制代码,可以一同输入到虚拟机运行,即可实现第一语言和多种第二语言的互操作。图8示出根据本申请实施例获取第一语言代码的二进制代码以及运行第一语言代码的二进制代码的一个示例。其中,如果第二语言包括C语言等比较基础的语言,在虚拟机进行运行时,可以直接使用对应语言的库文件。
根据本申请实施例的语言互操作方法,如表2所示,一般可在3/人年的实现成本下使第一语言具备基本的语言互操作能力,相比现有技术平均40+/人年的实现成本,本申请实施例的语言互操作方法成本低、工作量小;采用统一抽象表示来表征多种第二语言的互操作边界信息,使得维护成本大大降低;第二语言的特性变更对统一抽象表示造成影响极小,仅专注于互操作边界信息也可以提高方案稳定性,在为第一语言增加与新的第二语言的互操作能力时依然可使用该语言互操作方法;互操作边界信息可随着第一语言的能力的扩充而随时扩充, 只需按特性级别对统一抽象表示进行扩展即可,如新增C语言的数组互操作功能,并不影响其他部分已有的互操作功能;用户使用步骤缩减到撰写第一语言的代码、作出编译指示两步完成,并且编译所述第一语言代码时使用的处理手段简洁明了,这为用户带来较好的体验。
表2
  本申请方法 现有技术
实现成本 8/人年 40/人年+
可维护性
扩展性
用户体验
图9示出根据本申请实施例的语言互操作装置的示例性结构示意图。
如图9所示,在一种可能的实现方式中,本申请的实施例提供了一种语言互操作装置,所述装置包括:编译器90,用于:获取第一语言代码以及多种第二语言代码;根据所述多种第二语言代码,生成多种第二语言的互操作边界信息的统一抽象表示,所述统一抽象表示是多种第二语言的互操作边界信息的二进制代码,所述多种第二语言的互操作边界信息表示多种第二语言的构成要素中允许与第一语言互相访问或使用的构成要素;根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,所述第一语言代码的二进制代码在执行时使得第一语言代码中的构成要素和多种第二语言中的任一第二语言的构成要素能够互相访问或使用。
在一种可能的实现方式中,根据所述多种第二语言代码,生成多种第二语言的互操作边界信息的统一抽象表示,包括:根据所述多种第二语言代码,识别所述多种第二语言的互操作边界信息;根据所述多种第二语言的互操作边界信息,生成所述统一抽象表示。
在一种可能的实现方式中,所述多种第二语言的互操作边界信息包括至少一个重复的构成要素和至少一个唯一的构成要素,所述至少一个重复的构成要素为所述多种第二语言的互操作边界信息包括的构成要素中,重复出现的构成要素;所述至少一个唯一的构成要素为所述多种第二语言的互操作边界信息包括的构成要素中,仅一次出现的构成要素。
在一种可能的实现方式中,所述多种第二语言的互操作边界信息包括共性部分和特有部分,所述共性部分中的每一构成要素,对应所述多种第二语言中的至少两种第二语言;所述特有部分中的每一构成要素,对应所述多种第二语言中的唯一一种第二语言。
在一种可能的实现方式中,根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,包括:根据所述统一抽象表示与第一语言的互操作边界信息的差异,得到对所述统一抽象表示与第一语言的语义进行处理的处理手段,所述第一语言的互操作边界信息根据所述第一语言代码确定;编译所述第一语言代码时使用所述处理手段,得到所述第一语言代码的二进制代码并输出。
在一种可能的实现方式中,所述处理手段包括映射处理,其中,所述映射处理为针对所述第一语言代码中、与所述统一抽象表示内存相同、名称不同的构成要素,按照映射关系中对应内存的数据类型进行编译,所述映射关系指示所述统一抽象表示的构成要素、第一语言的互操作边界信息的构成要素与不同内存的数据类型的对应关系。
在一种可能的实现方式中,所述统一抽象表示的构成要素变量包括空指针、且所述第一 语言的构成要素变量不包括空指针,所述处理手段包括第一运行时转换处理,其中,所述第一运行时转换处理为使用运行时转换代码确定所述运行时为空时,将当前编译的构成要素变量作为异常值抛出。
在一种可能的实现方式中,所述统一抽象表示的构成要素变量包括空指针、且所述第一语言的构成要素变量不包括空指针,所述处理手段包括第二运行时转换处理,其中,所述第二运行时转换处理为使用运行时转换代码确定所述运行时为空时,返回可选构成要素中的空值。
在一种可能的实现方式中,根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,包括:在所述统一抽象表示与第一语言的互操作边界信息中,存在名称相同、语法不同的构成要素时,为第一语言的构成要素增加与所述统一抽象表示的构成要素对应的标记,得到第一语言代码的二进制代码并输出;所述标记指示第一语言代码中,所述具有标记的构成要素执行时,实现所述统一抽象表示的构成要素的语法。
在一种可能的实现方式中,不同的第二语言的互操作边界信息不完全相同。
本申请的实施例提供了一种语言互操作装置,包括:处理器以及用于存储处理器可执行指令的存储器;其中,所述处理器被配置为执行所述指令时实现上述方法。
本申请的实施例提供了一种非易失性计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。
本申请的实施例提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器执行上述方法。
图10示出根据本申请实施例的语言互操作装置的示例性结构示意图。
如图10所示,语言互操作装置的可以包括桌面型计算机、膝上型计算机、手持计算机、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)、增强现实(augmented reality,AR)设备、虚拟现实(virtual reality,VR)设备、人工智能(artificial intelligence,AI)设备、可穿戴式设备、车载设备、智能家居设备、或智慧城市设备、服务器设备中的至少一种。本申请实施例对该语言互操作装置的具体类型不作特殊限制。
语言互操作装置可以包括处理器110,存储器121。可以理解的是,本申请实施例示意的结构并不构成对语言互操作装置的具体限定。在本申请另一些实施例中,语言互操作装置可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
处理器110可以根据指令操作码和时序信号,产生操作控制信号,完成获取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器可以为高速缓冲存储器。该存储器可以保存处理器110用过或使用频率较高的指令或数据。如果处理器110需要使用该指令或数据,可从该存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
存储器121可以用于存储计算机可执行程序代码,该可执行程序代码包括指令。存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如处理手段)等。存储数据区可存储语言互操作装置使用过程中所创建的数据(比如统一抽象表示)等。此外,存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。处理器110通过运行存储在存储器121的指令,和/或存储在设置于处理器中的存储器的指令,执行语言互操作装置的各种功能方法或上述语言互操作方法。
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是――但不限于――电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(Random Access Memory,RAM)、只读存储器(Read Only Memory,ROM)、可擦式可编程只读存储器(Electrically Programmable Read-Only-Memory,EPROM或闪存)、静态随机存取存储器(Static Random-Access Memory,SRAM)、便携式压缩盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、数字多功能盘(Digital Video Disc,DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。
这里所描述的计算机可读程序指令或代码可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。
用于执行本申请操作的计算机程序指令可以是汇编指令、指令集架构(Instruction Set Architecture,ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(Local Area Network,LAN)或广域网(Wide Area Network,WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或可编程逻辑阵列(Programmable Logic Array,PLA),该电子电路可以执行计算机可读程序指令,从而实现本申请的各个方面。
这里参照根据本申请实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本申请的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。
附图中的流程图和框图显示了根据本申请的多个实施例的装置、系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。
也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行相应的功能或动作的硬件(例如电路或ASIC(Application Specific Integrated Circuit,专用集成电路))来实现,或者可以用硬件和软件的组合,如固件等来实现。
尽管在此结合各实施例对本发明进行了描述,然而,在实施所要求保护的本发明过程中,本领域技术人员通过查看所述附图、公开内容、以及所附权利要求书,可理解并实现所述公开实施例的其它变化。在权利要求中,“包括”(comprising)一词不排除其他组成部分或步骤,“一”或“一个”不排除多个的情况。单个处理器或其它单元可以实现权利要求中列举的若干项功能。相互不同的从属权利要求中记载了某些措施,但这并不表示这些措施不能组合起来产生良好的效果。
以上已经描述了本申请的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。

Claims (14)

  1. 一种语言互操作方法,其特征在于,所述方法包括:
    获取第一语言代码以及多种第二语言代码;
    根据所述多种第二语言代码,生成多种第二语言的互操作边界信息的统一抽象表示,所述统一抽象表示是多种第二语言的互操作边界信息的二进制代码,所述多种第二语言的互操作边界信息表示多种第二语言的构成要素中允许与第一语言互相访问或使用的构成要素;
    根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,所述第一语言代码的二进制代码在执行时使得第一语言代码中的构成要素和多种第二语言中的任一第二语言的构成要素能够互相访问或使用。
  2. 根据权利要求1所述的方法,其特征在于,根据所述多种第二语言代码,生成多种第二语言的互操作边界信息的统一抽象表示,包括:
    根据所述多种第二语言代码,识别所述多种第二语言的互操作边界信息;
    根据所述多种第二语言的互操作边界信息,生成所述统一抽象表示。
  3. 根据权利要求1或2所述的方法,其特征在于,所述多种第二语言的互操作边界信息包括至少一个重复的构成要素和至少一个唯一的构成要素,所述至少一个重复的构成要素为所述多种第二语言的互操作边界信息包括的构成要素中,重复出现的构成要素;所述至少一个唯一的构成要素为所述多种第二语言的互操作边界信息包括的构成要素中,仅一次出现的构成要素。
  4. 根据权利要求1或2所述的方法,其特征在于,所述多种第二语言的互操作边界信息包括共性部分和特有部分,
    所述共性部分中的每一构成要素,对应所述多种第二语言中的至少两种第二语言;
    所述特有部分中的每一构成要素,对应所述多种第二语言中的唯一一种第二语言。
  5. 根据权利要求1-4中任一项所述的方法,其特征在于,根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,包括:
    根据所述统一抽象表示与第一语言的互操作边界信息的差异,得到对所述统一抽象表示与第一语言的语义进行处理的处理手段,所述第一语言的互操作边界信息根据所述第一语言代码确定;
    编译所述第一语言代码时使用所述处理手段,得到所述第一语言代码的二进制代码并输出。
  6. 根据权利要求5所述的方法,其特征在于,
    所述处理手段包括映射处理,其中,所述映射处理为
    针对所述第一语言代码中、与所述统一抽象表示内存相同、名称不同的构成要素,按照映射关系中对应内存的数据类型进行编译,所述映射关系指示所述统一抽象表示的构成要素、第一语言的互操作边界信息的构成要素与不同内存的数据类型的对应关系。
  7. 根据权利要求5或6所述的方法,其特征在于,
    所述统一抽象表示的构成要素变量包括空指针、且所述第一语言的构成要素变量不包括空指针,所述处理手段包括第一运行时转换处理,
    其中,所述第一运行时转换处理为
    使用运行时转换代码确定所述运行时为空时,将当前编译的构成要素变量作为异常值抛出。
  8. 根据权利要求5-7中任一项所述的方法,其特征在于,
    所述统一抽象表示的构成要素变量包括空指针、且所述第一语言的构成要素变量不包括空指针,所述处理手段包括第二运行时转换处理,
    其中,所述第二运行时转换处理为
    使用运行时转换代码确定所述运行时为空时,返回可选构成要素中的空值。
  9. 根据权利要求1-4中任一项所述的方法,其特征在于,根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,包括:
    在所述统一抽象表示与第一语言的互操作边界信息中,存在名称相同、语法不同的构成要素时,为第一语言的构成要素增加与所述统一抽象表示的构成要素对应的标记,得到第一语言代码的二进制代码并输出;
    所述标记指示第一语言代码中,所述具有标记的构成要素执行时,实现所述统一抽象表示的构成要素的语法。
  10. 根据权利要求1所述的方法,其特征在于,
    不同的第二语言的互操作边界信息不完全相同。
  11. 一种语言互操作装置,其特征在于,所述装置包括:
    编译器,用于:
    获取第一语言代码以及多种第二语言代码;
    根据所述多种第二语言代码,生成多种第二语言的互操作边界信息的统一抽象表示,所述统一抽象表示是多种第二语言的互操作边界信息的二进制代码,所述多种第二语言的互操作边界信息表示多种第二语言的构成要素中允许与第一语言互相访问或使用的构成要素;
    根据所述统一抽象表示,编译所述第一语言代码,得到第一语言代码的二进制代码并输出,所述第一语言代码的二进制代码在执行时使得第一语言代码中的构成要素和多种第二语言中的任一第二语言的构成要素能够互相访问或使用。
  12. 一种语言互操作装置,其特征在于,包括:
    处理器;
    用于存储处理器可执行指令的存储器;
    其中,所述处理器被配置为执行所述指令时实现权利要求1-10任意一项所述的方法。
  13. 一种非易失性计算机可读存储介质,其上存储有计算机程序指令,其特征在于,所述计算机程序指令被处理器执行时实现权利要求1-10中任意一项所述的方法。
  14. 一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,其特征在于,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行权利要求1-10中任意一项所述的方法。
PCT/CN2022/125164 2021-10-14 2022-10-13 语言互操作方法、装置、存储介质及程序产品 WO2023061452A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP22880388.8A EP4361796A1 (en) 2021-10-14 2022-10-13 Language interoperation method and apparatus, storage medium, and program product

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111200966.6 2021-10-14
CN202111200966.6A CN115981652B (zh) 2021-10-14 2021-10-14 语言互操作方法、装置、存储介质及程序产品

Publications (1)

Publication Number Publication Date
WO2023061452A1 true WO2023061452A1 (zh) 2023-04-20

Family

ID=85968609

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/125164 WO2023061452A1 (zh) 2021-10-14 2022-10-13 语言互操作方法、装置、存储介质及程序产品

Country Status (3)

Country Link
EP (1) EP4361796A1 (zh)
CN (3) CN117389570A (zh)
WO (1) WO2023061452A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567023A (zh) * 2010-12-28 2012-07-11 微软公司 参数化接口标识符技术
CN106415495A (zh) * 2014-05-30 2017-02-15 苹果公司 用于应用开发的编程系统和语言
CN106970802A (zh) * 2017-04-25 2017-07-21 北京航天飞行控制中心 在领域特定语言中集成编程脚本语言的方法及装置
CN109871284A (zh) * 2017-12-05 2019-06-11 北京元比特科技有限责任公司 一种应用程序跨语言运行时环境执行的虚拟化技术及方法
US20200218534A1 (en) * 2012-11-06 2020-07-09 Coherent Logix, Incorporated Multiprocessor Programming Toolkit for Design Reuse
CN112753014A (zh) * 2018-07-25 2021-05-04 英国开放大学 使用二进制中间表示的处理方法

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6066181A (en) * 1997-12-08 2000-05-23 Analysis & Technology, Inc. Java native interface code generator
US9965259B2 (en) * 2002-11-20 2018-05-08 Purenative Software Corporation System for translating diverse programming languages
US9996328B1 (en) * 2017-06-22 2018-06-12 Archeo Futurus, Inc. Compiling and optimizing a computer code by minimizing a number of states in a finite machine corresponding to the computer code
CN108345458B (zh) * 2018-01-25 2021-04-09 微梦创科网络科技(中国)有限公司 一种静态编译语言与脚本语言的调用方法及系统
US10803087B2 (en) * 2018-10-19 2020-10-13 Oracle International Corporation Language interoperable runtime adaptable data collections
CN111736838A (zh) * 2019-03-25 2020-10-02 华为技术有限公司 一种跨语言编译方法及设备
CN111381817A (zh) * 2020-03-25 2020-07-07 北京字节跳动网络技术有限公司 实现跨平台多语言开发的方法、装置、介质和电子设备
CN111488154B (zh) * 2020-04-23 2024-01-12 北京东土科技股份有限公司 St语言源代码编译方法、装置、计算机设备及介质
CN111651165A (zh) * 2020-05-18 2020-09-11 深圳市大富网络技术有限公司 编程语言的集成方法、编程软件系统和电子装置
CN111913691A (zh) * 2020-06-16 2020-11-10 武汉达梦数据库有限公司 一种Python和Java数据互操作的方法和装置
CN111813381A (zh) * 2020-06-22 2020-10-23 北京字节跳动网络技术有限公司 跨平台生成可运行程序的方法、装置、介质和电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567023A (zh) * 2010-12-28 2012-07-11 微软公司 参数化接口标识符技术
US20200218534A1 (en) * 2012-11-06 2020-07-09 Coherent Logix, Incorporated Multiprocessor Programming Toolkit for Design Reuse
CN106415495A (zh) * 2014-05-30 2017-02-15 苹果公司 用于应用开发的编程系统和语言
CN106970802A (zh) * 2017-04-25 2017-07-21 北京航天飞行控制中心 在领域特定语言中集成编程脚本语言的方法及装置
CN109871284A (zh) * 2017-12-05 2019-06-11 北京元比特科技有限责任公司 一种应用程序跨语言运行时环境执行的虚拟化技术及方法
CN112753014A (zh) * 2018-07-25 2021-05-04 英国开放大学 使用二进制中间表示的处理方法

Also Published As

Publication number Publication date
CN117389570A (zh) 2024-01-12
CN115981652A (zh) 2023-04-18
CN115981652B (zh) 2023-09-29
CN117406999A (zh) 2024-01-16
EP4361796A1 (en) 2024-05-01

Similar Documents

Publication Publication Date Title
CN106462425B (zh) 使用复常量的方法和系统
US11347489B2 (en) Accessing a migrated member in an updated type
US8332828B2 (en) System for translating diverse programming languages
US9965259B2 (en) System for translating diverse programming languages
US9086931B2 (en) System for translating diverse programming languages
US11366643B2 (en) Generating dynamic modular proxies
US10158647B2 (en) Permissive access control for modular reflection
US8656372B2 (en) System for translating diverse programming languages
US10853096B2 (en) Container-based language runtime loading an isolated method
WO2024045379A1 (zh) 编译方法和编译器、Wasm虚拟机
US11645129B2 (en) Dynamically-imposed field and method type restrictions for managed execution environments
US10846417B2 (en) Identifying permitted illegal access operations in a module system
WO2023061452A1 (zh) 语言互操作方法、装置、存储介质及程序产品
US10394610B2 (en) Managing split packages in a module system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22880388

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022880388

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022880388

Country of ref document: EP

Effective date: 20240124

NENP Non-entry into the national phase

Ref country code: DE