CN109255209B - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN109255209B
CN109255209B CN201710571375.7A CN201710571375A CN109255209B CN 109255209 B CN109255209 B CN 109255209B CN 201710571375 A CN201710571375 A CN 201710571375A CN 109255209 B CN109255209 B CN 109255209B
Authority
CN
China
Prior art keywords
syntax
syntax element
tree
decoding
elements
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710571375.7A
Other languages
Chinese (zh)
Other versions
CN109255209A (en
Inventor
刘彬彬
朱为贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201710571375.7A priority Critical patent/CN109255209B/en
Publication of CN109255209A publication Critical patent/CN109255209A/en
Application granted granted Critical
Publication of CN109255209B publication Critical patent/CN109255209B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/12Protecting executable software
    • G06F21/121Restricting unauthorised execution of programs
    • G06F21/125Restricting unauthorised execution of programs by manipulating the program code, e.g. source code, compiled code, interpreted code, machine code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Technology Law (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The embodiment of the application provides a data processing method, a data processing device, data processing equipment and a storage medium, so as to improve the safety of source codes. The method comprises the following steps: analyzing the source code to generate a syntax tree; parsing a syntax element from the syntax tree, coding the syntax element, and generating a corresponding intermediate code file; and sending the intermediate code file. An intermediate code file is generated in the process of analyzing and coding, and the intermediate code file is unreadable under the condition of unknown coding and decoding methods and has certain confidentiality.

Description

Data processing method, device, equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method, a data processing apparatus, a device, a storage medium, and an operating system for a device.
Background
JavaScript is an transliterated scripting language whose interpreter is called js (JavaScript) engine. The source code of the JavaScript does not need to be pre-compiled, but runs in an interpreted manner during the running of the program.
The current JS engine parsing executes JS source files, so the source needs to be open to enable the JS engine parsing. However, this results in JS source code that is easily compromised during transmission.
Disclosure of Invention
The technical problem to be solved by the embodiments of the present application is to provide a data processing method to improve the security of a source code.
Correspondingly, the embodiment of the application also provides a data processing device, equipment, a storage medium and an operating system, which are used for ensuring the implementation and application of the method.
In order to solve the above problem, an embodiment of the present application discloses a data processing method, including: analyzing the source code to generate a syntax tree; parsing a syntax element from the syntax tree, coding the syntax element, and generating a corresponding intermediate code file; and sending the intermediate code file.
The embodiment of the application also discloses a data processing method, which comprises the following steps: acquiring a middle code file; decoding the intermediate code file, determining a corresponding syntax element, and restoring a corresponding syntax tree according to the syntax element; and executing corresponding codes according to the syntax tree.
The embodiment of the present application further discloses a data processing apparatus, including: the parsing module is used for parsing the source code to generate a syntax tree; the coding module is used for analyzing the syntax elements from the syntax tree, coding the syntax elements and generating corresponding intermediate code files; and the sending module is used for sending the intermediate code file.
The embodiment of the application also discloses a data processing device, which comprises: the file acquisition module is used for acquiring the intermediate code file; the decoding module is used for decoding the intermediate code file, determining a corresponding syntax element and restoring a corresponding syntax tree according to the syntax element; and the execution module is used for executing the corresponding codes according to the syntax tree.
The embodiment of the application also discloses a device, which comprises: one or more processors; and one or more machine readable media having instructions stored thereon, which when executed by the one or more processors, cause the apparatus to perform one or more of the methods as described in the parsing end for embodiments of the application.
One or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform one or more of the methods described in the embodiments of the present application for resolving peer correspondences are also disclosed.
The embodiment of the present application further discloses an apparatus, including: one or more processors; and one or more machine readable media having instructions stored thereon, which when executed by the one or more processors, cause the apparatus to perform one or more of the methods described as executables of embodiments of the present application.
One or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform one or more of the methods described in the corresponding executive end of the embodiments of the present application.
The embodiment of the present application further discloses an operating system for a device, including: a communication unit which acquires the intermediate code file; the decoding restoration unit is used for decoding the intermediate code file, determining a corresponding syntax element and restoring a corresponding syntax tree according to the syntax element; and the execution unit executes the corresponding source code according to the syntax tree.
Compared with the prior art, the embodiment of the application has the following advantages:
in the embodiment of the application, in the execution process of the source code of the program, the analysis unrelated to the platform and the compiling execution process related to the platform are separated, so that the source code is analyzed into the syntax tree at a non-execution end, such as an analysis end, and then is encoded to form the intermediate code file, and the intermediate code file is decoded at an execution end to obtain the syntax tree, and then the corresponding code is compiled and executed to provide the function of the program. The intermediate code file is generated in the process of analyzing and coding, is unreadable under the condition of unknown coding and decoding methods, has certain confidentiality and can protect the rights and interests of a source code developer. In the process of decoding, compiling and executing at the execution end, the execution engine does not need to execute the analysis of the lexical method and the grammar again, so that the running time is saved, and the running efficiency is improved.
Drawings
FIG. 1 is a schematic diagram of one embodiment of a data processing system according to the present application;
FIG. 2 is a flow chart of an embodiment of an encoding phase of a data processing method of the present application;
FIG. 3 is a flow chart of an embodiment of a decoding stage in a data processing method according to the present application
FIG. 4 is a flow chart of an embodiment of an encoding phase of a data processing method of the present application;
FIG. 5 is a flow diagram of one embodiment of lexical parsing of the present application;
FIG. 6 is a flow diagram of one embodiment of a grammar parsing of the present application;
FIG. 7 is a flow diagram of one embodiment of syntax tree coding of the present application;
FIG. 8 is a flow chart of an embodiment of a decoding stage in another data processing method of the present application;
FIG. 9 is a flow chart of the steps of one embodiment of syntax tree restoration of the present application;
FIG. 10 is a block diagram of an embodiment of a data processing apparatus according to the present application;
FIG. 11 is a block diagram of an alternative embodiment of a data processing apparatus according to the present application;
FIG. 12 is a block diagram of another data processing apparatus embodiment of the present application;
FIG. 13 is a block diagram of an alternate embodiment of a data processing apparatus according to the present application;
FIG. 14 is a diagram illustrating a hardware configuration of an apparatus according to an embodiment of the present application;
FIG. 15 is a diagram illustrating a hardware configuration of an apparatus according to another embodiment of the present application;
FIG. 16 is a schematic diagram of an embodiment of an operating system of the present application.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description.
Referring to FIG. 1, a schematic diagram of an embodiment of a data processing system of the present application is shown.
The data processing system of the embodiment of the application comprises: the system comprises a parsing end 10 and an execution end 20, wherein the parsing end is used for parsing the developed source code, and the execution end is used for executing the parsed source code. The source code (also called source program) is an uncompiled text file written according to a certain programming language specification, and is a series of readable computer language instructions. After the user has developed the source code, the user may parse the source code on the parsing terminal 10, which may be located on a development platform used by the user or may be another device. The execution end can be various devices capable of executing the source code, such as a user terminal like a mobile phone and a PC, and also devices like a server.
The source code is run in an interpreted manner, and thus can be compiled and executed after parsing, and the embodiment can generate a corresponding syntax tree after parsing the source code, where the syntax tree can be an Abstract Syntax Tree (AST), which is a tree-like representation of an abstract syntax structure of the source code. And then parsing the syntax tree to obtain syntax elements, coding the syntax elements to obtain an intermediate code file, and then sending the intermediate code file to an execution end to execute the source code. The intermediate code file is a file formed by intermediate codes, and the intermediate codes are data coded by syntax elements, so that the intermediate code file can be regarded as an encrypted file and can be read only after being decoded, and therefore, the intermediate code file is unreadable under the condition of unknown coding and decoding methods, has certain confidentiality and can protect the rights and interests of source code developers.
The intermediate code file can be analyzed at the execution end to obtain corresponding syntax elements, then a syntax tree is restored according to the syntax elements, and then the binary code is obtained and executed according to the syntax tree by compiling. Therefore, the execution engine does not need to execute the analysis of the lexical method and the grammar again in the compiling and executing process, the running time is saved, and the running efficiency is improved.
In the embodiment of the present application, parsing the source code includes: lexical parsing and grammatical parsing. The lexical analysis may be performed by a Scanner (Scanner), and may analyze words in the source code, check whether a spelling error exists, check for validity, and the like, and may abstract each word in the source code into a lexical unit (Token) having a specific meaning. The grammar parsing can be performed by parsing each sentence line by using a grammar Parser (Parser) according to the standard and lexical unit of the language, judging whether the sentence conforms to grammar rules, and generating an abstract grammar tree (AST) according to the parsed grammar elements. In addition, in the step of compiling the binary code according to the syntax tree and executing the binary code, the compiled binary code can be determined according to the running platform where the running end is located.
In the embodiment of the application, the lexical unit is data which is obtained by performing lexical analysis on the source code and has a marked meaning, and the marked meaning is determined according to the lexical method, so that the lexical unit can be a word or a code block, wherein the word is a minimum unit with independent meaning in a language and can comprise keywords, identifiers, operators, delimiters, constants and the like. Wherein, the keyword is an identifier with fixed meaning defined by a programming language, for example, begin, end, if, while, etc. in Pascal; identifiers are used to represent various names, such as variable names, array names, process names, and the like; the constant refers to a fixed numerical value, and the types of the constant generally include integer, real, Boolean, character type and the like; operators, such as +, -,/etc.; delimiters such as commas, semicolons, brackets, and the like. The syntax element is an element of a node corresponding to an Abstract Syntax Tree (AST), the syntax element may be determined according to a lexical unit and a syntax structure, and the syntax element may be the word.
Referring to fig. 2, a flow chart of an embodiment of an encoding stage in a data processing method of the present application is shown.
Step 202, parsing the source code to generate a syntax tree.
And 204, analyzing the syntax elements from the syntax tree, coding the syntax elements and generating corresponding intermediate code files.
Step 206, the intermediate code file is sent.
According to the embodiment of the application, the analysis and the execution of the source code are separated and executed at different processing ends. After the program is written and the source code is obtained, the source code can be analyzed and a corresponding grammar tree is generated, wherein the source code can be analyzed in a lexical mode to determine a lexical unit, grammar elements are obtained through grammar analysis according to the lexical unit, and the corresponding grammar tree is generated according to the grammar elements. In order to protect the security of the code, in this embodiment, after the syntax tree is obtained, the syntax tree is analyzed to obtain a syntax element, and then the syntax element is encoded to obtain the intermediate code file. The syntax element includes various elements, such as numbers, character strings and operation codes, so that when the syntax element is encoded, a corresponding encoding method can be determined according to the elements and encoded, wherein the encoding method can be set according to requirements. After the intermediate code file is determined, the intermediate code file may be transmitted. The intermediate code file can be published on a publishing platform, and can also be sent to an execution end through the publishing platform to execute the corresponding program.
Referring to fig. 3, a flow chart of an embodiment of an encoding stage in a data processing method of the present application is shown.
Step 302, acquiring a midamble file.
And step 304, decoding the intermediate code file, determining a corresponding syntax element, and restoring a corresponding syntax tree according to the syntax element.
Step 306, executing the corresponding code according to the syntax tree.
After the intermediate code file is obtained, the corresponding source code can be executed according to the intermediate code file, so that the intermediate code file can be decoded, the decoding method corresponds to the encoding method, can be preset according to requirements, can restore the corresponding syntax tree by adopting the syntax element after the corresponding syntax element is determined by decoding, can be compiled based on the syntax tree to obtain a binary code, and can execute the binary code, thereby realizing the operation of the source code to provide the service function corresponding to the program.
In the embodiment of the application, in the code execution process of the program, the analysis irrelevant to the platform and the compiling execution process relevant to the platform are separated, so that the source code is analyzed into the syntax tree at a non-execution end, such as an analysis end, and then is encoded to form the intermediate code file, the intermediate code file is decoded at an execution end to obtain the syntax tree, and the corresponding code is compiled and executed to provide the function of the program. The intermediate code file is generated in the process of analyzing and coding, is unreadable under the condition of unknown coding and decoding methods, has certain confidentiality and can protect the rights and interests of a source code developer. In the process of decoding, compiling and executing at the execution end, the execution engine does not need to execute the analysis of the lexical method and the grammar again, so that the running time is saved, and the running efficiency is improved.
In the embodiment of the present application, if the stage for executing the source code is divided based on the intermediate file, the stage may be divided into an encoding stage and a decoding stage, the parsing and encoding process may be used as the encoding stage, and the decoding, compiling and executing process may be used as the decoding stage, that is, the encoding stage is executed by the parsing end, and the decoding stage is executed by the executing end. Taking a JavaScript source code as an example, the above process may be referred to as a jsc coding and deCoding (JSCC) process, where the coding stage may be implemented by a tool such as an encoder, and the deCoding stage may be implemented by a tool such as a decoder, and the tool may be run on a device such as a server and a user terminal.
The encoder is a tool for encoding the source code into binary intermediate code independent of the platform, for example, a JSCC encoder can be used for JavaScript source code. The encoder is used for pre-compiling source codes written by developers into intermediate codes, the encoder can run on a development server, and the intermediate codes generated by the encoder are independent of a platform. The encoder may comprise: a source Scanner (Scanner), a lexical Parser (Lexer), a syntax Parser (Parser), and an intermediate code Encoder (Encoder).
Referring to fig. 4, a flow chart of an embodiment of an encoding stage in another data processing method of the present application is shown.
Step 402, performing lexical analysis on the source code, and determining at least one lexical unit.
After the program is written to obtain the source code, lexical analysis and syntactic analysis can be performed on the source code, wherein the lexical analysis can be performed by a Scanner (Scanner) and is used for analyzing words in the source code, checking whether spelling errors exist, checking validity and the like, and abstracting each word in the source code into a lexical unit (Token) with specific meaning.
Lexical parsing may include: scanning characters in the source code to generate a character string to be processed; and marking the character string to be processed, and determining at least one lexical unit. The method comprises the steps of scanning characters of a source code from left to right by adopting a source code scanner to obtain a character string to be processed, inputting the character string to be processed into a lexical analyzer (Lexer) for lexical analysis, marking the character string to be processed, classifying the character string to be processed into a lexical unit (Token) with a marking meaning, and establishing a corresponding character string table.
Referring to fig. 5, a flow chart of an embodiment of lexical parsing in accordance with the present application is shown.
Step 502, scanning the source code and reading the characters.
Step 504, determine whether the character is a blank character. If the character is a blank character, returning to the step 502 to continue scanning; if not, go to step 506.
Step 506, determining a character string to be processed according to the characters.
And step 508, inputting the character string to be processed into a lexical analyzer.
Step 510, determining whether the character string to be processed is legal. If not, go to step 518; if it is legal, go to step 512.
Step 512, determining whether the character string to be processed is a keyword. If yes, go to step 514, otherwise go to step 516
Step 514, the lexical units of the token-bit keywords are obtained.
At step 516, lexical units marked as identifiers are obtained.
Step 518, an error message is output.
Therefore, the character string to be processed can be obtained by scanning the source code, then the character string to be processed is input into a lexical analyzer (Lexer) for lexical analysis, the character string to be processed can be subjected to legality detection, lexical units marked as keywords, identifiers and the like can be obtained after the character string to be processed is legal, the upper diagram is only a flow diagram of lexical analysis, and lexical units marked with bit operators, delimiters, constants and the like can be obtained in actual processing. Therefore, lexical units with marked meanings can be obtained through lexical analysis, and a corresponding character string table is established.
And 404, performing grammar analysis on the lexical units, determining grammar elements, and generating corresponding grammar trees according to the grammar elements.
And (3) carrying out grammar analysis on the lexical units, inputting the lexical units into a grammar analyzer (Parser), analyzing each sentence line by taking the sentence as a unit according to the language standard by the grammar analyzer, judging whether the sentence accords with grammar rules, and generating an abstract grammar tree (AST).
The syntax parsing includes the following sub-steps: analyzing the sentence corresponding to the lexical unit to obtain a syntax element in the sentence; and generating a corresponding syntax tree according to the statement and the syntax element.
After a source file is divided into independent lexical units through lexical analysis, a grammar analyzer can be used for sequentially reading the lexical units, analyzing sentences corresponding to the lexical units to obtain grammar elements such as keywords, common character strings or numbers in the sentences, then determining the positions of the grammar elements in a grammar tree according to the sentences to generate the corresponding grammar tree, wherein the grammar elements can be used as nodes in the grammar tree, and the grammar elements can also be used as elements corresponding to the nodes in the grammar tree.
Referring to FIG. 6, a flow diagram of one embodiment of the syntax parsing of the present application is shown.
Step 602, a lexical unit is read by using a parser.
Step 604, determining whether the lexical unit is a keyword. If yes, go to step 606; if not, go to step 618.
Step 606, judging whether the statement corresponding to the lexical unit corresponds to a statement. If so, go to step 608; if no statement does correspond, then step 616 is performed.
At step 608, the corresponding statement sentence is analyzed.
Step 610, determining whether the lexical unit corresponds to the function definition. If the function is defined, go to step 612; if not, go to step 614.
Step 612, processing the function definition corresponding to the lexical unit to obtain a corresponding syntax element.
And 614, determining the variable definition corresponding to the lexical unit, and processing the variable definition to obtain the corresponding syntactic element.
Step 616, determining the sentence corresponding to the lexical unit as a compound sentence, and analyzing the compound sentence to obtain a corresponding syntax element.
Step 618, determining the expression corresponding to the lexical unit, and analyzing the expression to obtain the corresponding syntax element.
Step 620, generating a syntax tree according to the syntax element.
Therefore, sentences corresponding to the lexical units can be obtained, corresponding analysis processing is carried out according to the sentence types of the sentences, so that the sentence structures and the syntax elements in the sentence structures are obtained, the syntax elements are added into the syntax tree AST according to the sentence structures, the names and the like of the source codes can be used as root nodes, and the syntax elements obtained through analysis can be added into the syntax tree according to the sentence sequence, the sentence structures and the like in the source codes.
Step 406, traverse the syntax tree and obtain the corresponding syntax elements.
Step 408, analyzing the type of the syntax element, and encoding the syntax element according to the type.
Step 410, generating a corresponding intermediate code file according to the encoded syntax element.
After the syntax tree is determined, in order to prevent the source code from being analyzed and causing the code to be leaked, the intermediate code file can be obtained by encoding according to the syntax tree, so that the source code cannot be leaked by transmitting the intermediate code file, lexical analysis and syntax analysis are not needed at an execution end, resource consumption is reduced, and processing efficiency is improved.
Therefore, the syntax tree can be traversed to obtain corresponding syntax elements, and then the type of the syntax element is analyzed, where the syntax element in the embodiment of the present application includes at least one of the following: the system comprises numbers, character strings and operation codes, wherein the operation codes are instruction codes or fields related to execution operations in source codes, such as instruction serial numbers, operators and the like; the numbers are numbers in the source code, such as constants, and the operation codes are other character strings in the source code. The types of syntax elements thus include an opcode type, a number type, and/or a character type. Encoding the syntax element in dependence on the type comprises: and respectively coding the operation codes, the numbers and the character strings according to a preset coding rule. Namely, a coding rule corresponding to the type is determined according to the type, the coding rule is adopted to code the syntax element to obtain a coded syntax element, namely an intermediate code, and the intermediate code is adopted to generate an intermediate code file.
Wherein analyzing the type of the syntax element comprises: judging the type of the syntax element; if the type of the syntax element belongs to a first type, determining the syntax element as an operation code; and if the type of the mark code belongs to a second type, determining the syntax element as a number or a character string.
In this embodiment, the type of the syntax element may be determined, and if the type of the syntax element belongs to a first type, i.e., an operation type, the syntax element is an operation code; and if the type of the mark code belongs to a second type, namely a non-operation type, and comprises a number type and a character string type, determining the syntax element as a number or a character string. The following procedure may also be used for determination.
Referring to fig. 7, a flow diagram of one embodiment of syntax tree coding of the present application is shown.
Step 702, traverse the syntax tree to obtain syntax elements.
Step 704, determine whether the syntax element is an operation code. If the syntax element is an opcode, go to step 708; if the syntax element is not an opcode, step 706 is performed.
Step 706, determine whether the syntax element is a number. If the syntax element is a number, go to step 710; if the syntax element is not a number, step 712 is performed.
And 708, encoding the operation code according to a preset encoding rule.
And 710, coding the number according to a preset coding rule.
And 712, determining the syntax element as a character string, and encoding the character string according to a preset encoding rule.
At step 714, an intermediate code file is generated.
In the embodiment of the application, after the syntax tree is analyzed, the intermediate code file can be constructed based on the syntax tree, namely, the syntax tree can be traversed to obtain the syntax elements, and then the syntax elements are detected, so that the corresponding coding method is adopted for coding, wherein the character string is used for representing characters and can be one or more characters. For example, a sentence is an expression X + Y, wherein the syntax element X, Y is a character string and + is an operation code, and the syntax element + is encoded according to the corresponding encoding rule, so as to obtain an intermediate code, and obtain an intermediate code file.
In the embodiment of the application, parameters corresponding to intermediate codes are determined according to the lexical units; generating a intermediate code file according to the intermediate code, comprising: and generating a middle code file according to the middle code and the parameters corresponding to the middle code. According to the syntax tree, the association relation of each syntax unit, such as the father-son relation of each node and each child node, the brother relation among the nodes and the like, can be obtained, so that intermediate code parameters can be obtained, and after the intermediate code is obtained by encoding, an intermediate code file can be obtained according to the intermediate code and the parameters corresponding to the intermediate code, so that the position of each syntax element in the tree can be conveniently determined when the syntax tree is restored.
Step 412, sending the intermediate code file.
After the intermediate code file is obtained by encoding, the intermediate code file can be sent, for example, sent to a publishing platform for publishing the source code, or sent to an execution end through the publishing platform for executing a program corresponding to the source code.
In the decoding phase, the parsing of the intermediate code file may be performed using a decoder, which is a tool that parses binary intermediate code into a compilable syntax tree, such as a JSCC parser for JavaScript source code. The parser is used for restoring the intermediate code file into a syntax tree, the decoder can run on an execution end, and compiling and executing of the decoded syntax tree are related to a platform corresponding to the execution end. The decoder may comprise at least two parts: one part is independent of the platform and is the decoding sequence part of the intermediate code; another part is related to the running platform, which needs to call the interface of the running platform to restore the AST.
The intermediate code file is generated in the process of analyzing and coding, and is unreadable under the condition of unknown coding and decoding methods, has certain confidentiality and can protect the rights and interests of a source code developer.
Referring to fig. 8, a flow chart of an embodiment of a decoding stage in another data processing method of the present application is shown.
Step 802, obtaining coded syntax elements from the intermediate code file.
And step 804, decoding the coded syntax elements according to the types to obtain corresponding syntax elements.
After the execution end acquires the intermediate code file, the encoded syntax element can be acquired from the intermediate code file, then a decoding method corresponding to the encoding method is determined, and the encoded syntax element is decoded according to the type to obtain the corresponding syntax element. The encoded syntax element can be decoded according to a preset decoding rule to obtain at least one of the following syntax elements: operation code, number, character string; namely, the decoding method is adopted to determine the decoding rules corresponding to each type, and the corresponding decoding rules are adopted to decode the coded syntax elements according to the types to obtain the corresponding syntax elements. The composition of the syntax elements obtained by decoding the intermediate code file is used as the data segment corresponding to the source code to be analyzed, so that the data corresponding to the source code, such as the syntax elements, the corresponding parameters and other information, can be obtained.
Step 806, determining a corresponding lexical unit according to the character string information and the syntax element.
And 808, calling a first interface according to the execution end corresponding platform, and analyzing the lexical unit.
Step 810, generating a corresponding syntax tree according to the syntax elements obtained by parsing.
Corresponding character string information, such as a character string list, is also obtained when the source code is parsed to obtain the syntax unit, so that after the syntax element is obtained by decoding, the syntax element can be labeled according to the character string information to determine the corresponding lexical unit. The first interface can then be invoked to restore the corresponding syntax tree according to the syntax element. The first interface is called according to the execution end corresponding platform, the sentence corresponding to the lexical unit is analyzed, the lexical element obtained through analysis is determined, and the corresponding syntax tree is generated.
The sentence corresponding to the lexical unit can be analyzed, and a syntax element in the sentence is obtained; and generating a corresponding syntax tree according to the statement and the syntax elements obtained by analysis. The sentence corresponding to the lexical unit can be determined, then the grammar of the sentence is analyzed, the grammar element in the sentence is obtained, then the grammar tree can be restored according to the sentence, the grammar element obtained by analysis and the parameter corresponding to the grammar element in the intermediate code file, the process of restoring the grammar tree after decoding can be used as a code segment analysis process, and the grammar tree can be restored by carrying out data segment analysis and code segment analysis on the intermediate code file. Wherein, the header data of the intermediate code file can be analyzed before the data segment analysis is carried out, so that the information related to decoding and restoring the syntax tree, such as the decoding method, can be determined.
In an optional embodiment of the present application, the generating a corresponding syntax tree according to the sentence and the syntax element includes at least one of:
for statement, parsing the syntax element into a function or a variable, and adding the function or the variable into a corresponding syntax tree; that is, after the lexical unit is judged to be a statement, the corresponding syntax element of the lexical unit can be further judged to be a function or a variable, and then the syntax element is added to the corresponding syntax tree according to the statement and the corresponding parameter.
For a double-check statement, determining each syntax element in the double-check statement, and adding the syntax element to a corresponding syntax tree; that is, after the lexical unit determines that a sentence is a double-check sentence, each syntax element in the double-check sentence, such as an operation code, a number, a character string, etc., can be further determined, and the syntax element is added to the corresponding syntax tree.
For an expression, determining each syntax element in the expression and adding the syntax element to a corresponding syntax tree; after the lexical unit is judged to be an expression for the sentence, each syntax element in the double-check sentence can be further determined, and the syntax elements are added into the corresponding syntax tree if an operator in the expression is an operation code, a constant is a number, the syntax element can also comprise a character string and the like.
In the embodiment of the present application, a structure of an execution end is shown in fig. 9. The execution end comprises: the syntax tree analysis method comprises a decoder, a parsing engine and a platform module, wherein the parsing engine can be determined according to a platform and a coding language, for example, a JS engine can be determined for a JavaScript language, the parsing engine comprises a platform interface and a compiler, the platform interface is an interface of a platform where an execution end is located and comprises a first interface and a second interface, and the compiler is used for encoding a syntax tree into a binary code and executing the binary code. The platform module comprises an operating system and a hardware platform, namely the platform is determined according to software (the operating system) and hardware.
Referring to FIG. 9, a flowchart illustrating the steps of one embodiment of syntax tree restoration of the present application is shown.
And step 902, parsing a file header of the intermediate code file.
And 904, analyzing the data section of the intermediate code file according to the character string.
Step 906, performing code segment analysis on the intermediate code file to determine a corresponding statement.
Step 908, determine if the statement is a declaration statement. If yes, go to step 910; if not, step 914 or step 916 is performed.
Step 910, determine whether the syntax element in the sentence is a function. If yes, go to step 918; if not, step 912 is executed.
Step 912, determine the syntax element as a variable. Step 918 is then performed.
Step 914, confirm the statement is a compound statement, and analyze the compound statement. Step 918 is then performed.
Step 916, the statement is confirmed to be an expression, and the expression is analyzed. Step 918 is then performed.
Step 918, add to syntax tree.
Therefore, the file header information of the intermediate code file can be analyzed to obtain information such as decoding rules and the like, and then data segment analysis is carried out on the intermediate code file to determine syntactic elements, lexical units and corresponding sentences; and then, the type of the statement is determined through the analysis of the code segment, the syntax element in the statement is analyzed, and the corresponding syntax tree is obtained through reduction. Thereby enabling subsequent compilation execution operations to be performed based on the syntax tree.
And step 812, calling a second interface according to the platform corresponding to the execution end, and compiling the syntax tree to obtain a corresponding binary code.
Step 814, execute the binary code.
Then, a second interface can be called according to the corresponding platform of the execution end, the syntax tree is compiled, and a corresponding binary code is obtained, wherein the binary code can be identified by hardware of the execution end, so that the binary code can be executed through the hardware.
In the decoding and compiling execution process of the execution end, the execution engine does not need to execute the analysis of the lexical method and the grammar again, so that the running time is saved, and the running efficiency is improved.
Taking a source code applied to JavaScript as an example, the JS syntax tree can be coded, and a standard JS code can be compiled into a user-defined byte code which is another representation form of the standard JS, so that an unreadable byte stream subjected to coding optimization is formed, the rights and interests of developers are guaranteed, and the operation efficiency of the JS code can be accelerated.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the embodiments are not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the embodiments. Further, those skilled in the art will also appreciate that the embodiments described in the specification are presently preferred and that no particular act is required of the embodiments of the application.
On the basis of the above embodiments, the present embodiment further provides a data processing apparatus, which is applied to the parsing end.
Referring to fig. 10, a block diagram of a data processing apparatus according to an embodiment of the present application is shown, which may specifically include the following modules:
and the parsing module 1002 is configured to parse the source code to generate a syntax tree.
The encoding module 1004 is configured to parse syntax elements from the syntax tree, encode the syntax elements, and generate corresponding intermediate code files.
A sending module 1006, configured to send the intermediate code file.
The source code is run in an interpreted manner, and therefore can be compiled and executed after parsing, and the embodiment can generate a corresponding syntax tree after parsing the source code, where the syntax tree can be an Abstract Syntax Tree (AST), which is a tree-like representation of an abstract syntax structure of the source code. And then parsing the syntax tree to obtain syntax elements, coding the syntax elements to obtain an intermediate code file, and then sending the intermediate code file to an execution end to execute the source code. The intermediate code file is a file formed by intermediate codes, and the intermediate codes are data coded by syntax elements, so that the intermediate code file can be regarded as an encrypted file and can be read only after being decoded, and therefore, the intermediate code file is unreadable under the condition of unknown coding and decoding methods, has certain confidentiality and can protect the rights and interests of source code developers.
Referring to fig. 11, a block diagram of an alternative embodiment of a data processing apparatus according to the present application is shown, which specifically includes the following:
the parsing module 1002 includes: a lexical parsing sub-module 10022 and a grammar parsing sub-module 10024.
The lexical analysis submodule 10022 is configured to perform lexical analysis on the source code, and determine at least one lexical unit.
The syntax parsing sub-module 10024 is configured to perform syntax parsing on the lexical units, determine syntax elements, and generate corresponding syntax trees according to the syntax elements.
The lexical analysis submodule 10022 is configured to scan characters in the source code to generate a character string to be processed; and marking the character string to be processed, and determining at least one lexical unit.
The syntax parsing submodule 100024 is configured to parse the statement corresponding to the lexical unit, and obtain a syntax element in the statement; and generating a corresponding syntax tree according to the statement and the syntax element. The lexical unit comprises at least one of the following: keywords, identifiers, operators, delimiters, and constants.
Wherein the encoding module 1004 includes: an element acquisition sub-module 10042, an element encoding sub-module 10044, and a file generation sub-module 10046.
The element obtaining sub-module 10042 is configured to traverse the syntax tree and obtain a corresponding syntax element.
The element coding sub-module 10044 is configured to analyze a type of the syntax element and code the syntax element according to the type.
The file generating sub-module 10046 is configured to generate a corresponding intermediate code file according to the encoded syntax element.
The element coding submodule is used for judging the type of the syntax element; if the type of the syntax element belongs to a first type, determining the syntax element as an operation code; and if the type of the mark code belongs to a second type, determining the syntax element as a number or a character string.
The syntax element comprises at least one of: operation code, number, character string; and the element coding submodule is used for coding the operation code, the number and the character string respectively according to a preset coding rule.
The element coding submodule is also used for determining parameters corresponding to the syntax elements; and generating an intermediate code file according to the coded syntax element and the corresponding parameter.
On the basis of the above embodiments, the present embodiment further provides a data processing apparatus, which is applied to an execution end.
Referring to fig. 12, a block diagram of another data processing apparatus according to another embodiment of the present application is shown, which may specifically include the following modules:
a file obtaining module 1202, configured to obtain the intermediate code file.
The decoding module 1204 is configured to decode the intermediate code file, determine a corresponding syntax element, and restore a corresponding syntax tree according to the syntax element.
The executing module 1206 is configured to execute the corresponding code according to the syntax tree.
The intermediate code file can be analyzed at the execution end to obtain corresponding syntax elements, then a syntax tree is restored according to the syntax elements, and then the binary code is obtained and executed according to the syntax tree by compiling. Therefore, the execution engine does not need to execute the analysis of the lexical method and the grammar again in the compiling and executing process, the running time is saved, and the running efficiency is improved.
Referring to fig. 13, a block diagram of another alternative embodiment of the data processing apparatus of the present application is shown, which may specifically include the following modules:
wherein the decoding module element 1204 comprises: a file decoding submodule 12042 and a restoration submodule 12044.
A file decoding submodule 12042, configured to obtain a coded syntax element from the intermediate code file; and decoding the coded syntax elements according to the types to obtain corresponding syntax elements.
And the restoring submodule 12044 is configured to invoke the first interface according to the syntax element, and restore the corresponding syntax tree.
The file decoding submodule is configured to decode the encoded syntax element according to a preset decoding rule, so as to obtain at least one of the following syntax elements: opcode, number, string.
The restoring submodule is further used for determining a corresponding lexical unit according to the character string information and the grammar element.
And the reduction submodule is used for calling a first interface according to the execution end corresponding platform, analyzing the lexical unit and generating a corresponding syntax tree according to syntax elements obtained by analysis.
And the reduction submodule is also used for analyzing the sentences corresponding to the lexical units to obtain the syntax elements in the sentences.
And the reduction submodule is used for generating a corresponding syntax tree according to the statement and the syntax elements obtained by analysis.
The reduction submodule is used for analyzing the syntax elements into functions or variables for the statement sentences and adding the functions or variables into the corresponding syntax trees; for a double-check statement, determining each syntax element in the double-check statement, and adding the syntax element to a corresponding syntax tree; and for the expression, determining each syntax element in the expression and adding the syntax element to a corresponding syntax tree.
The execution block 1206 includes: a code compiling submodule 12062 and a code executing submodule 12064.
And the code compiling submodule 12062 is configured to call a second interface according to the platform corresponding to the execution end, and compile the syntax tree to obtain a corresponding binary code.
The code execution submodule 12064 is configured to execute the binary code.
Taking a source code applied to JavaScript as an example, the JS syntax tree can be coded, and a standard JS code can be compiled into a user-defined byte code which is another representation form of the standard JS, so that an unreadable byte stream subjected to coding optimization is formed, the rights and interests of developers are guaranteed, and the operation efficiency of the JS code can be accelerated.
The present application further provides a non-transitory, readable storage medium, where one or more modules (programs) are stored, and when the one or more modules are applied to a device, the device may execute instructions (instructions) of method steps in this application.
One embodiment of the present application provides one or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform one or more of the methods described above with respect to the parsing end.
One embodiment of the present application provides one or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform one or more of the methods described above as an execution side.
Fig. 14 is a schematic hardware structure diagram of an apparatus according to an embodiment of the present application. As shown in fig. 14, the device may include an input device 140, a processor 141, an output device 142, a memory 143, and at least one communication bus 144. The communication bus 144 is used to enable communication connections between the elements. Memory 143 may include high speed RAM memory and may also include non-volatile storage NVM, such as at least one disk memory, in which various programs may be stored for performing various processing functions and implementing the method steps of the present embodiment.
Alternatively, the processor 141 may be implemented by, for example, a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a controller, a microcontroller, a microprocessor, or other electronic components, and the processor 141 is coupled to the input device 140 and the output device 142 through a wired or wireless connection.
Optionally, the input device 140 may include a plurality of input devices, for example, at least one of a user interface for a user, a device interface for a device, a programmable interface for software, a camera, and a sensor. Optionally, the device interface facing the device may be a wired interface for data transmission between devices, or may be a hardware plug-in interface (e.g., a USB interface, a serial port, etc.) for data transmission between devices; optionally, the user-facing user interface may be, for example, a user-facing control key, a voice input device for receiving voice input, and a touch sensing device (e.g., a touch screen with a touch sensing function, a touch pad, etc.) for receiving user touch input; optionally, the programmable interface of the software may be, for example, an entry for a user to edit or modify a program, such as an input pin interface or an input interface of a chip; optionally, the transceiver may be a radio frequency transceiver chip with a communication function, a baseband processing chip, a transceiver antenna, and the like. An audio input device such as a microphone may receive voice data. Output device 142 may include an output device such as a display, a sound, etc.
In this embodiment, the processor of the device includes a module for executing the functions of the modules of the data processing apparatus in each device, and specific functions and technical effects are as described in the above embodiments, and are not described herein again.
Fig. 15 is a schematic hardware structure diagram of an apparatus according to another embodiment of the present application. FIG. 15 is a specific embodiment of FIG. 14 in an implementation. As shown in fig. 15, the apparatus of the present embodiment includes a processor 151 and a memory 152.
The processor 151 executes the computer program code stored in the memory 152 to implement the data processing method of fig. 1 to 9 in the above embodiments.
The memory 152 is configured to store various types of data to support operation at the device. Examples of such data include instructions for any application or method operating on the device, such as messages, pictures, videos, and so forth. The memory 152 may include a Random Access Memory (RAM) and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory.
Optionally, the processor 151 is provided in the processing component 150. The apparatus may further include: a communication component 153, a power component 154, a multimedia component 155, an audio component 156, an input/output interface 157 and/or a sensor component 158. The specific components included in the device are set according to actual requirements, which is not limited in this embodiment.
The processing component 150 generally controls the overall operation of the device. Processing components 150 may include one or more processors 151 to execute instructions to perform all or a portion of the steps of the methods of fig. 1-9 described above. Further, the processing component 150 may include one or more modules that facilitate interaction between the processing component 150 and other components. For example, the processing component 150 may include a multimedia module to facilitate interaction between the multimedia component 155 and the processing component 150.
The power supply component 154 provides power to the various components of the device. The power components 154 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for a device.
The multimedia component 155 includes a display screen that provides an output interface between the device and the user. In some embodiments, the display screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the display screen includes a touch panel, the display screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.
The audio component 156 is configured to output and/or input audio signals. For example, the audio component 156 includes a Microphone (MIC) configured to receive external audio signals when the device is in an operational mode, such as a speech recognition mode. The received audio signal may further be stored in the memory 152 or transmitted via the communication component 153. In some embodiments, audio assembly 156 further includes a speaker for outputting audio signals.
The input/output interface 157 provides an interface between the processing component 150 and peripheral interface modules, which may be click wheels, buttons, etc. These buttons may include, but are not limited to: a volume button, a start button, and a lock button.
The sensor assembly 158 includes one or more sensors for providing various aspects of status assessment for the device. For example, the sensor component 158 may detect the open/closed state of the device, the relative positioning of the components, the presence or absence of user contact with the device. The sensor assembly 158 may include a proximity sensor configured to detect the presence of nearby objects, including detecting the distance between the user and the device, without any physical contact. In some embodiments, the sensor assembly 158 may also include a camera or the like.
The communication component 153 is configured to facilitate wired or wireless communication between the device and other devices. The device may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In one embodiment, the device may include a SIM card slot therein for insertion of a SIM card so that the device can log onto a GPRS network to establish communication with a server via the internet.
From the above, the communication component 153, the audio component 156, the input/output interface 157 and the sensor component 158 involved in the embodiment of fig. 15 can be implemented as the input device in the embodiment of fig. 14.
In an apparatus of this embodiment, one or more processors are included; and one or more machine readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform a method as described above for one or more of the parsing ends.
In another apparatus of this embodiment, one or more processors are included; and one or more machine-readable media having instructions stored thereon, which when executed by the one or more processors, cause the apparatus to perform one or more of the methods as described above with respect to the execution side.
An embodiment of the present application further provides an operating system for a device, as shown in fig. 16, the operating system of the device includes: communication unit 1602, decode recovery unit 1604, and execution unit 1606.
The communication unit 1602 acquires the intermediate code file.
The decoding and restoring unit 1604 decodes the intermediate code file, determines a corresponding syntax element, and restores a corresponding syntax tree according to the syntax element.
The execution unit 1606 executes the corresponding source code according to the syntax tree.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one of skill in the art, embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The data processing method, device, equipment and operating system provided by the present application are introduced in detail, and a specific example is applied in the present application to explain the principle and the implementation of the present application, and the description of the above embodiment is only used to help understand the method and the core idea of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (26)

1. A data processing method, comprising:
analyzing the source code to generate a syntax tree;
traversing the syntax tree and acquiring corresponding syntax elements;
analyzing the type of the syntax element, and coding the syntax element according to the type;
generating a corresponding intermediate code file according to the coded syntax element;
and sending the intermediate code file.
2. The method of claim 1, wherein parsing the source code to generate the syntax tree comprises:
performing lexical analysis on the source code, and determining at least one lexical unit;
and carrying out grammar analysis on the lexical units, determining grammar elements, and generating corresponding grammar trees according to the grammar elements.
3. The method of claim 2, wherein lexical parsing the source code to determine at least one lexical unit comprises:
scanning characters in the source code to generate a character string to be processed;
and marking the character string to be processed, and determining at least one lexical unit.
4. The method of claim 2, wherein parsing the lexical units to determine syntax elements and generating corresponding syntax trees based on the syntax elements comprises:
analyzing the sentence corresponding to the lexical unit to obtain a syntax element in the sentence;
and generating a corresponding syntax tree according to the statement and the syntax element.
5. The method of claim 2 or 4, wherein the lexical unit comprises at least one of: keywords, identifiers, operators, delimiters, and constants.
6. The method of claim 1, wherein analyzing the type of the syntax element comprises:
judging the type of the syntax element;
if the type of the syntax element belongs to a first type, determining the syntax element as an operation code;
and if the type of the syntax element belongs to a second type, determining that the syntax element is a number or a character string.
7. The method of claim 1, wherein the syntax element comprises at least one of: operation code, number, character string;
the encoding the syntax element by type comprises:
and respectively coding the operation code, the number and the character string according to a preset coding rule.
8. The method of claim 1, further comprising:
determining parameters corresponding to the syntax elements;
generating a corresponding intermediate code file according to the coded syntax element, comprising: and generating an intermediate code file according to the coded syntax element and the corresponding parameter.
9. A data processing method, comprising:
acquiring a middle code file;
acquiring coded syntax elements from the intermediate code file;
decoding the coded syntax elements according to the types to obtain corresponding syntax elements;
restoring the corresponding syntax tree according to the syntax element;
and executing corresponding codes according to the syntax tree.
10. The method of claim 9, wherein decoding the coded syntax elements by type to obtain corresponding syntax elements comprises:
decoding the coded syntax element according to a preset decoding rule to obtain at least one of the following syntax elements: opcode, number, string.
11. The method of claim 9, wherein restoring the corresponding syntax tree based on the syntax elements comprises:
and calling the first interface according to the syntax element to restore the corresponding syntax tree.
12. The method of claim 11, further comprising:
and determining a corresponding lexical unit according to the character string information and the syntactic element.
13. The method of claim 12, wherein invoking the first interface to restore the corresponding syntax tree based on the syntax element comprises:
and calling a first interface according to the execution end corresponding platform, analyzing the lexical unit, and generating a corresponding syntax tree according to the syntax elements obtained by analysis.
14. The method of claim 13, wherein parsing the lexical unit comprises:
and analyzing the sentence corresponding to the lexical unit to acquire the syntax element in the sentence.
15. The method of claim 14, wherein generating the corresponding syntax tree based on the parsed syntax elements comprises:
and generating a corresponding syntax tree according to the statement and the syntax elements obtained by analysis.
16. The method of claim 15, wherein generating a corresponding syntax tree based on the sentence and the parsed syntax elements comprises at least one of:
for statement, parsing the syntax element into a function or a variable, and adding the function or the variable into a corresponding syntax tree;
for a double-check statement, determining each syntax element in the double-check statement, and adding the syntax element to a corresponding syntax tree;
and for the expression, determining each syntax element in the expression and adding the syntax element to a corresponding syntax tree.
17. The method of claim 9, wherein executing the corresponding code according to the syntax tree comprises:
calling a second interface according to the execution end corresponding platform, and compiling the syntax tree to obtain a corresponding binary code;
the binary code is executed.
18. A data processing apparatus, comprising:
the parsing module is used for parsing the source code to generate a syntax tree;
the coding module is used for analyzing the syntax elements from the syntax tree, coding the syntax elements and generating corresponding intermediate code files;
the sending module is used for sending the intermediate code file;
the encoding module includes:
the element acquisition submodule is used for traversing the syntax tree and acquiring a corresponding syntax element;
the element coding submodule is used for analyzing the type of the syntax element and coding the syntax element according to the type;
and the file generation submodule is used for generating a corresponding intermediate code file according to the coded syntax element.
19. The apparatus of claim 18, wherein the syntax element comprises at least one of: operation code, number, character string;
and the element coding submodule is used for coding the operation code, the number and the character string respectively according to a preset coding rule.
20. A data processing apparatus, comprising:
the file acquisition module is used for acquiring the intermediate code file;
the decoding module is used for decoding the intermediate code file, determining a corresponding syntax element and restoring a corresponding syntax tree according to the syntax element;
the execution module is used for executing corresponding codes according to the syntax tree;
the decoding module includes:
the file decoding submodule is used for acquiring coded syntax elements from the intermediate code file; and decoding the coded syntax elements according to the types to obtain corresponding syntax elements.
21. The apparatus of claim 20,
the file decoding submodule is configured to decode the encoded syntax element according to a preset decoding rule, so as to obtain at least one of the following syntax elements: opcode, number, string.
22. An apparatus, comprising:
one or more processors; and
one or more machine-readable media having instructions stored thereon that, when executed by the one or more processors, cause the device to perform the method of any of claims 1-8.
23. One or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform the method of any of claims 1-8.
24. An apparatus, comprising:
one or more processors; and
one or more machine-readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform the method of any of claims 9-17.
25. One or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform the method of any of claims 9-17.
26. An operating system for a device, comprising:
the communication unit acquires the intermediate code file;
the decoding restoration unit acquires the coded syntax element from the intermediate code file; decoding the coded syntax elements according to the types to obtain corresponding syntax elements, and restoring corresponding syntax trees according to the syntax elements;
and the execution unit executes the corresponding source code according to the syntax tree.
CN201710571375.7A 2017-07-13 2017-07-13 Data processing method, device, equipment and storage medium Active CN109255209B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710571375.7A CN109255209B (en) 2017-07-13 2017-07-13 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710571375.7A CN109255209B (en) 2017-07-13 2017-07-13 Data processing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109255209A CN109255209A (en) 2019-01-22
CN109255209B true CN109255209B (en) 2022-05-17

Family

ID=65050636

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710571375.7A Active CN109255209B (en) 2017-07-13 2017-07-13 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109255209B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110018829B (en) * 2019-04-01 2022-11-11 北京东方国信科技股份有限公司 Method and device for improving execution efficiency of PL/SQL language interpreter
CN110096264A (en) * 2019-04-29 2019-08-06 珠海豹好玩科技有限公司 A kind of code operation method and device
CN110309629B (en) * 2019-06-18 2023-10-10 创新先进技术有限公司 Webpage code reinforcement method, device and equipment
CN110457869B (en) * 2019-07-23 2022-03-22 Oppo广东移动通信有限公司 Program compiling and encrypting method and device, storage medium and electronic equipment
CN110647360B (en) * 2019-08-20 2022-05-03 百度在线网络技术(北京)有限公司 Method, device and equipment for processing device execution code of coprocessor and computer readable storage medium
CN111209004B (en) * 2019-12-30 2023-09-01 北京水滴科技集团有限公司 Code conversion method and device
CN111240772B (en) * 2020-01-22 2024-06-18 腾讯科技(深圳)有限公司 Block chain-based data processing method, device and storage medium
CN112069788A (en) * 2020-09-10 2020-12-11 杭州安恒信息技术股份有限公司 Method, device and equipment for analyzing yaml file and storage medium
CN113312880B (en) * 2021-04-02 2024-01-26 飞诺门阵(北京)科技有限公司 Text form conversion method and device and electronic equipment
CN113703779B (en) * 2021-09-06 2024-04-16 王喆 Cross-platform multi-language compiling method and ultra-light Internet of things virtual machine
CN116522295B (en) * 2023-04-26 2024-09-10 北京青萌数海科技有限公司 Method and device for protecting R language source code

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0327058A2 (en) * 1988-02-02 1989-08-09 Nec Corporation Protocol data unit encoding/decoding system
EP0567137A1 (en) * 1992-04-23 1993-10-27 Nec Corporation Protocol data unit encoding/decoding device
CN102622448A (en) * 2012-03-26 2012-08-01 中山大学 Digital television interactive application page markup language resolving method
CN103677952A (en) * 2013-12-18 2014-03-26 华为技术有限公司 Coder decoder generating device and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0327058A2 (en) * 1988-02-02 1989-08-09 Nec Corporation Protocol data unit encoding/decoding system
EP0567137A1 (en) * 1992-04-23 1993-10-27 Nec Corporation Protocol data unit encoding/decoding device
US5418963A (en) * 1992-04-23 1995-05-23 Nec Corporation Protocol encoding/decoding device capable of easily inputting/referring to a desired data value
CN102622448A (en) * 2012-03-26 2012-08-01 中山大学 Digital television interactive application page markup language resolving method
CN103677952A (en) * 2013-12-18 2014-03-26 华为技术有限公司 Coder decoder generating device and method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Abstract Syntax Tree 抽象语法树简介";Whilefor;《https://zhuanlan.zhihu.com/p/26988179》;20170519;全文 *

Also Published As

Publication number Publication date
CN109255209A (en) 2019-01-22

Similar Documents

Publication Publication Date Title
CN109255209B (en) Data processing method, device, equipment and storage medium
US8762962B2 (en) Methods and apparatus for automatic translation of a computer program language code
CN106970820B (en) Code storage method and code storage device
EP3605324B1 (en) Application development method and tool, and storage medium thereof
CN110414261B (en) Data desensitization method, device, equipment and readable storage medium
US8762963B2 (en) Translation of programming code
CN111736840A (en) Compiling method and running method of applet, storage medium and electronic equipment
CN111240684A (en) Cutting method and device of JS code, medium and electronic equipment
CN114625844B (en) Code searching method, device and equipment
CN111488573A (en) Link library detection method and device, electronic equipment and computer readable storage medium
US20230418566A1 (en) Programmatically generating evaluation data sets for code generation models
CN114328208A (en) Code detection method and device, electronic equipment and storage medium
CN112214736A (en) Code encryption method and related assembly
CN117113347A (en) Large-scale code data feature extraction method and system
CN114065222A (en) Source code risk analysis method and device, electronic equipment and storage medium
Decker et al. srcDiff: A syntactic differencing approach to improve the understandability of deltas
CN113885876A (en) Parameter checking method, device, storage medium and computer system
D'Antoni et al. Fast: A transducer-based language for tree manipulation
US10733303B1 (en) Polymorphic code translation systems and methods
CN114626061A (en) Webpage Trojan horse detection method and device, electronic equipment and medium
CN112000690A (en) Method and device for analyzing structured operation statement
US12014155B2 (en) Constrained prefix matching for generating next token predictions
CN114090965B (en) Java code confusion method, system, computer equipment and storage medium
Shi et al. Lifting network protocol implementation to precise format specification with security applications
CN111651781B (en) Log content protection method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant