CN118170386A - Term compiling method, term compiling system, storage medium and electronic device - Google Patents

Term compiling method, term compiling system, storage medium and electronic device Download PDF

Info

Publication number
CN118170386A
CN118170386A CN202410304744.6A CN202410304744A CN118170386A CN 118170386 A CN118170386 A CN 118170386A CN 202410304744 A CN202410304744 A CN 202410304744A CN 118170386 A CN118170386 A CN 118170386A
Authority
CN
China
Prior art keywords
term
tree
grammar
sentence
analyzed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410304744.6A
Other languages
Chinese (zh)
Inventor
杨清广
李宇哲
王春华
李日璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Youteyun Technology Co ltd
Original Assignee
Guangdong Youteyun Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Youteyun Technology Co ltd filed Critical Guangdong Youteyun Technology Co ltd
Priority to CN202410304744.6A priority Critical patent/CN118170386A/en
Publication of CN118170386A publication Critical patent/CN118170386A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Machine Translation (AREA)

Abstract

The application provides a term compiling method, which comprises the following steps: acquiring a statement to be analyzed; splitting the sentence to be analyzed into words according to the keyword and the word segmentation term defined by the glossary to obtain a word list containing the words; constructing an abstract syntax tree structure containing conditions and results based on the word list; carrying out grammar analysis on the abstract grammar tree to obtain a grammar tree; semantic analysis is carried out on the grammar tree to obtain an updated grammar tree; and obtaining standard sentences of the sentences to be analyzed according to the updated grammar tree. The application constructs abstract grammar tree structure through the algorithm of definition term priority in advance, if the construction is successful, the general domain language grammar check is finished. The terms in the application can be dynamically configured and defined, thereby achieving the convenient replacement of language sentences in the general field and arbitrary combination logic rules. And can adapt to changeable application scenes. The application also provides a term compiling system, a storage medium and electronic equipment, which have the beneficial effects.

Description

Term compiling method, term compiling system, storage medium and electronic device
Technical Field
The present application relates to the field of artificial intelligence, and in particular, to a term compiling method, a term compiling system, a storage medium, and an electronic device.
Background
The currently known user description business processes are all business processing processes of code writing, and if a business scene is changed, new business scene processing of code writing is required to be rewritten, so that the method can not be rapidly adapted to the changing requirement of the scene.
Disclosure of Invention
The application aims to provide a term compiling method, a term compiling system, a storage medium and electronic equipment, which can adapt to complex and changeable application scenes.
In order to solve the technical problems, the application provides a term compiling method, which comprises the following specific technical scheme:
Acquiring a statement to be analyzed;
splitting the sentence to be analyzed into words according to the keyword and the word segmentation term defined by the glossary to obtain a word list containing the words;
Constructing an abstract syntax tree structure containing conditions and results based on the word list;
Carrying out grammar analysis on the abstract grammar tree to obtain a grammar tree;
Carrying out semantic analysis on the grammar tree to obtain an updated grammar tree;
And obtaining the standard statement of the statement to be analyzed according to the updated grammar tree.
Optionally, the splitting the sentence to be analyzed into word lists by the word segmentation terms defined by the keywords and the glossary includes:
When an operator in the sentence to be analyzed is detected, a character string which is before the operator and is not split is identified as a word and is added to the word list;
The empty characters in the sentences to be analyzed are reserved, and empty content words corresponding to the empty characters are saved to a container;
and merging the complete character string content in the sentence to be analyzed into a character string in the word list.
Optionally, constructing an abstract syntax tree structure including conditions and results based on the word list includes:
determining the term priority of each word in the word list;
And analyzing the word list by using a recursive lower degradation analysis method and/or an operator priority analysis method according to the term priority to obtain an abstract syntax tree.
Optionally, parsing the abstract syntax tree to obtain a syntax tree includes:
Resolving a conditional subtree in the abstract syntax tree;
Analyzing a result subtree in the abstract syntax tree;
And merging the conditional subtrees and the result subtrees to obtain a grammar tree.
Optionally, performing semantic analysis on the grammar tree to obtain an updated grammar tree;
Initializing a lua term converter;
And traversing the grammar tree in a subsequent mode, and calling the lua term converter to perform semantic analysis to obtain an updated grammar tree.
Optionally, traversing the syntax tree in a subsequent way, and calling the lua term converter to perform semantic analysis, where obtaining the updated syntax tree further includes:
Executing semantic analysis of the newly added sentence on the json character string returned by the lua term converter;
The semantic analysis of the added sentence comprises the following steps:
Determining a new abstract syntax tree corresponding to each new added sentence;
performing subsequent traversal on the newly added abstract syntax tree;
Judging whether a result belonging to the conversion is a structured json character;
If yes, analyzing the structured json characters, extracting sentences with newly added marks, splitting the sentences with newly added marks, and adding a grammar tree;
And if not, executing the step of calling the lua term converter to perform semantic analysis to obtain an updated grammar tree.
Optionally, if the syntax tree includes user data table data defined by user, performing semantic analysis on the syntax tree to obtain an updated syntax tree, further including:
applying a term replacement rule to the user data table data, and replacing the sentence with a standard sentence;
Performing word-class term matching on the standard sentence;
Splitting the matching rule of the matched term to obtain a character string array;
And according to the replacement rule, calling the lua script to perform data replacement on the character string array.
The application also provides a term compiling system comprising:
The sentence acquisition module is used for acquiring sentences to be analyzed;
the word splitting module is used for splitting the sentence to be analyzed into words according to the keyword and the word segmentation term defined by the glossary to obtain a word list containing the words;
a grammar tree construction module for constructing an abstract grammar tree structure containing conditions and results based on the word list;
The grammar analysis module is used for carrying out grammar analysis on the abstract grammar tree to obtain a grammar tree;
the semantic analysis module is used for carrying out semantic analysis on the grammar tree to obtain an updated grammar tree;
and the compiling module is used for obtaining the standard statement of the statement to be analyzed according to the updated grammar tree.
The application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of the method as described above.
The application also provides an electronic device comprising a memory in which a computer program is stored and a processor which when calling the computer program in the memory implements the steps of the method as described above.
The application provides a term compiling method, which comprises the following steps: acquiring a statement to be analyzed; splitting the sentence to be analyzed into words according to the keyword and the word segmentation term defined by the glossary to obtain a word list containing the words; constructing an abstract syntax tree structure containing conditions and results based on the word list; carrying out grammar analysis on the abstract grammar tree to obtain a grammar tree; carrying out semantic analysis on the grammar tree to obtain an updated grammar tree; and obtaining the standard statement of the statement to be analyzed according to the updated grammar tree.
The application receives the statement to be analyzed through an algorithm of defining the term priority in advance, splits the statement to be analyzed, constructs an abstract grammar tree structure, and if the construction is successful, the general field language grammar checking is finished. Meanwhile, the terms in the application can be dynamically configured and defined, so that the convenient replacement of language sentences in the general field and any combination logic rules are achieved. The method can adapt to changeable application scenes to describe changeable business processes, and finally compiles the business processes into executable files.
The application also provides a term compiling system, a storage medium and electronic equipment, which have the beneficial effects and are not repeated here.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for compiling terms according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a construction process of an abstract syntax tree structure according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a semantic analysis process of a delay term according to an embodiment of the present application;
FIG. 4 is a schematic diagram of an analysis process of a delay term according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a grammar tree corresponding to a domain statement provided by an embodiment of the present application;
fig. 6 is a schematic diagram of a compiling system according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The term definition is composed of a plurality of parts, the first part is a term name definition, the second part is a description of the term, the third part is a matching rule of the term, the fourth part is a conversion rule of the term, and the fifth part is a priority definition of the term and a type definition of the term.
Description of the terms there are currently 3 modifiers, respectively: variable names, values, and statements. The description of the terms is defined by these 3 modifiers. Wherein, the variable name must be globally unique, the numerical value can be composed of positive and negative numbers, floating point numbers and character strings, and the sentence refers to a sentence which can be re-split by the term.
The term existence matching rules refer to correctly and accurately locating the term in a segment of sentence and segmenting into term descriptions conforming to it according to the matching rules. The matching rules are based on the matching rules of the regular expression, so that the matching rule definition is consistent with the regular expression.
Meanwhile, the term also has a conversion rule, which refers to a result rule that it finally converts after matching to the term. Rule definition as shown in table 1 below, table 1 is a conversion rule table of terms:
Table 1 conversion rule table of terms
Taking refrigeration as an example, in the result statement: "restaurant air-conditioning cooling" is replaced with "restaurant air-conditioning=0".
The corresponding term is refrigeration, the term is described as refrigeration, the term matching rule is refrigeration, and the conversion rule of the corresponding term is flag, sz 1, 0.
In addition, the term replacement rule may have a call to a function, and the functions involved in the current replacement rule are shown in the following table 2, where table 2 is a term rule function table:
TABLE 2 term rule function table
According to the definition of the term concept described above, the question is thus raised that the matching rule matches preferentially who, for example: both [ cancel normally open ] and [ normally open ] can be matched in one sentence, then which of the preferential matches can be determined by setting the priority of the term. The terms are currently matched in order of priority.
Referring to table 3, table 3 provides a schematic representation of the term priority for embodiments of the present application:
FIG. 3 term priority schematic table
Terminology Priority level
Or (b) 1
And is also provided with 2
At the same time 3
Equal to 4
Higher than 5
Referring to fig. 1, fig. 1 is a flowchart of a term compiling method according to an embodiment of the application, where the method includes:
s101: acquiring a statement to be analyzed;
s102: splitting the sentence to be analyzed into words according to the keyword and the word segmentation term defined by the glossary to obtain a word list containing the words;
s103: constructing an abstract syntax tree structure containing conditions and results based on the word list;
s104: carrying out grammar analysis on the abstract grammar tree to obtain a grammar tree;
s105: carrying out semantic analysis on the grammar tree to obtain an updated grammar tree;
s106: and obtaining the standard statement of the statement to be analyzed according to the updated grammar tree.
The statement to be analyzed generally refers to a configuration statement input by a user and can be used for executing business scene processing. The finally obtained standard sentence is used as an executable file of the electronic device, for example, the standard sentence can be a machine programming language such as java. Here, how to obtain the sentence to be analyzed is not limited, and the sentence may be input by the user through an input device such as a keyboard or a touch screen, or may be obtained through speech recognition or text recognition. In other words, the service configuration statement in any service scene can be used as the statement to be analyzed in the embodiment of the application.
After obtaining the sentences to be analyzed, the lexical analyzer needs to accept each sentence as input, and split the sentences into word lists according to the word segmentation terms defined by the @ keywords and the glossary. The term "cut" includes, but is not limited to or, and, at the same time, left brackets, right brackets, and the like. Step S102 is essentially a lexical analysis process, and in one possible embodiment, the lexical analyzer may be directly applied for analysis. Specifically, the word is required to be segmented in advance for the sentence to be analyzed, and in order to simplify the algorithm, the lexical analyzer mainly adopts the basis of segmentation by @, or, at the same time, a left bracket and a right bracket, and when an operator is encountered, a character string which is not split in the front is identified as a word and added into a word list. After word segmentation, the empty characters in the sentences to be analyzed are reserved, and the empty content words corresponding to the empty characters are stored in a container. The word with the empty content is required to be stored independently, so that sentences can be restored in the subsequent process and stored to the token, and the sentences can be highlighted, so that the problem that the sentences cannot be restored due to the deletion of the empty content is avoided. In addition, the complete character string content in the sentence to be analyzed can be combined into a character string in one word list. A complete string of content, after segmentation, is split into words, which need to be merged back into a string, e.g. for quotation marks or brackets, where string merging is needed.
After the word list is obtained, an abstract syntax tree structure containing conditions and results is constructed. The user-written statement contains conditions and results in the statement table. The statement can be spelled into a condition @ result by the statement table. The syntax tree of the statement includes a conditional subtree, an @ root node and a result subtree. Both the conditional subtree and the result subtree perform word segmentation and build grammar tree according to the term definition priority table. That is, the word class terminology in the term definition priority list will cut each element into tokens.
For example, "A is greater than 1 and B is less than 2@B is equal to 3", and "A is greater than 1 and B is less than 2" is first segmented into 3 token data according to the segmentation class term according to the term priority, and the token data are distinguished by "[ ]": [ A is greater than 1] [ B is less than 2], then a conditional sub-tree is built from the 3 token. And establishing a result subtree in the same way. And the token of [ A is more than 1] [ B is less than 2] can find the matching class terms to match and replace data through the term definition priority table during semantic analysis. And constructing an abstract syntax tree structure of conditions and results according to the @ key. The abstract syntax tree structure, AST (Abstract Syntax Tree), is a tree representation of the abstract syntax structure of the source code. AST is a structured representation of the source code of a programming language that exposes the syntactic structure of the code in the form of a tree. In this tree, each node corresponds to some structure in the source code, such as an expression, statement, variable declaration, etc.
In order to obtain the abstract syntax tree structure, two technologies of recursive descent analysis of term priority and operator priority analysis can be comprehensively applied. Recursive descent parsing and operator priority parsing are two different parsing techniques that find application in both compilers and interpreters.
Recursive downdegradation analysis is a top-down parsing method that starts from the root node and gradually derives the values of the expression from the grammar rules. Recursive lower degradation analysis typically uses a stack to store intermediate results and backtrack to the last state when an error is encountered. Recursive descent parsing has the advantage that any complex syntax structure can be processed, but has the disadvantage that it can lead to a large number of backtracking operations, thereby reducing the parsing efficiency.
Operator priority parsing is a bottom-up parsing method that first determines operator priorities in an expression, and then calculates the values of the expression in order of priority from low to high. Operator priority parsing is typically implemented using two stacks, one for storing operands and the other for storing operators. The advantage of operator priority parsing may increase parsing efficiency, but the disadvantage is that some complex syntax structures may not be handled.
First, lexical analysis is performed. The overall idea of lexical analysis is to analyze the expression first to judge whether the expression is a bracketed expression, and the attribute of node brackets is added in brackets. And (3) carrying out the tree building process of the priority of the operators without brackets, and carrying out the tree building according to the priority of the word class terminology, wherein the lowest priority is and the upper priority is low. Specifically, the conditional expression needs to be parsed to generate a conditional subtree, and the conditional subtree root node is returned. After the root node of the conditional subtree is obtained, the result expression is analyzed to generate a result subtree, and then the root node of the result subtree is returned. Finally, combining the grammar tree to complete the lexical analysis. When the grammar tree is combined, the grammar tree and the root node @, the conditional subtree root node is added at the left side of the root node @, and the result subtree root node is added at the right side of the root node @.
The lexical analysis process is described below by taking the conditional subtrees as an example, and the lexical analysis process of the result subtrees can refer to the lexical analysis of the conditional subtrees, and the lexical analysis process of the result subtrees are consistent.
In the lexical analysis process, the method mainly comprises the following two steps:
first step, expression analysis CSyntaxTreeNode x lhs= PARSEPRIMARY ();
And secondly, establishing a grammar tree ParseOpreatorRHS (expPriority, LHS) according to the operator priority.
Wherein expPriority is input priority and LHS is left subtree.
ParseOpreatorRHS (expPriority, LHS) represents the process of parsing to generate left and right subtrees. By default, starting with the highest priority 0, this procedure calls the following step A6 to obtain the right sub-tree RHS, and then combines the left sub-tree LHS and the right sub-tree RHS to obtain the conditional or result sub-tree.
When the first step of expression analysis is executed, the method comprises judging whether the current token is a word-cutting class, judging whether the current token is "(", analyzing the conditional expression to generate a conditional subtree and returning to the root node of the conditional subtree at the moment, adding "(" to the root node of the conditional subtree, reading the next token, judging whether the next token is a ")", if yes), adding ")" -to the root node of the conditional subtree; if not, ", reporting an error. If the current token is not a word-cut class, the current token is newly added as a node.
When executing the second step, building the grammar tree according to the operator priority, comprising the following steps:
A1: judging that the current token is a @ key character; if not, entering A2; if yes, entering A3;
A2: judging whether the current token is a word-class term or not; if not, entering A3; if yes, entering A4;
a3: returning to the input node LHS;
a4: acquiring the priority nCurrPriority of the current word-class term;
A5: determining whether the priority nCurrPriority of the current term is less than the incoming priority expPriority; if yes, executing A3; if not, entering A6;
a6: performing expression analysis CSyntaxTreeNode rhs= PARSEPRIMARY ();
a7: nextToken of reading the analyzed expression;
a8: judging nextToken whether the word-class term is cut; if yes, entering A9; if not, entering A12;
a9: obtain nextToken's priority nNextPriority;
A10: judging whether the current priority nCurrPriority is smaller than the priority nNextPriority of the next token; if yes, entering A11; if not, entering A12;
the low-order word class is placed under the token with high priority to form a subtree.
A11: establishing grammar tree ParseOpreatorRHS (nCurrPriority +1, LHS) based on operator priorities
A12: syntactic subtrees (LHS and RHS) are assembled. After A12, a loop analysis may be performed by returning to A1 to combine the grammar sub-trees according to priority, and finally returning to a conditional sub-tree.
The above A1-A12 process is PARSEPRIMARY algorithm process. RHS refers to the right subtree, LHS refers to the left subtree, and they are eventually combined into a conditional subtree by PARSEPRIMARY algorithm. Similarly, PARSEPRIMARY algorithm can be applied to obtain a result subtree, and finally the conditional subtree and the result subtree are combined into an AST tree through @.
Referring to fig. 2, fig. 2 is a schematic diagram of a parsing process according to an embodiment of the present application. It should be noted that the left bracket and the right bracket are not represented as separate nodes, but rather as one attribute within a node.
After the abstract syntax tree is constructed, syntax analysis and semantic analysis can be performed successively.
For the grammar analysis, the condition subtrees and the result subtrees in the abstract grammar tree are required to be analyzed respectively, and the analysis results of the condition subtrees and the result subtrees are combined to obtain the grammar tree. Specifically, the basic operand and binary operation need to be analyzed. Both conditional expressions and result expressions require parsing of basic and binary operations. The basic operand process is only analyzed here for (), and no other operand is symbolized. Both binary and basic operations can be combined into one process called building a syntax tree. Because the terms "+", "-"/"," × "are used as binary symbols and the basic symbols are replaced by terms, the process of parsing the binary operation and the basic operation is the process of creating a syntax tree according to the priorities of the cut words, and the process is nested because it is a recursively descending process of creating a syntax tree. Each token is also to be judged (), so the binary calculation process also includes the judgment of the basic operation amount. The order 2 word class term priority is obtained because this is a process that is continuously recursive in priority. The priority of the current word-class term is required to be compared with the priority of the word-class term appearing next, so that the node with high priority is ensured to be on the node with low priority. E.g., a priority of "or" of 0, "and a priority of" 1. Then "and" would be under "or" in the syntax tree.
After performing the parsing, further semantic analysis needs to be performed. In the semantic analysis process, the lua term converter may be applied for semantic analysis. The method specifically comprises the following steps:
A first step of initializing a lua term converter;
and traversing the grammar tree in the second step and the subsequent steps, and calling the lua term converter to perform semantic analysis to obtain an updated grammar tree.
Upon subsequent traversal of the syntax tree, context data locations and node types are passed, after which the lua term converter can be directly applied to convert terms into standard statements.
After initializing the lua term converter, the first "? This is because the condition tree is distinguished by question marks, the left side is the trigger condition, and the right side is the trigger result.
It should be noted that, the lua term converter is a term converter written in lua language, and returns a structured json character string after converting the term. And the general semantic analysis flow does not carry out special analysis processing on the delay term. Instead, by returning a structured json string, it is determined whether the term content requires a new statement. Before initializing the lua term converter, a term definition file needs to be determined, and the lua term converter corresponds to a file path of the lua script and can further comprise custom data transferred to the lua script by a user. The initialization lua term converter is mainly the initialization of user-defined data and the initialization of term definition tables.
When the lua term converter is applied, a term matching interface is called to execute matching class term matching, term conversion is carried out on functions in matched data application lua scripts, and a statement conversion result is returned. Thereafter, data updates of the service may be performed. The result of the Lua script conversion is a standard statement, for example, [ a equal to 1 ] the token is converted to a=1 by calling the Lua term converter, and the result is returned to the token. And finally, traversing the assembled sentence in the original grammar tree in order to obtain a standard sentence. If this result is not returned to let the token save, then the token, again [ A equals 1 ], cannot be converted to the last standard statement.
It should be noted that, the lua term converter must have a global variable g_ errorStr as error information in the lua script, and if there is a specific error of service logic in the script, the error information can be transferred by assigning a value to g_ errorStr.
Referring to fig. 3, fig. 3 is a schematic diagram of a semantic analysis process of a delay term according to an embodiment of the present application. In fig. 3, the delay term before replacement is "delay operation 0 is on after 30 seconds before the living room, and the following steps are needed to be sequentially performed:
1. And analyzing the structured json character string in the semantic analysis of the newly added sentence.
2. And extracting the sentence with type of 1. And performing @ node analysis of the character string.
3. DelayVar = =0@parlor front spotlight = 3, a new AST syntax tree is newly added, and the new AST syntax tree is added into an AST syntax tree queue.
The Delay term is replaced by "[ { \" SrcValue \ ": \" Delay (0, 30) \ ", \" type\ ":0}, { \" SrcValue \ ": delayVar = 0@ living room front lamp = 1\", \ "type\" 1} ".
The analysis rule of the term is a result of each token morpheme calling the lua script to execute and the lua script analyzing the replacement rule defined by the term.
The present embodiment supports 2 input modes of script path and character stream, and the character stream is used as input, and cannot be plaintext, and base64 encryption is needed to ensure data integrity.
The function return value of the lua script, that is, the conversion result of the lua term converter, contains the data type of the lua normal grammar and the character string of json format. Both can be used compatibly. The json character string mainly solves the problem that in the term conversion process, the terms can have the requirement of new sentences instead of just replacing the terms.
Since the Lua script supports loadfile the loading script mode and the execution by transferring the script contents to the memory. The main function name of lua is fixed: luaGlossary, which is the main function of the execution of the incoming lua. The present embodiment executes the corresponding character strings transferred through term matching in the lua script, thereby achieving the effect of replacing sentences according to rules.
A json formatted data stream. The format is as follows:
The above data stream contains the following table 4 contents:
Table 4json data stream table of contents
In addition, the sentence to be analyzed can also contain a custom user data table. At this time, user-defined user data table data can be determined, and then the data is analyzed according to the input term rule data and the user data table data. The user data table data may include time period table data, scene table data, region table data, constant table data, variable table data, device table data, adaptive scene table data, and the like.
The term matching and replacement is to take a conditional statement or a result statement, take user data table data as input, and replace the statement with a standard statement according to a term replacement rule. The term matching and substitution analysis process is as follows:
1) Sequentially performing word segmentation class term matching on the sentences;
2) Splitting the matched terms into a character string array by splitting the matched terms into matching rules;
3) And then according to the replacement rule, calling the lua script to replace the data.
Referring to fig. 4, fig. 4 is a schematic diagram of an analysis process of a delay term according to an embodiment of the present application, which includes the following steps:
1. splitting statement to obtain: after the delay operation 0 is performed for 30 seconds, the left key of the living room switch is equal to 1, and the upper key of the living room switch is equal to 2;
2. Splitting character strings of keywords by sub sentences;
3. The context information of each sub-sentence is transferred, a matching function is called, and a lua script is executed;
4. Returning the sub-sentences;
5. And finally returning the matching data of all the split sentences.
The above analysis procedure is illustrated in the field statement:
relay is greater than 1 and 15 minutes 5 seconds @ left relay is equal to 1 and right relay is equal to 1at 1/2001, syntax tree is as shown in fig. 5:
in the first step, matching rules are carried out according to all the defined terms in the 6 token. Matching to 6 terms of [ equal to ], [ and ], [ simultaneously ], [ greater than ], [ time-of-year, month, day, time-of-second ].
And secondly, replacing rules according to the 'and', and → and &.
Thirdly, replacing rules according to the 'simultaneous', and simultaneously; .
Fourth, according to [ year, month, day, time, minute, second ] substitution rule, 15 minutes, 5 seconds at 1 day, 0, 2001- > Calendar (1,5,1) & TimeValue = 905.
And fifthly, according to the substitution rule (greater than), a rule function exists in the substitution rule, so that the lua function coverOperator is called first to judge and then substitution is carried out. The relay is greater than 1- > relay >1.
And sixthly, according to the substitution rule (equal), a rule function exists in the substitution rule, so that the lua function coverEqual is called first to judge and then substitution is carried out. The left relay is equal to 1- > left relay=1.
Seventh step, final field statement is that Calendar (1,5,1) & TimeValue = 905 @ relay >1@ left relay = 1; right relay=0; and performing reverse Poland conversion on the replaced token morphemes to generate a compiled executable file. An inverse Polish expression, also called suffix expression, is a representation that places operators behind their operands. The advantage of this representation is that it does not require brackets to identify the priority of the operation, thereby simplifying the parsing process of the expression.
The term compiling system provided in the embodiments of the present application is described below, and the term compiling system described below and the term compiling method described above may be referred to correspondingly.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a term compiling system according to an embodiment of the present application, and the present application further provides a term compiling system, including:
The sentence acquisition module is used for acquiring sentences to be analyzed;
the word splitting module is used for splitting the sentence to be analyzed into words according to the keyword and the word segmentation term defined by the glossary to obtain a word list containing the words;
a grammar tree construction module for constructing an abstract grammar tree structure containing conditions and results based on the word list;
The grammar analysis module is used for carrying out grammar analysis on the abstract grammar tree to obtain a grammar tree;
the semantic analysis module is used for carrying out semantic analysis on the grammar tree to obtain an updated grammar tree;
and the compiling module is used for obtaining the standard statement of the statement to be analyzed according to the updated grammar tree.
Based on the above embodiments, as a preferred embodiment, the word splitting module includes:
A detection unit, configured to, when detecting an operator in the sentence to be analyzed, identify a character string before the operator and not split as a word, and add the word string to the word list;
The blank character storage unit is used for storing blank characters in the sentences to be analyzed and storing blank content words corresponding to the blank characters into a container;
And the content merging unit is used for merging the complete character string content in the sentence to be analyzed into a character string in the word list.
Based on the above embodiment, as a preferred embodiment, the syntax tree construction module includes:
a term priority determining unit for determining a term priority of each word in the word list;
And the grammar tree generating unit is used for analyzing the word list by using a recursive lower degradation analysis method and/or an operator priority analysis method according to the term priority to obtain an abstract grammar tree.
Based on the above embodiments, as a preferred embodiment, the parsing module includes:
The first parsing unit is used for parsing the conditional subtrees in the abstract syntax tree;
the second parsing unit is used for parsing the result subtrees in the abstract syntax tree;
And the merging unit is used for merging the conditional subtrees and the result subtrees to obtain a grammar tree.
Based on the above embodiments, as a preferred embodiment, the semantic analysis module;
An initializing unit for initializing the lua term converter;
The semantic analysis unit is used for traversing the grammar tree in a subsequent mode, calling the lua term converter to carry out semantic analysis, and obtaining an updated grammar tree.
Based on the above embodiment, as a preferred embodiment, the semantic analysis module further includes:
The semantic analysis unit of the newly added sentence is used for executing semantic analysis of the newly added sentence on the json character string returned by the lua term converter;
The semantic analysis unit of the added sentence is used for executing the following steps:
Determining a new abstract syntax tree corresponding to each new added sentence;
performing subsequent traversal on the newly added abstract syntax tree;
Judging whether a result belonging to the conversion is a structured json character;
If yes, analyzing the structured json characters, extracting sentences with newly added marks, splitting the sentences with newly added marks, and adding a grammar tree;
And if not, executing the step of calling the lua term converter to perform semantic analysis to obtain an updated grammar tree.
Based on the above embodiment, as a preferred embodiment, further comprising:
A replacing module, configured to replace the sentence with a standard sentence if the syntax tree includes user-defined user data table data to which a term replacing rule is applied; performing word-class term matching on the standard sentence; splitting the matching rule of the matched term to obtain a character string array; and according to the replacement rule, calling the lua script to perform data replacement on the character string array.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed, performs the steps provided by the above-described embodiments. The storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The application also provides an electronic device, which can comprise a memory and a processor, wherein the memory stores a computer program, and the processor can realize the steps provided by the embodiment when calling the computer program in the memory. Of course the electronic device may also include various network interfaces, power supplies, etc.
In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. The system provided by the embodiment is relatively simple to describe as it corresponds to the method provided by the embodiment, and the relevant points are referred to in the description of the method section.
The principles and embodiments of the present application have been described herein with reference to specific examples, the description of which is intended only to facilitate an understanding of the method of the present application and its core ideas. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the application can be made without departing from the principles of the application and these modifications and adaptations are intended to be within the scope of the application as defined in the following claims.
It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A method of compiling a term comprising:
Acquiring a statement to be analyzed;
splitting the sentence to be analyzed into words according to the keyword and the word segmentation term defined by the glossary to obtain a word list containing the words;
Constructing an abstract syntax tree structure containing conditions and results based on the word list;
Carrying out grammar analysis on the abstract grammar tree to obtain a grammar tree;
Carrying out semantic analysis on the grammar tree to obtain an updated grammar tree;
And obtaining the standard statement of the statement to be analyzed according to the updated grammar tree.
2. The term compiling method of claim 1 wherein the splitting the sentence to be analyzed into word lists by the term words defined by the keyword and the glossary comprises:
When an operator in the sentence to be analyzed is detected, a character string which is before the operator and is not split is identified as a word and is added to the word list;
The empty characters in the sentences to be analyzed are reserved, and empty content words corresponding to the empty characters are saved to a container;
and merging the complete character string content in the sentence to be analyzed into a character string in the word list.
3. The term compilation method according to claim 1, wherein constructing an abstract syntax tree structure containing conditions and results based on the word list comprises:
determining the term priority of each word in the word list;
And analyzing the word list by using a recursive lower degradation analysis method and/or an operator priority analysis method according to the term priority to obtain an abstract syntax tree.
4. The term compilation method according to claim 1, wherein parsing the abstract syntax tree to obtain a syntax tree comprises:
Resolving a conditional subtree in the abstract syntax tree;
Analyzing a result subtree in the abstract syntax tree;
And merging the conditional subtrees and the result subtrees to obtain a grammar tree.
5. The method of claim 1, wherein said semantic analysis of said syntax tree results in an updated syntax tree;
Initializing a lua term converter;
And traversing the grammar tree in a subsequent mode, and calling the lua term converter to perform semantic analysis to obtain an updated grammar tree.
6. The method of claim 5, wherein traversing the syntax tree in a subsequent step, invoking the lua term transformer for semantic analysis, and obtaining an updated syntax tree further comprises:
Executing semantic analysis of the newly added sentence on the json character string returned by the lua term converter;
The semantic analysis of the added sentence comprises the following steps:
Determining a new abstract syntax tree corresponding to each new added sentence;
performing subsequent traversal on the newly added abstract syntax tree;
Judging whether a result belonging to the conversion is a structured json character;
If yes, analyzing the structured json characters, extracting sentences with newly added marks, splitting the sentences with newly added marks, and adding a grammar tree;
And if not, executing the step of calling the lua term converter to perform semantic analysis to obtain an updated grammar tree.
7. The method of claim 1, wherein if the syntax tree includes user data table data defined by a user, performing semantic analysis on the syntax tree to obtain an updated syntax tree, further comprising:
applying a term replacement rule to the user data table data, and replacing the sentence with a standard sentence;
Performing word-class term matching on the standard sentence;
Splitting the matching rule of the matched term to obtain a character string array;
And according to the replacement rule, calling the lua script to perform data replacement on the character string array.
8. A term compiling system, comprising:
The sentence acquisition module is used for acquiring sentences to be analyzed;
the word splitting module is used for splitting the sentence to be analyzed into words according to the keyword and the word segmentation term defined by the glossary to obtain a word list containing the words;
a grammar tree construction module for constructing an abstract grammar tree structure containing conditions and results based on the word list;
The grammar analysis module is used for carrying out grammar analysis on the abstract grammar tree to obtain a grammar tree;
the semantic analysis module is used for carrying out semantic analysis on the grammar tree to obtain an updated grammar tree;
and the compiling module is used for obtaining the standard statement of the statement to be analyzed according to the updated grammar tree.
9. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the term compiling method according to any of claims 1-7.
10. An electronic device comprising a memory in which a computer program is stored and a processor that when invoked performs the steps of the term compiling method according to any one of claims 1-7.
CN202410304744.6A 2024-03-15 2024-03-15 Term compiling method, term compiling system, storage medium and electronic device Pending CN118170386A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410304744.6A CN118170386A (en) 2024-03-15 2024-03-15 Term compiling method, term compiling system, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410304744.6A CN118170386A (en) 2024-03-15 2024-03-15 Term compiling method, term compiling system, storage medium and electronic device

Publications (1)

Publication Number Publication Date
CN118170386A true CN118170386A (en) 2024-06-11

Family

ID=91355872

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410304744.6A Pending CN118170386A (en) 2024-03-15 2024-03-15 Term compiling method, term compiling system, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN118170386A (en)

Similar Documents

Publication Publication Date Title
US11681877B2 (en) Systems and method for vocabulary management in a natural learning framework
US7636657B2 (en) Method and apparatus for automatic grammar generation from data entries
US8117023B2 (en) Language understanding apparatus, language understanding method, and computer program
US7630892B2 (en) Method and apparatus for transducer-based text normalization and inverse text normalization
US9710243B2 (en) Parser that uses a reflection technique to build a program semantic tree
RU2610241C2 (en) Method and system for text synthesis based on information extracted as rdf-graph using templates
US20140156282A1 (en) Method and system for controlling target applications based upon a natural language command string
CN110502227B (en) Code complement method and device, storage medium and electronic equipment
US20060212859A1 (en) System and method for generating XML-based language parser and writer
WO2002033582A2 (en) Method for analyzing text and method for builing text analyzers
WO2001029699A1 (en) Method and system to analyze, transfer and generate language expressions using compiled instructions to manipulate linguistic structures
CN109491658A (en) The generation method and device of computer-executable code data
CN111913739B (en) Service interface primitive defining method and system
Angelov et al. PGF: A portable run-time format for type-theoretical grammars
KR20100091209A (en) Device and method for automatically building applications from specifications and from off-the-shelf components selected by semantic analysis
CN113779062A (en) SQL statement generation method and device, storage medium and electronic equipment
CN110096264A (en) A kind of code operation method and device
CN111158663B (en) Method and system for handling references to variables in program code
CN111459537A (en) Redundant code removing method, device, equipment and computer readable storage medium
US20080141230A1 (en) Scope-Constrained Specification Of Features In A Programming Language
US20070055492A1 (en) Configurable grammar templates
US20070044080A1 (en) Structure initializers and complex assignment
CN110879710B (en) Method for automatically converting RPG program into JAVA program
Koskimies et al. The design of a language processor generator
KR102614967B1 (en) Automation system and method for extracting intermediate representation based semantics of javascript

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination