CN110489128A - The method and apparatus that feature calculation script is converted into underlying programs code - Google Patents

The method and apparatus that feature calculation script is converted into underlying programs code Download PDF

Info

Publication number
CN110489128A
CN110489128A CN201910782987.XA CN201910782987A CN110489128A CN 110489128 A CN110489128 A CN 110489128A CN 201910782987 A CN201910782987 A CN 201910782987A CN 110489128 A CN110489128 A CN 110489128A
Authority
CN
China
Prior art keywords
statement
code
node
block
subgraph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910782987.XA
Other languages
Chinese (zh)
Other versions
CN110489128B (en
Inventor
肖羽
陈靓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
4Paradigm Beijing Technology Co Ltd
Original Assignee
4Paradigm Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 4Paradigm Beijing Technology Co Ltd filed Critical 4Paradigm Beijing Technology Co Ltd
Priority to CN201910782987.XA priority Critical patent/CN110489128B/en
Publication of CN110489128A publication Critical patent/CN110489128A/en
Application granted granted Critical
Publication of CN110489128B publication Critical patent/CN110489128B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/433Dependency analysis; Data or control flow analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/447Target code generation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

Disclose a kind of method and apparatus that feature calculation script is converted into underlying programs code, wherein the described method includes: parsing feature calculation script is to generate at least one statement group;Multiple statement blocks are generated based at least one described statement group, wherein the sentence in each statement block executes the abnormal sentence not influenced in statement block at the same level and executes;Underlying programs code is generated based on the multiple statement block.According to the disclosure, the rate that runs succeeded for the underlying programs code being converted into can be improved.

Description

The method and apparatus that feature calculation script is converted into underlying programs code
Technical field
Disclosure all things considered is related to machine learning field, more particularly, is related to a kind of converting feature calculation script At the method and apparatus of underlying programs code.
Background technique
In machine-learning process, before being trained, need to realize the process that feature is obtained based on initial data.This The process of sample is referred to alternatively as Feature Engineering, and Feature Engineering is the important component of machine-learning process.
There are various machine learning frames, most machine learning frames in these machine learning frames are absorbed in The training of machine learning is to be modeled, and a small number of machine learning frames are absorbed in Feature Engineering, but the function of Feature Engineering is simultaneously It is not perfect.
In existing Feature Engineering frame, the feature calculation script on the upper layer for writing user is needed to be converted to underlying programs Code, then execute the underlying programs code and complete feature calculation.During executing the underlying programs code being converted into, need The sentence in underlying programs code is sequentially executed, however, have any one sentence during sequence executes executes hair When raw abnormal, execution will stop, so that all sentences after abnormal execution sentence will not be all performed, cannot get phase The feature calculation answered exports as a result, feature calculation low efficiency.
Summary of the invention
The exemplary embodiment of the disclosure is to provide a kind of side that feature calculation script is converted into underlying programs code Method, wherein by fault tolerant mechanism the rate that runs succeeded for the underlying programs code indicated by sentence being converted into is improved, phase It answers, the rate that runs succeeded of feature calculation script also improves.
According to an exemplary embodiment of the present disclosure, a kind of side that feature calculation script is converted into underlying programs code is provided Method, wherein the described method includes: parsing feature calculation script is to generate at least one statement group;Based at least one described sound The multiple statement blocks of bright group of generation, wherein the sentence in each statement block executes the abnormal sentence not influenced in statement block at the same level and holds Row;Underlying programs code is generated based on the multiple statement block.
Optionally, described the step of multiple statement blocks are generated based at least one described statement group include: for it is described extremely Each statement group in a few statement group obtains the list comprising the statement indicated by sentence in the statement group, according to The dependence of sentence in the list constructs directed acyclic graph (DAG), wherein the node in directed acyclic graph indicates to become It measures, the directed edge in directed acyclic graph indicates the sequence of calculating logic and calculating logic;Based on the directed acyclic graph building sound Bright piece, wherein the node for belonging to different statement blocks is mutually indepedent, and the calculating logic for belonging to different statement blocks is mutually indepedent, same It states between the node among block, there are dependences between calculating logic and/or between node and calculating logic.
Optionally, the step of building statement block based on the directed acyclic graph includes: from the directed acyclic graph Remove the directed acyclic graph root node and with the associated side of the root node, and obtain by removal generation the first subgraph; From can be constituted in first subgraph removed in the subgraph of directed acyclic graph and number of nodes more than or equal to 2 root node and with The associated side of the root node, and obtain the second subgraph generated by removal;It repeats from directed acyclic graph can be constituted And the step of removing root node and side in subgraph of the number of nodes more than or equal to 2, until in the subgraph generated by removal Until the number of nodes of each subgraph is less than or equal to 2;Based on the subgraph generated by removal, building statement block, wherein same Grade statement block includes multiple statement blocks of the remainder building based on same directed acyclic graph after removing root node.
Optionally, the step of building statement block based on the directed acyclic graph further include: according in directed acyclic graph The direction of directed edge of node determine whether node is that can be used as root node, wherein with the node that can be used as root node It is directed toward the node other than the node in the direction of the directed edge of connection, wherein when there are two or more can be used as root It, will be in the two or more nodes according to the logical order of feature calculation script corresponding with node when the node of node One be determined as root node.
Optionally, described the step of generating underlying programs code based on the multiple statement block includes: whenever root node quilt When removal, according to statement corresponding with the root node being removed, code is generated;It obtains after removing root node by individual node When composition or subgraph including two nodes, according to pair with the subgraph or the subgraph including two nodes that are made of individual node It should be stated that generating code, wherein it is corresponding that the sequence of code corresponding with the node being first removed comes the node being removed with after Before code, and the corresponding code of the subgraph that first generates sequence come code corresponding with the subgraph generated afterwards before, with The sequence of the corresponding code of each statement block in peer's statement block is determined according to the logical order of feature calculation script.
Optionally, the method also includes: exception handling is added at least one code of generation, so that when described At least one code in the process of running throw exception when, the exception dished out is captured and is disposed, wherein peer statement block Corresponding parallel subgraph, add exception handling code include in following item at least one of: with parallel multiple subgraphs Each of the corresponding code of subgraph of non-individual node, with the whole corresponding code of parallel multiple subgraphs, parallel subgraph It include: the multiple subgraphs generated after same directed acyclic graph is removed root node.
According to the another exemplary embodiment of the disclosure, a kind of feature calculation side based on feature calculation script is provided Method, wherein the feature calculation method includes: to generate underlying programs code according to the process described above;What execution was converted into Underlying programs code.
According to the another exemplary embodiment of the disclosure, providing a kind of includes at least one computing device and at least one The system of the storage device of store instruction, wherein described instruction promotes described when being run by least one described computing device At least one computing device executes method as described above.
According to the another exemplary embodiment of the disclosure, a kind of computer readable storage medium of store instruction is provided, Wherein, when described instruction is run by least one computing device, at least one described computing device is promoted to execute as described above Method.
According to the another exemplary embodiment of the disclosure, provides and a kind of feature calculation script is converted into underlying programs generation Code equipment, wherein the equipment include: script parsing with statement group generation unit, parse feature calculation script with generate to A few statement group;It states block generation unit, multiple statement blocks is generated based at least one described statement group, wherein Mei Gesheng Sentence in bright piece executes the abnormal sentence not influenced in statement block at the same level and executes;Code generating unit is based on the multiple sound Bright piece of generation underlying programs code.
Optionally, the statement block generation unit generates statement block by following operation: at least one described statement Each statement group in group obtains the list comprising the statement indicated by sentence in the statement group, according in the list The dependence of sentence construct directed acyclic graph, wherein node in directed acyclic graph indicates variable, in directed acyclic graph The sequence of directed edge expression calculating logic and calculating logic;Statement block is constructed based on the directed acyclic graph, wherein belongs to difference State block node it is mutually indepedent, belong to it is different statement blocks calculating logics it is mutually indepedent, it is same statement block among node it Between, there are dependences between calculating logic and/or between node and calculating logic.
Optionally, it is described statement block generation unit removed from the directed acyclic graph directed acyclic graph root node and With the associated side of the root node, and obtain by removal generation the first subgraph;It can have been constituted from first subgraph Removed to acyclic figure and in subgraph of the number of nodes more than or equal to 2 root node and with the associated side of the root node, and obtain logical The second subgraph crossing removal and generating;It repeats from the son that can be constituted directed acyclic graph and number of nodes and be greater than or equal to 2 The step of root node and side are removed in figure, until the number of nodes of each subgraph in the subgraph generated by removal is less than or waits Until 2;And based on the subgraph generated by removal, building statement block, wherein peer's statement block includes being had based on same The multiple statement blocks constructed to remainder of the acyclic figure after removing root node.
Optionally, the statement block generation unit determines section also according to the direction of the directed edge of the node in directed acyclic graph Whether point is can be used as root node, wherein with the direction of directed edge that the node of root node can be used as to connect direction in addition to Node other than the node, wherein when can be as the node of root node there are two or more, according to corresponding with node Feature calculation script logical order, one in the two or more nodes is determined as root node.
Optionally, the code generating unit is when being removed root node, according to corresponding with the root node being removed Statement generates code;Obtain and be made of individual node or when subgraph including two nodes after removing root node, according to The subgraph be made of individual node or the subgraph including two nodes pair it should be stated that generating code, wherein be first removed Before the sequence of the corresponding code of node comes the corresponding code of the node being removed with after, code corresponding with the subgraph first generated Sequence come code corresponding with the subgraph generated afterwards before, code corresponding with each statement block in statement block at the same level Sequence determined according to the logical order of feature calculation script.
Optionally, the code generating unit is at least one code addition exception handling generated, so that working as institute State at least one code in the process of running throw exception when, the exception dished out is captured and is disposed, wherein peer statement Block corresponds to parallel subgraph, add exception handling code include in following item at least one of: with parallel multiple sons The corresponding code of subgraph of the non-individual node of each of figure, with the whole corresponding code of parallel multiple subgraphs, parallel son Figure includes: the multiple subgraphs generated after same directed acyclic graph is removed root node.
According to the another exemplary embodiment of the disclosure, provides a kind of feature calculation based on feature calculation script and set It is standby, wherein the feature computation device includes: that feature calculation script is converted into setting for underlying programs code as described above It is standby;Execution unit executes the underlying programs code being converted into.
According to disclosure exemplary embodiment, when realizing the machine-learning process such as Feature Engineering, needing will be by feature meter It calculates script and is converted into underlying programs code (for example, will turn based on the script of characteristic query language (FeQL) based on directed acyclic graph Changing (or being compiling, parsing etc.) is Java code).It is needed in conversion process using statement block, each statement block can correspond to A part of sentence.If the code being converted into is by by the execution of sequence, when in a plurality of sentence corresponding with a code block When run-time exception occurs for one sentence, can dish out operation exception, terminate the operation of the code block, so that originally can be just The a plurality of sentence often carried out and corresponding feature calculation are also interrupted.In order to solve the problems, the disclosure is shown The code being converted into example property embodiment will be added fault tolerant mechanism, and sentence as much as possible is enable smoothly to be executed, from And the success rate of feature extraction and calculating etc. can be improved.
According to disclosure exemplary embodiment, statement block can be generated based on statement group, for example, can divide directed acyclic graph with And subgraph to be to generate statement block, by stating to add fault tolerant mechanism in code that block is converted to, wherein the statement block of generation it Between do not interdepend.In addition, code made of conversion compared with former script, may sequentially have difference executing, for example, can It is separately operable without complementary code, peer states that the sentence of a statement block in block executes exception and do not influence other statement blocks Sentence execute so that error codes reduce the coverage of execution.In this case, it avoids when according to script original When thering is sequence to execute, due to abnormal Character losing problem caused by leading to feature calculation termination, the appearance of feature calculation is improved Mistake.The mode of specific addition fault tolerant mechanism is included in addition TryCatch exception handling in code generation process.With Suitable for being compared by carrying out fault-tolerant scheme to data cutting for generic scenario, the disclosure to statement group (set of statement) into Row segmentation incorporates fault tolerant mechanism in the sentence of statement block to generate statement block, can more effectively handle based on characteristic query The feature calculation of language etc..
Part in following description is illustrated into the other aspect and/or advantage of disclosure general plotting, there are also one Point will be apparent by description, or can implementation Jing Guo disclosure general plotting and learn.
Detailed description of the invention
By below with reference to be exemplarily illustrated embodiment attached drawing carry out description, disclosure exemplary embodiment it is upper Stating will become apparent with other purposes and feature, in which:
Fig. 1 is the schematic diagram of transcode process according to the exemplary embodiment of the disclosure;
Fig. 2 shows showing for the corresponding relationship between directed acyclic graph according to the exemplary embodiment of the disclosure and expression formula It is intended to;
Fig. 3 show conversion process from script accoding to exemplary embodiment to code schematic diagram;
Fig. 4 shows the schematic diagram according to the exemplary embodiment of the disclosure that directed acyclic graph is generated based on script;
Fig. 5 shows the signal of the process according to the exemplary embodiment of the disclosure that statement group is generated based on directed acyclic graph Figure;
Fig. 6 shows the signal of the process according to the exemplary embodiment of the disclosure for converting statement group to code block Figure;
Fig. 7 shows the feelings that the execution of code corresponding with statement group according to the exemplary embodiment of the disclosure is abnormal The schematic diagram of condition;
Fig. 8 A and Fig. 8 B are shown respectively the code according to an exemplary embodiment of the present disclosure based on old scheme and new departure and turn Change the schematic diagram of process;
Fig. 9 A and Fig. 9 B are shown according to an exemplary embodiment of the present disclosure based on the code building rank for coming scheme and new departure The schematic diagram of section;
Figure 10, which is shown, according to the exemplary embodiment of the disclosure is converted into underlying programs code for feature calculation script The flow chart of method;
Figure 11 shows the schematic diagram that statement block according to the exemplary embodiment of the disclosure generates;
Figure 12 shows the decomposition diagram that statement block according to the exemplary embodiment of the disclosure generates;
Figure 13 shows the signal according to the exemplary embodiment of the disclosure that statement block is generated based on a directed acyclic graph Figure;
The schematic diagram of code building strategy according to the exemplary embodiment of the disclosure is shown respectively in Figure 14 A and Figure 14 B;
Figure 15 shows the schematic diagram of addition exception handling according to the exemplary embodiment of the disclosure;
Figure 16 A is the schematic diagram of the transcode process according to the exemplary embodiment of the disclosure based on old scheme;
Figure 16 B is the schematic diagram of the transcode process according to the exemplary embodiment of the disclosure based on new departure.
Specific embodiment
It reference will now be made in detail embodiment of the disclosure, examples of the embodiments are shown in the accompanying drawings, wherein identical mark Number identical component is referred to always.It will illustrate the embodiment, by referring to accompanying drawing below to explain the disclosure.
Fig. 1 is the schematic diagram of transcode process according to the exemplary embodiment of the disclosure.In the exemplary of the disclosure In embodiment, realize Feature Engineering Computational frame include using tools such as SparkSQL and PySpark based on Spark ecology, Wherein, SparkSQL is the important component of Spark frame, is mainly used for structural data processing and type of structured inquires language Say (SQL) inquiry, the bottom of SparkSQL is SparkCore, and SparkSQL is abstracted DataFrame by a layer data and (can manage Solution is data framework, for example, data structure) make user that can write script using class SQL syntax on interface, and PySpark Difference with SparkSQL is that user writes script using the grammer of class Python.
In the case where the large-scale machines that various machine learning scenes especially need to be calculated in real time learn scene, can be used Feature Engineering is realized in such a way that upper type is especially with tools such as SparkSQL and PySpark.Carrying out Feature Engineering When, floor portions (for example, unit that the SparkCore or Java in Fig. 1 are indicated) generally can not be identified by programming interface and connect The script that the user of receipts writes, for example, the SparkCore as floor portions is write by Scala language under Spark frame, And received by programming interface is script that the code write by class Python grammer or class SQL syntax are write.This In the case of, need the foot for writing upper procedure code (for example, the script write by class Python grammer or class SQL syntax) Originally it is translated as underlying programs code.
As shown in fig. 1, it under Spark frame, needs to compile by script that SparkSQL writes and by PySpark The code write is converted into the identifiable code of SparkCore.In feature calculation engine, it can will pass through the foot of scripting language Originally it is converted into the code of Java language, to realize the translation from scripting language to program language.In addition, can also be by Java language Code be converted into the identifiable code of SparkCore.
Scripting language in Fig. 1 can be characteristic query language, and (FeQL can be regarded as Feature Engineering The abbreviation of Query Language or Feature Query Language), it is a kind of feature meter that fourth normal form (4nf company grinds certainly Language is calculated, is a kind of user's scripting language on upper layer.It can be from FeQL script (script write by FeQL) to Java generation Illustrate the exemplary embodiment of the disclosure for the conversion process of code.In engineering according to the exemplary embodiment of the disclosure During habit, the script that user in user oriented front-end interface inputs can be converted into underlying programs code, be turned with executing The underlying programs code changed into.So-called bottom can be for the front end interacted with user, and script can also be feature Calculate script etc..The conversion process slave FeQL script to Java code in the exemplary embodiment of the disclosure is only exemplary , it is not limited to the protection scope of the disclosure.
In an exemplary embodiment of the disclosure, one layer of secondary development can be carried out using Java language on Spark frame And form rear end;In front end, user's input includes FeQL script.When FeQL script is run, there is the FeQL foot by upper layer Originally translate SparkCore's to translation (conversion) process of the Java code to next layer, this process and from SparkSQL Process is similar, and similar place is that the conversion process from FeQL script to Java code is also from script to procedural language code Conversion.The conversion of from the PySpark code of class Python to SparkCore are similar with the conversion of from Java code to SparkCore, class It is that conversion of the Java code to SparkCore is also the conversion from code to code like place.In conversion process, it can increase Fault tolerant mechanism, for example, fault tolerant mechanism can be increased for the Java code converted from FeQL script.
In an exemplary embodiment of the disclosure, using the calculating between directed acyclic graph (DAG) expression variable, variable The sequence of logic and calculating logic, wherein the node in DAG is applicable to indicate variable, and the side in DAG is applicable to indicate Calculating logic, the arrow on side are applicable to close the sequence for indicating calculating logic.When calculating logic can by equation to express when, arrow The variable that head is directed toward is located at right side of the equal sign, and the variable of the one end without arrow on side is located at left side of the equal sign.For example, referring to Fig. 2, C=a+b and e=log (c) can be indicated by DAG, be expression formula on the left of Fig. 2, right side is DAG corresponding with expression formula.
In an exemplary embodiment of the disclosure, it is patrolled using DAG expression variable, the calculating logic between variable and calculating The sequence collected, it is only for the purpose of explanation is not limited to the protection scope of the disclosure, any to express variable, become The mode of the sequence of calculating logic and calculating logic between amount is feasible.
In an exemplary embodiment of the disclosure, calculate node (ComputeNode) can indicate carrying out Feature Engineering The computing unit node of feature calculation (extraction) in Shi Jinhang window.The node being described below can indicate calculate node.It is single A node can correspond to single variable, and the relationship between variable includes calculated relationship, can be indicated by the side of DAG.Statement (Statement) it can be used for defined variable, constant, function, class etc., can be indicated by sentence.It can will be more according to dependence A statement is divided into statement group (StatementGroup).
In an exemplary embodiment of the disclosure, statement block (StatementBlock) is also defined, sound is understood to be Bright perfoming block includes one or more of sentences (executing sentence).Statement block can be used to state to safeguard for feature calculation engine, To express finer dependence and scope relationship.Code made of statement block conversion combines fault tolerant mechanism, so that often The sentence of a statement block executes the abnormal sentence for not influencing statement block at the same level and executes.
In an exemplary embodiment of the disclosure, FeQL can be identical or different with existing programming language, can be used for spelling table, Data processing etc..FeQL can be applied to temporal aspect building, it is ensured that feature is consistent with feature under line on line, improves model pre-estimating energy Power.The script based on FeQL can be inputted in the user interface, therefore, FeQL can be interpreted as to the upper layer application that can be edited by user Language.User can the specific syntax based on FeQL be write in the script frame of user interface or edit script, script are logical in verification It can be run after crossing, the script being run can be compiled into underlying programs code so that code is performed.
During feature calculation script is converted into underlying programs code, DAG can play bridge beam action.Namely It says, feature calculation script can be converted to DAG, then, underlying programs code can be generated based on the DAG of conversion, to will be based on The script that the language such as FeQL are write is converted to the underlying programs code such as Java language code.Such conversion process may include as follows Step or stage: script parsing and plan generation, statement group generates and code building.
Fig. 3 show conversion process from script accoding to exemplary embodiment to code schematic diagram.It is detailed referring to Fig. 3 Conversion process of the description script to code.
Script parse and plan generation phase (step) in, can based on various modes to the script write using FeQL into To generate at least one statement group, statement group can be produced based on the dependence between the sentence in the list of statement for row parsing Raw, statement group can be indicated by DAG.The foundation for generating DAG includes the dependence of the sentence in list, which can be The list of statement in single statement group, statement can be indicated by way of sentence.It can will parse script and generate the mistake of DAG Cheng Zuowei plans and generates.The node of the DAG of generation includes: spelling table node (JoinTableNode) and calculate node.
Fig. 4 shows the schematic diagram according to the exemplary embodiment of the disclosure that DAG is generated based on script, is foot on the left of Fig. 4 This, right side DAG is inputted for the unit or step of realizing script parsing and plan generation phase as script, is exported and is DAG。
The STATEMENT RE of one calculate node can be formed to a statement in statement group generation phase with continued reference to Fig. 3 Group.The list of the statement in single statement group can be obtained.In the STATEMENT RE of one calculate node, in addition to one Node except calculating depends on one node.Each statement can correspond to a sentence, in a statement group, sentence quilt Sequence executes, when sentence corresponding with any one statement occurs to execute abnormal, it will generates shadow to the execution of correlative It rings.For example, if sentence corresponding with node generation executes exception, will affect described in dependence in a statement group The execution of the sentence of one node, the node for also influencing the downstream according to the sequence of one node export (with node To the value of dependent variable), but do not influence not depending on the execution of other sentences of one node and not depend on one The output of node.However, even if not influencing not depend on the execution of other sentences of one node and not depending on described one The output of a node, since sentence is performed serially, one node will lead to sentence executive termination, to cannot obtain The output of one node must not be depended on, corresponding feature cannot be obtained.In order to be obtained as far as possible when occurring executing abnormal More features needs to divide multiple statement groups, the sentence independent operating of each statement group, complementary shadow according to sentence dependence It rings.
Fig. 5 shows the schematic diagram of the process according to the exemplary embodiment of the disclosure that statement group is generated based on DAG.Such as figure Shown in 5, left side DAG, right side is the statement group generated based on DAG, for unit corresponding with statement group generation phase or For step, inputs as DAG, export as statement group.Referring to Fig. 5,3 statement groups, i.e. statement group can be generated based on the DAG in left side 1, statement group 2 and statement group 3.Each statement group can also form a DAG, exist between the node inside statement group and rely on Relationship, for example, the value of node needs to rely at least one node in same statement group and is calculated.The section of different statement groups Dependence is not present between point.For example, the value of a node needs not rely on other sound different from statement group belonging to node Node in bright group and be calculated.
Code, each code block corresponding at least one can be generated based on statement group in the code generation phase with continued reference to Fig. 3 Line code (at least one sentence) can realize corresponding operation by code building script.
Fig. 6 shows the signal of the process according to the exemplary embodiment of the disclosure for converting statement group to code block Figure.
As shown in Figure 6, left side is 3 statement groups, and right side is 3 code blocks corresponding with 3 statement groups, each Code block includes a plurality of sentence.It for code building step or for the unit of code building, inputs as DAG, exports and be Code block.This mode for converting code block for statement group of Fig. 6 is that single statement group is directly converted into a code Block.The code block 1 being converted to from statement group 1 includes 13 sentences, and the code block 2 being converted to from statement group 2 includes 2 sentences, The code block 3 being converted to from statement group 3 includes 2 sentences.Sentence in each code block can be indicated by Java language.It can press Code is generated and executed according to the sequence of sentence in code block, for example, for code block 1, it can be according to from the 1st article of sentence to the 13rd article The sequence of sentence generates and executes code, and every sentence can correspond to line code.
As described above, when the code being converted to is performed serially, there is the case where execution failure.For example, as in Fig. 6 It is shown, when the sentence being transformed from same statement group is performed, since sentence is performed serially (for example, according to from the 1st article The sequence of sentence to the 13rd article of sentence executes), therefore, if the execution of a sentence is abnormal, influence subsequent sentence It executes.For example, when the execution of var5=double (var3) this sentence is abnormal (for example, executing failure), by shadow Ring the execution for all or part of sentences being performed then, for example, influence sentence var6=double (var4) and The execution of sentence after the sentence.
In this case, for statement group 1, the implementing result of code can only include that the sentence having been carried out generates Implementing result.For example, can only finally return to centre when the execution of var5=double (var3) this sentence is abnormal The value of variable w, var1 and var2.It calculates return rate (acquisition value, which has accounted for, should obtain worth ratio) and there was only 3/13.For finally needing For the variable var6 and feature f1 to feature f6 that want, it is impossible to obtain result.Therefore, for being converted to code from script For whole process, if code block 1 occurs code and executes exception, and there is no code execution is different for code block 2 and code block 3 Often, program can only the code implementing result in return code block 2 and code block 3 and the partial code in code block 1 execute knot Fruit.
Fig. 7 shows the feelings that the execution of code corresponding with statement group according to the exemplary embodiment of the disclosure is abnormal The schematic diagram of condition.
As shown in the right side of Fig. 7, if statement group is identified as " X ", the execution of corresponding code, which is abnormal, (can be managed Solution is to execute failure), for example, the execution of code corresponding with statement group 1 is abnormal.If statement group is identified as " V ", Corresponding code is successfully executed, for example, code corresponding with statement group 2 and statement group 3 is successfully executed, cannot successfully be held Row or execution unsuccessfully will lead to variable and/or feature calculation failure.In addition, since code executes in sequence, variograph Calculation unsuccessfully will lead to the subsequent variable of at least part and/or feature and cannot be obtained.
As described above, if occurring to calculate failure in Feature Engineering (for example, feature calculation), it is impossible to obtain desired As a result, computing capability will decline.In order to avoid such case, need to introduce fault tolerant mechanism in Feature Engineering.Especially need Addition executes fault tolerant mechanism in the code of conversion.
However, all adding (introducing) fault tolerant mechanism if it is all codes, cause code complicated, chaotic, code is held Line efficiency can also reduce.Therefore, it is necessary to add fault tolerant mechanism by code piecemeal, and based on piecemeal result.
When introducing fault tolerant mechanism, especially introducing fault tolerant mechanism for the feature calculation based on DAG, need to consider in following Hold:
1. returning to proper characteristics as much as possible by feature calculation;
2. feature calculation will not lose feature because intervening statement calculates failure;
3. avoiding failing because intermediate variable calculates and leading to entire feature calculation mission failure and export without feature.
Fig. 8 A and Fig. 8 B are shown respectively the code according to an exemplary embodiment of the present disclosure based on old scheme and new departure and turn Change the schematic diagram of process.Fig. 9 A and Fig. 9 B are shown respectively according to an exemplary embodiment of the present disclosure based on old scheme and new departure The schematic diagram of code is obtained from statement group.As shown in Fig. 8 A to Fig. 9 B, it is raw to increase code block for new departure compared with old scheme At step (stage).According to old scheme, three statement groups can be obtained according to FeQL script, correspondence can produce based on each statement group Java code.The code (sentence) generated based on same statement group is performed serially, if generated based on same statement group A sentence in a plurality of sentence reports an error because of exception, then will block a plurality of sentence generated based on the same statement group Execution, that is to say, that will jump out the implementation procedure of a plurality of sentence with the execution of END.In this case, it holds The subsequent sentence of the capable sentence that reports an error can not be performed.In view of the sentence that can not be performed may independent of report an error sentence whether It is successfully executed, based on dependence division statements to generate statement block, and increases fault tolerant mechanism, so that the subsequent sentence can It is performed, particular content is detailed in new departure.
According to new departure, increase in statement group generation phase and between the code generation phase statement block generation phase (step Suddenly), to generate one or more statement blocks according to statement group.Correspondingly, in the code generation phase, according to statement block rather than sound Bright group generates Java code.
Figure 10, which is shown, according to the exemplary embodiment of the disclosure is converted into underlying programs code for feature calculation script The schematic diagram of method, Figure 10 correspond to new departure for generating code, and referring to Fig.1 0, according to an exemplary embodiment of the present disclosure The method that feature calculation script is converted into underlying programs code may include step 110 to step 130.
In step 110, feature calculation script is parsed to generate at least one statement group;In step 120, based on described at least One statement group generates multiple statement blocks, wherein the sentence in each statement block executes abnormal do not influence in statement block at the same level Sentence executes;In step 130, underlying programs code is generated based on the multiple statement block.
For example, the Java code generated is as follows:
As an example, described the step of generating multiple statement blocks based at least one described statement group includes: for described Each statement group at least one statement group performs the following operations: obtaining comprising the sound indicated by sentence in the statement group Bright list constructs directed acyclic graph according to the dependence of the sentence in the list, wherein the node in directed acyclic graph Indicate variable, the directed edge in directed acyclic graph indicates the sequence of calculating logic and calculating logic;Based on the directed acyclic graph Building statement block, wherein the node for belonging to different statement blocks is mutually indepedent, and the calculating logic for belonging to different statement blocks is mutually only Vertical, there are dependences between the node among same statement block, between calculating logic and/or between node and calculating logic.
As an example, the step of building statement block based on the directed acyclic graph includes: from the directed acyclic graph The middle root node for removing the directed acyclic graph and with the associated side of the root node, and obtain generated by removal it is first sub Figure;Root node is removed from that can be constituted in the subgraph that directed acyclic graph and number of nodes are greater than or equal to 2 in first subgraph With with the associated side of the root node, and obtain by removal generation the second subgraph;It repeats from oriented nothing can be constituted The step of ring figure and number of nodes remove root node and side in the subgraph more than or equal to 2, until the son generated by removal Until the number of nodes of each subgraph is less than or equal to 2 in figure;Based on the subgraph generated by removal, block is stated in building, In, peer's statement block includes multiple statements of the remainder building based on same directed acyclic graph after removing root node Block.
As an example, the step of building statement block based on the directed acyclic graph further include: according to directed acyclic graph In the direction of directed edge of node determine whether node is that can be used as root node, wherein with the section that can be used as root node Be directed toward node other than the node in the direction of the directed edge of point connection, wherein when there are two or more being capable of conduct When the node of root node, according to the logical order of feature calculation script corresponding with node, by the two or more nodes In one be determined as root node.
As an example, described the step of generating underlying programs code based on the multiple statement block includes: whenever root node When being removed, according to statement corresponding with the root node being removed, code is generated;It obtains after removing root node by individually saving When point composition or subgraph including two nodes, according to the subgraph being made of individual node or the subgraph including two nodes Pair it should be stated that generating code, wherein the sequence of code corresponding with the node being first removed comes the node pair that is removed with after Before answering code, and the corresponding code of the subgraph that first generates sequence come code corresponding with the subgraph generated afterwards before, The sequence of code corresponding with each statement block in statement block at the same level is determined according to the logical order of feature calculation script.
As an example, the method also includes: exception handling is added at least one code of generation, so that working as institute State at least one code in the process of running throw exception when, the exception dished out is captured and is disposed, wherein peer statement Block corresponds to parallel subgraph, add exception handling code include in following item at least one of: with parallel multiple sons The corresponding code of subgraph of the non-individual node of each of figure, with the whole corresponding code of parallel multiple subgraphs, parallel son Figure includes: the multiple subgraphs generated after same directed acyclic graph is removed root node.
In an exemplary embodiment of the disclosure, script is for example being converted by multiple statement groups by upstream logic When, piecemeal processing can be carried out to the statement in each statement group to generate statement block, the target that statement block generates is: each statement Sentence in block executes the abnormal sentence not influenced in statement block at the same level and executes.Sentence can be used to indicate the statement in statement group, Sentence in one statement group interdepends, therefore can generate a DAG based on one statement group.State block generating process Input be statement list that statement group includes.The output of statement block generating process is statement block.Building states that the process of block is According to the sentence dependence in statement list, so that the calculate node of different statement blocks is mutually indepedent, the meter of difference statement block Calculate that logic is mutually indepedent, there are dependence between the calculate node of identical statement block, between the calculating logic of identical statement block There are dependences.The production Methods of statement block can be understood based on Figure 11.
Figure 11 shows the schematic diagram that statement block according to the exemplary embodiment of the disclosure generates.Figure 12 is shown according to this public affairs The decomposition diagram that the statement block for the exemplary embodiment opened generates.Figure 13 shows base according to the exemplary embodiment of the disclosure The schematic diagram of statement block is generated in a directed acyclic graph.
As shown in Figure 11 to Figure 13, in order to which the intermediate node and the calculating logic that generate multiple statement blocks are mutually indepedent, and sound The complementary statement block of node and calculating logic among bright piece, can be based on the directed acyclic subgraph generation strategy of Dynamical Pruning Building statement block.Calculated since the root node of DAG, the node after having been calculated cut off, remaining part as subgraph, then Using the root node of subgraph as starting point, start to recycle next time.
Specifically, statement block can be converted by aforementioned statement group 1.The root node that can determine the DAG of statement group 1 is w, with The corresponding sentence of root node w is w=window (t).Firstly, removal root node w and the directed edge being connect with root node w.At this point, It generates using war1 as the DAG of root node, this DAG is subgraph, can correspond to a statement block, is defined as first layer statement block.With Afterwards, continue beta pruning (removal root node).Removing node var1, (sentence corresponding with node var1 is var1=w.col1 [0]/w.col2 [0]) and the directed edge that is connect with node var1, two subgraphs are generated, a subgraph is made of individual node f1, Another subgraph is DAG, every height using candidate root node var2 or candidate root node var3 as root node, in the two subgraphs Scheme a corresponding statement block, the two statement blocks are defined as second layer statement block.
Then, it can be selected from candidate root node var2 and candidate root node var3 according to an exemplary embodiment of the present disclosure Root node removes node var2 and corresponding directed edge, produces using var3 as root section when node var2 is selected as root node The subgraph of point, this subgraph can correspond to a statement block, be defined as third layer statement block.
Then, root node var3 and corresponding directed edge be can remove, three subgraphs are generated, be respectively made of node f3 Subgraph, the subgraph including node var5 and node f5 and using var4 as the subgraph of root node, these three subgraphs respectively correspond one A statement block, these three statement blocks are defined as the 4th layer of statement block.
Next, can remove root node var4 and corresponding directed edge, three subgraphs are generated, are respectively made of node f2 Subgraph, by the node f4 subgraph formed and the subgraph including node var6 and node f6.These three subgraphs respectively correspond one A statement block, these three statement blocks are defined as layer 5 statement block.
In aforesaid operations, the code of same layer statement block is parallel (peer), can be executed in parallel, due to peer Code does not interdepend, therefore can add exception handling, so that the execution of code is abnormal not to influence holding for code at the same level Row.
Judging that root node (root node is selected from candidate root node) makes, the foundation of judgement includes the direction of directed edge, If a node only issues the directed edge for being directed toward other nodes without being directed toward the node from the sending of other nodes from it Directed edge, then the node is candidate root node.If candidate root node has multiple, it is assigned according to candidate root node Associated script line number (logical order) judgement.For example, if the line number for the associated script that candidate root node is assigned compared with It is small, then preferentially it is elected to be root node.
It is described in detail below that TryCatch exception handling is added in code generation process.TryCatch is Java language One of speech exception handling.In java application, exception handling are as follows: throw exception and capture are abnormal.It throws It is abnormal out to refer to: when mistake, which occurs, in a method (function) causes abnormal, when creating exception object and delivering operation (runtime) system contains the exception informations such as Exception Type and abnormal program state when occurring in exception object.When operation System is responsible for finding the abnormal code of disposition and execution.Catch the exception and refer to: after method throw exception, runtime system will Switch to find suitable exception handler (exception handler).It is abnormal be always first spilled over, after be captured, can The exception for capturing (capture), must be the exception dished out.It is abnormal to be captured by try-catch sentence in Java language.try- The general grammatical form of catch sentence are as follows:
A pair of of braces after keyword try expands one piece of code that may be abnormal, referred to as monitoring area. If exception occurs in the process of running in the method for Java language, exception object is created.Except exception throws monitoring area, Matched catch clause is attempted to look for catch the exception by java runtime system.If there is matched catch clause, then transport The exception handling code of row catch clause, then, try-catch Statement Completion.The principle of matching catch clause is: if thrown Exception object out belongs to the exception class of catch clause, or belongs to the subclass of the exception class, then it is assumed that the exception object of generation Match with the Exception Type of catch block capture.
Based on TryCatch exception handling generate code when, firstly, between the statement of same statement group have according to The relationship of relying, can be organized into the DAG of a single input;Secondly, removal start node;Then, the remainder of DAG is split into one A or more DAG subgraph.The Specific Principles for carrying out TryCatch piecemeal are: the subgraph of individual node (referred to as single node) is not It needs to add TryCatch, parallel multiple subgraphs (multiple single nodes or single node and subgraph) need to add TryCatch. The schematic diagram of code building strategy according to the exemplary embodiment of the disclosure is shown respectively in Figure 14 A and Figure 14 B.
Figure 14 A is old scheme, and Figure 14 B is new departure.According to old scheme, need to judge the number of statement group corresponding with script Amount whether be it is multiple, when for multiple statement groups, each statement group is respectively converted into code, otherwise, will individually statement group conversion For code.
According to new departure, needs to generate statement block based on statement group, for each statement group for generating statement block, need Judge the statement block generated quantity whether be it is multiple, if it is multiple, in sequence (for example, the logical order of script, Script or the line number of sentence etc.) it is that code corresponding with each code block adds TryCatch exception handling, if quantity It is one, then judges whether code block corresponds to single sentence (sentence), if it is, direct basis code block generates language Sentence, otherwise (for example, the code block generated is DAG) then returns to sky.
Figure 15 shows the schematic diagram of addition exception handling according to the exemplary embodiment of the disclosure.The left side of Figure 15 For script, right side is exception handling addition manner, and being added exception handling, (mechanism passes through abnormality processing block (i.e. Trycatch sentence) realize) code block by TryCatch mark.It is according to the exemplary embodiment of the disclosure added with different The pseudocode of normal treatment mechanism is as follows:
In above-mentioned pseudocode, position of the trycatch sentence of addition in pseudocode is identified, based on the pseudo- generation Code Java code be above shown in code, the trycatch sentence of addition is identified in the code in Java code Position, wherein for generation first layer state block to layer 5 state block for, deep layer grade (the high level of level number) it is different Normal process block is located within the abnormality processing block of shallow-layer grade.
Figure 16 A is the schematic diagram of the transcode process according to the exemplary embodiment of the disclosure based on old scheme;Figure 16B is the schematic diagram of the transcode process according to the exemplary embodiment of the disclosure based on new departure.According to the disclosure The code of exemplary embodiment is as follows:
As shown in Figure 16 A and Figure 16 B, according to old scheme, according to the sequence of the 1st line statement to the 17th line statement execute with The corresponding sentence of statement group 1, when the 6th line code executes exception, the 7th row to the 13rd line statement will not be performed, with statement group 2 Corresponding 14th row and the 15th line statement are performed, and the 16th row corresponding with statement group 3 and the 17th line statement are performed.This In the case of, variable var5, variable var6 and feature f1 cannot be obtained to feature f6.
According to new departure, when executing the sentence based on statement block generation, if sentence corresponding with variable var5 executes It is abnormal, then it still can get variable var6 and feature f1 to feature f6 and improved to improve the implementation rate of sentence or code Feature calculation success rate.
According to the another exemplary embodiment of the disclosure, a kind of feature calculation side based on feature calculation script is provided Method, wherein the feature calculation method includes: to generate underlying programs code according to the process described above;What execution was converted into Underlying programs code.
According to the another exemplary embodiment of the disclosure, provides and a kind of feature calculation script is converted into underlying programs generation Code equipment, wherein the equipment include: script parsing with statement group generation unit, parse feature calculation script with generate to A few statement group;It states block generation unit, multiple statement blocks is generated based at least one described statement group, wherein Mei Gesheng Sentence in bright piece executes the abnormal sentence not influenced in statement block at the same level and executes;Code generating unit is based on the multiple sound Bright piece of generation underlying programs code.
As an example, the statement block generation unit generates statement block by following operation: at least one described sound Each statement group in bright group obtains the list comprising the statement indicated by sentence in the statement group, according to the list In the dependence of sentence construct directed acyclic graph, wherein node in directed acyclic graph indicates variable, in directed acyclic graph Directed edge indicate calculating logic and calculating logic sequence;Statement block is constructed based on the directed acyclic graph, wherein is belonged to not Node with statement block is mutually indepedent, and the calculating logic for belonging to different statement blocks is mutually indepedent, the node among same statement block Between, there are dependences between calculating logic and/or between node and calculating logic.
As an example, the statement block generation unit removes the root node of the directed acyclic graph from the directed acyclic graph With with the associated side of the root node, and obtain by removal generation the first subgraph;It can be constituted from first subgraph Directed acyclic graph and number of nodes removed in the subgraph more than or equal to 2 root node and with the associated side of the root node, and obtain The second subgraph generated by removal;It repeats from directed acyclic graph and number of nodes can be constituted more than or equal to 2 Root node is removed in subgraph and the step of side, be less than until the number of nodes of each subgraph in the subgraph by removal generation or Until 2;And based on the subgraph generated by removal, building statement block, wherein peer's statement block includes based on same Multiple statement blocks of remainder building of the directed acyclic graph after removing root node.
As an example, the statement block generation unit is determined also according to the direction of the directed edge of the node in directed acyclic graph Whether node is that can be used as root node, wherein is directed toward and removes with the direction for the directed edge that the node of root node can be used as to connect Node other than the node, wherein when can be as the node of root node there are two or more, according to node pair One in the two or more nodes is determined as root node by the logical order for the feature calculation script answered.
As an example, the code generating unit is when being removed root node, according to corresponding with the root node being removed Statement, generate code;It obtains and is made of individual node or when subgraph including two nodes after removing root node, according to With the subgraph being made of individual node or the subgraph including two nodes pair it should be stated that generate code, wherein be first removed The corresponding code of node sequence come the corresponding code of the node being removed with after before, generation corresponding with the subgraph first generated Before the sequence of code comes code corresponding with the subgraph generated afterwards, generation corresponding with each statement block in statement block at the same level The sequence of code is determined according to the logical order of feature calculation script.
As an example, the code generating unit is at least one code addition exception handling generated, so that working as At least one code in the process of running throw exception when, the exception dished out is captured and is disposed, wherein sound at the same level The parallel subgraph of bright piece of correspondence, add exception handling code include in following item at least one of: with it is parallel multiple The corresponding code of subgraph of the non-individual node of each of subgraph, with the whole corresponding code of parallel multiple subgraphs, it is parallel Subgraph includes: the multiple subgraphs generated after same directed acyclic graph is removed root node.
According to the another exemplary embodiment of the disclosure, provides a kind of feature calculation based on feature calculation script and set It is standby, wherein the feature computation device includes: that feature calculation script is converted into setting for underlying programs code as described above It is standby;Execution unit executes the underlying programs code being converted into.
According to an exemplary embodiment of the present disclosure, it is added to fault tolerant mechanism in the code being converted into, compared with old scheme, Code change amount is smaller, and the fault-tolerance of computational efficiency and fault-tolerance, especially feature calculation can be improved, and improves task execution effect Rate and reliability.For example, in the embodiment above, the executable code being converted into, according to implementing result, if var3 is calculated Existing mistake, then according to old scheme can only backout feature f7 and f8, and according to new departure can return to feature f1, f2, f3, f4, f6, F7 and f8.It is because having used the fault tolerant mechanism based on piecemeal according to the disclosure why such result, which can be realized,. New departure is compared with old scheme, and (when the error occurs, the number of the value of return) improves 2.5 times in serious forgiveness.In engineering It practises under scene, when carrying out Feature Engineering, script is usually very long (hundred travel far and wide originally, and even thousand rows or more travel far and wide this), fault-tolerant The raising of rate up to hundred times even thousand times or more, under online and online can it is more stable, reliably carry out feature calculation.
It should be understood that may be incorporated by reference Fig. 1 to figure according to the specific implementation of the equipment of disclosure exemplary embodiment The related specific implementation of 16B description realizes that details are not described herein.
It can be individually configured according to device included by the equipment of disclosure exemplary embodiment to execute specific function Any combination of software, hardware, firmware or above-mentioned item.For example, these devices can correspond to dedicated integrated circuit, can also correspond to In pure software code, the module that software is combined with hardware is also corresponded to.In addition, realized one of these devices or Multiple functions can also be sought unity of action by the component in physical entity equipment (for example, processor, client or server etc.).
It should be understood that according to the method for disclosure exemplary embodiment can by the program that is recorded in computer-readable media come It realizes, for example, according to an exemplary embodiment of the present disclosure, it is possible to provide one kind is for being converted into underlying programs for feature calculation script The computer-readable medium of code, wherein recording on the computer-readable medium has for executing following methods step Computer program: parsing feature calculation script is to generate at least one statement group;It is generated based at least one described statement group more A statement block, wherein the sentence in each statement block executes the abnormal sentence not influenced in statement block at the same level and executes;Based on described Multiple statement blocks generate underlying programs code.
Computer program in above-mentioned computer-readable medium can be in client, host, agent apparatus, server etc. Run in the environment disposed in computer equipment, it should be noted that the computer program can also be used in execute in addition to above-mentioned steps with Outer additional step or execute when executing above-mentioned steps more specifically handles, these additional steps and is further processed Content is described referring to figs. 1 to Figure 16 B, here in order to avoid repetition will be repeated no longer.
It should be noted that the operation of computer program can be completely dependent on according to the equipment of disclosure exemplary embodiment to realize phase The function of answering, that is, each unit is corresponding with each step to the function structure of computer program, so that whole system passes through specially Software package (for example, the library lib) and be called, to realize corresponding function.
On the other hand, each unit according to included by the equipment of disclosure exemplary embodiment can also by hardware, Software, firmware, middleware, microcode or any combination thereof are realized.When with the realization of software, firmware, middleware or microcode, Program code or code segment for executing corresponding operating can store in the computer-readable medium of such as storage medium, So that processor can execute corresponding operation by reading and running corresponding program code or code segment.
For example, the exemplary embodiment of the disclosure is also implemented as computing device, which includes storage unit And processor, set of computer-executable instructions conjunction is stored in storage unit, when the set of computer-executable instructions is closed by institute When stating processor execution, the method that feature calculation script is converted into underlying programs code is executed.
Particularly, the computing device can be deployed in server or client, can also be deployed in distributed network On node apparatus in network environment.In addition, the computing device can be PC computer, board device, personal digital assistant, intelligence Energy mobile phone, web are applied or other are able to carry out the device of above-metioned instruction set.
Here, the computing device is not necessarily single computing device, can also be it is any can be alone or in combination Execute the device of above-metioned instruction (or instruction set) or the aggregate of circuit.Computing device can also be integrated control system or system A part of manager, or can be configured to Local or Remote (for example, via wireless transmission) with the portable of interface inter-link Formula electronic device.
In the computing device, processor may include central processing unit (CPU), graphics processor (GPU), may be programmed and patrol Collect device, dedicated processor systems, microcontroller or microprocessor.As an example, not a limit, processor may also include simulation Processor, digital processing unit, microprocessor, multi-core processor, processor array, network processing unit etc..
Feature calculation script is converted into be retouched in the method for underlying programs code according to disclosure exemplary embodiment The certain operations stated can realize that certain operations can be realized by hardware mode by software mode, in addition, can also be by soft The mode of combination of hardware realizes these operations.
Processor can run the instruction being stored in one of storage unit or code, wherein the storage unit can be with Storing data.Instruction and data can be also sent and received via Network Interface Unit and by network, wherein the network connects Any of transport protocol can be used in mouth device.
Storage unit can be integral to the processor and be integrated, for example, RAM or flash memory are arranged in integrated circuit microprocessor etc. Within.In addition, storage unit may include independent device, such as, external dish driving, storage array or any Database Systems can Other storage devices used.Storage unit and processor can be coupled operationally, or can for example by the port I/O, Network connection etc. communicates with each other, and enables a processor to read the file being stored in storage unit.
In addition, the computing device may also include video display (such as, liquid crystal display) and user's interactive interface is (all Such as, keyboard, mouse, touch input device etc.).The all components of computing device can be connected to each other via bus and/or network.
According to disclosure exemplary embodiment feature calculation script is converted into involved by the method for underlying programs code Operation can be described as it is various interconnection or coupling functional blocks or function diagram.However, these functional blocks or function diagram can It is equably integrated into single logic device or is operated according to non-exact boundary.
For example, as described above, feature calculation script is converted into underlying programs generation according to disclosure exemplary embodiment The computing device of code may include storage unit and processor, wherein set of computer-executable instructions conjunction is stored in storage unit, When the set of computer-executable instructions close by the processor execute when, execute following step: parsing feature calculation script with Generate at least one statement group;Multiple statement blocks are generated based at least one described statement group, wherein the language in each statement block Sentence executes the abnormal sentence not influenced in statement block at the same level and executes;Underlying programs code is generated based on the multiple statement block.
The foregoing describe each exemplary embodiments of the disclosure, it should be appreciated that foregoing description is merely exemplary, and exhaustive Property, the present disclosure is not limited to disclosed each exemplary embodiments.It is right without departing from the scope of the present disclosure and spirit Many modifications and changes are obvious for those skilled in the art.Therefore, the protection of the disclosure Range should be subject to the scope of the claims.

Claims (10)

1. a kind of method that feature calculation script is converted into underlying programs code, wherein the described method includes:
Feature calculation script is parsed to generate at least one statement group;
Multiple statement blocks are generated based at least one described statement group, wherein the sentence in each statement block executes abnormal not shadow The sentence rung in statement block at the same level executes;
Underlying programs code is generated based on the multiple statement block.
2. described to generate multiple statement blocks based at least one described statement group according to the method described in claim 1, wherein Step include: for each statement group at least one described statement group,
The list comprising the statement indicated by sentence in the statement group is obtained, is closed according to the dependence of the sentence in the list System's building directed acyclic graph, wherein the node in directed acyclic graph indicates variable, and the directed edge in directed acyclic graph indicates to calculate The sequence of logic and calculating logic;
Statement block is constructed based on the directed acyclic graph, wherein the node for belonging to different statement blocks is mutually indepedent, belongs to not in unison Bright piece of calculating logic is mutually indepedent, between the node among same statement block, between calculating logic and/or node and calculating are patrolled There are dependences between volume.
3. according to the method described in claim 2, wherein, described the step of stating block based on directed acyclic graph building, wraps It includes:
Removed from the directed acyclic graph directed acyclic graph root node and with the associated side of the root node, and obtain pass through First subgraph of removal and generation;
Root node is removed from that can be constituted in the subgraph that directed acyclic graph and number of nodes are greater than or equal to 2 in first subgraph With with the associated side of the root node, and obtain by removal generation the second subgraph;
It repeats and removes root node and side in the subgraph of directed acyclic graph and number of nodes more than or equal to 2 from can constitute Step, until the number of nodes of each subgraph in the subgraph generated by removal is less than or equal to 2;
Based on the subgraph generated by removal, block is stated in building,
Wherein, statement block at the same level includes the multiple of the remainder building based on same directed acyclic graph after removing root node State block.
4. according to the method described in claim 3, wherein, described the step of stating block based on directed acyclic graph building, also wraps It includes: determining whether node is that can be used as root node according to the direction of the directed edge of the node in directed acyclic graph,
Wherein, the node other than the node is directed toward with the direction for the directed edge that the node of root node can be used as to connect,
Wherein, when can be as the node of root node there are two or more, according to feature calculation foot corresponding with node This logical order, is determined as root node for one in the two or more nodes.
5. according to the method described in claim 3, wherein, the step that underlying programs code is generated based on the multiple statement block Suddenly include:
When being removed root node, according to statement corresponding with the root node being removed, code is generated;
Obtain and be made of individual node or when subgraph including two nodes after removing root node, according to by individual node The subgraph of composition or subgraph including two nodes pair it should be stated that generate code,
Wherein, before the sequence of code corresponding with the node being first removed comes the corresponding code of the node being removed with after, with Before the sequence of the corresponding code of the subgraph first generated comes code corresponding with the subgraph generated afterwards, in peer's statement block The sequence of the corresponding code of each statement block determined according to the logical order of feature calculation script.
6. according to method described in any one of claim 2 to 5 claim, wherein the method also includes:
Exception handling is added at least one code of generation, so that when at least one code is thrown in the process of running When abnormal out, the exception dished out is captured and is disposed,
Wherein, statement block at the same level corresponds to parallel subgraph, add exception handling code include in following item at least one : it is code corresponding with each of parallel multiple subgraphs subgraph of non-individual node, integrally right with multiple subgraphs parallel The code answered,
Parallel subgraph includes: the multiple subgraphs generated after same directed acyclic graph is removed root node.
7. a kind of feature calculation method based on feature calculation script, wherein the feature calculation method includes:
According to claim 1, method described in any one of -6 claims generates underlying programs code;
Execute the underlying programs code being converted into.
8. a kind of system including at least one computing device He the storage device of at least one store instruction, wherein the finger It enables when being run by least one described computing device, at least one described computing device is promoted to execute as in claims 1 to 7 Any one claim described in method.
9. a kind of computer readable storage medium of store instruction, wherein when described instruction is run by least one computing device When, promote at least one described computing device to execute the method as described in any one claim in claims 1 to 7.
10. a kind of equipment that feature calculation script is converted into underlying programs code, wherein the equipment includes:
Script parsing and statement group generation unit, parse feature calculation script to generate at least one statement group;
It states block generation unit, multiple statement blocks is generated based at least one described statement group, wherein the language in each statement block Sentence executes the abnormal sentence not influenced in statement block at the same level and executes;
Code generating unit generates underlying programs code based on the multiple statement block.
CN201910782987.XA 2019-08-23 2019-08-23 Method and apparatus for converting feature computation script into underlying program code Active CN110489128B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910782987.XA CN110489128B (en) 2019-08-23 2019-08-23 Method and apparatus for converting feature computation script into underlying program code

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910782987.XA CN110489128B (en) 2019-08-23 2019-08-23 Method and apparatus for converting feature computation script into underlying program code

Publications (2)

Publication Number Publication Date
CN110489128A true CN110489128A (en) 2019-11-22
CN110489128B CN110489128B (en) 2023-08-29

Family

ID=68553134

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910782987.XA Active CN110489128B (en) 2019-08-23 2019-08-23 Method and apparatus for converting feature computation script into underlying program code

Country Status (1)

Country Link
CN (1) CN110489128B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111723940A (en) * 2020-05-22 2020-09-29 第四范式(北京)技术有限公司 Method, device and equipment for providing pre-estimation service based on machine learning service system

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090172638A1 (en) * 2001-02-28 2009-07-02 Computer Associates Think, Inc. Adding Functionality To Existing Code At Exits
US20090254881A1 (en) * 2008-04-04 2009-10-08 Microsoft Corporation Code generation techniques for administrative tasks
US20090276766A1 (en) * 2008-05-01 2009-11-05 Yonghong Song Runtime profitability control for speculative automatic parallelization
US20090313600A1 (en) * 2008-06-13 2009-12-17 Microsoft Corporation Concurrent code generation
CN102246150A (en) * 2008-12-16 2011-11-16 微软公司 Transforming user script code for debugging
US20130007722A1 (en) * 2011-06-28 2013-01-03 International Business Machines Corporation Method, system and program storage device that provide for automatic programming language grammar partitioning
CN104991773A (en) * 2015-06-30 2015-10-21 小米科技有限责任公司 Program generation method and apparatus
CN107436762A (en) * 2017-07-03 2017-12-05 北京东土军悦科技有限公司 A kind of register Code document generating method, device and electronic equipment
US20180088937A1 (en) * 2016-09-29 2018-03-29 Microsoft Technology Licensing, Llc Code refactoring mechanism for asynchronous code optimization using topological sorting
CN108153897A (en) * 2018-01-10 2018-06-12 中国银行股份有限公司 A kind of PLSQL program codes generation method and system
CN110134378A (en) * 2018-02-08 2019-08-16 腾讯科技(深圳)有限公司 Application program creation method and device, computer equipment and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090172638A1 (en) * 2001-02-28 2009-07-02 Computer Associates Think, Inc. Adding Functionality To Existing Code At Exits
US20090254881A1 (en) * 2008-04-04 2009-10-08 Microsoft Corporation Code generation techniques for administrative tasks
US20090276766A1 (en) * 2008-05-01 2009-11-05 Yonghong Song Runtime profitability control for speculative automatic parallelization
US20090313600A1 (en) * 2008-06-13 2009-12-17 Microsoft Corporation Concurrent code generation
CN102246150A (en) * 2008-12-16 2011-11-16 微软公司 Transforming user script code for debugging
US20130007722A1 (en) * 2011-06-28 2013-01-03 International Business Machines Corporation Method, system and program storage device that provide for automatic programming language grammar partitioning
CN104991773A (en) * 2015-06-30 2015-10-21 小米科技有限责任公司 Program generation method and apparatus
US20180088937A1 (en) * 2016-09-29 2018-03-29 Microsoft Technology Licensing, Llc Code refactoring mechanism for asynchronous code optimization using topological sorting
CN107436762A (en) * 2017-07-03 2017-12-05 北京东土军悦科技有限公司 A kind of register Code document generating method, device and electronic equipment
CN108153897A (en) * 2018-01-10 2018-06-12 中国银行股份有限公司 A kind of PLSQL program codes generation method and system
CN110134378A (en) * 2018-02-08 2019-08-16 腾讯科技(深圳)有限公司 Application program creation method and device, computer equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
宋道远等: "Java程序异常信息分析插件的研究与设计", 《计算机科学》 *
宋道远等: "Java程序异常信息分析插件的研究与设计", 《计算机科学》, vol. 41, no. 8, 15 August 2014 (2014-08-15), pages 106 - 108 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111723940A (en) * 2020-05-22 2020-09-29 第四范式(北京)技术有限公司 Method, device and equipment for providing pre-estimation service based on machine learning service system
CN111723940B (en) * 2020-05-22 2023-08-22 第四范式(北京)技术有限公司 Method, device and equipment for providing estimated service based on machine learning service system

Also Published As

Publication number Publication date
CN110489128B (en) 2023-08-29

Similar Documents

Publication Publication Date Title
Bhattacharya et al. Graph-based analysis and prediction for software evolution
del Mar Gallardo et al. Debugging UML designs with model checking
CN110471913A (en) A kind of data cleaning method and device
Wang et al. TranS^ 3: A transformer-based framework for unifying code summarization and code search
CN110018829A (en) Improve the method and device of PL/SQL language interpreter execution efficiency
WO2017095720A1 (en) Techniques to identify idiomatic code in a code base
Hodován et al. Coarse hierarchical delta debugging
van der Aalst et al. Process discovery and conformance checking using passages
Salah et al. Scenariographer: A tool for reverse engineering class usage scenarios from method invocation sequences
Hegedűs et al. A drill-down approach for measuring maintainability at source code element level
Remenska et al. Using model checking to analyze the system behavior of the LHC production grid
CN110688121A (en) Code completion method, device, computer device and storage medium
EP3230869A1 (en) Separating test verifications from test executions
Zhou et al. Confmapper: Automated variable finding for configuration items in source code
WO2012051844A1 (en) Intelligent network platform, method for executing services and method for analyzing service abnormity
Blech et al. Formal verification of java code generation from UML models
Wang et al. Intelligent test oracle construction for reactive systems without explicit specifications
CN110109658B (en) ROS code generator based on formalized model and code generation method
WO2020038376A1 (en) Method and system for uniformly performing feature extraction
CN110489128A (en) The method and apparatus that feature calculation script is converted into underlying programs code
Desprez et al. Assessing the performance of MPI applications through time-independent trace replay
CN114840410A (en) Test analysis method and device, computer equipment and storage medium
Xu et al. Mining executable specifications of web applications from selenium ide tests
CN111159203B (en) Data association analysis method, platform, electronic equipment and storage medium
CN115454702A (en) Log fault analysis method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant