CN110020004A - A kind of method for computing data and engine - Google Patents

A kind of method for computing data and engine Download PDF

Info

Publication number
CN110020004A
CN110020004A CN201910125629.1A CN201910125629A CN110020004A CN 110020004 A CN110020004 A CN 110020004A CN 201910125629 A CN201910125629 A CN 201910125629A CN 110020004 A CN110020004 A CN 110020004A
Authority
CN
China
Prior art keywords
node
data
target
current layer
ginseng
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910125629.1A
Other languages
Chinese (zh)
Other versions
CN110020004B (en
Inventor
赵亮星云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910125629.1A priority Critical patent/CN110020004B/en
Publication of CN110020004A publication Critical patent/CN110020004A/en
Priority to TW108132569A priority patent/TWI723535B/en
Priority to PCT/CN2020/073843 priority patent/WO2020168901A1/en
Application granted granted Critical
Publication of CN110020004B publication Critical patent/CN110020004B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing

Abstract

The present invention provides a kind of method for computing data and engines, wherein this method comprises: receiving data computation requests, wherein include: the mark of several target data views in data computation requests;It is configured according to preset DAG corresponding with Data View, determines that the current layer DS node of each target data view and current layer DS node enter ginseng;According to each current layer DS node and its enter ginseng, it determines several current layer target DS node and its enters ginseng, wherein, the first DS node and the second DS node are not present in several current layer target DS node, the first DS node is identical with the second DS node and the first DS node enter the second DS node of participation enter to join it is identical;Enter ginseng according to each current layer target DS node, executes each current layer target DS node;According to the implementing result of each current layer target DS node and the corresponding DAG configuration of each target data view, the data calculated result of the current layer of each target data view is determined.

Description

A kind of method for computing data and engine
Technical field
The present invention relates to field of computer technology, in particular to a kind of method for computing data and engine.
Background technique
In the operational process of operation system, a large amount of data can be generated.In practical application scene, developer is general Corresponding script is developed according to the demand of itself, and is calculated using the data that the script generates operation system, data meter Calculating result can be used for the demand etc. for analyzing user.
For example, following two query process respectively corresponds two scripts, first is that the common IP of inquiry user 1, is then inquired The commonly used equipment that the IP is used;Second is that the common IP of inquiry user 1, then inquires the nearest of the IP and uses the time.By complete Ground executes the script to realize inquiry purpose.
All there is " common IP of inquiry user 1 " in above-mentioned two query process.The overall process realized due to data query Be packaged in one whole section of code snippet, so, during actual queries, need to " common IP of user 1 " this data into Row 2 times inquiries.And repeat to inquire same data, the IO for increasing operation system is consumed.
Summary of the invention
In consideration of it, can reduce the IO of operation system the embodiment of the invention provides a kind of method for computing data and engine Consumption.
In a first aspect, the embodiment of the invention provides a kind of method for computing data, comprising:
Receive data computation requests, wherein include: the mark of several target data views in the data computation requests;
Matched according to preset DAG (Directed Acyclic Graph, directed acyclic graph) corresponding with Data View It sets, determines current layer DS (Data Source, the data source) node and current layer DS section of each target data view Point enters ginseng;
According to each current layer DS node and its enter ginseng, determine several current layer target DS node and its enter ginseng, In, the first DS node and the second DS node, first DS node and the are not present in several current layer target DS node Two DS node are identical and first DS node enter to participate in second DS node enter to join it is identical;
Enter ginseng according to each current layer target DS node, executes each current layer target DS node;
According to the implementing result of each current layer target DS node and the corresponding DAG of each target data view Configuration, determines the data calculated result of the current layer of each target data view.
Second aspect, the embodiment of the invention provides a kind of data computing engines, comprising:
Receiving unit, for receiving data computation requests, wherein include: several number of targets in the data computation requests According to the mark of view;
Determination unit determines each target data for configuring according to preset DAG corresponding with Data View The current layer DS node of view and the current layer DS node enter ginseng;
Combining unit determines several current layer target DS section for according to each current layer DS node and its entering ginseng Point and its enter ginseng, wherein it is not present the first DS node and the second DS node in several current layer target DS node, described the One DS node is identical as the second DS node and first DS node enter to participate in second DS node enter to join it is identical;
Execution unit executes each current layer mesh for entering ginseng according to each current layer target DS node Mark DS node;
Computing unit, for the implementing result and each target data according to each current layer target DS node The corresponding DAG configuration of view, determines the data calculated result of the current layer of each target data view.
At least one above-mentioned technical solution used in the embodiment of the present invention can reach following the utility model has the advantages that this method will count Access logic and data processing logic are abstracted as according to calculating, wherein access logic passes through DS node (data active layer) realization, data Processing logic is realized by DAG configuration (Data View layer).When receiving data computation requests, this method will match according to DAG It sets layering and collects DS node (I/O node), and execute the DS node after duplicate removal, reduce the access times to operation system, reduce industry The IO of business system is consumed.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the present invention Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.
Fig. 1 is a kind of flow chart of method for computing data provided by one embodiment of the present invention;
Fig. 2 is a kind of structural schematic diagram of DAG configuration provided by one embodiment of the present invention;
Fig. 3 is the structural schematic diagram of another kind DAG configuration provided by one embodiment of the present invention;
Fig. 4 is the structural schematic diagram of another DAG configuration provided by one embodiment of the present invention;
Fig. 5 is the structural schematic diagram of another DAG configuration provided by one embodiment of the present invention;
Fig. 6 is a kind of structural schematic diagram of data computing engines provided by one embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments, based on the embodiments of the present invention, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
As shown in Figure 1, this method may comprise steps of the embodiment of the invention provides a kind of method for computing data:
Step 101: receiving data computation requests, wherein include: the mark of several target data views in data computation requests Know.
In data computation requests further include: the first layer DS node of each target data view enters ginseng.One data meter Calculating request can be for one or more Data View.
Step 102: being configured according to preset DAG corresponding with Data View, determine working as each target data view Front layer DS node and current layer DS node enter ginseng.
DAG is configured to the form of expression of Data View, and DAG is layered structure, provides for each layer target DS node of determination It is convenient.It may include multilayer in DAG configuration, the method that every layer of processing can be provided using step 102.
In embodiments of the present invention, for first layer, the ginseng that enters of DS node is the data for including in data computation requests The first layer DS node of view enters ginseng, and for other layers other than first layer, the ginseng that enters of DS node is upper one layer Data calculated result.
As shown in Fig. 2, being a kind of corresponding DAG configuration of Data View, the business purpose to be realized of the Data View is: root According to input User ID, obtain the associated common IP of the User ID, and according to these IP lists acquisition had used these IP into The account sum that row logs in.
DAG configuration includes two layers, and the corresponding execution task of first layer DS node is " taking family common IP list ", is entered Ginseng is User ID;The corresponding execution task of second layer DS node is " the account number that the IP occurred ", and entering ginseng is first layer Data calculated result, finally obtained data calculated result is " the account number that User ID common IP can be associated with out ".
It should be noted that first layer DS node and second layer DS node are located at DAG tree not in DAG configuration Same layer, i.e., there are multiple levels to be calculated for each Data View itself.But for logic level, first layer DS node Data active layer is belonged to second layer DS node.
Step 103: according to each current layer DS node and its entering ginseng, determine several current layer target DS node and its enter Ginseng, wherein the first DS node and the second DS node, the first DS node and the 2nd DS are not present in several current layer target DS node Node is identical and the first DS node enter to participate in the second DS node enter to join it is identical.
It should be noted that the first DS node and the second DS node are duplicate section in several current layer target DS node Point, i.e. the corresponding execution task of two DS node is identical, and it is corresponding enter ginseng it is also identical.
When DAG configuration in include multilayer when, it is thus necessary to determine that each layer of target DS node and its enter ginseng.Now with Fig. 3-Fig. 5 Shown in three DAG configuration first layer for, step 103 is described in detail.
The first layer DS node of three DAG configuration is respectively as follows: DS4, DS1, DS1, it is corresponding enter ginseng be all User ID, due to DS1 in Fig. 4 is identical as the DS1 in Fig. 5, and the ginseng that enters of two nodes is all User ID, then the DS1 in Fig. 4 can in Fig. 5 DS1 merge execute, i.e., first layer target DS node be DS4 and DS1, it is corresponding enter ginseng be all User ID.It is obtained after merging Quantity of the quantity of current layer target DS node less than the current layer DS node before merging.
Step 104: ginseng being entered according to each current layer target DS node, executes each current layer target DS node.
Conventionally, as online operation system and offline business system cause to count there are huge environmental difference Be defined separately according to calculating logic needs: i.e. to same data requirements, (data configuration comprising the complexity such as access, data mart modeling is patrolled Volume), it needs to carry out stand-alone development twice according to environment difference.Exploitation cost in this way is high, and human cost is high, and is difficult to accomplish Real mathematical logic equity.
In consideration of it, different according to the environment of application, this method is divided into following two situation:
Situation 1: local environment is in thread environment;
At this point, step 104 specifically includes:
A1: TR service interface is called.
A2: being supplied to TR service interface for the ginseng that enters of current layer target DS node so that TR service interface obtain with it is current The data for entering ginseng and matching of layer target DS node.
Situation 2: local environment is offline environment;
At this point, step 104 specifically includes:
The data for entering ginseng and matching with current layer target DS node are filtered out from offline database.
By taking DAG shown in Fig. 2 configuration as an example, for first layer DS node, when local environment is the step in thread environment 104 can be by calling a TR service interface to realize: return IpService.queryIpList (userId).When locating When environment is offline environment, step 104 can be realized by one section of SQL statement: select ip from table1 where UserId=" userId ".
For second layer DS node, when local environment is in thread environment, step 104 can be by calling a TR service Interface is realized: return IpService.queryUserIdCount (ipList).When local environment is offline environment, step Rapid 104 can be realized by one section of SQL statement: select count (userId) from table1 where ip in ipList。
In embodiments of the present invention, which supports the IO of configurationization to merge, and can to greatest extent be each Operation system saves IO consumption.Also, the data engine supports primary configuration, can be suitable for online, offline environment simultaneously, can Data mining cost is greatlyd save, and improves online, off-line data consistency.
It is respectively adapted to according to environment although this part of DS node needs are online and offline, due to following two, is made It is simply controllable to obtain this process, not will increase exploitation complexity.
(1) DS node only includes most basic access logic, and complicated processing logic is not present, offline and online right very well Together.
(2) it is calculated in scene in data, basic data logic is often the set of a very little.More data are to pass through place It manages and processes and be derived.
It should be noted that at step 104, concurrently executing each current layer target to improve data computational efficiency DS node.
Step 105: according to the implementing result of each current layer target DS node and the corresponding DAG of each target data view Configuration, determines the data calculated result of the current layer of each target data view.
Step 105 specifically includes:
B1: configuring according to the implementing result of each current layer target DS node and the corresponding DAG of each target data view, Determine the corresponding implementing result of each target data view.
Each layer DS node can be determined according to the DAG of target data view configuration, and number of targets can be determined by DS node According to the corresponding target DS node of view, the implementing result of the target DS node is the corresponding implementing result of target data view.
The corresponding implementing result of target data view can be divided into two kinds: one is running succeeded, i.e. target data view Corresponding current layer target DS node obtains entering the data that ginseng matches with it in preset execution time range;Another kind is Execute failure, i.e., the corresponding current layer target DS node of target data view when being executed between do not obtain entering to join phase with it in range Matched data.
B2: data calculating is carried out according to the corresponding implementing result of each target data view and DAG configuration, obtains each mesh Mark the data calculated result of the current layer of Data View, wherein the corresponding data of different target Data View calculate serial execute.
In embodiments of the present invention, it the time that performance objective DS node is controlled by preset execution time range, improves The efficiency that data calculate.The presence for executing time range can be avoided the data calculation process suspension of a target data view, The progress of the data calculation process of other target data views is not influenced.If thering is some DS node not have in range between when being executed It calculates, then the calculating process of this DS is put into serial computing in subsequent DS parameter preparation process.
For above two implementing result, carried out according to the corresponding implementing result of each target data view and DAG configuration Data calculate, and are specifically divided into following two situation:
(1) when the corresponding current layer target DS node of target data view obtains in preset execution time range and it When entering the data that ginseng matches, data calculating is carried out according to data and the corresponding DAG configuration of target data view.
(2) when the corresponding current layer target DS node of target data view when being executed between do not obtain in range entering ginseng with it When the data to match, ginseng is entered according to the corresponding current layer target DS node of target data view, re-executes target data The corresponding current layer target DS node of view, when the corresponding current layer target DS node of target data view when being executed between range When inside obtaining entering the data that ginseng matches with it, data calculating is carried out according to the corresponding DAG configuration of target data view.
Certainly, in practical application scene, when the corresponding current layer target DS node of target data view when being executed between When not obtaining entering the data that ginseng matches with it in range, the data calculation process of target data view can also be terminated.It needs Illustrate, the corresponding data calculation process of a target data view terminates, and it is corresponding to have no effect on other target data views Data calculation process.
Data calculating is abstracted as access logic and data processing logic by this method, wherein access logic passes through DS node (data active layer) is realized, data mart modeling logic is realized by DAG configuration (Data View layer).When receiving data computation requests When, this method will configure layering according to DAG and collect DS node (I/O node), and execute the DS node after duplicate removal, reduce to business The access times of system reduce the IO consumption of operation system.
The embodiment of the present invention is by taking the corresponding DAG configuration of Fig. 3-three Data Views shown in fig. 5 as an example, to data calculating side Method is described in detail, this method comprises:
S1: receive data computation requests, wherein include: in data computation requests several target data views mark and The first layer DS node of each target data view enters ginseng.
Assuming that DAG shown in Fig. 3 configures corresponding data view 1, DAG shown in Fig. 4 configures corresponding data view 2, Fig. 5 institute The DAG configuration corresponding data view 3 shown.
It include: the mark 1,2,3 of target data view in data computation requests, corresponding first layer DS node enters ginseng all For User ID.
S2: it is configured according to preset DAG corresponding with Data View, determines the first layer DS of each target data view Node and first layer DS node enter ginseng.
The first layer DS node of target data view 1 is DS4, corresponding to enter to join as User ID;The of target data view 2 One layer of DS node is DS1, corresponding to enter to join as User ID;The first layer DS node of target data view 3 be DS1, it is corresponding enter Ginseng is User ID.
S3: according to each first layer DS node and its entering ginseng, determines several first layer target DS node and its enters ginseng, In, the first DS node and the second DS node, the first DS node and the second DS node are not present in several first layer target DS node It is identical and the first DS node enter participate in the second DS node enter to join it is identical.
First layer target DS node be DS1 and DS4, it is corresponding enter ginseng be all User ID.
S4: ginseng is entered according to each first layer target DS node, executes each first layer target DS node.
By taking target data view 1 as an example, when local environment is in thread environment, S4 is specifically included: calling TR service interface; User ID is supplied to TR service interface, so that TR service interface obtains the data to match with User ID.
When local environment is offline environment, S4 is specifically included: being filtered out from offline database and is matched with User ID Data.
S5: configuring according to the implementing result of each first layer target DS node and the corresponding DAG of each target data view, Determine the corresponding implementing result of each target data view.
The corresponding implementing result of target data view 1 is the implementing result of DS4, target data view 2, target data view 3 corresponding implementing results are the implementing result of DS1.
S6: data calculating is carried out according to the corresponding implementing result of each target data view and DAG configuration, obtains each mesh Mark the data calculated result of the first layer of Data View, wherein the corresponding data of different target Data View calculate serial execute.
Serial computing is carried out to above three target data view, but the specific computation sequence of target data view is not It limits, for example, calculating separately the data of three target data view first layers according to the sequence of target data view 1,2,3 Calculated result.
By taking target data view 1 as an example, when DS4 obtains entering the number that ginseng matches with it in preset execution time range According to when, according to data and the corresponding DAG of target data view 1 configuration carry out data calculating.Wherein, data calculating can be data Filter (filter), data check etc..
When DS4 when being executed between do not obtain the data to match with User ID in range when, it is right according to target data view 1 The User ID answered re-executes the corresponding DS4 of target data view 1, when the corresponding DS4 of target data view 1 when being executed between When obtaining the data to match with User ID in range, is configured according to the corresponding DAG of target data view 1 and carry out data calculating.
It is had been calculated into the first layer data of target data view 1 rear, successively carries out target data view 2 and target data First layer data of view 3 calculates.
S7: it is configured according to preset DAG corresponding with Data View, determines the second layer DS of each target data view Node and second layer DS node enter ginseng.
The second layer DS node of target data view 1 is DS2, corresponding to enter to join the data calculated result for its first layer; The second layer DS node of target data view 2 is DS2, corresponding to enter to join the data calculated result for its first layer;Target data The second layer DS node of view 3 is DS3, corresponding to enter to join the data calculated result for its first layer.
S8: according to each second layer DS node and its entering ginseng, determines several second layer target DS node and its enters ginseng, In, the first DS node and the second DS node, the first DS node and the second DS node are not present in several second layer target DS node It is identical and the first DS node enter participate in the second DS node enter to join it is identical.
Second layer target DS node is DS2 and DS3, it is corresponding enter ginseng be all upper one layer of data calculated result.
S9: ginseng is entered according to each second layer target DS node, executes each second layer target DS node.
By taking target data view 1 as an example, when local environment is in thread environment, S4 is specifically included: calling TR service interface; The data calculated result of first layer is supplied to TR service interface, is tied so that TR service interface obtains to calculate with the data of first layer The data that fruit matches.
When local environment is offline environment, S4 is specifically included: the data with first layer are filtered out from offline database The data that calculated result matches.
S10: matched according to the implementing result of each second layer target DS node and the corresponding DAG of each target data view It sets, determines the corresponding implementing result of each target data view.
The implementing result that target data view 1 and the corresponding implementing result of target data view 2 are DS2, target data view The corresponding implementing result of Fig. 3 is the implementing result of DS3.
S6: data calculating is carried out according to the corresponding implementing result of each target data view and DAG configuration, obtains each mesh Mark the data calculated result of the second layer of Data View, wherein the corresponding data of different target Data View calculate serial execute.
According to the sequence of target data view 1,2,3, the data for calculating separately three target data view second layers are calculated As a result.
By taking target data view 1 as an example, when DS2 obtains calculating with the data of first layer in preset execution time range When the data as a result to match, data calculating is carried out according to data and the corresponding DAG of target data view 1 configuration.Wherein, data Calculate to be data deduplication, data check etc..
When DS2 when being executed between do not obtain the data to match with the data calculated result of first layer in range when, according to The data calculated result of the corresponding first layer of target data view 1 re-executes the corresponding DS2 of target data view 1, works as target The corresponding DS2 of Data View 1 when being executed between when obtaining the data to match with the data calculated result of first layer in range, root It is configured according to the corresponding DAG of target data view 1 and carries out data calculating.
It is had been calculated into the second layer data of target data view 1 rear, successively carries out target data view 2 and target data Second layer data of view 3 calculates.
As shown in fig. 6, a kind of data computing engines, comprising:
Receiving unit 601, for receiving data computation requests, wherein include: several target datas in data computation requests The mark of view;
Determination unit 602 determines each for being configured according to preset directed acyclic graph DAG corresponding with Data View The current layer DS node of a target data view and current layer DS node enter ginseng;
Combining unit 603, for according to each current layer DS node and its enter ginseng, determine several current layer target DS node And its enter ginseng, wherein in several current layer target DS node be not present the first DS node and the second DS node, the first DS node with Second DS node is identical and the first DS node enter to participate in the second DS node enter to join it is identical;
Execution unit 604 executes each current layer target DS section for entering ginseng according to each current layer target DS node Point;
Computing unit 605, for according to each current layer target DS node implementing result and each target data view Corresponding DAG configuration, determines the data calculated result of the current layer of each target data view.
In one embodiment of the invention, computing unit 605, for the execution according to each current layer target DS node As a result the corresponding DAG configuration with each target data view, determines the corresponding implementing result of each target data view;According to each The corresponding implementing result of a target data view and DAG configuration carry out data calculating, obtain the current of each target data view The data calculated result of layer, wherein the corresponding data of different target Data View calculate serial execute.
In one embodiment of the invention, computing unit 605, for working as the corresponding current layer target of target data view When DS node obtains entering the data that ginseng matches with it in preset execution time range, according to data and target data view Corresponding DAG configuration carries out data calculating.
In one embodiment of the invention, computing unit 605 are further used for when target data view is corresponding current Layer target DS node when being executed between when not obtaining entering the data that ginseng matches with it in range, it is corresponding according to target data view Current layer target DS node enter ginseng, re-execute the corresponding current layer target DS node of target data view, work as number of targets According to the corresponding current layer target DS node of view when being executed between obtain in range entering the data that ginseng matches with it when, according to mesh It marks the corresponding DAG configuration of Data View and carries out data calculating.
In one embodiment of the invention, when local environment is the execution unit 604, for calling TR in thread environment Service interface;The ginseng that enters of current layer target DS node is supplied to TR service interface, so that TR service interface obtains and current layer The data for entering ginseng and matching of target DS node.
In one embodiment of the invention, when local environment is offline environment, execution unit 604 is used for from offline The data for entering ginseng and matching with current layer target DS node are filtered out in database.
The embodiment of the invention provides a kind of data counting devices, comprising: processor and memory;
Memory for store execute instruction, processor be used for execute memory storage execute instruction to realize above-mentioned The method of one embodiment.
In the 1990s, the improvement of a technology can be distinguished clearly be on hardware improvement (for example, Improvement to circuit structures such as diode, transistor, switches) or software on improvement (improvement for method flow).So And with the development of technology, the improvement of current many method flows can be considered as directly improving for hardware circuit. Designer nearly all obtains corresponding hardware circuit by the way that improved method flow to be programmed into hardware circuit.Cause This, it cannot be said that the improvement of a method flow cannot be realized with hardware entities module.For example, programmable logic device (Programmable Logic Device, PLD) (such as field programmable gate array (Field Programmable Gate Array, FPGA)) it is exactly such a integrated circuit, logic function determines device programming by user.By designer Voluntarily programming comes a digital display circuit " integrated " on a piece of PLD, designs and makes without asking chip maker Dedicated IC chip.Moreover, nowadays, substitution manually makes IC chip, this programming is also used instead mostly " is patrolled Volume compiler (logic compiler) " software realizes that software compiler used is similar when it writes with program development, And the source code before compiling also write by handy specific programming language, this is referred to as hardware description language (Hardware Description Language, HDL), and HDL is also not only a kind of, but there are many kind, such as ABEL (Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL (Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language) etc., VHDL (Very-High-Speed is most generally used at present Integrated Circuit Hardware Description Language) and Verilog.Those skilled in the art also answer This understands, it is only necessary to method flow slightly programming in logic and is programmed into integrated circuit with above-mentioned several hardware description languages, The hardware circuit for realizing the logical method process can be readily available.
Controller can be implemented in any suitable manner, for example, controller can take such as microprocessor or processing The computer for the computer readable program code (such as software or firmware) that device and storage can be executed by (micro-) processor can Read medium, logic gate, switch, specific integrated circuit (Application Specific Integrated Circuit, ASIC), the form of programmable logic controller (PLC) and insertion microcontroller, the example of controller includes but is not limited to following microcontroller Device: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicone Labs C8051F320 are deposited Memory controller is also implemented as a part of the control logic of memory.It is also known in the art that in addition to Pure computer readable program code mode is realized other than controller, can be made completely by the way that method and step is carried out programming in logic Controller is obtained to come in fact in the form of logic gate, switch, specific integrated circuit, programmable logic controller (PLC) and insertion microcontroller etc. Existing identical function.Therefore this controller is considered a kind of hardware component, and to including for realizing various in it The device of function can also be considered as the structure in hardware component.Or even, it can will be regarded for realizing the device of various functions For either the software module of implementation method can be the structure in hardware component again.
System, device, module or the unit that above-described embodiment illustrates can specifically realize by computer chip or entity, Or it is realized by the product with certain function.It is a kind of typically to realize that equipment is computer.Specifically, computer for example may be used Think personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media play It is any in device, navigation equipment, electronic mail equipment, game console, tablet computer, wearable device or these equipment The combination of equipment.
For convenience of description, it is divided into various units when description apparatus above with function to describe respectively.Certainly, implementing this The function of each unit can be realized in the same or multiple software and or hardware when application.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, net Network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/or The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable medium Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), Digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices Or any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, it calculates Machine readable medium does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want Element.In the absence of more restrictions, the element limited by sentence " including one ... ", it is not excluded that including described There is also other identical elements in the process, method of element, commodity or equipment.
The application can describe in the general context of computer-executable instructions executed by a computer, such as program Module.Generally, program module includes routines performing specific tasks or implementing specific abstract data types, programs, objects, group Part, data structure etc..The application can also be practiced in a distributed computing environment, in these distributed computing environments, by Task is executed by the connected remote processing devices of communication network.In a distributed computing environment, program module can be with In the local and remote computer storage media including storage equipment.
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality For applying example, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to embodiment of the method Part explanation.
The above description is only an example of the present application, is not intended to limit this application.For those skilled in the art For, various changes and changes are possible in this application.All any modifications made within the spirit and principles of the present application are equal Replacement, improvement etc., should be included within the scope of the claims of this application.

Claims (12)

1. a kind of method for computing data, comprising:
Receive data computation requests, wherein include: the mark of several target data views in the data computation requests;
It is configured according to preset directed acyclic graph DAG corresponding with Data View, determines each target data view Current layer data source DS node and the current layer DS node enter ginseng;
According to each current layer DS node and its enter ginseng, determine several current layer target DS node and its enter ginseng, wherein institute It states and the first DS node and the second DS node is not present in several current layer target DS node, first DS node and the 2nd DS are saved Point is identical and first DS node enter to participate in second DS node enter to join it is identical;
Enter ginseng according to each current layer target DS node, executes each current layer target DS node;
Matched according to the implementing result of each current layer target DS node and the corresponding DAG of each target data view It sets, determines the data calculated result of the current layer of each target data view.
2. method for computing data as described in claim 1,
The implementing result and the corresponding DAG of each target data view according to each current layer target DS node Configuration, determines the data calculated result of the current layer of each target data view, comprising:
Matched according to the implementing result of each current layer target DS node and the corresponding DAG of each target data view It sets, determines the corresponding implementing result of each target data view;
Data calculating is carried out according to the corresponding implementing result of each target data view and DAG configuration, is obtained each described The data calculated result of the current layer of target data view, wherein the corresponding data calculating of different target Data View is serially held Row.
3. method for computing data as claimed in claim 2,
It is described that data calculating is carried out according to the corresponding implementing result of each target data view and DAG configuration, comprising:
When the corresponding current layer target DS node of the target data view obtains entering with it in preset execution time range When joining the data to match, data calculating is carried out according to the data and the corresponding DAG configuration of the target data view.
4. method for computing data as claimed in claim 3, further comprises:
When the corresponding current layer target DS node of the target data view does not obtain entering with it in the execution time range When joining the data to match,
Enter ginseng according to the corresponding current layer target DS node of the target data view, re-executes the target data view Corresponding current layer target DS node;
When the corresponding current layer target DS node of the target data view obtains entering ginseng with it in the execution time range When the data to match, data calculating is carried out according to the corresponding DAG configuration of the target data view.
5. method for computing data as described in claim 1,
When local environment be in thread environment,
It is described that ginseng is entered according to each current layer target DS node, execute each current layer target DS node, comprising:
Call TR service interface;
The ginseng that enters of the current layer target DS node is supplied to the TR service interface so that the TR service interface obtain with The data for entering ginseng and matching of the current layer target DS node.
6. such as method for computing data as claimed in any one of claims 1 to 5,
When local environment is offline environment,
It is described that ginseng is entered according to each current layer target DS node, execute each current layer target DS node, comprising:
The data for entering ginseng and matching with the current layer target DS node are filtered out from offline database.
7. a kind of data computing engines, comprising:
Receiving unit, for receiving data computation requests, wherein include: several target datas views in the data computation requests The mark of figure;
Determination unit determines each mesh for configuring according to preset directed acyclic graph DAG corresponding with Data View Mark Data View current layer data source DS node and the current layer DS node enter ginseng;
Combining unit, for according to each current layer DS node and its enter ginseng, determine several current layer target DS node and It enters ginseng, wherein the first DS node and the second DS node, the first DS are not present in several current layer target DS node Node is identical as the second DS node and first DS node enter to participate in second DS node enter to join it is identical;
Execution unit executes each current layer target DS for entering ginseng according to each current layer target DS node Node;
Computing unit, for according to each current layer target DS node implementing result and each target data view Corresponding DAG configuration, determines the data calculated result of the current layer of each target data view.
8. data computing engines as claimed in claim 7,
The computing unit, for the implementing result and each target data according to each current layer target DS node The corresponding DAG configuration of view, determines the corresponding implementing result of each target data view;According to each target data The corresponding implementing result of view and DAG configuration carry out data calculating, obtain the number of the current layer of each target data view According to calculated result, wherein the corresponding data of different target Data View calculate serial execute.
9. data computing engines as claimed in claim 8,
The computing unit, for when the corresponding current layer target DS node of the target data view is in the preset execution time When obtaining entering the data that ginseng matches with it in range, configured according to the data and the corresponding DAG of the target data view Carry out data calculating.
10. data computing engines as claimed in claim 9,
The computing unit is further used for when the corresponding current layer target DS node of the target data view is in the execution When not obtaining entering the data that ginseng matches with it in time range, according to the corresponding current layer target DS of the target data view Node enters ginseng, the corresponding current layer target DS node of the target data view is re-executed, when the target data view When corresponding current layer target DS node obtains entering the data that ginseng matches with it in the execution time range, according to described The corresponding DAG configuration of target data view carries out data calculating.
11. data computing engines as claimed in claim 7,
When local environment be in thread environment,
The execution unit, for calling TR service interface;The ginseng that enters of the current layer target DS node is supplied to the TR Service interface, so that the TR service interface obtains the data for entering ginseng and matching with the current layer target DS node.
12. any data computing engines as claim in claims 7-11,
When local environment is offline environment,
The execution unit, the ginseng that enters for filtering out from offline database with the current layer target DS node match Data.
CN201910125629.1A 2019-02-19 2019-02-19 Data calculation method and engine Active CN110020004B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201910125629.1A CN110020004B (en) 2019-02-19 2019-02-19 Data calculation method and engine
TW108132569A TWI723535B (en) 2019-02-19 2019-09-10 Data calculation method and engine
PCT/CN2020/073843 WO2020168901A1 (en) 2019-02-19 2020-01-22 Data calculation method and engine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910125629.1A CN110020004B (en) 2019-02-19 2019-02-19 Data calculation method and engine

Publications (2)

Publication Number Publication Date
CN110020004A true CN110020004A (en) 2019-07-16
CN110020004B CN110020004B (en) 2020-08-07

Family

ID=67189027

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910125629.1A Active CN110020004B (en) 2019-02-19 2019-02-19 Data calculation method and engine

Country Status (3)

Country Link
CN (1) CN110020004B (en)
TW (1) TWI723535B (en)
WO (1) WO2020168901A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781180A (en) * 2019-09-05 2020-02-11 腾讯科技(深圳)有限公司 Data screening method and data screening device
WO2020168901A1 (en) * 2019-02-19 2020-08-27 阿里巴巴集团控股有限公司 Data calculation method and engine

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI814481B (en) * 2021-07-20 2023-09-01 奧義智慧科技股份有限公司 Security event analysis system and related computer program product for auxiliary intrusion detection

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110191324A1 (en) * 2010-01-29 2011-08-04 Song Wang Transformation of directed acyclic graph query plans to linear query plans
CN102571752A (en) * 2011-12-03 2012-07-11 山东大学 Service-associative-index-map-based quality of service (QoS) perception Top-k service combination system
CN103123652A (en) * 2013-03-14 2013-05-29 曙光信息产业(北京)有限公司 Data query method and cluster database system
CN103150219A (en) * 2013-04-03 2013-06-12 重庆大学 Quick task allocation method avoiding deadlock on heterogeneous resource system
CN106815027A (en) * 2017-01-22 2017-06-09 山东鲁能软件技术有限公司 A kind of high resiliency calculating platform for power network multidimensional business composite computing
CN109063056A (en) * 2018-07-20 2018-12-21 阿里巴巴集团控股有限公司 A kind of data query method, system and terminal device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120158768A1 (en) * 2010-12-15 2012-06-21 Microsoft Corporation Decomposing and merging regular expressions
CN102541875B (en) * 2010-12-16 2014-04-16 北京大学 Access method, device and system for relational node data of directed acyclic graph
KR101621490B1 (en) * 2014-08-07 2016-05-17 (주)그루터 Query execution apparatus and method, and system for processing data employing the same
CN105677752A (en) * 2015-12-30 2016-06-15 深圳先进技术研究院 Streaming computing and batch computing combined processing system and method
CN106960004A (en) * 2017-02-15 2017-07-18 浙江大学 A kind of analysis method of multidimensional data
CN107133257A (en) * 2017-03-21 2017-09-05 华南师范大学 A kind of similar entities recognition methods and system based on center connected subgraph
CN110020004B (en) * 2019-02-19 2020-08-07 阿里巴巴集团控股有限公司 Data calculation method and engine

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110191324A1 (en) * 2010-01-29 2011-08-04 Song Wang Transformation of directed acyclic graph query plans to linear query plans
CN102571752A (en) * 2011-12-03 2012-07-11 山东大学 Service-associative-index-map-based quality of service (QoS) perception Top-k service combination system
CN103123652A (en) * 2013-03-14 2013-05-29 曙光信息产业(北京)有限公司 Data query method and cluster database system
CN103150219A (en) * 2013-04-03 2013-06-12 重庆大学 Quick task allocation method avoiding deadlock on heterogeneous resource system
CN106815027A (en) * 2017-01-22 2017-06-09 山东鲁能软件技术有限公司 A kind of high resiliency calculating platform for power network multidimensional business composite computing
CN109063056A (en) * 2018-07-20 2018-12-21 阿里巴巴集团控股有限公司 A kind of data query method, system and terminal device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
何昱泽: "面向大数据处理的分布式机器学习算法编排系统的研究与实现", 《中国优秀硕士学位论文全文数据库·信息科技辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020168901A1 (en) * 2019-02-19 2020-08-27 阿里巴巴集团控股有限公司 Data calculation method and engine
CN110781180A (en) * 2019-09-05 2020-02-11 腾讯科技(深圳)有限公司 Data screening method and data screening device
CN110781180B (en) * 2019-09-05 2022-08-30 腾讯科技(深圳)有限公司 Data screening method and data screening device

Also Published As

Publication number Publication date
TW202032395A (en) 2020-09-01
WO2020168901A1 (en) 2020-08-27
CN110020004B (en) 2020-08-07
TWI723535B (en) 2021-04-01

Similar Documents

Publication Publication Date Title
CN110020004A (en) A kind of method for computing data and engine
CN109933834A (en) A kind of model creation method and device of time series data prediction
CN106201673B (en) A kind of seismic data processing technique and device
CN108683692A (en) A kind of service request processing method and device
CN110096498A (en) A kind of data cleaning method and device
CN110134668A (en) Data migration method, device and equipment applied to block chain
CN108415695A (en) A kind of data processing method, device and equipment based on visualization component
CN110378400A (en) A kind of model training method and device for image recognition
CN109725989A (en) A kind of method and device of task execution
CN114936085A (en) ETL scheduling method and device based on deep learning algorithm
CN110635962B (en) Abnormity analysis method and device for distributed system
CN109359120A (en) Data-updating method, device and equipment in a kind of model training
CN109936642A (en) The method, apparatus and system of machine ID are generated in a kind of distributed system
CN116185532B (en) Task execution system, method, storage medium and electronic equipment
CN105868216B (en) A kind of method, apparatus and equipment for realizing the expired operation of object
CN109597678A (en) Task processing method and device
CN109086126A (en) Task scheduling processing method, apparatus, server, client and electronic equipment
CN110069523A (en) A kind of data query method, apparatus and inquiry system
CN108390914A (en) A kind of service update method and device, system
CN109886804A (en) A kind of task processing method and device
CN109656946A (en) A kind of multilist relation query method, device and equipment
CN106648883B (en) Dynamic reconfigurable hardware acceleration method and system based on FPGA
CN116136952A (en) Simulation test method and device for components
CN110008386A (en) A kind of data generation, processing, evaluation method, device, equipment and medium
CN108921375A (en) A kind of data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20201019

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20201019

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Patentee before: Alibaba Group Holding Ltd.