CN104899209A - Optimization method and device for open type data processing service - Google Patents

Optimization method and device for open type data processing service Download PDF

Info

Publication number
CN104899209A
CN104899209A CN201410078866.4A CN201410078866A CN104899209A CN 104899209 A CN104899209 A CN 104899209A CN 201410078866 A CN201410078866 A CN 201410078866A CN 104899209 A CN104899209 A CN 104899209A
Authority
CN
China
Prior art keywords
field
current work
associated job
data
optimization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410078866.4A
Other languages
Chinese (zh)
Other versions
CN104899209B (en
Inventor
王磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Tmall Technology Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201410078866.4A priority Critical patent/CN104899209B/en
Publication of CN104899209A publication Critical patent/CN104899209A/en
Application granted granted Critical
Publication of CN104899209B publication Critical patent/CN104899209B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an optimization method and device for open type data processing service, and belongs to the field of computer communication. The method comprises the following steps: obtaining the current operation of a user, and looking up a source field of the current operation; establishing upstream and downstream relationships of all pieces of data corresponding to all the fields in the current operation; carrying out calculation to obtain a normalization expression and the field calculation cost of each field in the current operation; utilizing the normalization expression of each field in the current operation to optimize each field in the current operation according to the field calculation cost of each field in the current operation so as to obtain the optimal normalization expression and the optimal field calculation cost of each field in the current operation; and optimizing a current operation code corresponding to the current operation to obtain an optimal operation code which can obtain the optimal normalization expression and the optimal field calculation cost. The optimization method can automatically realize global optimization and is low in operation cost.

Description

For optimization method and the device of open type data process service
Technical field
The application relates to computer communication field, is specifically related to a kind of optimization method for open type data process service and device.
Background technology
Along with the development of computer communication technology, can for user provide for PB rank data, the open type data process service of batch processing ability that requirement of real-time is relatively not high receives liking of user.Open type data process service is mainly used in the fields such as data analysis, mass data statistics, data mining, business intelligence.The publicly-owned cloud mode that open type data process service can be shared with many tenants provides (towards for medium and small sized enterprises and developer), also can be deployed in Client Enterprise inside (towards large and medium-sized enterprise) in the mode of privately owned cloud.
Open type data process service needs to use open, large-scale data warehouse, and the object of data warehouse builds the integrated data environment towards analyzing, for enterprise provides decision support (Decision Support).The data structure in large-scale data warehouse can be divided into three parts from technical standpoint: fragmented storage district, basic data warehouse and Data Mart.The upper layer application of data warehouse, as BI form, decision analysis product, extemporaneous inquiry, data mining etc. depend on the aggregated data (wherein data mining also can depend on basic data warehouse in a large number) of Data Mart.The extent for multiplexing of the aggregated data of Data Mart determines the efficiency of data warehouse substantially, and the extent for multiplexing of the aggregated data of Data Mart depends on following condition: the model of Data Mart and gather the demand whether frameworks such as level always accurately meet upper-layer service, and the Rapid Variable Design of operation system can be adapted to; Whether have corresponding instrument and system support, help the developer of upper-layer service system whether always can prepare to find, the various data understood in Data Mart; The renewal of Data Mart also needs last traffic team active response and implements to change.At present, the profile of data framework teacher/framework group is depended on to ensure that above-mentioned condition is satisfied.But the growth of the business along with enterprise, the scale of data is in rapid expanding, the particularly publicly-owned data warehouse of platform character, come service data fairground by traditional top-down mode and become an impossible mission, thus cause in data warehouse, there is a large amount of repetitive operations and repeated storage.Generally, data warehouse expert, business model expert can reconstruct in regular implementation data warehouse, namely carries out global optimization for open type data process service.
The existing optimization method for open type data process service, when carrying out global optimization, need be proficient in professional work and data processing knowledge expert leader under carry out periodicity reconstruct, the existing optimization method for open type data process service cannot realize global optimization automatically, and Cost optimization is high.
Summary of the invention
Technical problems to be solved in this application are to provide a kind of optimization method for open type data process service and device, do not need understanding business, thered is provided as a service by platform service provider, user does not need to drop into expert specially and is optimized, user only need click, this business of choice for use, can realize global optimization automatically, and Cost optimization is far below cultivation and engagement one/crowd expert.
In order to solve the problem, this application discloses a kind of optimization method for open type data process service, described method comprises:
Obtain the current work of user, search the source field of described current work;
With data corresponding to the source field of described current work for starting point, take field as granularity, set up the upstream-downstream relationship between all data corresponding to all fields in described current work, wherein, when setting up the upstream-downstream relationship between data, second data of one of the direct or indirect input as calculating first data are called the upstream of described first data, described first data are called the downstream of the second data, and when described first data are unique downstream datas of described second data, when described second data are unique upstream datas of described first data, described second data are called the direct upstream of described first data, described first data are called the direct downstream of described second data,
According to the upstream-downstream relationship between all data that all fields in described current work are corresponding, calculate the normalization expression formula of each field in described current work;
According to the normalization expression formula of each field in described current work, the field calculating each field in described current work assesses the cost;
Utilize the normalization expression formula of each field in described current work, field according to each field in described current work assesses the cost, each field in described current work is optimized, obtains the optimum normalization expression formula of each field in described current work, optimum field assesses the cost;
By current work code optimization corresponding for described current work be can obtain described optimum normalization expression formula, optimum operation code that described optimum field assesses the cost.
Further, search the source field of described current work, comprising:
Scan the current work code that described current work is corresponding;
Resolve the input and output path of the data comprised in described current work code;
According to the input and output path of the data comprised in described current work code, from described current work code or data warehouse corresponding to described current work code, find corresponding with the described current work code source data cell only having data to export not have data to input;
The field described source data cell comprised is as source field.
Further, when obtaining the current work of user, also comprise:
Obtain the optimization occasion information that described current work is corresponding, wherein, described optimization occasion information comprises static optimization or dynamic optimization;
When described optimization occasion information is described static optimization, be the optimum normalization expression formula of each field that can obtain in described current work by current work code optimization corresponding for described current work, after the optimum operation code that assesses the cost of optimum field, also comprise:
Described optimum operation code is returned to described user;
When described user confirms to use described optimum operation code to replace described current work code, described current work code is replaced with described optimum operation code;
When described optimization occasion information is dynamic optimization, be the optimum normalization expression formula of each field that can obtain in described current work by current work code optimization corresponding for described current work, after the optimum operation code that assesses the cost of optimum field, also comprise:
Described current work code is replaced with described optimum operation code.
Further, when obtaining the current work of user, also comprise:
Obtain the optimization object information that described current work is corresponding, wherein, described optimization object information comprises increment optimization or full dose optimization;
When described optimization object information is the optimization of described increment, judge whether have the associated job relevant to described current work in the data warehouse that described current work code is corresponding;
If have the associated job relevant to described current work in described data warehouse, then judge whether each field in described associated job has corresponding normalization expression formula and field to assess the cost;
If each field in described associated job has corresponding normalization expression formula and field to assess the cost, then perform the step of searching the source field of described current work;
When described optimization object information is the optimization of described full dose, judge whether have the associated job relevant to described current work in the data warehouse that described current work code is corresponding;
If have the associated job relevant to described current work in described data warehouse, then judge described associated job whether optimised mistake;
If described associated job is optimised mistake, then perform the step of searching the source field of described current work.
Further, when described optimization object information is described increment optimization, after judging whether have the associated job relevant to described current work in the data warehouse that described current work code is corresponding, also comprise:
If do not have the associated job relevant to described current work in described data warehouse, then perform the step of searching the source field of described current work.
Further, when described optimization object information is described increment optimization, whether each field judging in described associated job has after corresponding normalization expression formula and field assess the cost, and also comprises:
If each field in described associated job does not have corresponding normalization expression formula and field to assess the cost, then search the source field of described associated job;
With data corresponding to the source field of described associated job for starting point, be granularity with field, set up the upstream-downstream relationship between all data corresponding to all fields in described associated job;
According to the upstream-downstream relationship between all data that all fields in described associated job are corresponding, calculate the normalization expression formula of each field in described associated job;
According to the normalization expression formula of each field in described associated job, the field calculating each field in described associated job assesses the cost, and then performs the step of searching the source field of described current work.
Further, when described optimization object information is the optimization of described full dose, after judging whether have the associated job relevant to described current work in the data warehouse that described current work code is corresponding, also comprise:
If do not have the associated job relevant to described current work in described data warehouse, then perform the step of searching the source field of described current work.
Further, when described optimization object information be described full dose optimize time, judge described associated job whether after optimised mistake, also comprise:
If described associated job does not have optimised mistake, then judge whether each field in described associated job has corresponding normalization expression formula and field to assess the cost;
If each field in described associated job does not have corresponding normalization expression formula and field to assess the cost, then search the source field of described associated job;
With data corresponding to the source field of described associated job for starting point, be granularity with field, set up the upstream-downstream relationship between all data corresponding to all fields in described associated job;
According to the upstream-downstream relationship between all data that all fields in described associated job are corresponding, calculate the normalization expression formula of each field in described associated job;
According to the normalization expression formula of each field in described associated job, the field calculating each field in described associated job assesses the cost;
Utilize the normalization expression formula of each field in described associated job, field according to each field in described associated job assesses the cost, each field in described associated job is optimized, obtains the optimum normalization expression formula of each field in described associated job, optimum field assesses the cost;
Be the optimum normalization expression formula of each field that can obtain in described associated job by associated job code optimization corresponding for described associated job, the optimum operation code that assesses the cost of optimum field, then perform the step of searching the source field of described current work.
Further, when described optimization object information be described full dose optimize time, whether each field judging in described associated job has after corresponding normalization expression formula and field assess the cost, and also comprises:
If each field in described associated job has corresponding normalization expression formula and field to assess the cost, then utilize the normalization expression formula of each field in described associated job, field according to each field in described associated job assesses the cost, each field in described associated job is optimized, obtains the optimum normalization expression formula of each field in described associated job, optimum field assesses the cost;
Be the optimum normalization expression formula of each field that can obtain in described associated job by associated job code optimization corresponding for described associated job, the optimum operation code that assesses the cost of optimum field, then perform the step of searching the source field of described current work.
In order to solve the problem, disclosed herein as well is a kind of optimization device for open type data process service, described device comprises:
Processing module, for obtaining the current work of user, searches the source field of described current work;
Set up module, be starting point for the data that the source field with described current work is corresponding, take field as granularity, set up the upstream-downstream relationship between all data corresponding to all fields in described current work, wherein, when setting up the upstream-downstream relationship between data, second data of one of the direct or indirect input as calculating first data are called the upstream of described first data, described first data are called the downstream of the second data, and when described first data are unique downstream datas of described second data, when described second data are unique upstream datas of described first data, described second data are called the direct upstream of described first data, described first data are called the direct downstream of described second data,
First computing module, for according to the upstream-downstream relationship between all data corresponding to all fields in described current work, calculates the normalization expression formula of each field in described current work;
Second computing module, for the normalization expression formula according to each field in described current work, the field calculating each field in described current work assesses the cost;
First optimizes module, for utilizing the normalization expression formula of each field in described current work, field according to each field in described current work assesses the cost, each field in described current work is optimized, obtains the optimum normalization expression formula of each field in described current work, optimum field assesses the cost;
Second optimizes module, for by current work code optimization corresponding for described current work be can obtain described optimum normalization expression formula, optimum operation code that described optimum field assesses the cost.
Further, described processing module comprises:
Scanning element, for scanning current work code corresponding to described current work;
Resolution unit, for resolving the input and output path of the data comprised in described current work code;
First searches unit, for the input and output path according to the data comprised in described current work code, from described current work code or data warehouse corresponding to described current work code, find corresponding with the described current work code source data cell only having data to export not have data to input;
As unit, for field that described source data cell is comprised as source field.
Further, described processing module also comprises:
First acquiring unit, for obtaining optimization occasion information corresponding to described current work, wherein, described optimization occasion information comprises static optimization or dynamic optimization;
When described optimization occasion information is described static optimization, described second optimizes module also comprises:
Return unit, for being the optimum normalization expression formula of each field that can obtain in described current work by current work code optimization corresponding for described current work, after the optimum operation code that assesses the cost of optimum field, described optimum operation code being returned to described user;
First replacement unit, during for confirming as described user to use described optimum operation code to replace described current work code, replaces with described optimum operation code by described current work code;
When described optimization occasion information is dynamic optimization, described second optimizes module also comprises:
Second replacement unit, for replacing with described optimum operation code by described current work code.
Further, described processing module also comprises:
Second acquisition unit, for obtaining optimization object information corresponding to described current work, wherein, described optimization object information comprises increment optimization or full dose optimization;
First judging unit, during for being the optimization of described increment when described optimization object information, judges whether have the associated job relevant to described current work in the data warehouse that described current work code is corresponding;
Second judging unit, if for having the associated job relevant to described current work in described data warehouse, then judges whether each field in described associated job has corresponding normalization expression formula and field to assess the cost;
First notification unit, if having corresponding normalization expression formula and field to assess the cost for each field in described associated job, then notifies that described processing module performs the step of searching the source field of described current work;
3rd judging unit, during for being the optimization of described full dose when described optimization object information, judges whether have the associated job relevant to described current work in the data warehouse that described current work code is corresponding;
Whether 4th judging unit, if for having the associated job relevant to described current work in described data warehouse, then judge described associated job optimised mistake;
Second notification unit, if for described associated job optimised mistake, then notifies that described processing module performs the step of searching the source field of described current work.
Further, when described optimization object information is described increment optimization, described processing module also comprises:
Third notice unit, if for not having the associated job relevant to described current work in described data warehouse, then notifies that described processing module performs the step of searching the source field of described current work.
Further, when described optimization object information is described increment optimization, described processing module also comprises:
Second searches unit, if do not have corresponding normalization expression formula and field to assess the cost for each field in described associated job, then searches the source field of described associated job;
First sets up unit, and the data corresponding for the source field with described associated job are starting point, take field as granularity, sets up the upstream-downstream relationship between all data corresponding to all fields in described associated job;
First computing unit, for according to the upstream-downstream relationship between all data corresponding to all fields in described associated job, calculates the normalization expression formula of each field in described associated job;
Second computing unit, for the normalization expression formula according to each field in described associated job, the field calculating each field in described associated job assesses the cost, and then notifies that described processing module performs the step of searching the source field of described current work.
Further, when described optimization object information is the optimization of described full dose, described processing module also comprises:
4th notification unit, if for not having the associated job relevant to described current work in described data warehouse, then notifies that described processing module performs the step of searching the source field of described current work.
Further, when described optimization object information is the optimization of described full dose, described processing module also comprises:
5th judging unit, if do not have optimised mistake for described associated job, then judges whether each field in described associated job has corresponding normalization expression formula and field to assess the cost;
3rd searches unit, if do not have corresponding normalization expression formula and field to assess the cost for each field in described associated job, then searches the source field of described associated job;
Second sets up unit, and the data corresponding for the source field with described associated job are starting point, take field as granularity, sets up the upstream-downstream relationship between all data corresponding to all fields in described associated job;
3rd computing unit, for according to the upstream-downstream relationship between all data corresponding to all fields in described associated job, calculates the normalization expression formula of each field in described associated job;
4th computing unit, for the normalization expression formula according to each field in described associated job, the field calculating each field in described associated job assesses the cost;
First optimizes unit, for utilizing the normalization expression formula of each field in described associated job, field according to each field in described associated job assesses the cost, each field in described associated job is optimized, obtains the optimum normalization expression formula of each field in described associated job, optimum field assesses the cost;
Second optimizes unit, for be the optimum normalization expression formula of each field that can obtain in described associated job by associated job code optimization corresponding for described associated job, the optimum operation code that assesses the cost of optimum field, then notify that described processing module performs the step of searching the source field of described current work.
Further, when described optimization object information is the optimization of described full dose, described processing module also comprises:
3rd optimizes unit, if have corresponding normalization expression formula and field to assess the cost for each field in described associated job, then utilize the normalization expression formula of each field in described associated job, field according to each field in described associated job assesses the cost, each field in described associated job is optimized, obtains the optimum normalization expression formula of each field in described associated job, optimum field assesses the cost;
4th optimizes unit, for be the optimum normalization expression formula of each field that can obtain in described associated job by associated job code optimization corresponding for described associated job, the optimum operation code that assesses the cost of optimum field, then notify that described processing module performs the step of searching the source field of described current work.
Compared with prior art, the application can obtain and comprise following technique effect:
Do not need understanding business, provided by platform service provider as a service, user does not need to drop into expert specially and is optimized, user only need click, this business of choice for use, can realize global optimization automatically, and Cost optimization is far below cultivation and engagement one/crowd expert.Optimization method is iterative, automatically runs, avoids a large amount of reconstruct in cycle, for user saves a large amount of human costs and Opportunity Cost of Time brought thus.
Certainly, the arbitrary product implementing the application must not necessarily need to reach above-described all technique effects simultaneously.
Accompanying drawing explanation
Accompanying drawing described herein is used to provide further understanding of the present application, and form a application's part, the schematic description and description of the application, for explaining the application, does not form the improper restriction to the application.In the accompanying drawings:
Fig. 1 is the first optimization method process flow diagram for open type data process service of the embodiment of the present application;
Fig. 2 is the optimization method process flow diagram that the second of the embodiment of the present application is served for open type data process;
Fig. 3 is the third optimization method process flow diagram for open type data process service of the embodiment of the present application;
Fig. 4 is a kind of optimization method application schematic diagram for open type data process service of the embodiment of the present application;
Fig. 5 is a kind of optimization device structural representation for open type data process service of the embodiment of the present application.
Embodiment
Drawings and Examples will be coordinated below to describe the embodiment of the application in detail, by this to the application how application technology means solve technical matters and the implementation procedure reaching technology effect can fully understand and implement according to this.
In one typically configuration, computing equipment comprises one or more processor (CPU), input/output interface, network interface and internal memory.
Internal memory may comprise the volatile memory in computer-readable medium, and the forms such as random access memory (RAM) and/or Nonvolatile memory, as ROM (read-only memory) (ROM) or flash memory (flash RAM).Internal memory is the example of computer-readable medium.
Computer-readable medium comprises permanent and impermanency, removable and non-removable media can be stored to realize information by any method or technology.Information can be computer-readable instruction, data structure, the module of program or other data.The example of the storage medium of computing machine comprises, but be not limited to phase transition internal memory (PRAM), static RAM (SRAM), dynamic RAM (DRAM), the random access memory (RAM) of other types, ROM (read-only memory) (ROM), Electrically Erasable Read Only Memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc ROM (read-only memory) (CD-ROM), digital versatile disc (DVD) or other optical memory, magnetic magnetic tape cassette, tape magnetic rigid disk stores or other magnetic storage apparatus or any other non-transmitting medium, can be used for storing the information can accessed by computing equipment.According to defining herein, computer-readable medium does not comprise non-temporary computer readable media (transitory media), as data-signal and the carrier wave of modulation.
embodiment describes
Be described further with the realization of an embodiment to the application's method below.As shown in Figure 1, be a kind of optimization method process flow diagram for open type data process service of the embodiment of the present application, the method comprises:
S101: the current work obtaining user, searches the source field of current work.
Wherein, user can be the developer, modeling expert etc. of large and medium-sized enterprise, medium and small sized enterprises etc.Operation comprises user and solves enterprise problem etc. and the code write and some data etc., and other places are similar, repeat no longer one by one.
Wherein, source field is in all data acquisitions corresponding with the All Jobs that current work is associated and does not have the data field of upstream, and the target of searching the source field of current work is for all data corresponding with the All Jobs that current work is associated in data warehouse find starting point.The terminal searching source field can be the synchronous entrance of data warehouse, also can basic data warehouse, accepts or rejects principle to be: the processing procedure before the source field of mark can not participate in global optimization; Data corresponding to source field must keep relative problem, isolate the change of leading portion business datum as far as possible.Based on these two principles, the terminal that source field is searched in general selection is basic data warehouse.
Particularly, search the source field of current work in the present embodiment, comprising:
The current work code that scanning current work is corresponding;
Resolve the input and output path of the data comprised in current work code;
According to the input and output path of the data comprised in current work code, from current work code or data warehouse corresponding to current work code, find corresponding with the current work code source data cell only having data to export not have data to input;
Field source data cell comprised is as source field.
Wherein, data in the data comprised in current work code, data warehouse can store with forms such as forms, when data be store in a tabular form time, corresponding with the current work code source data cell only having data to export not have data to input can be called source tables of data.
S102: with data corresponding to the source field of current work for starting point, be granularity with field, sets up the upstream-downstream relationship between all data corresponding to all fields in current work.
Wherein, when setting up the upstream-downstream relationship between data, second data of one of the direct or indirect input as calculating first data are called the upstream of the first data, first data are called the downstream of the second data, and when the first data are unique downstream datas of the second data, when second data are unique upstream datas of the first data, the second data are called the direct upstream of the first data, the first data are called the direct downstream of the second data.
Particularly, with data corresponding to the source field of current work for starting point, take field as granularity, set up the upstream-downstream relationship between all data corresponding to all fields in current work, can using current work code corresponding to current work, metadata as input, be granularity with field, set up the upstream-downstream relationship between all data corresponding to all fields in current work.Further, the upstream-downstream relationship of foundation to scheme or the form such as table is set up, can be called data blood relationship or field level genetic connection etc. by upstream-downstream relationship.
S103: according to the upstream-downstream relationship between all data that all fields in current work are corresponding, calculate the normalization expression formula of each field in current work.
Wherein, normalization expression formula is the expression that each field in current work is directly calculated by the data that source field is corresponding, and expression formula is normalizing, as long as computational logic is the same, the normalization expression formula obtained should be the same.Normalization expression formula is made up of computing unit, and computing unit is minimum unit in normalization expression formula.
Such as: the normalization of the source field A that current work is corresponding is expressed as A=A, the normalization expression formula of the first middle field B that current work comprises is B=A+1, and the normalization expression formula of the second middle field C that current work comprises is C=3(A+1) etc.
S104: according to the normalization expression formula of each field in current work, the field calculating each field in current work assesses the cost.
Wherein, it is assessing the cost of estimating according to the normalization expression formula of each field in current work that the field of each field in current work assesses the cost.Namely from source field, calculate according to the calculating path of normalization expression formula definition, data warehouse needs the estimated cost consumed.
Can in the following way, calculated field assesses the cost:
(1) modeling: for each computing unit sets up the model with input data volume;
(2) use historical data as input, by the real cost of each computing unit of the model assessment of computing unit.
(3) by the mode of iteration, the cost of each computing unit of accumulative composition normalization expression formula, obtains field and assesses the cost.
S105: the normalization expression formula utilizing each field in current work, field according to each field in current work assesses the cost, each field in current work is optimized, obtains the optimum normalization expression formula of each field in current work, optimum field assesses the cost.
Such as: the source field A that current work is corresponding is expressed as A=A, the normalization expression formula of the first middle field B that current work comprises is B=A+1, and the normalization expression formula of the second middle field C that current work comprises is C=3(A+1) etc.After optimization, the source field A that current work is corresponding is expressed as A=A, and the normalization expression formula of the first middle field B that current work comprises is B=A+1, and the normalization expression formula of the second middle field C that current work comprises is C=3B etc.
S106: be the optimum normalization expression formula of each field that can obtain in described current work by current work code optimization corresponding for current work, the optimum operation code that assesses the cost of optimum field.
Particularly, when being optimized, optimization can also being set and selecting for user opportunity, when namely obtaining the current work of user in step S101, also comprising:
Obtain the optimization occasion information that current work is corresponding, wherein, optimize occasion information and comprise static optimization or dynamic optimization;
When to optimize occasion information be static optimization, be the optimum normalization expression formula of each field that can obtain in described current work by current work code optimization corresponding for current work, after the optimum operation code that assesses the cost of optimum field, also comprise:
Optimum operation code is returned to user;
When user confirms to use optimum operation code to replace current work code, current work code is replaced with optimum operation code;
When to optimize occasion information be dynamic optimization, be the optimum normalization expression formula of each field that can obtain in described current work by current work code optimization corresponding for current work, after the optimum operation code that assesses the cost of optimum field, also comprise:
Current work code is replaced with optimum operation code.
Particularly, user can according to actual needs, select different optimization to be optimized opportunity, and can find out according to foregoing description, static optimization is optimized rewriting to operation after user's submit job, and resubmit after being confirmed by user, user can see the code after optimization.Dynamic optimization is optimized operation when operation is scheduled, and the result after optimizing is submitted to dispatching system actual motion, the code before the optimization that user writes with oneself seeing only.
Particularly, when being optimized, optimization range (object) can also being set and selecting for user, when namely obtaining the current work of user in step S101, also comprising:
Obtain the optimization object information that current work is corresponding, wherein, optimization object information comprises increment optimization or full dose optimization.
Wherein, increment optimization is only optimized the current work when submit.Full dose optimization is not only optimized the current work when submit, but also is optimized the associated job relevant to current work.Wherein, the associated job relevant to current work refers to exist in data between current work or contacting in calculating etc., is the operation submitted to solve the same problem etc.
See Fig. 2, when optimization object information is increment optimization, after obtaining optimization object information corresponding to current work, specifically comprise:
S201: judge whether have the associated job relevant to current work in the data warehouse that current work code is corresponding, if having the associated job relevant to current work in data warehouse, then perform S202; Otherwise, perform the step of searching the source field of current work in S101.
Particularly, if do not have the associated job relevant to current work in data warehouse, then current work may be the first time operation that user submits to.
S202: judge whether each field in associated job has corresponding normalization expression formula and field to assess the cost, if each field in associated job has corresponding normalization expression formula and field to assess the cost, then perform the step of searching the source field of current work in S101; Otherwise, perform S203.
Particularly, if user is before carrying out global optimization to current work, do not select to adopt the method pair associated job relevant to current work of the present embodiment to carry out global optimization, after then system can be arranged on and receive associated job, each field do not calculated in associated job has corresponding normalization expression formula and field to assess the cost, also can be arranged on after receiving associated job, each field in compute associations operation has corresponding normalization expression formula and field to assess the cost, and does not carry out global optimization to associated job.
When user is before carrying out global optimization to current work, do not select to adopt the method pair associated job relevant to current work of the present embodiment to carry out global optimization, and system be set to after receiving associated job, when each field do not calculated in associated job has corresponding normalization expression formula and field to assess the cost, judged result when performing this step will be: each field in associated job does not have corresponding normalization expression formula and field to assess the cost.
S203: the source field of searching associated job.
S204: with data corresponding to the source field of associated job for starting point, be granularity with field, the upstream-downstream relationship between all data that all fields be associated in operation are corresponding.
S205: according to the upstream-downstream relationship between all data that all fields in associated job are corresponding, calculate the normalization expression formula of each field in associated job.
S206: according to the normalization expression formula of each field in associated job, the field calculating each field in associated job assesses the cost, then performs the step of searching the source field of current work in S101.
See Fig. 3, when optimization object information is full dose optimization, after obtaining optimization object information corresponding to current work, specifically comprise:
S301: judge whether have the associated job relevant to current work in the data warehouse that current work code is corresponding, if having the associated job relevant to current work in data warehouse, then perform S302; Otherwise, perform the step of searching the source field of current work in S101.
Whether S302: judge associated job optimised mistake, if associated job optimised mistake, then perform the step of searching the source field of current work in S101; Otherwise, perform S303.
Whether associated job optimised mistake, and namely whether associated job has adopted the global optimization method of the present embodiment to be optimized.
S303: judge whether each field in associated job has corresponding normalization expression formula and field to assess the cost, if each field in associated job does not have corresponding normalization expression formula and field to assess the cost, then performs S304; Otherwise, perform S308.
S304: the source field of searching associated job.
S305: with data corresponding to the source field of associated job for starting point, be granularity with field, the upstream-downstream relationship between all data that all fields be associated in operation are corresponding.
S306: according to the upstream-downstream relationship between all data that all fields in associated job are corresponding, calculate the normalization expression formula of each field in associated job.
S307: according to the normalization expression formula of each field in associated job, the field calculating each field in associated job assesses the cost.
S308: the normalization expression formula utilizing each field in associated job, field according to each field in associated job assesses the cost, each field in associated job is optimized, obtains the optimum normalization expression formula of each field in associated job, optimum field assesses the cost.
S309: be the optimum normalization expression formula of each field that can obtain in described associated job by associated job code optimization corresponding for associated job, the optimum operation code that assesses the cost of optimum field, then performs the step of searching the source field of current work in S101.
It should be noted that, normalization field storehouse can be set up, the normalization expression formula of each field, optimum normalization expression formula, field are assessed the cost, optimum field assesses the cost stored therein.Further, for the ease of distinguishing each field, can arrange UUID for each field, by UUID unique identification field, when the form of data is sheet format, UUID can be made up of name space+table name+row name.And can global optimizer be set up, perform global optimization by global optimizer, and collect the optimization information such as the hit rate optimized, main optimization information includes but not limited to: the miss situation of the part field repeated; The a large amount of field miss situation repeated; Never be hit, or the model data be seldom hit.Miss operation can be supplied to expert through arrangement and carry out artificial optimization, or as the input of automation model optimization system.Optimization daily record can also being set up, the optimization information of collection is stored in optimization daily record, according to optimizing the optimization information stored in daily record, setting up data warehouse model, and provide a user, use for reference.
And, it should be noted that, the existing optimization for open type data service mainly other optimization of statement level, and the global optimization method of the present embodiment and existing statement level other optimize and non-exclusive or choice relation, but can superpose, the global optimization method of the present embodiment can supplement other optimization of existing statement level.
See Fig. 4, it is the schematic diagram of a kind of embody rule example of the present embodiment method.
The optimization method of the present embodiment can provide a kind of optimization method for extensive, the open type data process service be deployed in publicly-owned cloud or privately owned cloud.By the optimization method of the present embodiment, considerable computing time and data carrying cost can be saved for the user of data processing service service and/or provider.
The optimization method for open type data process service described in the present embodiment, do not need understanding business, thered is provided as a service by platform service provider, user does not need to drop into expert specially and is optimized, user only need click, this business of choice for use, can realize global optimization automatically, and Cost optimization is far below cultivation and engagement one/crowd expert.Optimization method is iterative, automatically runs, avoids a large amount of reconstruct in cycle, for user saves a large amount of human costs and Opportunity Cost of Time brought thus.
As shown in Figure 5, be a kind of optimization device structural drawing for open type data process service of the embodiment of the present application, this device comprises:
Processing module 401, for obtaining the current work of user, searches the source field of current work;
Set up module 402, be starting point for the data that the source field with current work is corresponding, take field as granularity, set up the upstream-downstream relationship between all data corresponding to all fields in current work, wherein, when setting up the upstream-downstream relationship between data, second data of one of the direct or indirect input as calculating first data are called the upstream of the first data, first data are called the downstream of the second data, and when the first data are unique downstream datas of the second data, when second data are unique upstream datas of the first data, second data are called the direct upstream of the first data, first data are called the direct downstream of the second data,
First computing module 403, for according to the upstream-downstream relationship between all data corresponding to all fields in current work, calculates the normalization expression formula of each field in current work;
Second computing module 404, for the normalization expression formula according to each field in current work, the field calculating each field in current work assesses the cost;
First optimizes module 405, for utilizing the normalization expression formula of each field in current work, field according to each field in current work assesses the cost, each field in current work is optimized, obtains the optimum normalization expression formula of each field in current work, optimum field assesses the cost;
Second optimizes module 406, for by current work code optimization corresponding for current work be can obtain optimum normalization expression formula, optimum operation code that optimum field assesses the cost.
Preferably, processing module 401 comprises:
Scanning element, for scanning current work code corresponding to current work;
Resolution unit, for resolving the input and output path of the data comprised in current work code;
First searches unit, for the input and output path according to the data comprised in current work code, from current work code or data warehouse corresponding to current work code, find corresponding with the current work code source data cell only having data to export not have data to input;
As unit, for field that source data cell is comprised as source field.
Preferably, processing module 401 also comprises:
First acquiring unit, for obtaining optimization occasion information corresponding to current work, wherein, optimizing occasion information and comprising static optimization or dynamic optimization;
When optimization occasion information is described static optimization, second optimizes module also comprises:
Return unit, for being the optimum normalization expression formula of each field that can obtain in current work by current work code optimization corresponding for current work, after the optimum operation code that assesses the cost of optimum field, optimum operation code being returned to described user;
First replacement unit, during for confirming as described user to use optimum operation code to replace current work code, replaces with optimum operation code by current work code;
When optimization occasion information is dynamic optimization, second optimizes module 406 also comprises:
Second replacement unit, for replacing with optimum operation code by current work code.
Preferably, processing module 401 also comprises:
Second acquisition unit, for obtaining optimization object information corresponding to current work, wherein, optimization object information comprises increment optimization or full dose optimization;
First judging unit, during for being the optimization of described increment when optimization object information, judges whether have the associated job relevant to current work in the data warehouse that current work code is corresponding;
Second judging unit, if for having the associated job relevant to current work in data warehouse, then judges whether each field in associated job has corresponding normalization expression formula and field to assess the cost;
First notification unit, if having corresponding normalization expression formula and field to assess the cost for each field in associated job, then notification handler module 401 performs the step of searching the source field of current work;
3rd judging unit, during for being the optimization of described full dose when optimization object information, judges whether have the associated job relevant to current work in the data warehouse that current work code is corresponding;
Whether 4th judging unit, if for having the associated job relevant to current work in data warehouse, then judge associated job optimised mistake;
Second notification unit, if for associated job optimised mistake, then notification handler module 401 performs the step of searching the source field of current work.
Preferably, when optimization object information is described increment optimization, processing module 401 also comprises:
Third notice unit, if for not having the associated job relevant to current work in data warehouse, then notification handler module 401 performs the step of searching the source field of current work.
Preferably, when optimization object information is described increment optimization, processing module 401 also comprises:
Second searches unit, if do not have corresponding normalization expression formula and field to assess the cost for each field in associated job, then searches the source field of associated job;
First sets up unit, and the data corresponding for the source field with associated job are starting point, take field as granularity, the upstream-downstream relationship between all data that all fields be associated in operation are corresponding;
First computing unit, for according to the upstream-downstream relationship between all data corresponding to all fields in associated job, calculates the normalization expression formula of each field in associated job;
Second computing unit, for the normalization expression formula according to each field in associated job, the field calculating each field in associated job assesses the cost, and then notification handler module 401 performs the step of searching the source field of current work.
Preferably, when optimization object information is the optimization of described full dose, processing module 401 also comprises:
4th notification unit, if for not having the associated job relevant to current work in data warehouse, then notification handler module 401 performs the step of searching the source field of current work.
Preferably, when optimization object information is the optimization of described full dose, processing module 401 also comprises:
5th judging unit, if do not have optimised mistake for associated job, then judges whether each field in associated job has corresponding normalization expression formula and field to assess the cost;
3rd searches unit, if do not have corresponding normalization expression formula and field to assess the cost for each field in associated job, then searches the source field of associated job;
Second sets up unit, and the data corresponding for the source field with associated job are starting point, take field as granularity, the upstream-downstream relationship between all data that all fields be associated in operation are corresponding;
3rd computing unit, for according to the upstream-downstream relationship between all data corresponding to all fields in associated job, calculates the normalization expression formula of each field in associated job;
4th computing unit, for the normalization expression formula according to each field in associated job, the field calculating each field in associated job assesses the cost;
First optimizes unit, for utilizing the normalization expression formula of each field in associated job, field according to each field in associated job assesses the cost, each field in associated job is optimized, obtains the optimum normalization expression formula of each field in associated job, optimum field assesses the cost;
Second optimizes unit, for be the optimum normalization expression formula of each field that can obtain in associated job by associated job code optimization corresponding for associated job, the optimum operation code that assesses the cost of optimum field, then notification handler module 401 performs the step of searching the source field of current work.
Preferably, when optimization object information is the optimization of described full dose, processing module 401 also comprises:
3rd optimizes unit, if have corresponding normalization expression formula and field to assess the cost for each field in associated job, then utilize the normalization expression formula of each field in associated job, field according to each field in associated job assesses the cost, each field in associated job is optimized, obtains the optimum normalization expression formula of each field in associated job, optimum field assesses the cost;
4th optimizes unit, for be the optimum normalization expression formula of each field that can obtain in associated job by associated job code optimization corresponding for associated job, the optimum operation code that assesses the cost of optimum field, then notification handler module 401 performs the step of searching the source field of current work.
The optimization device for open type data process service described in the present embodiment, do not need understanding business, thered is provided as a service by platform service provider, user does not need to drop into expert specially and is optimized, user only need click, this business of choice for use, can realize global optimization automatically, and Cost optimization is far below cultivation and engagement one/crowd expert.Optimization method is iterative, automatically runs, avoids a large amount of reconstruct in cycle, for user saves a large amount of human costs and Opportunity Cost of Time brought thus.
Described device describes corresponding with aforesaid method flow, and weak point describing with reference to said method flow process, repeats no longer one by one.
Above-mentioned explanation illustrate and describes some preferred embodiments of the application, but as previously mentioned, be to be understood that the application is not limited to the form disclosed by this paper, should not regard the eliminating to other embodiments as, and can be used for other combinations various, amendment and environment, and can in invention contemplated scope described herein, changed by the technology of above-mentioned instruction or association area or knowledge.And the change that those skilled in the art carry out and change do not depart from the spirit and scope of the application, then all should in the protection domain of the application's claims.

Claims (20)

1., for an optimization method for open type data process service, it is characterized in that, described method comprises:
Obtain the current work of user, search the source field of described current work;
With data corresponding to the source field of described current work for starting point, be granularity with field, set up the upstream-downstream relationship between all data corresponding to all fields in described current work;
According to the upstream-downstream relationship between all data that all fields in described current work are corresponding, calculate the normalization expression formula of each field in described current work;
According to the normalization expression formula of each field in described current work, the field calculating each field in described current work assesses the cost;
Utilize the normalization expression formula of each field in described current work, field according to each field in described current work assesses the cost, each field in described current work is optimized, obtains the optimum normalization expression formula of each field in described current work, optimum field assesses the cost;
By current work code optimization corresponding for described current work be can obtain described optimum normalization expression formula, optimum operation code that described optimum field assesses the cost.
2. the method for claim 1, is characterized in that,
When setting up the upstream-downstream relationship between data, second data of one of the direct or indirect input as calculating first data are called the upstream of described first data, described first data are called the downstream of the second data, and when described first data are unique downstream datas of described second data, when described second data are unique upstream datas of described first data, described second data are called the direct upstream of described first data, described first data are called the direct downstream of described second data.
3. method as claimed in claim 2, is characterized in that, search the source field of described current work, comprising:
Scan the current work code that described current work is corresponding;
Resolve the input and output path of the data comprised in described current work code;
According to the input and output path of the data comprised in described current work code, from described current work code or data warehouse corresponding to described current work code, find corresponding with the described current work code source data cell only having data to export not have data to input;
The field described source data cell comprised is as source field.
4. the method as described in claim 1 or 3, is characterized in that, when obtaining the current work of user, also comprises:
Obtain the optimization occasion information that described current work is corresponding, wherein, described optimization occasion information comprises static optimization or dynamic optimization;
When described optimization occasion information is described static optimization, be the optimum normalization expression formula of each field that can obtain in described current work by current work code optimization corresponding for described current work, after the optimum operation code that assesses the cost of optimum field, also comprise:
Described optimum operation code is returned to described user;
When described user confirms to use described optimum operation code to replace described current work code, described current work code is replaced with described optimum operation code;
When described optimization occasion information is dynamic optimization, be the optimum normalization expression formula of each field that can obtain in described current work by current work code optimization corresponding for described current work, after the optimum operation code that assesses the cost of optimum field, also comprise:
Described current work code is replaced with described optimum operation code.
5. the method as described in claim 1 or 3, is characterized in that, when obtaining the current work of user, also comprises:
Obtain the optimization object information that described current work is corresponding, wherein, described optimization object information comprises increment optimization or full dose optimization;
When described optimization object information is the optimization of described increment, judge whether have the associated job relevant to described current work in the data warehouse that described current work code is corresponding;
If have the associated job relevant to described current work in described data warehouse, then judge whether each field in described associated job has corresponding normalization expression formula and field to assess the cost;
If each field in described associated job has corresponding normalization expression formula and field to assess the cost, then perform the step of searching the source field of described current work;
When described optimization object information is the optimization of described full dose, judge whether have the associated job relevant to described current work in the data warehouse that described current work code is corresponding;
If have the associated job relevant to described current work in described data warehouse, then judge described associated job whether optimised mistake;
If described associated job is optimised mistake, then perform the step of searching the source field of described current work.
6. method as claimed in claim 5, is characterized in that, when described optimization object information is described increment optimization, after judging whether have the associated job relevant to described current work in the data warehouse that described current work code is corresponding, also comprise:
If do not have the associated job relevant to described current work in described data warehouse, then perform the step of searching the source field of described current work.
7. method as claimed in claim 5, is characterized in that, when described optimization object information is described increment optimization, whether each field judging in described associated job has after corresponding normalization expression formula and field assess the cost, and also comprises:
If each field in described associated job does not have corresponding normalization expression formula and field to assess the cost, then search the source field of described associated job;
With data corresponding to the source field of described associated job for starting point, be granularity with field, set up the upstream-downstream relationship between all data corresponding to all fields in described associated job;
According to the upstream-downstream relationship between all data that all fields in described associated job are corresponding, calculate the normalization expression formula of each field in described associated job;
According to the normalization expression formula of each field in described associated job, the field calculating each field in described associated job assesses the cost, and then performs the step of searching the source field of described current work.
8. method as claimed in claim 5, is characterized in that, when described optimization object information is the optimization of described full dose, after judging whether have the associated job relevant to described current work in the data warehouse that described current work code is corresponding, also comprises:
If do not have the associated job relevant to described current work in described data warehouse, then perform the step of searching the source field of described current work.
9. method as claimed in claim 5, is characterized in that, when described optimization object information be described full dose optimize time, judge described associated job whether after optimised mistake, also comprise:
If described associated job does not have optimised mistake, then judge whether each field in described associated job has corresponding normalization expression formula and field to assess the cost;
If each field in described associated job does not have corresponding normalization expression formula and field to assess the cost, then search the source field of described associated job;
With data corresponding to the source field of described associated job for starting point, be granularity with field, set up the upstream-downstream relationship between all data corresponding to all fields in described associated job;
According to the upstream-downstream relationship between all data that all fields in described associated job are corresponding, calculate the normalization expression formula of each field in described associated job;
According to the normalization expression formula of each field in described associated job, the field calculating each field in described associated job assesses the cost;
Utilize the normalization expression formula of each field in described associated job, field according to each field in described associated job assesses the cost, each field in described associated job is optimized, obtains the optimum normalization expression formula of each field in described associated job, optimum field assesses the cost;
Be the optimum normalization expression formula of each field that can obtain in described associated job by associated job code optimization corresponding for described associated job, the optimum operation code that assesses the cost of optimum field, then perform the step of searching the source field of described current work.
10. method as claimed in claim 9, is characterized in that, when described optimization object information be described full dose optimize time, whether each field judging in described associated job has after corresponding normalization expression formula and field assess the cost, and also comprises:
If each field in described associated job has corresponding normalization expression formula and field to assess the cost, then utilize the normalization expression formula of each field in described associated job, field according to each field in described associated job assesses the cost, each field in described associated job is optimized, obtains the optimum normalization expression formula of each field in described associated job, optimum field assesses the cost;
Be the optimum normalization expression formula of each field that can obtain in described associated job by associated job code optimization corresponding for described associated job, the optimum operation code that assesses the cost of optimum field, then perform the step of searching the source field of described current work.
11. 1 kinds of optimization devices for open type data process service, it is characterized in that, described device comprises:
Processing module, for obtaining the current work of user, searches the source field of described current work;
Set up module, the data corresponding for the source field with described current work are starting point, take field as granularity, set up the upstream-downstream relationship between all data corresponding to all fields in described current work;
First computing module, for according to the upstream-downstream relationship between all data corresponding to all fields in described current work, calculates the normalization expression formula of each field in described current work;
Second computing module, for the normalization expression formula according to each field in described current work, the field calculating each field in described current work assesses the cost;
First optimizes module, for utilizing the normalization expression formula of each field in described current work, field according to each field in described current work assesses the cost, each field in described current work is optimized, obtains the optimum normalization expression formula of each field in described current work, optimum field assesses the cost;
Second optimizes module, for by current work code optimization corresponding for described current work be can obtain described optimum normalization expression formula, optimum operation code that described optimum field assesses the cost.
12. devices as claimed in claim 11, is characterized in that,
Described module of setting up is when setting up the upstream-downstream relationship between data, second data of one of the direct or indirect input as calculating first data are called the upstream of described first data, described first data are called the downstream of the second data, and when described first data are unique downstream datas of described second data, when described second data are unique upstream datas of described first data, described second data are called the direct upstream of described first data, described first data are called the direct downstream of described second data.
13. devices as claimed in claim 12, it is characterized in that, described processing module comprises:
Scanning element, for scanning current work code corresponding to described current work;
Resolution unit, for resolving the input and output path of the data comprised in described current work code;
First searches unit, for the input and output path according to the data comprised in described current work code, from described current work code or data warehouse corresponding to described current work code, find corresponding with the described current work code source data cell only having data to export not have data to input;
As unit, for field that described source data cell is comprised as source field.
14. devices as described in claim 11 or 13, it is characterized in that, described processing module also comprises:
First acquiring unit, for obtaining optimization occasion information corresponding to described current work, wherein, described optimization occasion information comprises static optimization or dynamic optimization;
When described optimization occasion information is described static optimization, described second optimizes module also comprises:
Return unit, for being the optimum normalization expression formula of each field that can obtain in described current work by current work code optimization corresponding for described current work, after the optimum operation code that assesses the cost of optimum field, described optimum operation code being returned to described user;
First replacement unit, during for confirming as described user to use described optimum operation code to replace described current work code, replaces with described optimum operation code by described current work code;
When described optimization occasion information is dynamic optimization, described second optimizes module also comprises:
Second replacement unit, for replacing with described optimum operation code by described current work code.
15. devices as described in claim 11 or 13, it is characterized in that, described processing module also comprises:
Second acquisition unit, for obtaining optimization object information corresponding to described current work, wherein, described optimization object information comprises increment optimization or full dose optimization;
First judging unit, during for being the optimization of described increment when described optimization object information, judges whether have the associated job relevant to described current work in the data warehouse that described current work code is corresponding;
Second judging unit, if for having the associated job relevant to described current work in described data warehouse, then judges whether each field in described associated job has corresponding normalization expression formula and field to assess the cost;
First notification unit, if having corresponding normalization expression formula and field to assess the cost for each field in described associated job, then notifies that described processing module performs the step of searching the source field of described current work;
3rd judging unit, during for being the optimization of described full dose when described optimization object information, judges whether have the associated job relevant to described current work in the data warehouse that described current work code is corresponding;
Whether 4th judging unit, if for having the associated job relevant to described current work in described data warehouse, then judge described associated job optimised mistake;
Second notification unit, if for described associated job optimised mistake, then notifies that described processing module performs the step of searching the source field of described current work.
16. devices as claimed in claim 15, is characterized in that, when described optimization object information is described increment optimization, described processing module also comprises:
Third notice unit, if for not having the associated job relevant to described current work in described data warehouse, then notifies that described processing module performs the step of searching the source field of described current work.
17. devices as claimed in claim 15, is characterized in that, when described optimization object information is described increment optimization, described processing module also comprises:
Second searches unit, if do not have corresponding normalization expression formula and field to assess the cost for each field in described associated job, then searches the source field of described associated job;
First sets up unit, and the data corresponding for the source field with described associated job are starting point, take field as granularity, sets up the upstream-downstream relationship between all data corresponding to all fields in described associated job;
First computing unit, for according to the upstream-downstream relationship between all data corresponding to all fields in described associated job, calculates the normalization expression formula of each field in described associated job;
Second computing unit, for the normalization expression formula according to each field in described associated job, the field calculating each field in described associated job assesses the cost, and then notifies that described processing module performs the step of searching the source field of described current work.
18. devices as claimed in claim 15, is characterized in that, when described optimization object information is the optimization of described full dose, described processing module also comprises:
4th notification unit, if for not having the associated job relevant to described current work in described data warehouse, then notifies that described processing module performs the step of searching the source field of described current work.
19. devices as claimed in claim 15, is characterized in that, when described optimization object information is the optimization of described full dose, described processing module also comprises:
5th judging unit, if do not have optimised mistake for described associated job, then judges whether each field in described associated job has corresponding normalization expression formula and field to assess the cost;
3rd searches unit, if do not have corresponding normalization expression formula and field to assess the cost for each field in described associated job, then searches the source field of described associated job;
Second sets up unit, and the data corresponding for the source field with described associated job are starting point, take field as granularity, sets up the upstream-downstream relationship between all data corresponding to all fields in described associated job;
3rd computing unit, for according to the upstream-downstream relationship between all data corresponding to all fields in described associated job, calculates the normalization expression formula of each field in described associated job;
4th computing unit, for the normalization expression formula according to each field in described associated job, the field calculating each field in described associated job assesses the cost;
First optimizes unit, for utilizing the normalization expression formula of each field in described associated job, field according to each field in described associated job assesses the cost, each field in described associated job is optimized, obtains the optimum normalization expression formula of each field in described associated job, optimum field assesses the cost;
Second optimizes unit, for be the optimum normalization expression formula of each field that can obtain in described associated job by associated job code optimization corresponding for described associated job, the optimum operation code that assesses the cost of optimum field, then notify that described processing module performs the step of searching the source field of described current work.
20. devices as claimed in claim 19, is characterized in that, when described optimization object information is the optimization of described full dose, described processing module also comprises:
3rd optimizes unit, if have corresponding normalization expression formula and field to assess the cost for each field in described associated job, then utilize the normalization expression formula of each field in described associated job, field according to each field in described associated job assesses the cost, each field in described associated job is optimized, obtains the optimum normalization expression formula of each field in described associated job, optimum field assesses the cost;
4th optimizes unit, for be the optimum normalization expression formula of each field that can obtain in described associated job by associated job code optimization corresponding for described associated job, the optimum operation code that assesses the cost of optimum field, then notify that described processing module performs the step of searching the source field of described current work.
CN201410078866.4A 2014-03-05 2014-03-05 For the optimization method and device of open type data processing service Active CN104899209B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410078866.4A CN104899209B (en) 2014-03-05 2014-03-05 For the optimization method and device of open type data processing service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410078866.4A CN104899209B (en) 2014-03-05 2014-03-05 For the optimization method and device of open type data processing service

Publications (2)

Publication Number Publication Date
CN104899209A true CN104899209A (en) 2015-09-09
CN104899209B CN104899209B (en) 2018-05-18

Family

ID=54031877

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410078866.4A Active CN104899209B (en) 2014-03-05 2014-03-05 For the optimization method and device of open type data processing service

Country Status (1)

Country Link
CN (1) CN104899209B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017124959A1 (en) * 2016-01-21 2017-07-27 阿里巴巴集团控股有限公司 Method and device for use in analyzing data table
CN109324800A (en) * 2018-10-25 2019-02-12 金蝶软件(中国)有限公司 It is a kind of avoid endless loop calculate optimization method and optimization device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101408900A (en) * 2008-11-24 2009-04-15 中国科学院地理科学与资源研究所 Distributed space data enquiring and optimizing method under gridding calculation environment
CN101677286A (en) * 2008-09-19 2010-03-24 中国电信股份有限公司 Optimization method of carrier network
CN102081678A (en) * 2011-03-14 2011-06-01 华中科技大学 Method for searching optimal execution plan in database query
US20110289119A1 (en) * 2010-05-20 2011-11-24 Sybase, Inc. Methods and systems for monitoring server cloud topology and resources
CN103186406A (en) * 2011-12-30 2013-07-03 国际商业机器公司 Method and device for control flow analysis
CN103488609A (en) * 2013-09-03 2014-01-01 南京国电南自美卓控制系统有限公司 Compound expression intelligent analytic method based on operator variable recursion recognition technology

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101677286A (en) * 2008-09-19 2010-03-24 中国电信股份有限公司 Optimization method of carrier network
CN101408900A (en) * 2008-11-24 2009-04-15 中国科学院地理科学与资源研究所 Distributed space data enquiring and optimizing method under gridding calculation environment
US20110289119A1 (en) * 2010-05-20 2011-11-24 Sybase, Inc. Methods and systems for monitoring server cloud topology and resources
CN102081678A (en) * 2011-03-14 2011-06-01 华中科技大学 Method for searching optimal execution plan in database query
CN103186406A (en) * 2011-12-30 2013-07-03 国际商业机器公司 Method and device for control flow analysis
CN103488609A (en) * 2013-09-03 2014-01-01 南京国电南自美卓控制系统有限公司 Compound expression intelligent analytic method based on operator variable recursion recognition technology

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017124959A1 (en) * 2016-01-21 2017-07-27 阿里巴巴集团控股有限公司 Method and device for use in analyzing data table
CN106991101A (en) * 2016-01-21 2017-07-28 阿里巴巴集团控股有限公司 A kind of method and apparatus of spreadsheet analysis processing
US10909481B2 (en) 2016-01-21 2021-02-02 Alibaba Group Holding Limited Method and apparatus for analyzing data table
CN106991101B (en) * 2016-01-21 2021-02-02 阿里巴巴集团控股有限公司 Data table analysis processing method and device
CN109324800A (en) * 2018-10-25 2019-02-12 金蝶软件(中国)有限公司 It is a kind of avoid endless loop calculate optimization method and optimization device

Also Published As

Publication number Publication date
CN104899209B (en) 2018-05-18

Similar Documents

Publication Publication Date Title
CN106909372B (en) Method and system for calculating purchase path of mobile terminal user
US20190012605A1 (en) Artificial intelligence based solution generator
CN103605662A (en) Distributed computation frame parameter optimizing method, device and system
Diffendorfer et al. Land cover and topography affect the land transformation caused by wind facilities
US10963963B2 (en) Rule based hierarchical configuration
CN104899209A (en) Optimization method and device for open type data processing service
US20210348509A1 (en) Systems and methods for estimating well parameters and drilling wells
US8417594B2 (en) Dimension-based financial reporting using multiple combinations of dimensions
Yang et al. On construction of the air pollution monitoring service with a hybrid database converter
Soni et al. Enlightening grey portions of energy security towards sustainability
CN114860759A (en) Data processing method, device and equipment and readable storage medium
Majumder et al. Efficiency assignment of hydropower plants by DEMATEL-MAPPAC approach
KR102324086B1 (en) Cost efficiency tracking for configuration management database
Bērziša Project management knowledge retrieval: project classification
Akbar et al. A review of prominent work on agile processes software process improvement and process tailoring practices
US9588777B2 (en) Method and system of knowledge transfer between users of a software application
Hantambo Dam management information system: a case of Kalomo district-Southern province Zambia
Sultan et al. Dynamic cloud resources allocation
Munoz et al. A mobile quality assurance application for the nrdc
Cherel et al. " Birds in the Clouds": Adventures in Data Engineering
Al Riyami et al. Petroleum Development Oman Forecasting Management System
Xin et al. Community detection based on readers' borrowing records
Acharya et al. InDeaTe 3.0: An Ontology based, generic design process guidance web-tool
Lathifah et al. Cross-Industry Standard Process For Data Mining (CRISP-DM) For Discovering Association Rules in Graduate Tracer Study Data of Islamic Higher Education Institution
Liu Absorptive Capacity, Dynamic Capabilities and Product Innovation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20211108

Address after: Room 507, floor 5, building 3, No. 969, Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province

Patentee after: ZHEJIANG TMALL TECHNOLOGY Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Patentee before: ALIBABA GROUP HOLDING Ltd.