CN106708965A - Data processing method and apparatus - Google Patents

Data processing method and apparatus Download PDF

Info

Publication number
CN106708965A
CN106708965A CN201611090820.XA CN201611090820A CN106708965A CN 106708965 A CN106708965 A CN 106708965A CN 201611090820 A CN201611090820 A CN 201611090820A CN 106708965 A CN106708965 A CN 106708965A
Authority
CN
China
Prior art keywords
data
data processing
task
model
processing task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611090820.XA
Other languages
Chinese (zh)
Inventor
李铮
侯怀锋
高飞龙
郑超平
张超
郑扬
张娟娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201611090820.XA priority Critical patent/CN106708965A/en
Publication of CN106708965A publication Critical patent/CN106708965A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method and apparatus. The method comprises the steps of receiving a data processing task; according to the data processing task, determining a data processing model which finishes the data processing task, and reading to-be-processed data from a corresponding data source; and performing data processing on the to-be-processed data by utilizing the determined data processing model, thereby obtaining a data processing result. According to the technical scheme, personnel who do not understand code development can perform data processing on target data by utilizing the available data processing model, and do not need to write codes for each data processing task, so that the data processing efficiency is greatly improved.

Description

A kind for the treatment of method and apparatus of data
Technical field
The present invention relates to field of computer technology, and in particular to a kind for the treatment of method and apparatus of data.
Background technology
In the prior art, business datum is carried out statistical computation, generation statistical report form operation often it is sufficiently complex, Need technical staff to write code, so needing to be wasted time and energy very much in the case that generation statistical report form is more.Letter speech It, treatment that is big for data volume, calculating complicated target data is frequently not being capable of simple realization.
The content of the invention
In view of the above problems, it is proposed that the present invention so as to provide one kind overcome above mentioned problem or at least in part solve on State the treating method and apparatus of the data of problem.
According to one aspect of the present invention, there is provided a kind of processing method of data, including:
Receive data processing task;
According to the data processing task, it is determined that the data processing model of the data processing task is completed, and slave phase The data source answered reads pending data;
Data processing is carried out to the pending data using the data processing model for determining, data processing knot is obtained Really.
Alternatively, the reception data processing task includes:
The data processing task submitted to by front end page is received, the data processing task at least includes:Input address;
The data pending from the reading of corresponding data source include:Pending number is read from the input address According to.
Alternatively, the pending data is that the Source log data of the business specified to user carry out dissection process, is obtained The business formatting daily record data.
Alternatively, the data processing task includes:The data processing model specified;
It is described according to the data processing task, it is determined that the data processing model for completing the data processing task includes: The data processing model specified is selected from data processing model storehouse.
Alternatively, following at least one data processing model is included in the data processing model storehouse:
Newly-increased statistical model;
Enliven statistical model;
Retain statistical model.
Alternatively, the data processing task also includes:The parameter information of the data processing model specified;
The data processing model using determination carries out data processing to the pending data to be included:Using described Parameter information is carried out to specified data processing model with postponing, using with the data processing model for postponing to described pending Data carry out data processing.
Alternatively, the data processing model specified is streaming models;
The data processing task also includes:At least one section customized code snippet;
The parameter information includes:Every section of code snippet and one piece of corresponding relation of logic partitioning in streaming models.
Alternatively, the data processing task includes:The address of customized data processing model;
It is described according to the data processing task, it is determined that the data processing model for completing the data processing task includes: Customized data processing model is read from the address.
Alternatively, the method also includes:
The customized data processing model is saved in data processing model storehouse.
Alternatively, the data processing task also includes:The OPADD of data processed result;
The method also includes:By data processed result output to the OPADD.
According to another aspect of the present invention, there is provided a kind of processing unit of data, including:
Receiving unit, is suitable to receive data processing task;
Pretreatment unit, is suitable to according to the data processing task, it is determined that completing at the data of the data processing task Reason model, and read pending data from corresponding data source;
Data processing unit, is suitable to carry out at data the pending data using the data processing model for determining Reason, obtains data processed result.
Alternatively, the receiving unit, is suitable to receive the data processing task submitted to by front end page, at the data Reason task at least includes:Input address;
The pretreatment unit, is suitable to read pending data from the input address.
Alternatively, the pending data is that the Source log data of the business specified to user carry out dissection process, is obtained The business formatting daily record data.
Alternatively, the data processing task includes:The data processing model specified;
The pretreatment unit, is suitable to the data processing model for selecting to specify from data processing model storehouse.
Alternatively, following at least one data processing model is included in the data processing model storehouse:
Newly-increased statistical model;
Enliven statistical model;
Retain statistical model.
Alternatively, the data processing task also includes:The parameter information of the data processing model specified;
The data processing unit, being suitable for the application of the parameter information is carried out to specified data processing model with postponing, Data processing is carried out to the pending data using with the data processing model for postponing.
Alternatively, the data processing model specified is streaming models;
The data processing task also includes:At least one section customized code snippet;
The parameter information includes:Every section of code snippet and one piece of corresponding relation of logic partitioning in streaming models.
Alternatively, the data processing task includes:The address of customized data processing model;
The pretreatment unit, is suitable to read customized data processing model from the address.
Alternatively, the pretreatment unit, is further adapted for for the customized data processing model being saved in data processing In model library.
Alternatively, the data processing task also includes:The OPADD of data processed result;
The data processing unit, is further adapted for data processed result output to the OPADD.
From the foregoing, technical scheme, according to the data processing task for receiving, it is determined that completing at the data The data processing model of reason task, and pending data are read from corresponding data source, and using the data processing for determining Model carries out data processing to the pending data, obtains data processed result.The technical scheme causes that being ignorant of code opens The personnel of hair can also carry out data processing to target data using available data processing model, and need not be every secondary data Process task writes code again, greatly improves the efficiency of data processing.
Described above is only the general introduction of technical solution of the present invention, in order to better understand technological means of the invention, And can be practiced according to the content of specification, and in order to allow the above and other objects of the present invention, feature and advantage can Become apparent, below especially exemplified by specific embodiment of the invention.
Brief description of the drawings
By reading the detailed description of hereafter preferred embodiment, various other advantages and benefit is common for this area Technical staff will be clear understanding.Accompanying drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention Limitation.And in whole accompanying drawing, identical part is denoted by the same reference numerals.In the accompanying drawings:
Fig. 1 shows the workflow schematic diagram of daily record data platform;
Fig. 2 shows a kind of schematic flow sheet of the processing method of data according to an embodiment of the invention;
Fig. 3 shows a kind of structural representation of the processing unit of data according to an embodiment of the invention.
Specific embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in accompanying drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here Limited.Conversely, there is provided these embodiments are able to be best understood from the disclosure, and can be by the scope of the present disclosure Complete conveys to those skilled in the art.
Technical scheme can apply to daily record data platform, and Fig. 1 shows the workflow of daily record data platform Journey schematic diagram.As shown in figure 1, the daily record data platform is by daily record data, and by ETL, (Extract-Transform-Load takes out Take-change-load) treatment, result data is saved in data warehouse, while the statistical computation to daily record data is supported, generation Form is saved in report database;And front end page is provided, and it is easy to the access of user, understand the operation conditions of task and incite somebody to action Report data carries out visual presentation etc..Whole platform also provides rights management function and task scheduling function, to daily record data Treatment, statistics and displaying regulated and controled.The present invention will focus on the content introduced in terms of statistical computation therein.But it is of the invention Technical scheme is also not necessarily limited to the statistical computation to daily record data, it is also possible to for other data processings.
Fig. 2 shows a kind of schematic flow sheet of the processing method of data according to an embodiment of the invention, such as Fig. 2 institutes Show, the method includes:
Step S210, receives data processing task.
Step S220, according to data processing task, it is determined that the data processing model of data processing task is completed, and slave phase The data source answered reads pending data.
Pending data are carried out data processing by step S230 using the data processing model for determining, are obtained at data Reason result.
It can be seen that, the method shown in Fig. 2, according to the data processing task for receiving, it is determined that completing the data of data processing task Treatment model, and pending data are read from corresponding data source, and using the data processing model for determining to pending Data carry out data processing, obtain data processed result.The technical scheme causes that the personnel for being ignorant of code development can also profit Data processing is carried out to target data with available data processing model, and need not again be write for each data processing task Code, greatly improves the efficiency of data processing.
In one embodiment of the invention, in the above method, receiving data processing task includes:Receive and pass through preceding end page The data processing task that face is submitted to, data processing task at least includes:Input address;Read from corresponding data source pending Data include:Pending data are read from input address.For example, input address have pointed to a specific address for database, So from the pending data in the input address reading database.
In one embodiment of the invention, in the above method, pending data is the source day of the business specified to user Will data carry out dissection process, the daily record data of the formatting of the business for obtaining.
Several implementations below are illustrated and carry out dissection process to Source log data and implement, and further comprises authority pipe The content of reason:
In one embodiment of the invention, the daily record data at least one specified services carries out dissection process, obtains The daily record data of the formatting of business is simultaneously saved in data warehouse in corresponding Data Mart;Each data in for data warehouse Fairground configures authority by user grouping;Front end page is provided, according to the user login information that front end page sends, the user is determined User grouping information;When the Data Mart for receiving front end page transmission checks instruction, according to the user grouping of the user Information, shows that it has the Data Mart information for checking authority by front end page to the user.Specifically, at one of the invention In embodiment, the above method also includes:The corresponding relation of configuration service and domain name, the daily record data that will be received is carried out by domain name Classification;Carrying out dissection process to the daily record data of at least one specified services includes:According to the corresponding domain name of the business, to the domain Daily record data under name classification carries out dissection process.
In actual applications, by taking Internet firm as an example, it is often for the different domain name of each traffic assignments managed is opened The development of the business of carrying out, thus produce daily record data both also be from different domain names, the daily record data that will be received is by domain Name carries out classification fast can be realized being classified daily record data exactly.Due to the extensive use of distributed type assemblies, each industry Business is often carried out on different distributed type assemblies, for example, the functional module of same business may be in some of the whole nation Disposed on individual different clusters, similarly, there are multiple service lines of the task may be run on same cluster, then daily record data is by next The modes such as source carry out classification and are nothing like classifying fast and convenient by domain name.There can also be multiple subdomains under each domain name in this example Name, the subservice in correspondence business, such domain name compares various, can be saved as metadata with the corresponding relation of business, It is managed as data dictionary and is used.It is of course also possible to for the checking of metadata, change etc. by user grouping distribution not With authority, such as keeper can possess modification authority, and domestic consumer only possesses authority for checking partial content etc..
In one embodiment of the invention, the above method also includes:A sample daily record to being input into is carried out at parsing Reason, the analysis result of output formatting;After receiving the analysis result confirmation instruction of user, record parsing sample daily record is used Daily record resolution rules;Carrying out dissection process to the daily record data of at least one specified services includes:According to the daily record solution for being recorded Analysis rule, the Source log data of the business specified to user carry out dissection process.
For example for sample daily record:<Id=123><Sex=male><Age=18>, daily record resolution rules can be obtained For:ID is the string number started with " id=";Sex is a string of the characters started with " sex=";Age is with " age =" start numeral.So apply the daily record resolution rules, it is possible to right<Id=1233><Sex=male><Age=8>、< Id=12332><Sex=male><Age=28>Such Source log data parse.
Specifically, multiple log contents identification engines can be preset, for recognize different-format respectively log content simultaneously It is parsed into one or more fields;Sample daily record is sequentially inputted in multiple log content identification engines;By each log content Each field of output of identification engine collect the analysis result for being formatted.
The system or server used by each business may be different, and the form of the daily record data of generation is also various many Sample.Shown below is the example of several daily records:
1、http://mbs.hao.360.cn/index.phpId=1353332&sex=male&age=28&....
2、{"id":"13532232332","sex":"male","age":"28"}
3、<Id=13532232332><Sex=male><Age=28>
4、id->13532232332;sex->male;age->28
It can be seen that the form of these four daily records is entirely different.Default multiple daily records can be utilized in the above-described embodiments Engine is recognized to recognize the log content of different-format.For example, JSON is a kind of more conventional data form, its content is included Structure be typically specific, such as with braces, colon, log content is divided into multiple fields (as above by the symbol such as quotation marks The example 2 in face), and the daily record identification engine for being directed to JSON forms can be just carried out at parsing log content by these separators Reason, obtains one or more data of one or more fields.Specifically, during log content identification engine can include as follows One or more:IP address recognizes engine;Time-stamp Recognition engine;ID recognizes engine;Channel recognizes engine;JSON forms Content recognition engine.The form of IP address can be estimate (for example:Xxx.xxx.xxx.xxx), ID often by NAME, USER_ID or ID etc. can set corresponding key values (such as channel) as key values, channel by developer, timestamp Form is usually " YYYY-MM-DD HH:mm:SS”.Especially, IP address identification engine can also after IP address is identified, IP address is further parsed, IP address analysis result includes following one or more fields:Country, province, city, operator.Certainly, More detailed address can also be extended to according to demand, IP address analysis result can also expand word including area, street etc. Section, but these are relatively low for follow-up treatment use under normal conditions, can waste certain resource, can carry out according to demand Set.
As can be seen that daily record identification engine is also configurable, the key that such as different business is used channel is probably Different.Therefore in one embodiment of the invention, there is provided log content recognizes engine edition interface, by the interface Increase/delete/modification log content recognizes the instruction of engine, according to instruction perform corresponding log content identification module increase/ Delete/modification operation.
The form of daily record is very various in practice.Luckily these journal formats all include specification, recognizable knot mostly Structure, can be recognized by different daily record identification engines.But for the daily record data that a business is produced, generally can't All of daily record identification engine is used, and the daily record data quantity that business is produced is various, if calling all of daily record every time Recognize engine to be identified, not only waste of resource, efficiency is also very beneath.Therefore in one embodiment of the invention, it is above-mentioned In method, will only there is the identification resolution rules summary record of each log content identification engine of output result.So next time is right When the daily record data of the business carries out dissection process, it is not necessary to which waste actually will not recognize engine using to log content .
However, not all daily record data is generated with the form of such specification, default daily record recognizes engine Most daily record data can be processed, but can also there is the situation that analysis result does not meet daily record original meaning.Therefore in the present invention One embodiment in, when there is each log content in sample daily record and recognize the content of the equal None- identified of engine, by making by oneself Justice identification interface exports the content;By the recognition result being input into after self-defined identification interface manual identified and corresponding knowledge Other resolution rules;The identification resolution rules being input into after the manual identified are recorded as to parse the daily record parsing that sample daily record is used A part for rule.For example, conventional separator is including colon, branch, big round bracket etc., if the daily record data of a class business In contain the separator being of little use, it is necessary to user is identified result and corresponding identification solution in self-defined identification interface Analysis rule input.
In one embodiment of the invention, the above method also includes:Being tied to parsing for input is received by front end page The instruction that each field in fruit is operated, and perform corresponding operation;The finger that each field in analysis result is operated Make one or more in including as follows:Adjust the instruction of the clooating sequence of each field;Change the instruction of the title of specific field; Delete the instruction of specific field.
For example, a data for field in analysis result are calculated without any help subsequent statistical, this can be deleted Field;Field name in analysis result is " USERNAME ", wishes to be changed to " user name " etc. during subsequent treatment, these Can be operated in analysis result editing interface.
Whole field can be operated in a upper embodiment, and in one embodiment of the invention, the above method In, the identification resolution rules of a log content identification engine include:To recognizing and in one or more fields for parsing The parameter value of specific field sets and limits threshold value;For parameter value discard processing is carried out beyond the daily record data for limiting threshold value.This Sample can just discard the unwanted data in part when daily record is parsed, and reduce the later stage and carry out the operation of daily record data discarding.
Being described in above-described embodiment carries out the operation of dissection process to daily record data, and parses the form of the business for obtaining The daily record data of change is saved in data warehouse in corresponding Data Mart, in particular it is required that determined according to field attribute should Row in the corresponding data warehouse of field;Data to be stored are stored by field in being arranged accordingly in data warehouse.
Address above, table and dimension table can be included the fact that in data warehouse, these tables of data are typically what is built up in advance, For storing the data for receiving.Therefore, the data for receiving can by by attribute determine its it is corresponding be which in tables of data Row.And each table is included, and row are typically different, thus only need to determine in the present embodiment its it is corresponding be data warehouse In which row.
Specifically, determine that the row in the corresponding data warehouse of the field include according to field attribute:Read data warehouse Metadata, obtains the attribute of each row in data warehouse;According to the attribute of each row in field attribute and data warehouse, set up to be stored The mapping relations of the data of each field and Ge Lie in data.Metadata includes:Service attribute and/or data that data warehouse is respectively arranged Attribute, wherein, service attribute include it is following at least one:Business Name, business domain name, business description information;Data belong to Property include it is following at least one:Row name, data form, data type.For example, it is which that " user name " this field is corresponding Individual row, such corresponding relation can be stored in the metadata of data warehouse.
In one embodiment of the invention, the above method also includes:Arranging specified for input is received by front end page The instruction that is processed of data, instruction include it is following at least one:Data deciphering, Data Format Transform, data encoding Conversion;According to the instruction for receiving, the data to specifying row are processed accordingly.
For example, carrying out unserializing treatment to the data of the row, it is set to become readable;Time in daily record is colon form Data, be converted into timestamp, etc..
Data can be screened out in daily record identification engine in previous embodiment, in one embodiment of the invention Additionally provide the method screened out to the data in data warehouse:The train value to specifying row of input is received by front end page The instruction for limiting threshold value is set;According to the instruction, train value is carried out into delete processing beyond the whole piece data for limiting threshold value.For example, Whole piece data by access times less than 3 are deleted.
Analogously, can also by front-end interface receive input he to data warehouse in the finger that is operated of each row Order, and perform corresponding operation;The instruction that each row in data warehouse are operated include it is following in one or more:Adjust The instruction of the clooating sequence of whole each row;The instruction of the row name of row is specified in modification;Delete the instruction for specifying row.
It should be noted that directly the data in data warehouse are adjusted in this embodiment, and in foregoing implementation Although the clooating sequence of each field, the title of modification specific field can also be adjusted in example or specific field is deleted, these numbers According to data warehouse is not stored in also, in data buffer storage.
In one embodiment of the invention, in the above method, Data Mart includes at least one tables of data;It is data bins Each Data Mart in storehouse is also included by user grouping configuration authority:For each tables of data of Data Mart is looked into by user grouping configuration Authority is seen, and for each row of tables of data check authority by user grouping configuration;User grouping information according to the user, passes through Front end page shows that it has the Data Mart information for checking authority to include to the user:User grouping information according to the user, Show that it has the tables of data for checking authority to the user, and/or show its row in having the tables of data for checking authority to the user. For example, operation personnel can check the tables of data of business flowing water, and technical staff can not check.
Following embodiments will be introduced and include report generation, i.e., the specific reality of statistical computation is carried out to the daily record data for formatting It is existing.
In one embodiment of the invention, the above method also includes:User is received and preserves to be submitted to by front end page Report generation task;According to the daily record data in the Data Mart specified in report generation task, generate form and preserve extremely Report database;Wherein, the authority for being arranged in the tables of data in the corresponding Data Mart of the authority of each row of the form of generation It is identical;The authority of each row according to form determines the authority of the form.
Report generation task in above-described embodiment is a kind of data processing task.
Except specifying input address, more many condition can also be limited in report generation task by user, for example, use number According to the partial data of certain tables of data in fairground.Therefore in one embodiment of the invention, report generation task includes:User The standard queries sentence or the query argument of user input of input;The method also includes:With the standard queries sentence of user input Corresponding Data Mart is inquired about, the daily record data in the Data Mart that user specifies is obtained;Or, according to the inquiry of user input Parameter generates standard queries sentence, and corresponding Data Mart, the data set specified are inquired about with the standard queries sentence for generating Daily record data in city.For example for the technical staff of enterprise data center, standard of compiling query statement is handy, but Such query statement may can't be write for common business personnel.Therefore provide in the present embodiment user is defeated The querying condition for entering carries out the function of assembly.The content that identifying user is input into for convenience, user is in inquiry except specifying phase Outside the Data Mart answered, the character repertoire for identified input content can also be specified, it is to avoid the content of input is not correctly validated.
Because the data volume in Data Mart is very huge, if user sets corresponding limitation not in querying condition, Such as time conditions, the data volume asked may cause the collapse of Data Mart.Therefore in one embodiment of the present of invention In, the above method also includes:Sentence filtering rule, the standard queries language of standard queries sentence or generation to user input are set Sentence is filtered.The method of above-mentioned inquiry can be not only used for the inquiry of specific daily record data, it is also possible to for Data Mart letter The inquiry of breath.
Generation form needed for data processing model can user specify, it is also possible to user from provide data processing Selected in model, in one embodiment of the invention, in the above method, data processing task includes:At the data specified Reason model;According to data processing task, it is determined that the data processing model for completing data processing task includes:From data processing model The data processing model specified is selected in storehouse.Wherein, following at least one data processing mould is included in data processing model storehouse Type:Newly-increased statistical model;Enliven statistical model;Retain statistical model.These models can respectively count a certain from daily record Day daily record in, user (Adding User) quantity for not occurring in history, specify the quantity of any active ues in the time period, certain The retention situation for Adding User for 1st.Certainly, specific data model can be added or configure according to business demand, herein It is not limited.User typically no longer needs to write code, it is only necessary to provide parameter information when using these data processing models Can be used, for example being arranged using which carries out statistical computation etc..That is, report generation task also includes:The data specified Process the parameter information of model;Carrying out data processing to daily record data using the data processing model for determining includes:Application parameter Information carries out, with postponing, carrying out at data daily record using with the data processing model for postponing to specified data processing model Reason.
Certainly, the model of this " foolproof " can not meet all demands of user.Therefore in a reality of the invention Apply in example, the data processing model specified in the above method is streaming models;Report generation task also includes:At least one The customized code snippet of section;Parameter information includes:One piece of logic partitioning is right in every section of code snippet and streaming models Should be related to.
For example, user is desirable with statistical computation of the Map-Reduce frameworks execution to daily record data.So in this implementation In example, user only needs to develop respectively Map program code segments and Reduce program code segments, and (core for namely performing calculating is patrolled Volume), without writing complete code again.When form task is submitted to, it is only necessary to by Map program code segments and Reduce program generations Code section is respectively filled in code input frame corresponding with Map sections and Reduce sections in front end page, it is possible to realized on backstage overall The assembly of code, such program is easily managed and changes, and decreases the possibility of written in code mistake.
In the case where above-mentioned model can not all meet user's request, user can also select customized model, therefore In the above method, report generation task includes:The address of customized data processing model;It is determined that completing report generation task Data processing model includes:Customized data processing model is read from the address.Or, in the case of size of code is less Can be uploaded by between front end page.Customized data processing model can also be saved in data processing model storehouse, on Biography person can also be its distribution authority.
The method that form is generated except daily record data above in Data Mart, user can also open other modes The report upload of hair carries out unified control of authority and management to report database.Therefore in one embodiment of the invention, The above method also includes:The form that reception user is uploaded by front end page, and/or submitted to by front end page according to user Form store path, the form specified is obtained from the path, is saved in report database.
In one embodiment of the invention, the above method also includes:User grouping information according to user, shows to it The Data Mart and/or form of editable authority;The authority edit instruction that user submits to is received, to Data Mart and/or form Authority edited accordingly.
In the present embodiment there is provided the management control method of authority, possess the manager of higher-rights (such as in data The keeper of the heart) authority of data that can possess it editing authority enters edlin.For example, business director can control it The member of each group can only see the related form of the group.
In one embodiment of the invention, the above method also includes:User grouping information according to the user, by preceding End page shows that it has the report name for checking authority towards the user;Report generation task includes:The form that user specifies;According to Daily record data in the Data Mart specified in report generation task, generating form and preserving to report database includes:Use The configuration information of the form that user specifies, the daily record data generation report in the Data Mart specified in report generation task Table.
The method according to the new form of existing report generation is provided in the present embodiment, is properly termed as " form clone ", i.e., Using a configuration information for the form for having developed completion, generate new form, form of so new form etc. all with developed Former form it is similar.
Following embodiments will also be introduced and for report data carry out implementing for visual presentation.
In one embodiment of the invention, the above method also includes:When the form for receiving user input checks instruction When, show that user in the form has the data of the row for checking authority to the user by front end page.
Due to the data volume in form be typically it is very big (business that may include carry out since all data, such as it is several The data in year), the specific data during only displaying row name is arranged without displaying in the aforementioned embodiment.And check instruction in user input Afterwards, then by specific data (restrictive condition being included, for example, time conditions) it is shown.
In one embodiment of the invention, the above method also includes:New Report in report database is checked time Number is initialized as zero;When the form for receiving user input checks instruction, corresponding report is checked that number of times increases by one;For each Form set cleaning cycle, reach cleaning cycle time point when, according to judge the form check number of times whether less than clearly Reason threshold value, if being less than, deletes corresponding report generation task.
After report generation task is set up, because daily record data is being continuously generated, therefore report data is also constantly more Newly, this undoubtedly consumes substantial amounts of resource, therefore even 0 form less for the amount of checking, corresponding report generation is appointed Business carries out deletion can economize on resources.
In one embodiment of the invention, the above method also includes:User grouping information according to user, shows to it It has the statistics graph model of access right, so that the front end page statistics graph model selected according to user and the number for having shown that form According to the corresponding statistical chart of generation.
Form is typically the mode of form, and this is simultaneously unfavorable for viewing, and the mode of statistical chart is just more directly perceived, such as pie Figure, block diagram etc..Therefore the method that statistical chart is generated according to report data is provided in the present embodiment.And specifically, statistics Graph model include it is following in any one:According to the mould of the data genaration statistical chart for having shown that form cached in front end page Type;Again obtaining user in the form from data source has the data of the row for checking authority, generates the model of statistical chart;To having opened up Show that the data source of form enters the statistics graph model of edlin.
In above-mentioned model, the model according to the data genaration statistical chart for having shown that form cached in front end page need not Interacted with server, even if such user is ranked up, classifies etc. operation in front end page to data, all without again Lose time to conduct interviews report database with resource;And the form more sensitive for real-time property, it is possible to use Again obtaining user in the form from data source has the data of the row for checking authority, generates the model of statistical chart.Further, since User wishes to modify form in many cases, for example change row name, and access report database be it is sufficiently complex, because This is additionally provided to having shown that the data source of form enters the statistics graph model of edlin.
Following embodiments will introduce the scheduling of task by taking report generation task as an example.Certainly, daily record data is carried out at ETL Reason is also one of task, and its flow is similar with the method that following embodiments are introduced.
Because report generation task can expend resource, therefore it is very necessary that rational scheduling is carried out to task.At this In one embodiment of invention, the above method also includes:Corresponding task configuration text is generated and preserved according to report generation task Part;According to the multiple tasks configuration file for having preserved, the task topological diagram comprising dependence between task is generated and preserved;According to The task topological diagram of generation completes task scheduling.
Task topological diagram shows the dependence between task, and such as task A could only be transported after the completion of task B operations OK.So specifically, the task topological diagram according to generation completes task scheduling includes:When any one task in task topological diagram When meeting other service conditions in addition to dependence between task of the task, according to task topological diagram judge the task whether according to Rely in other task runs;If being independent of other task runs, the task configuration file of the task is read, directly run this Business;If relying on other task runs, after its other task for relying on whole end of run, then the reading the task of the task Configuration file, runs the task.
The run time of the task of its dependence is often estimated in the scheduling of existing report generation task, and such as task B can 2 points of operations can finish in the afternoon, then the run time of task A is possibly set to afternoon two point ten minutes.It is contemplated that taking office The operation of business is relevant with the idle degree of the cluster of operation task, and when cluster is compared with busy, possible task B can be transported just half past two in the afternoon Row is finished, and so the task A of 2 points of operations in ten minutes will operation exception in the afternoon;And working as cluster compared with idle, possible task B is under Half past one at noon has just been run and has finished, and until ten minutes afternoons two point task A can just run, the money of this time cluster of 40 minutes Source is just wasted.And in the present embodiment, after the other conditions of task run meet, finished in its task run for relying on The task can just be run afterwards.
Specifically, report generation task also include it is following in any one:The basic parameter of task;The time of task run Condition;The cluster of task run;The physical resource condition of task run;The data resource condition of task run;Task and other The dependence of task.
For example, appointed task runs on which cluster, it is desirable to which what configuration the machine on cluster meets;Periodic task Could be arranged to be performed in daily fixed time period, and temporary duty can also increase restrictive condition, as only specified Performed in daily fixed time period in one week.When task is submitted into cluster, can be according to load balancing principle, from task At least one machine is selected to run the task in the cluster of operation.
If the dependence comprising task and other tasks in report generation task, can directly according to task and its The dependence generation task topological diagram of his task;Task topology can also be generated according to the data resource condition of task run Figure, wherein, the data resource condition of task run includes:The input address of data needed for task run, and/or task run knot The OPADD of fruit.
For example, the data needed for task A are the results that task B is obtained, then the OPADD of task B and task A Input address matching, thus obtained task A and depended on task B.
User can view the task topological diagram for checking authority by front end page, and it is modified.Therefore In one embodiment of the invention, the method also includes:In response to the displaying instruction that front end page sends, by multiple tasks And/or the task topological diagram comprising dependence between multiple tasks returns to front end page and is shown.User can also basis There are the multiple tasks for checking authority, generate new task topological diagram.For example, task A and task B are added into new task topology In figure, and the task A of formulation depends on task B.Or, the dependence in existing task topological diagram is modified.I.e.:Connect Newly-increased/modification/the deletion of dependence is instructed between the task that receipts front end page sends, and is correspondingly generated or modification task topology Figure.Front end page can be visually presented with task topological diagram, for example, user is when topological diagram is changed, it is only necessary to which task is made For node is pulled into or hauling-out figure, with arrow logo dependence between two tasks.
In one embodiment of the present of invention, corresponding appointing can also be caused by being modified to report generation task Business topological diagram sends change.The modification instruction of report generation task is received, the task configuration file to corresponding task is modified; Judge whether to need the task topological diagram related to the task to modify according to the modification instruction of report generation task, if so, then Modified according to the task topological diagram that amended task configuration file is related to the task.For example, input address are have changed, The task A of may be such that eliminates the reliance on task B, but is changed into dependence task C.
Because report generation task can be towards all users in enterprise, to ensure stability, can be using such as lower section Method:Report generation task for receiving report generation task is provided and submits interface to;For task scheduling server at least one Running state parameter sets corresponding alarm threshold value, the current operating conditions parameter of monitor task dispatch server;When monitoring Any one running state parameter when reaching corresponding alarm threshold value, perform predetermined alarm corresponding with the alarm threshold value and operate, And be set to report generation task submission interface unavailable.So in task dispatch server heavy-duty service, Ke Yiting Interface is submitted to report generation task, no longer receive new report generation task, and in task dispatch server no longer high load capacity During operation, the availability that form task submits interface to is recovered again.That is, when the running state parameter for monitoring drops to warning level When under value, report generation task submission interface is set to again available;Interface is submitted to submit to by report generation task Report generation task, generate and preserve corresponding task configuration file.
Referred in previous embodiment, task needs to be submitted to operation in corresponding cluster.In one embodiment of the present of invention In, additionally provide following method:Judge whether the cluster of task run meets task and submit condition to, if meeting, will appoint accordingly Business configuration file is submitted on corresponding cluster.Wherein, at least one during task submission condition is included as follows:Task run Cluster can be accessed;The available resources of the cluster of task run are not less than predetermined threshold;The cluster of task run is not in safeguarding State.
That is, first judging the cluster of task run, whether whether whether network is unobstructed, safeguarded, can also transported The capable task.Otherwise task cannot normally be submitted to the cluster, even if being submitted to the cluster, can not correctly run.
User can also check the operation conditions of task by front end page.In one embodiment of the invention, the party Method also includes:In response to the task choosing instruction that front end page is submitted to, the task run state of corresponding task in each cluster is obtained Information, returns to the task run status information of corresponding task front end page and is shown.So user can check at any time It has the running state information of the checking authority of the task, such as:Task run session information;Task run progress msg;Task is remained Remaining temporal information;Task run log information.The error log of task can be read in time, and task is modified;Or, Task is divided into multiple stages, for example, need to carry out the statistical computation in multiple stages, can check that task carries out that stage. According to the amount of work and cluster resource of task, it may also be inferred that the operation progress of task and tasks leave time.
Task tends not to enough mistake letters for always normally being run on cluster, artificial Exclusion Tasks being needed in the prior art Breath, wastes time and energy, and in one embodiment of the invention, the above method also includes:Receive the mission failure day that each cluster is submitted to Will;Mission failure daily record is analyzed, the failure information of task is obtained.Thus manually mission failure daily record is checked, arrange The time for looking into failure cause saves.Specifically, mission failure daily record is analyzed, obtains the failure information bag of task Include:The default failure sample storehouse comprising at least one failure model;Failure model includes:Mission failure log matches are regular and appoint The failure information of business;Mission failure daily record is matched with the failure model in failure sample storehouse, according to the failure for matching Model obtains the failure information of task.
For example, not having data in input path, such task cannot be run, then corresponding record is just had in daily record. Corresponding failure model is unsuccessfully set if such, then by the failure model in mission failure daily record and failure sample storehouse Matched, it is possible to the quick failure information for determining task, for example, included:The failure cause of task, the error code of task, The type of error of task.The type of error of task can include that type can be retried and can not retry type.To be input into path without number As a example by, even if retrying the task, still without data in the input path, then task still will not normally be run.And such as Fruit is only that cannot connect to corresponding database, then be likely to succeed after retrying, such type of error is exactly can be again Examination type mistake.Therefore the failure information of task can also include:The solution of mission failure;The method also includes:Foundation is appointed The solution of business failure, the task is resubmited on corresponding cluster, or, carry out the alert process of predetermined way. For type mistake can be retried, the task is resubmited by it and is retried on corresponding cluster;It is wrong for type can not be retried By mistake, the alert process of predetermined way is carried out, for example, sends mail or short message to attendant.
Following embodiments describe the monitoring management carried out to whole platform.
In one embodiment of the invention, the above method also includes:Any one operation performed to user is recorded, Corresponding with user profile it is saved in monitoring data storehouse by the operating time.
Although as can be seen that the operation that user performs has strict rights management, being still very sensitive operation. Any one operation for being performed to user in the present embodiment is recorded, and can facilitate exclusion in the future, and secret is occurring Quickly determine a suspect during leakage accident.
In one embodiment of the invention, the above method also includes:Operation alarm is set by action type tactful and right The alarm operation that should be performed;When strategy is alarmed in the operation for matching the action type of any one operation that user performs, perform Corresponding alarm operation.
For example, although certain user possesses the authority of checking to large quantities of forms, but within the shorter time period, it is visited in large quantities These forms are asked, this behavior is likely to reveal the behavior of corporate secret, it is therefore desirable to perform corresponding alarm operation, this Sample just can as far as possible reduce the loss of secret, be saved in time.
Fig. 3 shows a kind of structural representation of the processing unit of data according to an embodiment of the invention, such as Fig. 3 institutes Show, the processing unit 300 of data includes:
Receiving unit 310, is suitable to receive data processing task.
Pretreatment unit 320, is suitable to according to data processing task, it is determined that completing the data processing mould of data processing task Type, and read pending data from corresponding data source.
Data processing unit 330, is suitable to carry out data processing to pending data using the data processing model for determining, Obtain data processed result.
It can be seen that, the device shown in Fig. 3, according to the data processing task for receiving, it is determined that completing the data of data processing task Treatment model, and pending data are read from corresponding data source, and using the data processing model for determining to pending Data carry out data processing, obtain data processed result.The technical scheme causes that the personnel for being ignorant of code development can also profit Data processing is carried out to target data with available data processing model, and need not again be write for each data processing task Code, greatly improves the efficiency of data processing.
In one embodiment of the invention, in said apparatus, receiving unit 310 is suitable to reception and is carried by front end page The data processing task of friendship, data processing task at least includes:Input address;Pretreatment unit 320, is suitable to read from input address Take pending data.
In one embodiment of the invention, in said apparatus, pending data is the source day of the business specified to user Will data carry out dissection process, the daily record data of the formatting of the business for obtaining.
In one embodiment of the invention, in said apparatus, data processing task includes:The data processing mould specified Type;Pretreatment unit 320, is suitable to the data processing model for selecting to specify from data processing model storehouse.
In one embodiment of the invention, in said apparatus, following at least one is included in data processing model storehouse Data processing model:Newly-increased statistical model;Enliven statistical model;Retain statistical model.
In one embodiment of the invention, in said apparatus, data processing task also includes:The data processing mould specified The parameter information of type;Data processing unit 330, being suitable for the application of parameter information is carried out to specified data processing model with postponing, Data processing is carried out to pending data using with the data processing model for postponing.
In one embodiment of the invention, in said apparatus, the data processing model specified is streaming models; Data processing task also includes:At least one section customized code snippet;Parameter information includes:Every section of code snippet with One piece of corresponding relation of logic partitioning in streaming models.
In one embodiment of the invention, in said apparatus, data processing task includes:Customized data processing mould The address of type;Pretreatment unit 320, is suitable to read customized data processing model from the address.
In one embodiment of the invention, in said apparatus, pretreatment unit 320 is further adapted for customized data Treatment model is saved in data processing model storehouse.
In one embodiment of the invention, in said apparatus, data processing task also includes:Data processed result it is defeated Go out address;Data processing unit 330, is further adapted for data processed result output to OPADD.
It should be noted that the specific embodiment of above-mentioned each device embodiment is specific with foregoing corresponding method embodiment Mode is identical, will not be repeated here.
In sum, technical scheme, according to the data processing task for receiving, it is determined that completing the data processing The data processing model of task, and pending data are read from corresponding data source, and using the data processing mould for determining Type carries out data processing to the pending data, obtains data processed result.The technical scheme causes to be ignorant of code development Personnel data processing can also be carried out to target data using available data processing model, and need not be at every secondary data Reason task writes code again, greatly improves the efficiency of data processing.
It should be noted that:
Algorithm and display be not inherently related to any certain computer, virtual bench or miscellaneous equipment provided herein. Various fexible units can also be used together with based on teaching in this.As described above, construct required by this kind of device Structure be obvious.Additionally, the present invention is not also directed to any certain programmed language.It is understood that, it is possible to use it is various Programming language realizes the content of invention described herein, and the description done to language-specific above is to disclose this hair Bright preferred forms.
In specification mentioned herein, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention Example can be put into practice in the case of without these details.In some instances, known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify one or more that the disclosure and helping understands in each inventive aspect, exist Above to the description of exemplary embodiment of the invention in, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor The application claims of shield features more more than the feature being expressly recited in each claim.More precisely, such as following Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, and wherein each claim is in itself All as separate embodiments of the invention.
Those skilled in the art are appreciated that can be carried out adaptively to the module in the equipment in embodiment Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment Unit or component be combined into a module or unit or component, and can be divided into addition multiple submodule or subelement or Sub-component.In addition at least some in such feature and/or process or unit exclude each other, can use any Combine to all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so disclosed appoint Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification (including adjoint power Profit is required, summary and accompanying drawing) disclosed in each feature can the alternative features of or similar purpose identical, equivalent by offer carry out generation Replace.
Although additionally, it will be appreciated by those of skill in the art that some embodiments described herein include other embodiments In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment required for protection is appointed One of meaning mode can be used in any combination.
All parts embodiment of the invention can be realized with hardware, or be run with one or more processor Software module realize, or with combinations thereof realize.It will be understood by those of skill in the art that can use in practice Microprocessor or digital signal processor (DSP) are come some in the processing unit for realizing data according to embodiments of the present invention Or some or all functions of whole parts.The present invention be also implemented as perform method as described herein one Partly or completely equipment or program of device (for example, computer program and computer program product).It is such to realize this The program of invention can be stored on a computer-readable medium, or can have the form of one or more signal.So Signal can be downloaded from internet website and obtain, or provided on carrier signal, or provided in any other form.
It should be noted that above-described embodiment the present invention will be described rather than limiting the invention, and ability Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims, Any reference symbol being located between bracket should not be configured to limitations on claims.Word "comprising" is not excluded the presence of not Element listed in the claims or step.Word "a" or "an" before element is not excluded the presence of as multiple Element.The present invention can come real by means of the hardware for including some different elements and by means of properly programmed computer It is existing.If in the unit claim for listing equipment for drying, several in these devices can be by same hardware branch To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and run after fame Claim.
Embodiment of the invention discloses that A1, a kind of processing method of data, wherein, the method includes:
Receive data processing task;
According to the data processing task, it is determined that the data processing model of the data processing task is completed, and slave phase The data source answered reads pending data;
Data processing is carried out to the pending data using the data processing model for determining, data processing knot is obtained Really.
A2, the method as described in A1, wherein, the reception data processing task includes:
The data processing task submitted to by front end page is received, the data processing task at least includes:Input address;
The data pending from the reading of corresponding data source include:Pending number is read from the input address According to.
A3, the method as described in A1, wherein, the pending data is that the Source log data of the business specified to user are entered Row dissection process, the daily record data of the formatting of the business for obtaining.
A4, the method as described in A1, wherein, the data processing task includes:The data processing model specified;
It is described according to the data processing task, it is determined that the data processing model for completing the data processing task includes: The data processing model specified is selected from data processing model storehouse.
A5, the method as described in A4, wherein, following at least one data processing is included in the data processing model storehouse Model:
Newly-increased statistical model;
Enliven statistical model;
Retain statistical model.
A6, the method as described in A4, wherein, the data processing task also includes:The ginseng of the data processing model specified Number information;
The data processing model using determination carries out data processing to the pending data to be included:Using described Parameter information is carried out to specified data processing model with postponing, using with the data processing model for postponing to described pending Data carry out data processing.
A7, the method as described in A6, wherein, the data processing model specified is streaming models;
The data processing task also includes:At least one section customized code snippet;
The parameter information includes:Every section of code snippet and one piece of corresponding relation of logic partitioning in streaming models.
A8, the method as described in A1, wherein, the data processing task includes:The ground of customized data processing model Location;
It is described according to the data processing task, it is determined that the data processing model for completing the data processing task includes: Customized data processing model is read from the address.
A9, the method as described in A8, wherein, the method also includes:
The customized data processing model is saved in data processing model storehouse.
A10, the method as any one of A1-A9, wherein, the data processing task also includes:Data processing knot The OPADD of fruit;
The method also includes:By data processed result output to the OPADD.
Embodiments of the invention also disclose B11, a kind of processing unit of data, wherein, the device includes:
Receiving unit, is suitable to receive data processing task;
Pretreatment unit, is suitable to according to the data processing task, it is determined that completing at the data of the data processing task Reason model, and read pending data from corresponding data source;
Data processing unit, is suitable to carry out at data the pending data using the data processing model for determining Reason, obtains data processed result.
B12, the device as described in B11, wherein,
The receiving unit, is suitable to receive the data processing task submitted to by front end page, the data processing task At least include:Input address;
The pretreatment unit, is suitable to read pending data from the input address.
B13, the device as described in B11, wherein, the pending data is the Source log data of the business specified to user Dissection process is carried out, the daily record data of the formatting of the business for obtaining.
B14, the device as described in B11, wherein, the data processing task includes:The data processing model specified;
The pretreatment unit, is suitable to the data processing model for selecting to specify from data processing model storehouse.
B15, the device as described in B14, wherein, comprising at following at least one data in the data processing model storehouse Reason model:
Newly-increased statistical model;
Enliven statistical model;
Retain statistical model.
B16, the device as described in B14, wherein, the data processing task also includes:The data processing model specified Parameter information;
The data processing unit, being suitable for the application of the parameter information is carried out to specified data processing model with postponing, Data processing is carried out to the pending data using with the data processing model for postponing.
B17, the device as described in B16, wherein, the data processing model specified is streaming models;
The data processing task also includes:At least one section customized code snippet;
The parameter information includes:Every section of code snippet and one piece of corresponding relation of logic partitioning in streaming models.
B18, the device as described in B11, wherein, the data processing task includes:Customized data processing model Address;
The pretreatment unit, is suitable to read customized data processing model from the address.
B19, the device as described in B18, wherein,
The pretreatment unit, is further adapted for for the customized data processing model being saved in data processing model storehouse In.
B20, the device as any one of B11-B19, wherein, the data processing task also includes:Data processing The OPADD of result;
The data processing unit, is further adapted for data processed result output to the OPADD.

Claims (10)

1. a kind of processing method of data, wherein, the method includes:
Receive data processing task;
According to the data processing task, it is determined that complete the data processing model of the data processing task, and from corresponding Data source reads pending data;
Data processing is carried out to the pending data using the data processing model for determining, data processed result is obtained.
2. the method for claim 1, wherein the reception data processing task includes:
The data processing task submitted to by front end page is received, the data processing task at least includes:Input address;
The data pending from the reading of corresponding data source include:Pending data are read from the input address.
3. the method for claim 1, wherein the pending data is the Source log data of the business specified to user Dissection process is carried out, the daily record data of the formatting of the business for obtaining.
4. the method for claim 1, wherein the data processing task includes:The data processing model specified;
It is described according to the data processing task, it is determined that the data processing model for completing the data processing task includes:From number According to the data processing model for selecting to specify in treatment model library.
5. method as claimed in claim 4, wherein, comprising at following at least one data in the data processing model storehouse Reason model:
Newly-increased statistical model;
Enliven statistical model;
Retain statistical model.
6. a kind of processing unit of data, wherein, the device includes:
Receiving unit, is suitable to receive data processing task;
Pretreatment unit, is suitable to according to the data processing task, it is determined that completing the data processing mould of the data processing task Type, and read pending data from corresponding data source;
Data processing unit, is suitable to carry out data processing to the pending data using the data processing model for determining, obtains To data processed result.
7. device as claimed in claim 6, wherein,
The receiving unit, is suitable to receive the data processing task submitted to by front end page, and the data processing task is at least Including:Input address;
The pretreatment unit, is suitable to read pending data from the input address.
8. device as claimed in claim 6, wherein, the pending data is the Source log data of the business specified to user Dissection process is carried out, the daily record data of the formatting of the business for obtaining.
9. device as claimed in claim 6, wherein, the data processing task includes:The data processing model specified;
The pretreatment unit, is suitable to the data processing model for selecting to specify from data processing model storehouse.
10. device as claimed in claim 9, wherein, following at least one data are included in the data processing model storehouse Treatment model:
Newly-increased statistical model;
Enliven statistical model;
Retain statistical model.
CN201611090820.XA 2016-12-01 2016-12-01 Data processing method and apparatus Pending CN106708965A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611090820.XA CN106708965A (en) 2016-12-01 2016-12-01 Data processing method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611090820.XA CN106708965A (en) 2016-12-01 2016-12-01 Data processing method and apparatus

Publications (1)

Publication Number Publication Date
CN106708965A true CN106708965A (en) 2017-05-24

Family

ID=58934405

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611090820.XA Pending CN106708965A (en) 2016-12-01 2016-12-01 Data processing method and apparatus

Country Status (1)

Country Link
CN (1) CN106708965A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108200129A (en) * 2017-12-22 2018-06-22 北京智慧星光信息技术有限公司 A kind of internet statistical data acquisition methods and system
CN109144695A (en) * 2018-08-30 2019-01-04 百度在线网络技术(北京)有限公司 A kind of processing method, device, equipment and the medium of task topological relation
CN109299083A (en) * 2018-10-16 2019-02-01 全球能源互联网研究院有限公司 A kind of data governing system
CN109408559A (en) * 2018-10-09 2019-03-01 北京易观智库网络科技有限公司 Retain the method, apparatus and storage medium of analysis
CN110750727A (en) * 2019-10-28 2020-02-04 京东数字科技控股有限公司 Data processing method, device, system and computer readable storage medium
CN111008253A (en) * 2018-10-08 2020-04-14 阿里巴巴集团控股有限公司 Data model generation method, data warehouse generation device and electronic equipment
CN116303834A (en) * 2023-05-19 2023-06-23 北京弘维大数据技术有限公司 Data warehouse historical data storage and processing method, system and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104978256A (en) * 2014-04-10 2015-10-14 阿里巴巴集团控股有限公司 Log output method and equipment
CN105426292A (en) * 2015-10-29 2016-03-23 网易(杭州)网络有限公司 Game log real-time processing system and method
US20160232085A1 (en) * 2015-02-10 2016-08-11 Wipro Limited Method and device for improving software performance testing
CN106168909A (en) * 2016-06-30 2016-11-30 北京奇虎科技有限公司 A kind for the treatment of method and apparatus of daily record

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104978256A (en) * 2014-04-10 2015-10-14 阿里巴巴集团控股有限公司 Log output method and equipment
US20160232085A1 (en) * 2015-02-10 2016-08-11 Wipro Limited Method and device for improving software performance testing
CN105426292A (en) * 2015-10-29 2016-03-23 网易(杭州)网络有限公司 Game log real-time processing system and method
CN106168909A (en) * 2016-06-30 2016-11-30 北京奇虎科技有限公司 A kind for the treatment of method and apparatus of daily record

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108200129A (en) * 2017-12-22 2018-06-22 北京智慧星光信息技术有限公司 A kind of internet statistical data acquisition methods and system
CN109144695A (en) * 2018-08-30 2019-01-04 百度在线网络技术(北京)有限公司 A kind of processing method, device, equipment and the medium of task topological relation
CN109144695B (en) * 2018-08-30 2021-08-10 百度在线网络技术(北京)有限公司 Method, device, equipment and medium for processing task topological relation
US11321122B2 (en) 2018-08-30 2022-05-03 Apollo Intelligent Driving Technology (Beijing) Co., Ltd. Method, apparatus, device and medium for processing topological relation of tasks
CN111008253A (en) * 2018-10-08 2020-04-14 阿里巴巴集团控股有限公司 Data model generation method, data warehouse generation device and electronic equipment
CN111008253B (en) * 2018-10-08 2023-04-28 阿里巴巴集团控股有限公司 Data model generation method, data warehouse generation method, data model generation device and electronic equipment
CN109408559A (en) * 2018-10-09 2019-03-01 北京易观智库网络科技有限公司 Retain the method, apparatus and storage medium of analysis
CN109299083A (en) * 2018-10-16 2019-02-01 全球能源互联网研究院有限公司 A kind of data governing system
CN110750727A (en) * 2019-10-28 2020-02-04 京东数字科技控股有限公司 Data processing method, device, system and computer readable storage medium
CN116303834A (en) * 2023-05-19 2023-06-23 北京弘维大数据技术有限公司 Data warehouse historical data storage and processing method, system and device
CN116303834B (en) * 2023-05-19 2024-03-08 北京弘维大数据技术有限公司 Data warehouse historical data storage and processing method, system and device

Similar Documents

Publication Publication Date Title
CN106682097A (en) Method and device for processing log data
CN106648859A (en) Task scheduling method and device
CN106682096A (en) Method and device for log data management
CN106708965A (en) Data processing method and apparatus
CN106681808A (en) Task scheduling method and device
Dijkman et al. Business process architectures: overview, comparison and framework
CN106682099A (en) Data storage method and device
Inel et al. Crowdtruth: Machine-human computation framework for harnessing disagreement in gathering annotated data
US9020907B2 (en) Method and system for ranking affinity degree among functional blocks
JP5306360B2 (en) Method and system for analysis of systems for matching data records
US8751216B2 (en) Table merging with row data reduction
CN103886376B (en) System and method for rule-based information filtering
US20170109676A1 (en) Generation of Candidate Sequences Using Links Between Nonconsecutively Performed Steps of a Business Process
US9875277B1 (en) Joining database tables
US20170109668A1 (en) Model for Linking Between Nonconsecutively Performed Steps in a Business Process
CN103473672A (en) System, method and platform for auditing metadata quality of enterprise-level data center
US20170109667A1 (en) Automaton-Based Identification of Executions of a Business Process
An et al. Methodology for automatic ontology generation using database schema information
US9037552B2 (en) Methods for analyzing a database and devices thereof
US20170109639A1 (en) General Model for Linking Between Nonconsecutively Performed Steps in Business Processes
US20170109638A1 (en) Ensemble-Based Identification of Executions of a Business Process
Utamachant et al. An analysis of high-value datasets: a case study of Thailand’s open government data
US20230368091A1 (en) Systems and methods for efficiently distributing alert messages
US11928100B2 (en) Method and system for creating a unified data repository
CN105335466A (en) Audio data retrieval method and apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170524

RJ01 Rejection of invention patent application after publication