CN101719168A - Algorithm configurability-based universal data loading method - Google Patents

Algorithm configurability-based universal data loading method Download PDF

Info

Publication number
CN101719168A
CN101719168A CN201010100470A CN201010100470A CN101719168A CN 101719168 A CN101719168 A CN 101719168A CN 201010100470 A CN201010100470 A CN 201010100470A CN 201010100470 A CN201010100470 A CN 201010100470A CN 101719168 A CN101719168 A CN 101719168A
Authority
CN
China
Prior art keywords
data
algorithm
name
file
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201010100470A
Other languages
Chinese (zh)
Inventor
祝乃国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Communication Information System Co Ltd
Original Assignee
Inspur Communication Information System Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Communication Information System Co Ltd filed Critical Inspur Communication Information System Co Ltd
Priority to CN201010100470A priority Critical patent/CN101719168A/en
Publication of CN101719168A publication Critical patent/CN101719168A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an algorithm configurability-based universal data loading method. In the method, the data of documents can be loaded at all data sources; the data (the data sources) flexibly corresponds to fields (data sheets); script loading is automatically realized; and proper regulation in SQL sentences can be performed according to different database platforms. Therefore, the method can be used as the data loading method for all database platforms, and specifically can be better applied to mass data production and conditions needing to perform special processing on the loaded data.

Description

A kind of based on the configurable general data loading method of algorithm
Technical field
The present invention relates to a kind of microcomputer data processing field, specifically a kind of based on the configurable general data loading method of algorithm.
Background technology
Database is the common tool that data are used, and needs before using data earlier data loading in database.In the data loading process, use operational order insert (the insertion order of new record), the update (update command of data with existing) of database platform standard; Perhaps use the personalized warehouse-in instrument fast of disparate databases platform, as the sqlload order of oracle database platform, the dbaccess of informix platform etc.
Warehouse-in mode the simplest, the most normal use is the data according to a concrete data source, directly generate a concrete SQL statement, use the insert operational order of database, carry out data loading operation (and according to the sql order rreturn value of in-stockroom operation, judge the warehouse-in state, and adopt update to realize the renewal operation of data).This mode very flexible is revised difficulty, need revise each SQL statement to data migration to different database platforms the time.
In the ETL of some data warehouse instrument (perhaps similar instrument) in addition, realized general data loading function, promptly by data source and the intersegmental flexible definition of destination data database data literary name, satisfy data and put requirement flexibly in storage, when being transplanted to different database platforms, only need revise the database isomery and partly get final product.But this instrument still has very big gap with the reality use in dirigibility, as date format difference in the different pieces of information file, some can use the function conversion (as a year-moon-Ri form, be converted to day month year form etc.), some more specifically then can't use embedded operational order conversion (as the time with the long data representation), causes data to put in storage; In some cases, need be the data of some code forms to be put in storage, the data code in the dictionary data table in the application target database is to the conversion of content, and puts in storage in the new data table in the destination data storehouse.These functions well do not realize in particular tool, perhaps implementation efficient is not high (is incorporated into function in the new data table as the conversion from the code to the content, generally adopts the mode related with dictionary table, in mass data, efficient is very low, and availability is very poor).
Summary of the invention
The objective of the invention is in order to solve the data loading versatility, and when data loading, need become privileged problems such as processing, provide a kind of based on the configurable general data loading method of algorithm to data.
The objective of the invention is to realize in the following manner, main realization comprises following content:
1) algorithm (corresponding relation between data source variable and the destination data literary name section; Can be the four fundamental rules hybrid operation, also can be the special processing of some data) configurable general data loading method
2) carry out a data-switching and a method of conversion back data loading for data in the data dictionary table in the use database;
3) need carry out the implementation method that special processing (handling, judge value etc. as date conversion, character string) could be put in storage for database data to be gone into;
4) general storage method can adapt to different database platforms and use;
Corresponding relation configuration according to source data and destination data is used, and can handle special data;
Can be database or data file (database mode perhaps exports as file mode and uses according to the configuration of field corresponding relation);
Can select different source variables and destination data table, function is reused.Can in warehouse-in, use by the value function that the user need develop corresponding function or configuration correspondence according to the processing of oneself;
Based on standard SQL operation, the database platform differentiation is eliminated in algorithm configuration is carried out, and can be adapted to various database platforms automatically;
After adopting the general warehouse-in system of this method, excellent effect is exemplified below:
1) transplanting is irrelevant with the complexity of data content between different database platforms, almost accomplishes zero workload, does not need algorithm and program are done any adjustment, thereby has saved a large amount of enforcement time;
2) demand of newly-increased data processing is from week of other modes or reduce to several hrs several working days, and do not need to have the technician of program capability, only does configuration and gets final product;
3) adaptability of the present invention is good, can do special processing on demand to data, also can directly read dictionary data and produce the warehouse-in desired data from database.The special processing that data are done if desired, existing system does not provide, and can oneself write the processing function, disposes to get final product in algorithm, and versatility and extendability are strong.
Can in all data sources the data loading of file.
It is corresponding flexibly with field (tables of data) that the present invention solves data (data source), and put script in storage and realize automatically, can do the suitable adjustment of SQL statement according to different database platforms, so can be used as the method for all database platform data loadings.Especially in mass data processing and need do and have better application to embody under the situation of special processing to going into database data.
Description of drawings
Fig. 1 is the realization flow figure of system.
Embodiment
With reference to accompanying drawing method of the present invention is done following detailed explanation.
In data handling procedure, mainly comprise the processing that obtains data the analysis warehouse-in of data, data and the database from data source:
The main focused data of the present invention is gone into library facility after analysis, in the signal of Fig. 1 data handling procedure, and more corresponding data files of data source (forming file) or directly use database through gathering.Treat into database data if file then call format is as follows:
Variable name 1| variable name 2| variable name 3
Value 1| value 2| value 3
If (database data to be gone into does not meet following form, then is converted to this form and on request file name earlier and forms new file, re-uses the library facility of going into that this method describes)
If tables of data then exists data file name to be equivalent to the data table name; Field name in the tables of data is equivalent to the name variable in the data file.Processing procedure is all identical, mainly does with file mode below and realizes explanation.In system, used a master routine, and a configuration information.Master routine is realized the parsing of configuration information and calling of power function, the description of configuration information implementation algorithm.
Arthmetic statement configuration information: comprise the unique key word/default value of database table name/database table field/data file name/arthmetic statement/whether;
Read in the algorithm configuration file and form program control data; Dictionary data algorithm configuration information: comprise that algorithm handle, data dictionary table name claim, condition field, value field;
Resolve profile information;
The data of reading in dictionary table form the internal storage data group;
Read in library file to be gone into, from algorithm data, search corresponding information and calculate with filename;
By arithmetic analysis and calculate numerical value, comprise value, four fundamental rules hybrid operation from the internal storage data group, call the external function function;
Generate warehouse-in field value corresponding according to algorithm;
Form the SQL statement order and carry out the insertion data manipulation;
Unsuccessful then upgrade service data;
Configurable algorithm management
The algorithm configuration information format of using in system realizes is: algorithm versions number; The database table name; The database table field; The data block title; Arthmetic statement; Unique key word whether; Default value, assignment algorithm version number and the catalogue of going into database data when routine call, method of calling is as follows:
Dataload (program name)-ruleset (algorithm versions number) 1-datadir (pending data directory)/data/u1
● the dictionary data conversion
For the also corresponding configuration information of value-based algorithm, where this configuration information data of description is from coming, according to what rule obtaining.The source of data generally is the dictionary table in the database, can certainly expand to static data dictionary file.Configuration information is described below:
The algorithm handle; The data dictionary table name claims; Condition field; The value field
Content as:
OBJECT_ID_2_NAME;object;ID;NAME
In master routine, know value from table object, then the numerical value of this table is read out and deposit name and call in the Hash memory array of OBJECT_ID_2_NAME, be designated as the content of id field down, be worth content for the NAME field from this configuration information.After reading %OBJECT_ID_2_NAME (id), from array, obtain corresponding value according to the variate-value in the data file.Realize shape as
Select name From object Where id=:id; The effect of SQL statement
● the four fundamental rules algorithm
The algorithm of four fundamental rules hybrid operation is realized comparatively simple, also direct.As long as the statement of arthmetic statement being changed into the control information execution of development language just can obtain.Describing the variable that uses in the four fundamental rules algorithm must can find in data file, otherwise can't calculate numerical value.
● special processing
Particular algorithm is that some variable in the file is formed new numerical value by certain transformational relation.Since can not exhaustive these special circumstances, the present invention only provides method, and particular algorithm is not enumerated explanation.No matter be which kind of development language (doing the program language of data processing) can comprise the another one function file.Can computing function be set respectively according to different specific (special) requirements in this function file, variable is as bringing parameter into, and numerical value is return parameters.As BIN_2_OCT (id), this function can be realized from scale-of-two just can directly using the variable in the data file to bring into so in algorithm configuration information to metric conversion, thereby changes out the numerical value that needs by this function.
When increasing new demand, only need in function file, increase a corresponding processing function, the title of introducing function in algorithm configuration information gets final product.Judge that denominator is zero data special processing function code section schematically as follows (function that can independently carry out is independent of each other with other special processings):
General storage method
Because the special processing of data all realizes in algorithm configuration, so for the similarities and differences of embedded function in the disparate databases platform, the similarities and differences of time field demand, all realize by configuration in algorithm management.So that uses that the most basic data manipulation statement insert and update realize data generally goes into library facility.
At first carry out corresponding with field to the numerical value of finishing dealing with, automatically produce the warehouse-in statement order of insert, and carry out (use development language can realize the identical fill order of disparate databases platform, as the perl language, the embodiment in environment installation preparation of the difference of database platform).
Secondly, for unsuccessful SQL statement in inserting operation, after judging, if because the renewal operation (condition field of renewal has configuration definition in algorithm) of update is then used in the insertion that unique key word repeats to cause failure.

Claims (1)

1. one kind based on the configurable general data loading method of algorithm, it is characterized in that, in the data processing warehouse-in process, the data source correspondence be database data to be gone into through gathering the data that form, database data to be gone into is that file then requires following form:
Variable name 1| variable name 2| variable name 3
Value 1| value 2| value 3
Database data to be gone into does not meet form, then is converted to this form earlier, and file name on request forms new file, re-uses the library facility of going into of foregoing description;
The data file name that then exists that is tables of data is equivalent to the data table name; Field name in the tables of data is equivalent to the name variable in the data file, processing procedure is to have used a master routine and a configuration information in system, master routine is realized the parsing of configuration information and calling of power function, the description of configuration information implementation algorithm, and system's realization flow is as follows:
Arthmetic statement configuration information: comprise database table name, database table field, data file name, arthmetic statement, whether unique key word, default value;
Read in the algorithm configuration file and form program control data; Dictionary data algorithm configuration information: comprise that algorithm handle, data dictionary table name claim, condition field, value field;
Resolve profile information;
The data of reading in dictionary table form the internal storage data group;
Read in library file to be gone into, from algorithm data, search corresponding information and calculate with filename;
By arithmetic analysis and calculate numerical value, comprise value, four fundamental rules hybrid operation from the internal storage data group, call the external function function;
Generate warehouse-in field value corresponding according to algorithm;
Form the SQL statement order and carry out the insertion data manipulation;
Unsuccessful then upgrade service data.
CN201010100470A 2010-01-25 2010-01-25 Algorithm configurability-based universal data loading method Pending CN101719168A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010100470A CN101719168A (en) 2010-01-25 2010-01-25 Algorithm configurability-based universal data loading method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010100470A CN101719168A (en) 2010-01-25 2010-01-25 Algorithm configurability-based universal data loading method

Publications (1)

Publication Number Publication Date
CN101719168A true CN101719168A (en) 2010-06-02

Family

ID=42433742

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010100470A Pending CN101719168A (en) 2010-01-25 2010-01-25 Algorithm configurability-based universal data loading method

Country Status (1)

Country Link
CN (1) CN101719168A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101866364A (en) * 2010-06-22 2010-10-20 用友软件股份有限公司 Data lead-in method and device
CN101980190A (en) * 2010-10-15 2011-02-23 中兴通讯股份有限公司 Method and device for quickly putting service data into base
CN102339317A (en) * 2011-10-20 2012-02-01 北京握奇数据系统有限公司 High-capacity database card and data communication method thereof
WO2012048555A1 (en) * 2010-10-13 2012-04-19 中兴通讯股份有限公司 Method and device for importing data into database
CN103345501A (en) * 2013-06-27 2013-10-09 华为技术有限公司 Method and device for updating database
CN105577425A (en) * 2015-12-07 2016-05-11 浪潮通信信息系统有限公司 Method and device for processing network management data
CN105608163A (en) * 2015-12-18 2016-05-25 北京金山安全软件有限公司 Database storage method and interface
CN106446064A (en) * 2016-09-05 2017-02-22 中国银行股份有限公司 Data conversion method and device
CN109299103A (en) * 2018-10-23 2019-02-01 贵阳朗玛信息技术股份有限公司 A kind of database store process enters to join system and method

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101866364A (en) * 2010-06-22 2010-10-20 用友软件股份有限公司 Data lead-in method and device
CN101866364B (en) * 2010-06-22 2011-12-07 用友软件股份有限公司 Data lead-in method and device
WO2012048555A1 (en) * 2010-10-13 2012-04-19 中兴通讯股份有限公司 Method and device for importing data into database
CN101980190A (en) * 2010-10-15 2011-02-23 中兴通讯股份有限公司 Method and device for quickly putting service data into base
CN102339317A (en) * 2011-10-20 2012-02-01 北京握奇数据系统有限公司 High-capacity database card and data communication method thereof
CN103345501A (en) * 2013-06-27 2013-10-09 华为技术有限公司 Method and device for updating database
CN105577425A (en) * 2015-12-07 2016-05-11 浪潮通信信息系统有限公司 Method and device for processing network management data
CN105608163A (en) * 2015-12-18 2016-05-25 北京金山安全软件有限公司 Database storage method and interface
CN106446064A (en) * 2016-09-05 2017-02-22 中国银行股份有限公司 Data conversion method and device
CN106446064B (en) * 2016-09-05 2020-07-21 中国银行股份有限公司 Data conversion method and device
CN109299103A (en) * 2018-10-23 2019-02-01 贵阳朗玛信息技术股份有限公司 A kind of database store process enters to join system and method
CN109299103B (en) * 2018-10-23 2022-02-18 贵阳朗玛信息技术股份有限公司 Database storage process parameter entering system and method

Similar Documents

Publication Publication Date Title
CN101719168A (en) Algorithm configurability-based universal data loading method
US10095717B2 (en) Data archive vault in big data platform
CN110908997B (en) Data blood relationship construction method and device, server and readable storage medium
US10545981B2 (en) Virtual repository management
CA2603901C (en) System and methods for facilitating a linear grid database with data organization by dimension
US7610317B2 (en) Synchronization with derived metadata
CN111324610A (en) Data synchronization method and device
EP2463816A1 (en) Methods, apparatus, systems and computer readable mediums for use in sharing information between entities
US20050187974A1 (en) Modularized extraction, transformation, and loading for a database
CN102819585B (en) Method for controlling document of extensive makeup language (XML) database
CN105144080A (en) System for metadata management
CN102426582A (en) Data operation management device and data operation management method
CN109739828B (en) Data processing method and device and computer readable storage medium
CN104715032A (en) Mapping system and method of Chinese and English table name and field name of report system
US6915313B2 (en) Deploying predefined data warehouse process models
CN116028466A (en) Database structure migration method, device, equipment and storage medium
Chen et al. Constructing and maintaining scientific database views in the framework of the object-protocol model
US20080126317A1 (en) Method and system for converting source data files into database query language
Löwenborg et al. A turn towards the digital: An overview of Swedish heritage information management today
CN115617773A (en) Data migration method, device and system
US20080306976A1 (en) Process for dynamic table conversion
US20050114404A1 (en) Database table version upload
Pröll et al. Precise Data Identification Services for Long Tail Research Data.
US20110099143A1 (en) Embedding and retrieving data in an application file format
US7953714B2 (en) Method for maintaining parallelism in database processing using record identifier substitution assignment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20100602