CN104484441A - File batch processing and scheduling method - Google Patents
File batch processing and scheduling method Download PDFInfo
- Publication number
- CN104484441A CN104484441A CN201410816038.6A CN201410816038A CN104484441A CN 104484441 A CN104484441 A CN 104484441A CN 201410816038 A CN201410816038 A CN 201410816038A CN 104484441 A CN104484441 A CN 104484441A
- Authority
- CN
- China
- Prior art keywords
- file
- external data
- data file
- files
- database
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a file batch processing and scheduling method. The method comprises the following steps: receiving an external data file issued by a downloading platform; loading the external data file to a database. According to the file batch processing and scheduling method, each processing state of the external data file is scheduled by using a state drive method, so the purpose of efficiently and concurrently processing the file in a resource controllable mode is achieved. A state is set for each processing step of the file, and is recorded in the database; each processing process is sequentially scheduled by using the mode that the file is processed once the file is received, and the concurrence of the greatest extent is realized between every two processing stages of different files.
Description
Technical field
The invention discloses a kind of document handling method, a kind of method of particularly files in batch process and scheduling.
Background technology
At present, in data processing type systematic, extremely important for the inspection of the external data file in source, cleaning, loading procedure, be the basis of Construction of Data Warehouse; Especially for the system that data volume is very large, how can efficient stable to realize above-mentioned requirements more crucial.
For the concurrent processing of mass file and scheduling in prior art, there is no special file processing lot size scheduling instrument or method.Such as under AIX (Advanced Interactive eXecutive) system, AIX system is a set of class UNIX operating system that IBM develops based on AT & T Unix System V, operate in IBM proprietary Power family chip design minicomputer hardware system on.It has the features such as good security, manageability and clock availability, and is widely used in the field such as bank, retail trade.And for bank, for concurrent processing and the scheduling problem of mass file, there is the low and stable not problem of efficiency all the time and exist.
Summary of the invention
In view of the problems referred to above that prior art exists, the object of the present invention is to provide a kind of files in batch process and dispatching method.The method can the realization of efficient stable for the batch processing of the external data file in source and scheduling.
To achieve these goals, the method for a kind of files in batch process provided by the invention and scheduling, comprising:
Receive the external data file passing down platform and issue;
Load described external data file to database.
As preferably, load described external data file to database, comprising:
Connection data storehouse;
Obtain Loading Control file and according to described Loading Control files loading external data file to database.
As preferably, behind connection data storehouse, first obtain journal file path, and after loading external data file to database, again check that loading journal file judges that whether loading external data file is successful, if judge to load external data file success, updating file state turn-off data storehouse connect.
As preferably, when obtaining Loading Control file, if successfully loaded, in delete database current table section data after enter and load external data file step; Otherwise then first write Loading Control file and obtain Loading Control file again.
As preferably, before loading described external data file to database, judge the whether current file of described external data file, if it is load described external data file to database; Otherwise described external data file compression is preserved, and when issue arrives preset value, external data file described in decompress(ion).
As preferably, before judging the whether current file of described external data file, clean described external data file, this step comprises: the public informations such as file control information inspection, acquisition file separator, cleaning configuration file, line by line the cleaning rule of file according to each field is cleaned, data after cleaning are write line by line the rear file of cleaning, calculate cleaning error rate.
As preferably, before file cleaning is carried out to described external data file, check described external data file, comprising:
Connection data storehouse;
Open described external data file, after file reading control information, check file control information and according to different inspections, different states is arranged to file.
As preferably, described file control information comprises systematic name, passes table name down, increases full dose mark, file separator, the from date of data content and the Close Date of data content.
As preferably, before checking described external data file, external data file described in decompress(ion).
Compared with prior art, the method that the method using state of files in batch process of the present invention and scheduling drives is dispatched the processing stage of external data file each, reaches the object of efficiently concurrent, that resource is controlled process file.And be each treatment step set condition of file, and give record in a database; Take, with to each processing procedure of mode sequence call with process, to realize farthest concurrent between the processing stage of different file each.
Accompanying drawing explanation
Fig. 1 is the general flow chart of the method for files in batch process of the present invention and scheduling.
Fig. 2 is the general flow chart loading external data file in the method for files in batch process of the present invention and scheduling.
Fig. 3 is the general flow chart checking external data file in the method for files in batch process of the present invention and scheduling.
Embodiment
Below in conjunction with the drawings and the specific embodiments, technical scheme of the present invention is further described in detail.
The method of a kind of files in batch process provided by the invention and scheduling, provides concurrent processing and the scheduling feature of mass file under AIX system, for the basic document data preparation stage of Construction of Data Warehouse process provides control.Consist essentially of: first receive down the external data file passing platform and issue; And then be loaded into database with the maximum concurrent external data file that makes.These two steps are the most basic embodiments realizing technical solution of the present invention.And in following accompanying drawing 1, provide another more specifically embodiment, as shown in Figure 1, the method comprises:
S10, receives external data file.Here external data file is often referred to all data files from passing down platform.In data processing type systematic, extremely important for the inspection of the external data file in source, cleaning, loading procedure, be the basis of Construction of Data Warehouse; Especially for the system that data volume is very large, how can efficient stable to realize above-mentioned requirements more crucial.
S11, decompress(ion) external data file.If this refers to the external data file passing platform transmission by is down compressed format, need to decompress so that subsequent operation to it at this.In actual mechanical process, this operation can be carried out by calling gunzip in this step.
S12, checks external data file.The effect of this step checks that whether the file control information of each file is complete, and systematic name in file reading control information, the information such as from date, Close Date that passes table name down, increase full dose mark, file separator, data content.And according to the configuration that in database, one of them is shown, obtain the table name in database corresponding to external data file, then these information are recorded to again in another table.Such as, in concrete operations, for ODS (Operational DataStore, operational data stores) data set, according to the configuration in SYS_TABNAMECHG table, obtain the ODS table name that file is corresponding, and these are recorded in SYS_FTPFILECTL table.The state that file checking terminates rear file is 3000.
Fig. 3 shows the general flow chart checking external data file in the method for files in batch process of the present invention and scheduling.As shown in Figure 3, when carrying out file checking, step comprises: S31, connection data storehouse; If successful connection, then enter S32 step; S32, open the source file of the external data file that will check, if open successfully, enter S33 step, if open file unsuccessfully, arranging file status is 2005; The file control information of S33, reading external data file, if read successfully, enters S34 step, if file reading control information failure, then arranging file status is 2001; S34, inspection external data file.If check successfully, then enter S35 step.If check unsuccessfully, then according to different inspections, different states is arranged to file; S35, updating file state, if be updated successfully, enter S36 step, if upgrade unsuccessfully, then arranging file status is 2006; S36, close file; S37, turn-off data storehouse connect.
S13, cleaning external data file.File control information checks, obtain file separator, clean the public informations such as configuration file, cleans line by line to the cleaning rule of file according to each field, data after cleaning is write line by line file after cleaning, calculates and clean error rate.
S14, load described external data file to database before, judge the whether current file of described external data file, if it is load described external data file to database; Otherwise enter S15 step.
S15, the compression of described external data file to be preserved, and when issue arrives preset value, enter S16 step.
External data file described in S16, decompress(ion).
S17, loading external data file are to database.The effect of this step is by being loaded in the table of the database corresponding to it by the data file after having cleaned.Accept above example, such as, find corresponding ODS table name by SYS_TABNAMECHG table in ODS, and then the external data file after this cleaning is loaded in the ODS table of its correspondence by calling program (such as sqlldr instrument).Be 6000 by loading the file status successfully simultaneously.
In this step, as shown in Figure 2, be again specifically the loading having carried out external data file as follows: S21, connection data storehouse; S22, connection after database, first obtains journal file path, if obtain successfully, enters S23 step; S23, acquisition Loading Control file, successfully enter S25 step if obtained, otherwise enter S24 step when obtaining unsuccessfully; S24, when obtain in S23 step Loading Control file failure time, first can automatically write Loading Control file and obtain Loading Control file again, enter S25 step; In S25, delete database current table section data after enter and load external data file step.Such as, still for ODS data set, when connecting upper database and after obtaining Loading Control file, deleting the data of the current region that current ODS shows in this step; S26, loading external data file are to database; S27, acquisition Loading Control file and according to described Loading Control files loading external data file to database; S28, updating file state; S29, turn-off data storehouse connect.
Acquisition adds when specifically using, and directly disposes in file system in corresponding program, database and installs correlation parameter table and create required catalogue.
The method that the method using state of files in batch process of the present invention and scheduling drives is dispatched the processing stage of external data file each, reaches the object of efficiently concurrent, that resource is controlled process file.And be each treatment step set condition of file, and give record in a database; Take, with to each processing procedure of mode sequence call with process, to realize farthest concurrent between the processing stage of different file each.Realize concurrent processing and the scheduling feature of mass file under such as AIX system, for the basic document data preparation stage of Construction of Data Warehouse process provides control.
Certainly, the above is the preferred embodiment of the present invention, should be understood that; for those skilled in the art; under the premise without departing from the principles of the invention, can also make some improvements and modifications, these improvements and modifications are also considered as protection scope of the present invention.
Claims (8)
1. a method for files in batch process and scheduling, is characterized in that, comprising:
Receive the external data file passing down platform and issue;
Load described external data file to database.
2. the method for files in batch process as claimed in claim 1 and scheduling, is characterized in that, load described external data file to database, comprising:
Connection data storehouse;
Obtain Loading Control file and according to described Loading Control files loading external data file to database.
3. the method for files in batch process as claimed in claim 2 and scheduling, it is characterized in that, behind connection data storehouse, first obtain journal file path, and after loading external data file to database, again check that loading journal file judges that whether loading external data file is successful, if judge to load external data file success, updating file state turn-off data storehouse connect.
4. the method for files in batch process as claimed in claim 2 and scheduling, is characterized in that, when obtaining Loading Control file, if successfully loaded, in delete database current table section data after enter and load external data file step; Otherwise then first write Loading Control file and obtain Loading Control file again.
5. the method for files in batch process as claimed in claim 1 and scheduling, is characterized in that, before loading described external data file to database, judge the whether current file of described external data file, if it is load described external data file to database; Otherwise described external data file compression is preserved, and when issue arrives preset value, external data file described in decompress(ion).
6. the method for files in batch process as claimed in claim 5 and scheduling, it is characterized in that, before judging the whether current file of described external data file, clean described external data file, this step comprises: check file control information; Obtain file separator; Cleaning configuration file, cleans the cleaning rule that file is preset according to each field line by line, then data after cleaning is write the file after cleaning line by line.
7. the method for files in batch process as claimed in claim 6 and scheduling, is characterized in that, before carrying out file cleaning, check described external data file, comprising described external data file:
Connection data storehouse;
Open described external data file, after file reading control information, check file control information and according to different inspections, different states is arranged to file.
8. the method for files in batch process as claimed in claim 7 and scheduling, is characterized in that, described file control information comprises systematic name, passes table name down, increases full dose mark, file separator, the from date of data content and the Close Date of data content.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410816038.6A CN104484441A (en) | 2014-12-23 | 2014-12-23 | File batch processing and scheduling method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410816038.6A CN104484441A (en) | 2014-12-23 | 2014-12-23 | File batch processing and scheduling method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104484441A true CN104484441A (en) | 2015-04-01 |
Family
ID=52758982
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410816038.6A Pending CN104484441A (en) | 2014-12-23 | 2014-12-23 | File batch processing and scheduling method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104484441A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070294219A1 (en) * | 2004-01-22 | 2007-12-20 | International Business Machines Corporation | Shared scans utilizing query monitor during query execution to improve buffer cache utilization across multi-stream query environments |
CN101251861A (en) * | 2008-03-18 | 2008-08-27 | 北京锐安科技有限公司 | Method for loading and inquiring magnanimity data |
JP2008242677A (en) * | 2007-03-27 | 2008-10-09 | Hitachi Information Systems Ltd | Database construction-supporting system, database construction information-generating method, and program |
CN103077241A (en) * | 2013-01-10 | 2013-05-01 | 中国银行股份有限公司 | Method for loading data in parallel after splitting files |
-
2014
- 2014-12-23 CN CN201410816038.6A patent/CN104484441A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070294219A1 (en) * | 2004-01-22 | 2007-12-20 | International Business Machines Corporation | Shared scans utilizing query monitor during query execution to improve buffer cache utilization across multi-stream query environments |
JP2008242677A (en) * | 2007-03-27 | 2008-10-09 | Hitachi Information Systems Ltd | Database construction-supporting system, database construction information-generating method, and program |
CN101251861A (en) * | 2008-03-18 | 2008-08-27 | 北京锐安科技有限公司 | Method for loading and inquiring magnanimity data |
CN103077241A (en) * | 2013-01-10 | 2013-05-01 | 中国银行股份有限公司 | Method for loading data in parallel after splitting files |
Non-Patent Citations (5)
Title |
---|
夏阳等: "Oracle数据库的备份方法及策略", 《微型机与应用》 * |
李恒锐: "构建数据仓库的ETL系统研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
李玉萍等: "局域网中WindowsNT、Oracle7服务器故障9例", 《医学信息》 * |
秦峰巍等: "基于SQL*Loader的海量数据装载方案优化", 《武汉理工大学学报信息与管理工程版》 * |
顾轶: "数据库控制文件丢失后的恢复", 《电脑报》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10346850B2 (en) | Case management integration with external content repositories | |
CN109598427B (en) | Robot management method and device and electronic equipment | |
CN105512294B (en) | Multimedia file update prompting method and device | |
CN105045676B (en) | A kind of restoration methods of the loss data based on SQLite databases | |
CN107608798A (en) | A kind of method for processing business and equipment | |
CN110650164B (en) | File uploading method and device, terminal and computer storage medium | |
CN102375891A (en) | Implementation tool for unloading and loading incremental data | |
US8473504B2 (en) | Stabilized binary differencing | |
CN107819883A (en) | A kind of multi signal processing equipment and its remote upgrade method to FPGA programs | |
CN109324821B (en) | Self-service terminal system version management method | |
US20170185388A1 (en) | Application program uninstallation method and apparatus | |
CN113760611B (en) | System site switching method and device, electronic equipment and storage medium | |
CN112181695A (en) | Abnormal application processing method, device, server and storage medium | |
CN104318467A (en) | Food material information input method for intelligent refrigerator | |
CN110191182A (en) | Distributed document batch processing method, device, equipment and readable storage medium storing program for executing | |
CN112583743B (en) | Distributed file exchange method and device | |
CN104484441A (en) | File batch processing and scheduling method | |
CN103731629B (en) | A kind of video conference terminal and its implementation method for supporting third-party application | |
CN116069859A (en) | Incremental data synchronization method of database, storage medium and computer equipment | |
CN108984221B (en) | Method and device for acquiring multi-platform user behavior logs | |
CN110838338A (en) | System, method, storage medium, and electronic device for creating biological analysis item | |
CN112579250B (en) | Middleware management method and device and repair engine system | |
CN105630554B (en) | A kind of reloading method and user terminal of third-party application | |
CN105791514A (en) | Application starting monitoring method and device | |
CN108334454A (en) | A kind of automatic scheduling method and system of smart card test platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20150401 |
|
RJ01 | Rejection of invention patent application after publication |