CN106407329A - Method for automatically importing incremental data from massive platform to hadoop platform - Google Patents

Method for automatically importing incremental data from massive platform to hadoop platform Download PDF

Info

Publication number
CN106407329A
CN106407329A CN201610801065.5A CN201610801065A CN106407329A CN 106407329 A CN106407329 A CN 106407329A CN 201610801065 A CN201610801065 A CN 201610801065A CN 106407329 A CN106407329 A CN 106407329A
Authority
CN
China
Prior art keywords
data
platform
magnanimity
hadoop
imports
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610801065.5A
Other languages
Chinese (zh)
Other versions
CN106407329B (en
Inventor
刘飞
傅靖
王栋
李伟伦
蒋亮
江陈桢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Nantong Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
Nantong Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Nantong Power Supply Co of State Grid Jiangsu Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201610801065.5A priority Critical patent/CN106407329B/en
Publication of CN106407329A publication Critical patent/CN106407329A/en
Application granted granted Critical
Publication of CN106407329B publication Critical patent/CN106407329B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2379Updates performed during online database operations; commit processing
    • G06F16/2386Bulk updating operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for automatically importing incremental data from a massive platform to a hadoop platform. The method is characterized in that the massive platform is taken as a source database, the hadoop big data platform is taken as a target database, and an automatic incremental data import method of Java development is combined to realize the aim of importing the incremental data in the source database into the target database. Compared with the traditional Hadoop platform data import method, the method is capable of greatly improving the data import efficiency and improving the quality of the imported quality, so that a favorable foundation is provided for the correctness of subsequent big data statistical analysis.

Description

The method that magnanimity platform imports incremental data toward hadoop platform automatization
Technical field
The present invention relates to a kind of method that magnanimity platform imports incremental data toward hadoop platform automatization.
Background technology
The power industry of China is through high speed development in decades, complete with Intelligent electric power system Construction of future generation Face is launched, and the power system of China has had become as the professional Internet of Things that maximum-norm in the world involves the interests of the state and the people, or even In a way, this, throughout the relations of production net of each link of production and operation, has constructed the big data meter of Largest In China scale Calculate platform, be to carry out large-scale energy resources allotment from multiple dimension such as time and space to lay a good foundation.For electric power row For industry, electric power big data will pass through the links such as future electrical energy commercial production and management, plays unique and huge work With, to be China electric power industry successfully manage resource-constrained, ambient pressure etc. in power industry systematic procedure of future generation and ask making Topic, the key that the thick long-pending thickness of realization is sent out, green suslainability develops.
In recent years, electric power trade informationization had also obtained significant progress, and China's Electric Power Enterprise Information originates from 20 generation Recorded for the 60's, the IT application in management construction with financial computerization as representative from initial electrical production automation the to the eighties, then arrive In recent years large-scale corporation's informatization, particularly along with the all-round construction of intelligent power network of future generation, with electric power big data For the extensive application in power industry for the IT technology of new generation representing, electric power data resource starts sharp increase and defines Certain scale.In the long run, as " barometer " of China's economic social development, electric power data is with itself and economic development Closely and widely contact, it will present unmatched straight outside, to China's socio-economic development down to human society Progress also will form more powerful motive force.
For the mass data of the continuous generation of power industry, this to be processed using which kind of technology?Certainly, Hadoop Become the first-selection of numerous enterprises, Hadoop has wide applicability and good easy-to-use because of it in big data process field Property, after releasing from 2007, it is generalizable in industrial quarters quickly, has obtained extensive concern and the research of academia simultaneously. In short several years, Hadoop is quickly become up to the present successful, most widely accepted use big data and processes master Flow Technique and system platform, and become a kind of big data and process actual industrial standard, obtained industrial quarters substantial amounts of enter One step exploitation and improvement, and be in the industry cycle widely used with application industry especially internet industry.
There are mass data and corresponding Hadoop big data treatment technology it is necessary to it is contemplated that how extra large by these this is Amount data imports to Hadoop platform, and under such a background, this patent proposes a kind of efficient and stable solution.
Content of the invention
It is an object of the invention to provide a kind of efficient and stable magnanimity platform imports toward hadoop platform automatization increasing The method of amount data.
The technical solution of the present invention is:
A kind of method that magnanimity platform imports incremental data toward hadoop platform automatization, is characterized in that:With magnanimity platform as source Data base, with Hadoop big data platform as target database, and combines automatization's incremental data introduction method that Java develops, Import to the purpose of target database to realize the incremental data in source database;
In Hadoop big data platform, using MapReduce as the computing engines of big data, with HDFS distributed file system Storage destructuring and partly-structured data, with HBase distributed data base structured data;Java automatization increases Amount data imports program needs daily execution once, and the realization of java applet function is divided into two parts:
(1)Java applet realizes the function to the collection of magnanimity platform incremental data;Magnanimity platform provides the interface of inquiry data, this Interface comprises time range, table name, querying condition parameter, and java applet calls magnanimity platform to connect by configuring corresponding parameter Mouthful, obtain corresponding data;Because java applet is the data to obtain using by the way of calling remote interface on magnanimity platform, So the situation of Network Abnormal can be there is, for this situation, before java applet obtains the method for data in execution, first judge Whether network connection is normal, if abnormal, records corresponding daily record, and records corresponding time tag, etc. network environment just Chang Shi, java applet is further according to the date tag of record, rerun routine, and reacquires this date corresponding data;Separately Outward, if there is java applet when execution obtains data method, occur as soon as the situation of Network Abnormal, then the number having obtained According to not importing to big data platform, same log and label;
(2)After the incremental data in magnanimity platform is obtained by java applet, import data to big data platform;But doing Before this operation, first verify the data obtaining, if normal, abnormal data will be rejected, the number that java applet obtains According to importing to big data platform by the way of additional, before data imports, java applet needs also exist for first judging network even Connect whether normal, during Network Abnormal, log and date tag, the introduction method of this program is not performed;In addition, such as The situation of Network Abnormal when executing introduction method in fruit, then log, date tag and data label, thus may be used To know which data of current date has been imported into having been introduced into the data of big data platform in big data platform, will not Repeated to import.
Described first verify the data obtaining, if normal, be to judge that data whether there is null value, the feelings of negative value Condition.
By Eclipse3.6 developing instrument, develop and realize export function, import feature and different using Java language Often processing function.
By Eclipse3.6 developing instrument, develop and realize export function, import feature and different using Java language Often processing function.
Realize export function it needs to be determined that whether the network port between java applet and magnanimity platform opens, and Java Program can successfully call the query interface that magnanimity platform is provided.
Realize import feature it needs to be determined that whether the network port between java applet and big data platform opens, and The order that java applet can successfully be provided using Hadoop big data platform, by the data supplementing obtaining to Hdfs file system In system.
Realize the function that Network Abnormal judges data verification, derive data in magnanimity platform and big data platform imports number According to when, be required for judging whether network abnormal, and record correlation log and date tag, go out in order to identify data when Existing problem, and in the case of network is normal, again these problematic data is derived or import;When data imports Wait, judge whether data is problematic, when null value, negative value abnormal data, program filters out in itself, so can improve and lead Enter the total quality of data, provide high-quality data for quality evaluation below.
Compared with traditional Hadoop platform data lead-in method, this method can not only greatly improve data and import the present invention Efficiency, can also improve the quality importing data simultaneously, and the accuracy for follow-up big data statistical analysiss provides good basis.Solution Traditional data of having determined introduction method efficiency is low, error is big, problem poor in real time, being imported based on Hadoop platform automatization of proposition The method of incremental data, substantially increases the efficiency importing data toward Hadoop platform, and ensure that and import the accurate real of data Shi Xing, accuracy.
Brief description
The invention will be further described with reference to the accompanying drawings and examples.
Fig. 1 is the system flow chart of the present invention.
Specific embodiment
A kind of method that magnanimity platform imports incremental data toward hadoop platform automatization, with magnanimity platform as source data Storehouse, with Hadoop big data platform as target database, and combines automatization's incremental data introduction method that Java develops, and comes real Incremental data in existing source database imports to the purpose of target database;
In Hadoop big data platform, using MapReduce as the computing engines of big data, with HDFS distributed file system Storage destructuring and partly-structured data, with HBase distributed data base structured data;Java automatization increases Amount data imports program needs daily execution once, and due to magnanimity platform, frequency of usage is less at night, so the execution of program Time is set at 24 points, and the realization of java applet function is divided into two parts:
(1)Java applet realizes the function to the collection of magnanimity platform incremental data;Magnanimity platform provides the interface of inquiry data, this Interface comprises time range, table name, querying condition parameter, and java applet calls magnanimity platform to connect by configuring corresponding parameter Mouthful, obtain corresponding data;Because java applet is the data to obtain using by the way of calling remote interface on magnanimity platform, So the situation of Network Abnormal can be there is, for this situation, before java applet obtains the method for data in execution, first judge Whether network connection is normal, if abnormal, records corresponding daily record, and records corresponding time tag, etc. network environment just Chang Shi, java applet is further according to the date tag of record, rerun routine, and reacquires this date corresponding data;Separately Outward, if there is java applet when execution obtains data method, occur as soon as the situation of Network Abnormal, then the number having obtained According to not importing to big data platform, same log and label;
(2)After the incremental data in magnanimity platform is obtained by java applet, import data to big data platform;But doing Before this operation, first verify the data obtaining, if normal, abnormal data will be rejected, the number that java applet obtains According to importing to big data platform by the way of additional, before data imports, java applet needs also exist for first judging network even Connect whether normal, during Network Abnormal, log and date tag, the introduction method of this program is not performed;In addition, such as The situation of Network Abnormal when executing introduction method in fruit, then log, date tag and data label, thus may be used To know which data of current date has been imported into having been introduced into the data of big data platform in big data platform, will not Repeated to import.
Described first verify the data obtaining, if normal, be to judge that data whether there is null value, the feelings of negative value Condition.
Deployment hardware environment and software environment, cluster server installs the operating system of CentOS 6.5, after installing, needs Want Configuration network environment, due to java applet based on Hadoop big data platform support programs, so need in server Cluster Hadoop installed above related software.
By Eclipse3.6 developing instrument, develop and realize export function, import feature and different using Java language Often processing function.
By Eclipse3.6 developing instrument, develop and realize export function, import feature and different using Java language Often processing function.
Realize export function it needs to be determined that whether the network port between java applet and magnanimity platform opens, and Java Program can successfully call the query interface that magnanimity platform is provided.
Realize import feature it needs to be determined that whether the network port between java applet and big data platform opens, and The order that java applet can successfully be provided using Hadoop big data platform, by the data supplementing obtaining to Hdfs file system In system,
Code is as follows:
String hdfs_path = "hdfs://mycluster/home/wyp/wyp.txt";// file path
Configuration conf = new Configuration();
conf.setBoolean("dfs.support.append", true);
String inpath = "/home/wyp/append.txt";
FileSystem fs = null;
try {
fs = FileSystem.get(URI.create(hdfs_path), conf);
// file stream to be added, inpath is file
InputStream in = new
BufferedInputStream(new FileInputStream(inpath));
OutputStream out = fs.append(new Path(hdfs_path));
IOUtils.copyBytes(in, out, 4096, true);
} catch (IOException e) {
e.printStackTrace();
}
Realize the function that Network Abnormal judges data verification, derive data in magnanimity platform and big data platform imports data When, it is required for judging whether network is abnormal, and records correlation log and date tag, occur asking in order to identify data when Topic, and in the case of network is normal, again these problematic data is derived or import;When data imports, Judge whether data is problematic, when null value, negative value abnormal data, program filters out in itself, so can improve importing The total quality of data, provides high-quality data for quality evaluation below.

Claims (7)

1. a kind of method that magnanimity platform imports incremental data toward hadoop platform automatization, is characterized in that:With magnanimity platform it is Source database, with Hadoop big data platform as target database, and combines the incremental data importing side of automatization of Java exploitation Method, to realize the purpose that the incremental data in source database imports to target database;
In Hadoop big data platform, using MapReduce as the computing engines of big data, with HDFS distributed file system Storage destructuring and partly-structured data, with HBase distributed data base structured data;Java automatization increases Amount data imports program needs daily execution once, and the realization of java applet function is divided into two parts:
(1)Java applet realizes the function to the collection of magnanimity platform incremental data;Magnanimity platform provides the interface of inquiry data, this Interface comprises time range, table name, querying condition parameter, and java applet calls magnanimity platform to connect by configuring corresponding parameter Mouthful, obtain corresponding data;Because java applet is the data to obtain using by the way of calling remote interface on magnanimity platform, So the situation of Network Abnormal can be there is, for this situation, before java applet obtains the method for data in execution, first judge Whether network connection is normal, if abnormal, records corresponding daily record, and records corresponding time tag, etc. network environment just Chang Shi, java applet is further according to the date tag of record, rerun routine, and reacquires this date corresponding data;Separately Outward, if there is java applet when execution obtains data method, occur as soon as the situation of Network Abnormal, then the number having obtained According to not importing to big data platform, same log and label;
(2)After the incremental data in magnanimity platform is obtained by java applet, import data to big data platform;But doing Before this operation, first verify the data obtaining, if normal, abnormal data will be rejected, the number that java applet obtains According to importing to big data platform by the way of additional, before data imports, java applet needs also exist for first judging network even Connect whether normal, during Network Abnormal, log and date tag, the introduction method of this program is not performed;In addition, such as The situation of Network Abnormal when executing introduction method in fruit, then log, date tag and data label, thus may be used To know which data of current date has been imported into having been introduced into the data of big data platform in big data platform, will not Repeated to import.
2. the method that magnanimity platform according to claim 1 imports incremental data toward hadoop platform automatization, its feature It is:Described first verify the data obtaining, if normal, be to judge that data whether there is null value, the situation of negative value.
3. the method that magnanimity platform according to claim 1 imports incremental data toward hadoop platform automatization, its feature It is:By Eclipse3.6 developing instrument, develop and realize export function, import feature and abnormality processing using Java language Function.
4. the method that magnanimity platform according to claim 2 imports incremental data toward hadoop platform automatization, its feature It is:By Eclipse3.6 developing instrument, develop and realize export function, import feature and abnormality processing using Java language Function.
5. the method that magnanimity platform according to claim 1 imports incremental data toward hadoop platform automatization, its feature It is:Realize export function it needs to be determined that whether the network port between java applet and magnanimity platform opens, and java applet The query interface that magnanimity platform is provided can successfully be called.
6. the method that magnanimity platform according to claim 1 imports incremental data toward hadoop platform automatization, its feature It is:Realize import feature it needs to be determined that whether the network port between java applet and big data platform opens, and Java journey The order that sequence can successfully be provided using Hadoop big data platform, by the data supplementing obtaining to Hdfs file system.
7. the method that magnanimity platform according to claim 1 imports incremental data toward hadoop platform automatization, its feature It is:Realize the function that Network Abnormal judges data verification, derive data in magnanimity platform and big data platform imports data When, it is required for judging whether network is abnormal, and records correlation log and date tag, occur asking in order to identify data when Topic, and in the case of network is normal, again these problematic data is derived or import;When data imports, Judge whether data is problematic, when null value, negative value abnormal data, program filters out in itself, so can improve importing The total quality of data, provides high-quality data for quality evaluation below.
CN201610801065.5A 2016-09-05 2016-09-05 Magnanimity platform automates the method for importing incremental data toward hadoop platform Active CN106407329B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610801065.5A CN106407329B (en) 2016-09-05 2016-09-05 Magnanimity platform automates the method for importing incremental data toward hadoop platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610801065.5A CN106407329B (en) 2016-09-05 2016-09-05 Magnanimity platform automates the method for importing incremental data toward hadoop platform

Publications (2)

Publication Number Publication Date
CN106407329A true CN106407329A (en) 2017-02-15
CN106407329B CN106407329B (en) 2019-06-25

Family

ID=57999461

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610801065.5A Active CN106407329B (en) 2016-09-05 2016-09-05 Magnanimity platform automates the method for importing incremental data toward hadoop platform

Country Status (1)

Country Link
CN (1) CN106407329B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085613A (en) * 2017-05-17 2017-08-22 广州四三九九信息科技有限公司 Enter the filter method and device of library file
CN108596686A (en) * 2018-05-09 2018-09-28 珠海横琴盛达兆业科技投资有限公司 A method of realizing that chain partner drugstore management system and Foshan food medicine prison system data are integrated
CN108647362A (en) * 2018-05-21 2018-10-12 珠海横琴盛达兆业科技投资有限公司 A method of realizing that Mono-drugstore management system and Foshan food medicine prison system data are integrated

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102546247A (en) * 2011-12-29 2012-07-04 华中科技大学 Massive data continuous analysis system suitable for stream processing
CN105243067A (en) * 2014-07-07 2016-01-13 北京明略软件系统有限公司 Method and apparatus for realizing real-time increment synchronization of data
US20160196311A1 (en) * 2013-09-26 2016-07-07 Shenzhen Audaque Data Technology Ltd Data quality measurement method and system based on a quartile graph

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102546247A (en) * 2011-12-29 2012-07-04 华中科技大学 Massive data continuous analysis system suitable for stream processing
US20160196311A1 (en) * 2013-09-26 2016-07-07 Shenzhen Audaque Data Technology Ltd Data quality measurement method and system based on a quartile graph
CN105243067A (en) * 2014-07-07 2016-01-13 北京明略软件系统有限公司 Method and apparatus for realizing real-time increment synchronization of data

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085613A (en) * 2017-05-17 2017-08-22 广州四三九九信息科技有限公司 Enter the filter method and device of library file
CN107085613B (en) * 2017-05-17 2020-07-28 广州四三九九信息科技有限公司 Method and device for filtering files to be put in storage
CN108596686A (en) * 2018-05-09 2018-09-28 珠海横琴盛达兆业科技投资有限公司 A method of realizing that chain partner drugstore management system and Foshan food medicine prison system data are integrated
CN108647362A (en) * 2018-05-21 2018-10-12 珠海横琴盛达兆业科技投资有限公司 A method of realizing that Mono-drugstore management system and Foshan food medicine prison system data are integrated

Also Published As

Publication number Publication date
CN106407329B (en) 2019-06-25

Similar Documents

Publication Publication Date Title
CN107645562A (en) Data transmission processing method, device, equipment and system
CN109189379A (en) code generating method and device
CN109062780A (en) The development approach and terminal device of automatic test cases
CN110413595B (en) Data migration method applied to distributed database and related device
CN111382073A (en) Automatic test case determination method, device, equipment and storage medium
CN110502425A (en) Test data generating method, device, electronic equipment and storage medium
CN106407329A (en) Method for automatically importing incremental data from massive platform to hadoop platform
CN104767795A (en) LTE MRO data statistical method and system based on HADOOP
Li et al. Microservice migration using strangler fig pattern: A case study on the green button system
CN111159897B (en) Target optimization method and device based on system modeling application
CN102955739B (en) A kind of method improving performance test script reuse rate
CN112559525B (en) Data checking system, method, device and server
CN113761079A (en) Data access method, system and storage medium
CN116204428A (en) Test case generation method and device
US10447807B1 (en) Dynamic middleware source selection for optimizing data retrieval from network nodes
CN113992736B (en) Interconnection method of structured data based on cloud computing service platform and server
CN104216986A (en) Device and method for improving data query efficiency through pre-operation according to data update period
CN114706839A (en) Log data processing method and device, electronic equipment and storage medium
CN113868116A (en) Test dependent data generation method and device, server and storage medium
CN113468509A (en) User authentication migration method, device, equipment and storage medium
CN109284278B (en) Calculation logic migration method based on data analysis technology and terminal equipment
CN112835932A (en) Batch processing method and device of service table and nonvolatile storage medium
CN113515306B (en) System transplanting method and device
CN110727655B (en) Method, device, equipment and medium for building shadow database of block chain
CN115794609A (en) Script sharing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant