CN108595156A - A kind of batch processing method and system based on Impala components - Google Patents

A kind of batch processing method and system based on Impala components Download PDF

Info

Publication number
CN108595156A
CN108595156A CN201810385610.6A CN201810385610A CN108595156A CN 108595156 A CN108595156 A CN 108595156A CN 201810385610 A CN201810385610 A CN 201810385610A CN 108595156 A CN108595156 A CN 108595156A
Authority
CN
China
Prior art keywords
batch processing
impala
processing task
target batch
components
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810385610.6A
Other languages
Chinese (zh)
Inventor
翁宇哲
林强
陈翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank Of Ningbo Co Ltd
Original Assignee
Bank Of Ningbo Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank Of Ningbo Co Ltd filed Critical Bank Of Ningbo Co Ltd
Priority to CN201810385610.6A priority Critical patent/CN108595156A/en
Publication of CN108595156A publication Critical patent/CN108595156A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/20Software design
    • G06F8/24Object-oriented
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/35Creation or generation of source code model driven
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A kind of batch processing method and system based on Impala components provided by the present application, user only needs the input SQL statement for meeting Impala Element formats corresponding with target batch processing task in default template, generate the preset format file for being embedded with the SQL statement, run the preset format file, at least one encapsulation function packet corresponding with the target batch processing task in running environment by calling the preset format file, call the application programming interface in Impala components, the SQL statement is sent to Impala components, the Impala components are made to execute the target batch processing task.Developer is avoided in order to which progress programming brings mistake when calling Impala components to carry out a large amount of program development, and carrying out batch processing every time, improves the execution efficiency of batch processing.

Description

A kind of batch processing method and system based on Impala components
Technical field
The present invention relates to technical field of data processing, more particularly to a kind of batch processing side based on Impala components Method and system.
Background technology
In recent years, big data processing has become global problem with analysis, with the informationization of China's economic society and certainly The continuous improvement of dynamicization level also faces big data in many fields such as government, public service, scientific research and business application and asks Topic.Big data platform is proposed based on internet and data center, service-oriented big data analysis Platform Solution, to meet Growing big data process demand.
Impala components are operate in the real-time interactive SQL query engine in existing big data platform, and HDFS can be allowed literary Data in part system and HBase databases support inquiry real-time.Many organizations all attempt to utilize big number at present Data batch processing is carried out according to platform and Impala components, still, since the System level gray correlation opposite sex and business of different tissues mechanism are poor The opposite sex, existing big data platform and Impala components cannot meet the batch requirement of different tissues mechanism, a group loom Structure needs to carry out generally requiring to write when data batch processing the program of large amount of complex using big data platform and Impala components To call Impala components, huge workload, while also error-prone are brought for developer.
Invention content
In view of this, the present invention provides a kind of batch processing method and system based on Impala components, can directly adjust Batch processing is realized with Impala components, improves batch processing efficiency.
To achieve the goals above, specific technical solution provided by the invention is as follows:
A kind of batch processing method based on Impala components, including:
Obtain that user inputs in default template corresponding with target batch processing task meets Impala Element formats SQL statement;
Generate the preset format file for being embedded with the SQL statement;
Run the preset format file, in the running environment by calling the preset format file with the target batch The corresponding at least one encapsulation function packet of processing task calls the application programming interface in Impala components, by institute It states SQL statement and is sent to Impala components, the Impala components is made to execute the target batch processing task.
Preferably, obtain SQL statement corresponding with target batch processing task that user inputs in default template it Afterwards, the method further includes:
Whether the input parameter detected in the SQL statement is correct, and prompts user when detecting input parameter mistake Input parameter mistake.
Preferably, the method further includes:
By the execution of each script in the parameter information of the target batch processing task, the target batch processing task The execution state recording of state and each SQL statement is in daily record.
Preferably, the method further includes:
In the implementation procedure of the target batch processing task, the execution state and each SQL of each script are fed back The execution state of sentence.
Preferably, the method further includes:
In the implementation procedure of the target batch processing task, to each script in the target batch processing task Execution state is monitored, and when monitoring that any one script executes abnormal, and user's script is prompted to execute exception.
Preferably, the method further includes:
After the target batch processing task execution, discharge memory source that the target batch processing task occupies and Cpu resource.
A kind of batch processing system based on Impala components, including:
SQL acquiring units, the symbol corresponding with target batch processing task inputted in default template for obtaining user Close the SQL statement of Impala formats;
Generation unit, for generating the preset format file for being embedded with the SQL statement;
Impala call units, for running the preset format file, by the fortune for calling the preset format file At least one encapsulation function packet corresponding with the target batch processing task in row environment calls answering in Impala components With Program Interfaces, the SQL statement is sent to Impala components, the Impala components is made to execute at the target batch Reason task.
Preferably, the system also includes:
Whether input parameter detection unit is correct for detecting the input parameter in the SQL statement and defeated when detecting User's input parameter mistake is prompted when entering parameter error.
Preferably, the system also includes:
Logging unit, being used for will be in the parameter information of the target batch processing task, the target batch processing task Each script execution state and each SQL statement execution state recording in daily record.
Preferably, the system also includes:
State feedback unit, in the implementation procedure of the target batch processing task, feeding back holding for each script The execution state of row state and each SQL statement.
Preferably, the system also includes:
Exception monitoring unit, in the implementation procedure of the target batch processing task, appointing to the target batch processing The execution state of each script in business is monitored, and when monitoring that any one script executes abnormal, prompts user Script executes exception.
Preferably, the system also includes:
Resource releasing unit is appointed for after the target batch processing task execution, discharging the target batch processing The memory source and cpu resource for occupancy of being engaged in.
Compared with the existing technology, beneficial effects of the present invention are as follows:
A kind of batch processing method and system based on Impala components provided by the invention, user only need in default template The middle input SQL statement for meeting Impala Element formats corresponding with target batch processing task, generation are embedded with the SQL languages The preset format file of sentence, runs the preset format file, in the running environment by calling the preset format file with The corresponding at least one encapsulation function packet of target batch processing task calls the application programming in Impala components The SQL statement is sent to Impala components by interface, and the Impala components is made to execute the target batch processing task.It keeps away Into stroke when having exempted from developer to call Impala components to carry out a large amount of program development, and having carried out batch processing every time Sequence, which is write, brings mistake, improves the execution efficiency of batch processing.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of batch processing method flow chart based on Impala components disclosed by the embodiments of the present invention;
Fig. 2 is a kind of batch processing system structural schematic diagram based on Impala components disclosed by the embodiments of the present invention.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Referring to Fig. 1, Fig. 1 is a kind of batch processing method flow chart based on Impala components disclosed in the present embodiment, tool Body includes the following steps:
S101:Obtain that user inputs in default template corresponding with target batch processing task meets Impala components The SQL statement of format;
The default template is developed in advance, it is preferred that the default template is developed by Python.
Here user can be batch processing developer, user can according to the input requirements in default template input with The corresponding SQL statement of target batch processing task, the SQL statement have to comply with Impala Element formats, and the SQL statement is extremely It is one less.Batch processing method disclosed in the present embodiment supports single SQL and batch SQL two ways.
The internal logic that user need not pay close attention to batch processing is realized, only need to be paid close attention to SQL itself, be reduced the execution of batch processing Difficulty.
Preferably, obtain SQL statement corresponding with target batch processing task that user inputs in default template it Afterwards, the method further includes:
Whether the input parameter detected in the SQL statement is correct, and prompts user when detecting input parameter mistake Input parameter mistake.
Input parameter mistake may be that the format of input parameter is illegal.
The input parameter of general script by default is the accounting date.The day of data in accounting date feeling the pulse with the finger-tip mark batch processing task Phase.
S102:Generate the preset format file for being embedded with the SQL statement;
When system is developed by Python, the preset format file is the file of Python formats.Certainly, When system is developed by other language, the preset format file is the corresponding formatted file of corresponding development language.
Specifically, corresponding with target batch processing task meeting Impala groups according to what user inputted in default template The SQL statement of part format generates the preset format file for being embedded with the SQL statement.
S103:Run the preset format file, in the running environment by calling the preset format file with it is described The corresponding at least one encapsulation function packet of target batch processing task calls the application programming in Impala components to connect Mouthful, the SQL statement is sent to Impala components, the Impala components is made to execute the target batch processing task.
System, which is previously determined in batch processing implementation procedure, may need function to be used, and be carried out respectively to these functions Encapsulation, obtains encapsulation function packet.When system is developed by Python, encapsulation function packet is integrated into Python operation rings for these In border.
The batch processing task of different demands or type may need to use different functions, it is thus necessary to determine that target batch Handle the encapsulation function packet of required by task.
By calling the application programming interface in Impala components corresponding with each encapsulation function packet respectively, Realize the purpose for calling Impala performance objective batch processing tasks.
Preferably, the method further includes:
By the execution of each script in the parameter information of the target batch processing task, the target batch processing task The execution state recording of state and each SQL statement is in daily record.
The parameter information of the target batch processing task includes:It the Starting Executing Time of the target batch processing task and holds The parameters such as row end time.
Script includes each of operational objective batch processing task in each script and system in the preset format file A script.The executive condition that target batch processing task can be checked by access log, specific to the execution shape of each script The execution state of state and each SQL statement.
Preferably, the method further includes:
In the implementation procedure of the target batch processing task, the execution state and each SQL of each script are fed back The execution state of sentence.
The execution state of each script and the execution state of each SQL statement are fed back, understands target batch convenient for user The executive condition of processing task.
Preferably, the method further includes:
In the implementation procedure of the target batch processing task, to each script in the target batch processing task Execution state is monitored, and when monitoring that any one script executes abnormal, and user's script is prompted to execute exception.
Real-time monitoring to abnormal script carries out timely processing convenient for user to abnormal conditions.
Preferably, the method further includes:
After the target batch processing task execution, discharge memory source that the target batch processing task occupies and Cpu resource.
After target batch processing task execution, discharge in time memory source that the target batch processing task occupies and Cpu resource improves resource utilization.
Present embodiment discloses a kind of batch processing method based on Impala components, user only needs defeated in default template Enter the SQL statement for meeting Impala Element formats corresponding with target batch processing task, generation is embedded with the SQL statement Preset format file runs the preset format file, in the running environment by calling the preset format file with it is described The corresponding at least one encapsulation function packet of target batch processing task calls the application programming in Impala components to connect Mouthful, the SQL statement is sent to Impala components, the Impala components is made to execute the target batch processing task.It avoids Developer is in order to call Impala components to carry out a large amount of program development, and into line program when carrying out batch processing every time It writes and brings mistake, improve the execution efficiency of batch processing.
Based on a kind of batch processing method based on Impala components disclosed in above-described embodiment, referring to Fig. 2, the present embodiment Correspondence discloses a kind of batch processing system based on Impala components, specifically includes:
SQL acquiring units 101 are inputted for obtaining user in default template corresponding with target batch processing task Meet the SQL statement of Impala formats;
Generation unit 102, for generating the preset format file for being embedded with the SQL statement;
Impala call units 103, for running the preset format file, by calling the preset format file At least one encapsulation function packet corresponding with the target batch processing task in running environment calls in Impala components The SQL statement is sent to Impala components by application programming interface, and the Impala components is made to execute the target batch Processing task.Preferably, the batch processing system based on Impala components is developed by Python, Python It is a kind of explanatory computer programming language for the object-oriented increased income, it is easy to learn, it is portable strong, it can be any The batch processing system based on Impala components is developed in the system of organization, meanwhile, Python has abundant Class libraries, it is preferable to the support of Impala components.
Certainly, the batch processing system based on Impala components disclosed in the present embodiment is not limited thereto, can also be by it He develops programming language.
The batch processing system based on Impala components direct-connected big data platform Impala components realization target when running Batch processing task.
Python running environment, the batch processing system and Python scripts based on Impala components are stored in network and deposit On reservoir NAS.
Preferably, the system also includes:
Whether input parameter detection unit is correct for detecting the input parameter in the SQL statement and defeated when detecting User's input parameter mistake is prompted when entering parameter error.
Preferably, the system also includes:
Logging unit, being used for will be in the parameter information of the target batch processing task, the target batch processing task Each script execution state and each SQL statement execution state recording in daily record.
Preferably, the system also includes:
State feedback unit, in the implementation procedure of the target batch processing task, feeding back holding for each script The execution state of row state and each SQL statement.
Preferably, the system also includes:
Exception monitoring unit, in the implementation procedure of the target batch processing task, appointing to the target batch processing The execution state of each script in business is monitored, and when monitoring that any one script executes abnormal, prompts user Script executes exception.
Preferably, the system also includes:
Resource releasing unit is appointed for after the target batch processing task execution, discharging the target batch processing The memory source and cpu resource for occupancy of being engaged in.
Present embodiment discloses a kind of batch processing system based on Impala components, user only needs defeated in default template Enter the SQL statement for meeting Impala Element formats corresponding with target batch processing task, generation is embedded with the SQL statement Preset format file runs the preset format file, in the running environment by calling the preset format file with it is described The corresponding at least one encapsulation function packet of target batch processing task calls the application programming in Impala components to connect Mouthful, the SQL statement is sent to Impala components, the Impala components is made to execute the target batch processing task.It avoids Developer is in order to call Impala components to carry out a large amount of program development, and into line program when carrying out batch processing every time It writes and brings mistake, improve the execution efficiency of batch processing.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest range caused.

Claims (12)

1. a kind of batch processing method based on Impala components, which is characterized in that including:
Obtain the SQL that meets Impala Element format corresponding with target batch processing task that user inputs in default template Sentence;
Generate the preset format file for being embedded with the SQL statement;
Run the preset format file, in the running environment by calling the preset format file with the target batch processing The corresponding at least one encapsulation function packet of task calls the application programming interface in Impala components, by the SQL Sentence is sent to Impala components, and the Impala components is made to execute the target batch processing task.
2. to go the method described in 1 according to right, which is characterized in that obtaining user inputs in default template and target batch After the corresponding SQL statement of processing task, the method further includes:
Whether the input parameter detected in the SQL statement is correct, and user's input is prompted when detecting input parameter mistake Parameter error.
3. according to the method described in claim 1, it is characterized in that, the method further includes:
By the execution shape of each script in the parameter information of the target batch processing task, the target batch processing task The execution state recording of state and each SQL statement is in daily record.
4. according to the method described in claim 1, it is characterized in that, the method further includes:
In the implementation procedure of the target batch processing task, the execution state and each SQL statement of each script are fed back Execution state.
5. to go the method described in 1 according to right, which is characterized in that the method further includes:
In the implementation procedure of the target batch processing task, the execution to each script in the target batch processing task State is monitored, and when monitoring that any one script executes abnormal, and user's script is prompted to execute exception.
6. to go the method described in 1 according to right, which is characterized in that the method further includes:
After the target batch processing task execution, memory source and CPU that the target batch processing task occupies are discharged Resource.
7. a kind of batch processing system based on Impala components, which is characterized in that including:
SQL acquiring units corresponding with target batch processing task meet for obtain that user inputs in default template The SQL statement of Impala formats;
Generation unit, for generating the preset format file for being embedded with the SQL statement;
Impala call units, for running the preset format file, by the operation ring for calling the preset format file At least one encapsulation function packet corresponding with the target batch processing task in border calls the application journey in Impala components The SQL statement is sent to Impala components by sequence programming interface, so that the Impala components is executed the target batch processing and is appointed Business.
8. system according to claim 7, which is characterized in that the system also includes:
Input parameter detection unit, it is whether correct for detecting the input parameter in the SQL statement, and input ginseng when detecting Miscount prompt user's input parameter mistake of mistaking.
9. system according to claim 7, which is characterized in that the system also includes:
Logging unit, being used for will be every in the parameter information of the target batch processing task, the target batch processing task The execution state of one script and the execution state recording of each SQL statement are in daily record.
10. system according to claim 7, which is characterized in that the system also includes:
State feedback unit, in the implementation procedure of the target batch processing task, feeding back the execution shape of each script The execution state of state and each SQL statement.
11. to go the system described in 7 according to right, which is characterized in that the system also includes:
Exception monitoring unit is used in the implementation procedure of the target batch processing task, in the target batch processing task The execution state of each script be monitored, and when monitoring that any one script executes abnormal, prompt user's script Execute exception.
12. to go the system described in 7 according to right, which is characterized in that the system also includes:
Resource releasing unit, for after the target batch processing task execution, discharging the target batch processing task and accounting for Memory source and cpu resource.
CN201810385610.6A 2018-04-26 2018-04-26 A kind of batch processing method and system based on Impala components Pending CN108595156A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810385610.6A CN108595156A (en) 2018-04-26 2018-04-26 A kind of batch processing method and system based on Impala components

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810385610.6A CN108595156A (en) 2018-04-26 2018-04-26 A kind of batch processing method and system based on Impala components

Publications (1)

Publication Number Publication Date
CN108595156A true CN108595156A (en) 2018-09-28

Family

ID=63610240

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810385610.6A Pending CN108595156A (en) 2018-04-26 2018-04-26 A kind of batch processing method and system based on Impala components

Country Status (1)

Country Link
CN (1) CN108595156A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112947846A (en) * 2019-12-11 2021-06-11 北京金山云网络技术有限公司 Batch processing task execution method and device of object storage system and electronic equipment
CN117056379A (en) * 2023-10-11 2023-11-14 宁波银行股份有限公司 Metadata caching method and device, electronic equipment and readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929585A (en) * 2012-09-25 2013-02-13 上海证券交易所 Batch processing method and system supporting multi-master distributed data processing
US20140280032A1 (en) * 2013-03-13 2014-09-18 Cloudera, Inc. Low latency query engine for apache hadoop
CN106777101A (en) * 2016-12-14 2017-05-31 深圳天源迪科信息技术股份有限公司 Data processing engine
CN107301057A (en) * 2017-07-28 2017-10-27 山东中创软件工程股份有限公司 A kind of big data batch processing method and device
CN107391719A (en) * 2017-07-31 2017-11-24 南京邮电大学 Distributed stream data processing method and system in a kind of cloud environment
CN107943555A (en) * 2017-10-17 2018-04-20 华南理工大学 Big data storage and processing platform and processing method under a kind of cloud computing environment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929585A (en) * 2012-09-25 2013-02-13 上海证券交易所 Batch processing method and system supporting multi-master distributed data processing
US20140280032A1 (en) * 2013-03-13 2014-09-18 Cloudera, Inc. Low latency query engine for apache hadoop
CN106777101A (en) * 2016-12-14 2017-05-31 深圳天源迪科信息技术股份有限公司 Data processing engine
CN107301057A (en) * 2017-07-28 2017-10-27 山东中创软件工程股份有限公司 A kind of big data batch processing method and device
CN107391719A (en) * 2017-07-31 2017-11-24 南京邮电大学 Distributed stream data processing method and system in a kind of cloud environment
CN107943555A (en) * 2017-10-17 2018-04-20 华南理工大学 Big data storage and processing platform and processing method under a kind of cloud computing environment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112947846A (en) * 2019-12-11 2021-06-11 北京金山云网络技术有限公司 Batch processing task execution method and device of object storage system and electronic equipment
CN117056379A (en) * 2023-10-11 2023-11-14 宁波银行股份有限公司 Metadata caching method and device, electronic equipment and readable storage medium
CN117056379B (en) * 2023-10-11 2024-01-26 宁波银行股份有限公司 Metadata caching method and device, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN100476819C (en) Data mining system based on Web and control method thereof
US10129116B2 (en) Techniques for capturing execution time data in dataflow graphs
Zhang et al. Research on lightweight MVC framework based on spring MVC and mybatis
US6125363A (en) Distributed, multi-user, multi-threaded application development method
US20070156888A1 (en) Dynamically repositioning workflow by end users
CN105975261B (en) A kind of runtime system and operation method called towards unified interface
CN102810057A (en) Log recording method
CN101751288A (en) Method, device and system applying process scheduler
CN113168422A (en) System and method for automatically completing ICS process using artificial intelligence/machine learning
CN110490416A (en) Task management method and terminal device
CN108595156A (en) A kind of batch processing method and system based on Impala components
CN111177237A (en) Data processing system, method and device
US20240143294A1 (en) Distributed application development platform
CN103678425A (en) Integrated analysis for multiple systems
US8126961B2 (en) Integration of client and server development environments
Liu et al. Parallelization of scientific workflows in the cloud
de Carvalho Junior et al. Contextual abstraction in a type system for component-based high performance computing platforms
CN107704362A (en) A kind of method and device based on Ambari monitoring big data components
US7124134B2 (en) Distributed, multi-user, multi-threaded application development system and method
Madougou et al. Characterizing workflow-based activity on a production e-infrastructure using provenance data
US20140244570A1 (en) Optimizing and managing execution of hybrid flows
US20080091632A1 (en) Distributed, multi-user, multi-threaded application development system and method
FatimaAladwan et al. Service composition in service oriented architecture: a survey
Wang et al. Low-code ChatOps for Microservices Systems Using Service Composition
CN109905253A (en) A kind of log information acquisition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180928

RJ01 Rejection of invention patent application after publication