CN109086038B - Spark-based big data development method and device, and terminal - Google Patents
Spark-based big data development method and device, and terminal Download PDFInfo
- Publication number
- CN109086038B CN109086038B CN201810755408.8A CN201810755408A CN109086038B CN 109086038 B CN109086038 B CN 109086038B CN 201810755408 A CN201810755408 A CN 201810755408A CN 109086038 B CN109086038 B CN 109086038B
- Authority
- CN
- China
- Prior art keywords
- development
- template
- big data
- data
- spark
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
- G06F8/33—Intelligent editors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/20—Software design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Stored Programmes (AREA)
Abstract
The invention is suitable for the technical field of big data development, and provides a big data development method, a device and a terminal based on Spark, wherein the big data development method comprises the following steps: an integrated development environment is installed, and template engineering is conveniently introduced; downloading a recent template project, compiling and packaging at the same time, and generating a software development kit; adding the software development kit to the integrated development environment to form a development template; and newly building a big data development project, and developing the big data by applying the development template. In the invention, the development mode based on the template not only provides the encapsulation class and the encapsulation method, but also provides the directly operable development template, thereby improving the development efficiency, reducing the door threshold and accelerating the development progress in the simplest and most effective mode.
Description
Technical Field
The invention belongs to the technical field of big data development, and particularly relates to a big data development method and device based on Spark, and a terminal.
Background
In recent years, more and more tool development kits bring great convenience to the development task, namely, a technician packages some dependence and practical methods by self-contained packaging means and then uses the methods by others by reference. The method is the most common technology and function sharing mode at present, but the mode has certain disadvantages that the mode is not friendly to beginners, the development and packaging of Spark and the like are not thorough, and the mode cannot be used by many people quickly.
The existing tool development kit only provides an encapsulation method or a parent class, and is used in an inheritance and reference mode, so that a user can well use an internal method only by reading the internal method to a certain extent, and can really start task development only by correspondingly knowing Spark development through other information sources. Therefore, the problems of slow operation, difficult development and the like are caused, and extra development cost is increased.
Disclosure of Invention
The embodiment of the invention provides a Spark-based big data development method, a Spark-based big data development device and a Spark-based big data development terminal, and aims to solve the problems that the development mode in the prior art is not completely encapsulated and cannot be used quickly.
A Spark-based big data development method comprises the following steps:
an integrated development environment is installed, and template engineering is conveniently introduced;
downloading a recent template project, compiling and packaging at the same time, and generating a software development kit;
adding the software development kit to the integrated development environment to form a development template;
and newly building a big data development project, and developing the big data by applying the development template.
Preferably, after the installing the integrated development environment, the method further includes: and installing the Maven plug-in of the Maven warehouse and the IDE.
Preferably, the development template includes at least one of a general template, a data cleansing template, and a Spark operator template.
Preferably, the development template contains the reading and structuring of input parameters, the input and output of data, and the selection of intermediate cleaning methods.
Preferably, the step of performing big data development by applying the development template in the newly-built big data development project includes:
performing corresponding modification according to the codes of the development template to complete big data development, or
And continuously expanding the development template, simplifying the development process and sharing a code architecture.
Preferably, the development template is a code with detailed comments and capable of running quickly, and the step of applying the development template to develop big data includes:
selecting a required data source writing method according to the annotation, selecting a reasonable RDD operator, and selecting a required data source input method;
the code is modified or pruned as needed.
The invention also provides a Spark-based big data development device, which is characterized by comprising:
the installation unit is used for installing an integrated development environment and is convenient for introducing template engineering;
the compiling unit is used for downloading a recent template project, compiling and packaging at the same time and generating a software development kit;
the adding unit is used for adding the software development toolkit into the integrated development environment to form a development template;
and the development unit is used for newly building a big data development project and applying the development template to develop the big data.
Preferably, the mounting unit further comprises: and installing the Maven plug-in of the Maven warehouse and the IDE.
The invention also provides a memory storing a computer program executed by a processor to perform the steps of:
an integrated development environment is installed, and template engineering is conveniently introduced;
downloading a recent template project, compiling and packaging at the same time, and generating a software development kit;
adding the software development kit into the integrated development environment to form a development template;
and newly building a big data development project, and developing the big data by applying the development template.
The invention also provides a terminal, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the computer program to realize the following steps:
an integrated development environment is installed, and template engineering is conveniently introduced;
downloading a recent template project, compiling and packaging at the same time, and generating a software development kit;
adding the software development kit into the integrated development environment to form a development template;
and newly building a big data development project, and developing the big data by applying the development template.
In the embodiment of the invention, the development mode based on the template not only provides the class and the method of the encapsulation, but also provides the development template which can be directly operated, thereby improving the development efficiency, reducing the door threshold and accelerating the development progress in the simplest and most effective mode.
Drawings
Fig. 1 is a flowchart of a Spark-based big data development method according to a first embodiment of the present invention;
fig. 2 is a flowchart of a preferred mode of a Spark-based big data development method according to a first embodiment of the present invention;
fig. 3 is a structural diagram of a Spark-based big data development device according to a second embodiment of the present invention;
fig. 4 is a structural diagram of a terminal according to a third embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In an embodiment of the present invention, a Spark-based big data development method includes: an integrated development environment is installed, and template engineering is conveniently introduced; downloading a recent template project, compiling and packaging at the same time, and generating a software development kit; adding the software development kit to the integrated development environment to form a development template; and newly building a big data development project, and developing the big data by applying the development template.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
The first embodiment is as follows:
fig. 1 is a flowchart illustrating a big data development method based on Spark according to a first embodiment of the present invention, where the method includes:
step S1, installing an Integrated Development Environment (IDE) to facilitate introduction of template engineering;
an IDE typically includes a code editor, compiler, debugger, and graphical user interface tools. The IDE in the embodiment of the invention can be development environments such as IDEA, Eclipse and the like. In step S1, after the integrated development environment is installed, the Maven plugins of the Maven repository and IDE need to be installed so that the development tool can introduce the template engineering.
Step S2, downloading the recent template project, compiling and packaging at the same time, and generating a Software Development Kit (SDK);
the SDK in the embodiments of the present invention supports multiple versions. The SDK encapsulates various versions of Spark-dependent and general methods, such as access to various databases, preprocessing of some data, and the like. Taking developing a simple Spark task as an example, under a common condition, developing the Spark task requires storing related knowledge of Spark, finding dependence, building a development environment, familiarizing an interface, then performing customized development according to a Spark programming specification, and knowing an existing method of Spark based on RDD.
Step S3, adding the software development kit into the integrated development environment to form a development template;
specifically, the development template includes at least one of a general template, a data cleansing template, and a Spark operator template. The development template needs to be dependent on the existing development tools and related plug-ins, needs to be used with dependent management plug-ins, and needs to use template writing functionality to the existing development tools.
All the template related dependencies are integrated into the existing SDK, when the SDK is imported, all the dependencies of the template class are introduced, and the dependency configuration of the functions can be completed in one key. The related dependence of the development template is provided by an SDK mode, the development template is not a compiled tool kit, but has detailed comments and codes capable of running quickly, and the development template has three types in total, namely a general template, a data cleaning template and a Spark operator template. The general template inherits a reading code of a data source, a simple processing code of data and a data storage code, the data cleaning template adds various ETL processes such as filtering, de-duplication, merging and the like on the basis of the general template, and the Spark operator template adds some use examples of Spark complex operators such as Aggregate and the like on the basis of the data cleaning template.
Step S4, creating a big data development project, and developing big data by applying the development template;
specifically, a big data development project is newly built, and after the SDK is introduced, a development template is created, so that all functions in the SDK can be used. The template can be developed secondarily and contributed by codes, and the development of Spark tasks can be completed with minimum time cost. The dependency introduction is completed after the SDK is introduced, the whole development template deployment is completed after the template class is introduced, the configuration of the task development project of Spark is completed only by the two steps, and the method is very friendly to new users and can be directly compiled and run.
In this embodiment, the template-based development method not only provides the class and method of encapsulation, but also provides a directly operable development template, which improves the development efficiency, reduces the threshold for entry, and accelerates the development progress in the simplest and most effective manner.
The whole big data is developed as the following table 1:
firstly, installing an IDE (integrated development environment), and installing a Maven plug-in of a Maven warehouse and the IDE to lead in a template project; then, a management tool is relied on to download a recent template project, and compiling and packaging are carried out simultaneously to generate an SDK; the SDK is added into the IDE, a Spark task project is newly built, a development template is created after the SDK is introduced, the development can be started, all functions in the SDK can be used at the moment, meanwhile, the development templates in various forms provide executable and diversified program samples, the development can be directly and correspondingly changed according to template codes, the development task is completed, the development template can be continuously expanded, the development flow is simplified, and a code framework is shared.
The data source adaptation can comprise writing and outputting of data sources such as MongoDB, HDFS, Hive, Hbase and MySQL, and the universal method comprises various time specifications such as a day starting timestamp, a week starting timestamp and a five-minute starting timestamp; regular judgment of character strings, judgment of NULL values, dynamic switching of data sources and the like are also carried out; meanwhile, configuration management of some dynamic parameters, local configuration association, HDFS configuration association, KV library configuration association and the like are added. All three templates contain the reading and regularization of input parameters, the input and output of data, and the selection of intermediate washing methods.
The invention provides a large data development mode based on a template, not only provides a packaging class and a method, but also provides a directly operable development template, and the development template is respectively customized according to different scenes, so that a user can directly operate only by introducing the template, and simultaneously, the available method is directly modified according to the template sample, and the adjustment of parameters is completed according to the comments. The development template is directly available for users and is provided with a known code which is read in detail, and when a Spark task is created each time, the development template can be created directly through the template, and meanwhile, improvement and addition can be carried out according to the existing template, and the creation of the own template is completed.
The method provided by the embodiment of the invention can further improve the development efficiency of developers, reduce the threshold of entry, build a Spark big data development environment in a one-stop manner, provide an easy-to-use method and increase the support for various data sources.
In a preferred embodiment of this embodiment (see fig. 2), the step of developing big data by applying the development template includes:
step S5, selecting a needed data source writing method according to the annotation, selecting a reasonable RDD operator, and selecting a needed data source input method;
taking developing a simple Spark task as an example, under a common condition, developing the Spark task requires storing related knowledge of Spark, finding dependence, building a development environment, familiarizing an interface, then performing customized development according to a Spark programming specification, and knowing an existing method of Spark based on RDD. After the development method is provided, the debugging of the Spark task can be carried out only by establishing the MAVEN project belonging to the development method, downloading the SDK introduction and introducing the SDK introduction together with the template. And selecting template classes from the three categories to create, selecting a required data source writing method according to the annotations, selecting a reasonable RDD operator, and selecting a required data source input method.
Step S6, modifying or deleting the code as needed.
The developer does not need to care about details, all data source operations and RDD operator operations are presented in the template class in a code mode, and the developer can complete development only by modifying or deleting the code according to needs. The universal template can be selected for development by an experienced developer, and the cost of code specification and data input and output programming is saved. The user can customize own development template at the same time, can establish a new template according to the templates of the three types, and only needs to share the template when multiple persons develop in a collaborative manner.
The function of the entire SDK is shown in table 2 below:
the method is only a simple example, in real life, the difficulty of entering the door is faced by people, the repeated labor is always a great problem troubling the development, and the development mode based on the template just solves the problem.
The big data development method of the embodiment of the invention is based on the existing IDE, only SDK is directly introduced, the use is convenient, the template class of the invention can be directly operated and presented in a code mode, the modification is easy, the expansion is convenient, and the template development class of the invention is established; the invention improves the development efficiency, reduces the entrance threshold, can adapt to various modes such as individual combat and multi-person cooperation, and accelerates the development progress in the simplest and most effective mode. Therefore, the big data development method and the big data development platform provided by the invention have very wide application prospects in various fields such as big data development and the like. It should be noted that, in the implementation process of the present invention, support of the existing development tools is required, data sources included in the present invention are wide, and include MongoDB, HDFS, Hive, Hbase, Mysql, Kafka, and supported operators include all RDD operators on Spark official networks, including method types and use examples.
Example two:
as shown in fig. 3, a structure diagram of a big data developing device based on Spark according to a second embodiment of the present invention includes: installation unit 1, compiling unit 2 connected with installation unit 1, adding unit 3 connected with compiling unit 2, developing unit 4 connected with adding unit 3, wherein:
the installation unit 1 is used for installing an integrated development environment and is convenient for introducing template engineering;
an IDE typically includes a code editor, compiler, debugger, and graphical user interface tools. The IDE in the embodiment of the present invention may be a development environment such as IDEA, Eclipse, or the like. In step S1, after the integrated development environment is installed, the Maven plugins of the Maven repository and IDE need to be installed so that the development tool can introduce the template engineering.
The compiling unit 2 is used for downloading a recent template project, compiling and packaging at the same time, and generating a software development kit;
the SDK in the embodiments of the present invention supports multiple versions. The SDK encapsulates various versions of Spark-dependent and general methods, such as access to various databases, preprocessing of some data, and the like. Taking developing a simple Spark task as an example, under a common condition, developing the Spark task requires storing related knowledge of Spark, finding dependence, building a development environment, familiarizing an interface, then performing customized development according to a Spark programming specification, and knowing an existing method of Spark based on RDD.
The adding unit 3 is used for adding the software development toolkit into the integrated development environment to form a development template;
specifically, the development template includes at least one of a general template, a data cleansing template, and a Spark operator template. The development template needs to be dependent on the existing development tools and related plug-ins, needs to be used with dependent management plug-ins, and needs to use template writing functionality to the existing development tools.
All the template related dependencies are integrated into the existing SDK, when the SDK is imported, all the dependencies of the template class are introduced, and the dependency configuration of the functions can be completed in one key. The related dependence of the development template is provided by an SDK mode, the development template is not a compiled tool kit but is provided with detailed comments and codes capable of running quickly, and the development template comprises three types in total, namely a universal template, a data cleaning template and a Spark operator template. The general template inherits a reading code of a data source, a simple processing code of data and a data storage code, the data cleaning template adds various ETL processes such as filtering, de-duplication, merging and the like on the basis of the general template, and the Spark operator template adds some use examples of Spark complex operators such as Aggregate and the like on the basis of the data cleaning template.
The development unit 4 is used for newly building a big data development project and applying the development template to develop big data;
specifically, a big data development project is newly built, and after the SDK is introduced, a development template is created, so that all functions in the SDK can be used. The template can be developed secondarily and contributed by codes, and the development of Spark tasks can be completed with minimum time cost.
The SDK includes data source adaptation, a general method, a configuration method and templates. The data source adaptation can comprise writing and outputting of data sources such as MongoDB, HDFS, Hive, Hbase and MySQL, and the universal method comprises various time specifications such as a day starting timestamp, a week starting timestamp and a five-minute starting timestamp; regular judgment of character strings, judgment of NULL values, dynamic switching of data sources and the like are also carried out; meanwhile, configuration management of some dynamic parameters, local configuration association, HDFS configuration association, KV library configuration association and the like are added. All three templates contain the reading and regularization of input parameters, the input and output of data, and the selection of intermediate washing methods.
The dependency introduction is completed after the SDK is introduced, the whole development template deployment is completed after the template class is introduced, the configuration of the task development project of Spark is completed only by the two steps, and the method is very friendly to new users and can be directly compiled and run.
The invention provides a large data development mode based on a template, not only provides a packaging class and a method, but also provides a directly operable development template, and the development template is respectively customized according to different scenes, so that a user can directly operate only by introducing the template, and simultaneously, the available method is directly modified according to the template sample, and the adjustment of parameters is completed according to the comments. The development template is directly available for users and is provided with a known code which is read in detail, and when a Spark task is created each time, the development template can be created directly through the template, and meanwhile, improvement and addition can be carried out according to the existing template, and the creation of the own template is completed.
In this embodiment, the template-based development method not only provides the class and method of encapsulation, but also provides a directly operable development template, which improves the development efficiency, reduces the threshold for entry, and accelerates the development progress in the simplest and most effective manner.
In a preferred embodiment of the present embodiment, the development unit 4 is further configured to:
selecting a required data source writing method according to the annotation, selecting a reasonable RDD operator, and selecting a required data source input method;
the code is modified or pruned as needed.
Taking developing a simple Spark task as an example, under a common condition, developing the Spark task requires storing related knowledge of Spark, finding dependence, building a development environment, familiarizing an interface, then performing customized development according to a Spark programming specification, and knowing an existing method of Spark based on RDD. After the development method is provided, debugging of Spark tasks can be performed only by establishing MAVEN engineering belonging to the development method, downloading SDK introduction and introducing templates together. And selecting template classes from the three categories to create, selecting a required data source writing method according to the annotation, selecting a reasonable RDD operator, and selecting a required data source input method.
The developer does not need to care about details, all data source operations and RDD operator operations are presented in the template class in a code mode, and the developer can complete development only by modifying or deleting the code according to needs. The universal template can be selected for development by an experienced developer, and the cost of code specification and data input and output programming is saved. The user can customize own development template at the same time, can establish a new template according to the templates of the three types, and only needs to share the template when multiple persons develop in a collaborative manner.
The big data development method of the embodiment of the invention is based on the existing IDE, only SDK is directly introduced, the use is convenient, the template class of the invention can be directly operated and presented in a code mode, the modification is easy, the expansion is convenient, and the template development class of the invention is established; the invention improves the development efficiency, reduces the entrance threshold, can adapt to various modes such as single-soldier combat and multi-person cooperation, and accelerates the development progress in the simplest and most effective mode. Therefore, the big data development method and the big data development platform provided by the invention have very wide application prospects in various fields such as big data development and the like. It should be noted that, in the implementation process of the present invention, support of the existing development tools is required, the data sources included in the present invention are wide, and include MongoDB, HDFS, Hive, Hbase, Mysql, Kafka, and the supported operators include all RDD operators on Spark official networks, including method categories and use examples, although such a method package is encapsulated by someone, no template is formed so far, and no data source capable of supporting so many data sources is formed, and the one-stop template-based rapid development method of the present invention is not formed.
Example three:
fig. 4 shows a block diagram of a terminal according to a fourth embodiment of the present invention, where the terminal includes: a memory (memory)41, a processor (processor)42 and a bus 43, wherein the processor 42 and the memory 41 are in mutual communication via the bus 43.
A memory 41 for storing various data;
specifically, the memory 41 is used for storing various data, such as parameters, codes, and the like in the process of developing big data, and is not limited herein, and the memory further includes a plurality of computer programs.
The processor 42 is configured to call various computer programs in the memory 41 to execute a Spark-based big data development method provided in the first embodiment, for example:
an integrated development environment is installed, and template engineering is conveniently introduced;
downloading a recent template project, compiling and packaging at the same time, and generating a software development kit;
adding the software development kit to the integrated development environment to form a development template;
and newly building a big data development project, and developing the big data by applying the development template.
The present invention further provides a memory, where the memory stores a plurality of computer programs, and the computer programs are called by the processor to execute a Spark-based big data development method according to the first embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution.
Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (6)
1. A Spark-based big data development method is characterized by comprising the following steps:
installing an integrated development environment for introducing template engineering;
downloading a recent template project, compiling and packaging at the same time, and generating a software development kit;
adding the software development kit to the integrated development environment to form a development template;
newly building a big data development project, and developing the big data by applying the development template;
the step of newly building big data development engineering and applying the development template to develop big data comprises the following steps:
correspondingly changing according to the codes of the development template to complete big data development;
continuing to expand the development template, simplifying the development process and sharing a code architecture;
the development template is a code with comments and capable of running, and the step of applying the development template to develop big data further comprises the following steps:
selecting a required data source writing method according to the annotation, selecting a reasonable RDD operator, and selecting a required data source input method;
modifying or deleting the code as required;
the big data development further comprises: data source adaptation, namely regular judgment of character strings, judgment of NULL values, dynamic switching of data sources, configuration management of dynamic parameters, local configuration association, HDFS configuration association and KV library configuration association; the related dependence of the development template is provided in an SDK mode, and the development template comprises at least one of a general template, a data cleaning template and a Spark operator template; the development template comprises reading and regularizing input parameters, inputting and outputting data and selecting an intermediate cleaning method.
2. The big data development method according to claim 1, wherein after installing the integrated development environment, the method further comprises: and installing the Maven plug-in of the Maven warehouse and the IDE.
3. A Spark-based big data development device is characterized by comprising:
the installation unit is used for installing an integrated development environment and introducing template engineering;
the compiling unit is used for downloading a recent template project, compiling and packaging at the same time and generating a software development kit;
the adding unit is used for adding the software development toolkit into the integrated development environment to form a development template;
the development unit is used for newly building a big data development project and applying the development template to develop big data;
the development unit is further to:
correspondingly changing according to the codes of the development template to complete big data development;
continuing to expand the development template, simplifying the development process and sharing a code architecture;
the development template is code with comments and capable of running, and the development unit is further configured to:
selecting a required data source writing method according to the annotation, selecting a reasonable RDD operator, and selecting a required data source input method;
modifying or deleting the code as required;
the big data development further comprises: data source adaptation, namely regular judgment of character strings, judgment of NULL values, dynamic switching of data sources, configuration management of dynamic parameters, local configuration association, HDFS configuration association and KV library configuration association; the related dependence of the development template is provided in an SDK mode, and the development template comprises at least one of a general template, a data cleaning template and a Spark operator template; the development template comprises reading and regularizing input parameters, inputting and outputting data and selecting an intermediate cleaning method.
4. The big data developing apparatus according to claim 3, wherein the installation unit further comprises: and installing the Maven plug-in of the Maven warehouse and the IDE.
5. A memory for Spark-based big data development, the memory storing a computer program, the computer program being executable by a processor to perform the steps of:
installing an integrated development environment for introducing template engineering;
downloading a recent template project, compiling and packaging at the same time, and generating a software development kit;
adding the software development kit to the integrated development environment to form a development template;
newly building a big data development project, and developing the big data by applying the development template;
the step of newly building the big data development project and applying the development template to develop the big data comprises the following steps:
correspondingly changing according to the codes of the development template to complete big data development;
continuing to expand the development template, simplifying the development process and sharing a code architecture;
the development template is a code with comments and capable of running, and the step of applying the development template to develop big data further comprises the following steps:
selecting a required data source writing method according to the annotation, selecting a reasonable RDD operator, and selecting a required data source input method;
modifying or deleting the code as required;
the big data development further comprises: data source adaptation, namely regular judgment of character strings, judgment of NULL values, dynamic switching of data sources, configuration management of dynamic parameters, local configuration association, HDFS configuration association and KV library configuration association; the related dependence of the development template is provided in an SDK mode, and the development template comprises at least one of a general template, a data cleaning template and a Spark operator template; the development template comprises reading and regularizing input parameters, inputting and outputting data and selecting an intermediate cleaning method.
6. A terminal for Spark-based big data development, comprising a memory, a processor and a computer program stored in the memory and operable on the processor, wherein the processor implements the steps of the Spark-based big data development method according to any one of claims 1 to 2 when executing the computer program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810755408.8A CN109086038B (en) | 2018-07-10 | 2018-07-10 | Spark-based big data development method and device, and terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810755408.8A CN109086038B (en) | 2018-07-10 | 2018-07-10 | Spark-based big data development method and device, and terminal |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109086038A CN109086038A (en) | 2018-12-25 |
CN109086038B true CN109086038B (en) | 2022-05-31 |
Family
ID=64837591
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810755408.8A Active CN109086038B (en) | 2018-07-10 | 2018-07-10 | Spark-based big data development method and device, and terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109086038B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110007900A (en) * | 2019-02-13 | 2019-07-12 | 平安科技(深圳)有限公司 | Tool-class call method, system, computer equipment and storage medium |
CN110928529B (en) * | 2019-11-06 | 2021-10-26 | 第四范式(北京)技术有限公司 | Method and system for assisting operator development |
CN114722161B (en) * | 2022-06-09 | 2022-10-11 | 易方信息科技股份有限公司 | Method and device for rapidly inquiring state of single task of adding PM on IDE interface |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103777944A (en) * | 2013-12-25 | 2014-05-07 | 中软信息系统工程有限公司 | MIPS platform integrated development environment based on Eclipse and implementation method thereof |
WO2017114188A1 (en) * | 2015-12-29 | 2017-07-06 | 口碑控股有限公司 | Printing apparatus and printing method |
CN106990965A (en) * | 2017-03-31 | 2017-07-28 | 合肥民众亿兴软件开发有限公司 | A kind of software platform and its development approach |
CN107924305A (en) * | 2015-09-02 | 2018-04-17 | 谷歌有限责任公司 | Software development and distribution platform |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100726614B1 (en) * | 2006-02-01 | 2007-06-11 | 에스케이 텔레콤주식회사 | System for surpporting a programing about an application based on virtual machine and a method the same |
US9218166B2 (en) * | 2008-02-20 | 2015-12-22 | Embarcadero Technologies, Inc. | Development system with improved methodology for creation and reuse of software assets |
CN103713896B (en) * | 2013-12-17 | 2017-01-04 | 北京京东尚科信息技术有限公司 | Method and device is generated for accessing the SDK of server |
CN106250987B (en) * | 2016-07-22 | 2019-03-01 | 无锡华云数据技术服务有限公司 | A kind of machine learning method, device and big data platform |
CN107632817A (en) * | 2017-09-28 | 2018-01-26 | 北京昆仑在线网络科技有限公司 | A kind of Mobile solution efficient iterative Spark frameworks |
CN107943485B (en) * | 2017-12-11 | 2021-07-20 | 北京奇虎科技有限公司 | Patch compiling platform and patch compiling method |
-
2018
- 2018-07-10 CN CN201810755408.8A patent/CN109086038B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103777944A (en) * | 2013-12-25 | 2014-05-07 | 中软信息系统工程有限公司 | MIPS platform integrated development environment based on Eclipse and implementation method thereof |
CN107924305A (en) * | 2015-09-02 | 2018-04-17 | 谷歌有限责任公司 | Software development and distribution platform |
WO2017114188A1 (en) * | 2015-12-29 | 2017-07-06 | 口碑控股有限公司 | Printing apparatus and printing method |
CN106990965A (en) * | 2017-03-31 | 2017-07-28 | 合肥民众亿兴软件开发有限公司 | A kind of software platform and its development approach |
Also Published As
Publication number | Publication date |
---|---|
CN109086038A (en) | 2018-12-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109086038B (en) | Spark-based big data development method and device, and terminal | |
ES2804506T3 (en) | First-class object sharing across multiple interpreted programming languages | |
US20140372972A1 (en) | Method and apparatus for code virtualization and remote process call generation | |
CN109739494B (en) | Tree-LSTM-based API (application program interface) use code generation type recommendation method | |
CN114610640B (en) | Fuzzy test method and system for trusted execution environment of Internet of things | |
CN111966357A (en) | Operating system application compiling method and device and electronic equipment | |
CN113360156B (en) | IOS compiling method and related equipment | |
JP2005018114A (en) | Program maintenance support device, program maintenance support method, and program | |
CN108304164B (en) | Business logic development method and development system | |
US20150378742A1 (en) | Rule-based activation of behaviors in an extensible software application | |
Sousa et al. | Operationalizing the integration of user interaction specifications in the synthesis of modeling editors | |
Marin et al. | Towards a framework for generating program dependence graphs from source code | |
Winetzhammer | Modgraph-generating executable emf models | |
CN111124386B (en) | Animation event processing method, device, equipment and storage medium based on Unity | |
Ullah et al. | Template-based automatic code generation for web application and APIs using class diagram | |
Standish et al. | EcoLab: Agent based modeling for C++ programmers | |
CN112486523A (en) | Container mirror image creating method and device, storage medium and electronic equipment | |
Papadimitriou et al. | Scientific scripting for the Java platform with jLab | |
Gyén et al. | Comprehension of Thread Scheduling for the C++ Programming Language | |
CN115185502B (en) | Rule-based data processing workflow definition method, device, terminal and medium | |
Di Salle et al. | Mastering Reference Architectures with Modeling Assistants | |
Ishikawa | A Case Study of Refactoring with UML Editor Plug-in for Eclipse–Replace Type Code with State/Strategy– | |
Mészáros et al. | Visual specification of a DSL processor debugger | |
CN118277268A (en) | Code compiling and pile inserting method and device, electronic equipment and readable storage medium | |
CN114185801A (en) | Method and device for generating Mock test script based on unit test |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |