CN107862038B - Data mining platform for decoupling WEB client and big data mining analysis and implementation method - Google Patents

Data mining platform for decoupling WEB client and big data mining analysis and implementation method Download PDF

Info

Publication number
CN107862038B
CN107862038B CN201711072922.3A CN201711072922A CN107862038B CN 107862038 B CN107862038 B CN 107862038B CN 201711072922 A CN201711072922 A CN 201711072922A CN 107862038 B CN107862038 B CN 107862038B
Authority
CN
China
Prior art keywords
data mining
web client
platform
service unit
development platform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711072922.3A
Other languages
Chinese (zh)
Other versions
CN107862038A (en
Inventor
陶源
李末岩
郭俸明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Third Research Institute of the Ministry of Public Security
Original Assignee
Third Research Institute of the Ministry of Public Security
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Third Research Institute of the Ministry of Public Security filed Critical Third Research Institute of the Ministry of Public Security
Priority to CN201711072922.3A priority Critical patent/CN107862038B/en
Publication of CN107862038A publication Critical patent/CN107862038A/en
Application granted granted Critical
Publication of CN107862038B publication Critical patent/CN107862038B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Abstract

The invention discloses a data mining platform for decoupling a WEB client and big data mining analysis and an implementation method, the scheme is formed by matching the WEB client, a data mining service unit and a data mining development platform, the data mining service unit is respectively in data connection with the WEB client and the data mining development platform, receives and manages the request of the WEB client, senses a related data mining algorithm developed and packaged by the data mining development platform, and builds a calling channel between the WEB client and the data mining development platform; and the WEB client calls a related data mining algorithm developed by the data mining development platform through the data mining service unit. According to the scheme, the WEB client development technology and the big data mining analysis technology can be decoupled, and the front end and the work of analysts are connected through the R data mining service, so that the big data mining analysts are concentrated in realizing the R function.

Description

Data mining platform for decoupling WEB client and big data mining analysis and implementation method
Technical Field
The invention relates to the technical field of computer data, in particular to a computer data mining technology.
Background
Data mining is the process of extracting potential, valuable knowledge from large amounts of data. With the continuous development of the information age, the application of the data mining technology in various industries is more and more extensive.
The data mining finds knowledge from massive information by using various technologies, and enables various industries to predict development trends and the like from the acquired knowledge.
The R language integrates a large number of open-source data mining algorithms, and the value of knowledge is brought into play in various industries by utilizing the algorithms provided by the R. Meanwhile, the big data technology is also endless, the big data technology represented by Hadoop/Spark is the mainstream of the current technology, and the big data technology increases the calculation and storage resources by expanding the number of machines. Spark R is a module of Spark, and can perform data mining on a large data set by using R as a runtime environment of Spark.
The results generated by the data mining analysis need to be displayed on the WEB client page. According to the current situation of the current development technology, few developers can understand data mining analysis and WEB client technology, and further development of data mining analysis is greatly restricted.
Therefore, there is a need in the art for a data platform that can organically combine these two technologies and utilize the respective expertise of developers of these two technologies to cooperate to complete data mining.
Disclosure of Invention
Aiming at the problems of the existing data mining analysis technology and the WEB client technology in the development process, a new scheme for realizing the decoupling of the data mining analysis technology and the WEB client technology is needed.
Therefore, the invention aims to provide a data mining platform for decoupling a WEB client and big data mining analysis and an implementation method.
In order to achieve the above object, the data mining platform for decoupling a WEB client and big data mining analysis provided by the present invention comprises: the data mining system comprises a WEB client, a data mining service unit and a data mining development platform, wherein the data mining service unit is respectively in data connection with the WEB client and the data mining development platform, receives and manages a request of the WEB client, senses a related data mining algorithm developed and packaged by the data mining development platform, and builds a calling channel between the WEB client and the data mining development platform; and the WEB client calls a related data mining algorithm developed by the data mining development platform through the data mining service unit.
Further, the WEB client supports an HTTP protocol.
Further, the data mining service unit provides a RESTful API interface, supports asynchronous execution and runs in a task mode, and the supported running environments are R and Spark.
Furthermore, the data mining development platform provides functions of code editing, running and packaging.
Further, the data mining development platform provides development testing and packaging environments of R and SparkR.
In order to achieve the above object, the method for implementing the decoupling WEB client and the big data mining analysis provided by the present invention comprises:
step 1: developing a data mining related algorithm on a data mining development platform, debugging, storing, packaging and deploying;
step 2: the data mining service unit automatically senses a packed data mining algorithm of the data mining development platform;
and step 3: and the WEB client calls a data mining algorithm through the RESTful API and asynchronously waits for the data mining service unit to return a result.
The scheme for decoupling big data mining analysis and the WEB client enables big data mining analysts to concentrate on realizing R functions or scripts, and WEB client developers can call the RESTful API of analysis by using favorite languages, tools and WEB frameworks.
The data mining service in this scenario provides a bridge between these two systems, and neither party needs to impose additional restrictions. Analysts focus on analytics, WEB developers focus on WEB technology and user experience. The decoupling scheme ensures good organization and maintainability of the application program, and enables analysts and WEB developers to respectively take their own roles and be tightly combined in one team.
Drawings
The invention is further described below in conjunction with the appended drawings and the detailed description.
Fig. 1 is a framework diagram of a data mining platform for decoupling WEB clients and big data mining analysis in an embodiment of the present invention.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further explained below by combining the specific drawings.
The data mining platform provided by the embodiment can decouple the WEB client development technology and the big data mining analysis technology, and connects the front end and the work of analysts through R data mining service, so that the big data mining analysts are concentrated in realizing R functions, the WEB client developers can concentrate in user experience, and the RESTful API of own favorite languages, tools and WEB frameworks can be used for calling analysis.
Referring to fig. 1, a block diagram of a data mining platform for decoupling WEB clients and big data mining analysis provided by the present example based on the above-mentioned principles is shown.
As can be seen from the figure, the data mining platform 100 is mainly composed of a client 100, a data mining service unit 120, and a data mining development platform 130 in cooperation.
The client 100 is used for WEB client developers to develop WEB client technology, and the client 100 only needs to support the HTTP protocol, for example, languages such as Java, C #, C + +, Python, Go, and Scala that support the HTTP protocol.
The client 100 can call the data mining service provided by the data mining service unit 120 through restful api through any language, tool and WEB framework, and the data mining service will asynchronously return a call result.
The data mining service unit 120, which is a provider of mining services, corresponds to a data mining gateway. The data mining service unit 120 provides an algorithm of a data mining service support R and a distributed algorithm of SparkR; the data mining service 120 generates data mining tasks for each request, and asynchronously returns a caller.
The data mining service unit 120 may specifically manage a WEB front-end request, resource allocation, I/O operation, data mining algorithm execution, asynchronously call a RESTful api, execute a RESTful service, and asynchronously return a result. Therefore, the data mining service unit provides a RESTful API interface, realizes RESTful API support, supports asynchronous execution and runs in a task mode, and the running environment supported by the data mining service unit is R and Spark (supported by Spark R).
And the data mining development platform 130 is used for developing big data mining analysis technology by an analyst. The data mining development platform 130 integrates development testing and packaging environments for developing data mining algorithms (e.g., R and SparkR), providing code editing, running, and packaging functions.
The data mining platform 100 for decoupling the WEB client and the big data mining analysis is capable of decoupling WEB client development and big data mining analysis, and the specific data mining process comprises the following steps:
step 1: developing a data mining related algorithm based on a development test and packaging environment of a data mining algorithm (such as R and SparkR) provided by the data mining development platform 130, debugging, storing, packaging and deploying;
step 2: the data mining service unit 120 automatically senses the availability of the packed data mining algorithms on the data mining development platform 130;
and step 3: the WEB client 110 calls a RESTful API provided by the data mining service unit 120, calls an available data mining algorithm sensed by the data mining service unit 120 through the RESTful API, and asynchronously waits for the data mining service to return a result.
Therefore, the method for decoupling the data mining analyst and the WEB client developer can be decoupled through the platform, so that the analyst can concentrate on realizing the R function or the script. In practical application, the decoupling ensures good organization and maintainability of the application program, and enables analysts and WEB developers to respectively take their own roles and be closely combined in one team.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (6)

1. The data mining platform for decoupling the WEB client and the big data mining analysis is characterized by comprising: the data mining system comprises a WEB client, a data mining service unit and a data mining development platform, wherein the data mining service unit is respectively in data connection with the WEB client and the data mining development platform, receives and manages a request of the WEB client, senses a related data mining algorithm developed and packaged by the data mining development platform, and builds a calling channel between the WEB client and the data mining development platform;
the data mining service unit provides an algorithm of a data mining service support R and a distributed algorithm of a spark R; the data mining service unit generates a data mining task for each request and asynchronously returns to call;
the WEB client calls a related data mining algorithm developed by a data mining development platform through a data mining service unit;
the data mining service unit can manage WEB front-end requests, resource allocation, I/O operation and data mining algorithm execution, asynchronously call RESTful API, execute RESTful service and asynchronously return results; and generating a data mining task aiming at each request of the WEB client, and asynchronously returning to a caller.
2. The data mining platform for decoupling WEB clients from big data mining analytics as claimed in claim 1, wherein the WEB client supports the HTTP protocol.
3. The data mining platform for decoupling WEB clients and big data mining analysis according to claim 1, wherein the data mining service unit provides a RESTful API interface, supports asynchronous execution, and operates in a task manner, and the supported operating environments are R and Spark.
4. The data mining platform for decoupling WEB clients and big data mining analysis according to claim 1, wherein the data mining development platform provides code editing, running and packaging functions.
5. The data mining platform for decoupling WEB clients and big data mining analysis according to claim 1, wherein the data mining development platform provides development testing and packaging environments for R and SparkR.
6. The implementation method for decoupling the WEB client and the big data mining analysis is characterized by comprising the following steps:
step 1: developing a data mining related algorithm on a data mining development platform, debugging, storing, packaging and deploying;
step 2: the data mining service unit automatically senses a packed data mining algorithm of the data mining development platform;
and step 3: and the WEB client calls a data mining algorithm through the RESTful API and asynchronously waits for the data mining service unit to return a result.
CN201711072922.3A 2017-11-04 2017-11-04 Data mining platform for decoupling WEB client and big data mining analysis and implementation method Active CN107862038B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711072922.3A CN107862038B (en) 2017-11-04 2017-11-04 Data mining platform for decoupling WEB client and big data mining analysis and implementation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711072922.3A CN107862038B (en) 2017-11-04 2017-11-04 Data mining platform for decoupling WEB client and big data mining analysis and implementation method

Publications (2)

Publication Number Publication Date
CN107862038A CN107862038A (en) 2018-03-30
CN107862038B true CN107862038B (en) 2022-01-21

Family

ID=61700726

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711072922.3A Active CN107862038B (en) 2017-11-04 2017-11-04 Data mining platform for decoupling WEB client and big data mining analysis and implementation method

Country Status (1)

Country Link
CN (1) CN107862038B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104954453A (en) * 2015-06-02 2015-09-30 浙江工业大学 Data mining REST service platform based on cloud computing
CN105391777A (en) * 2015-10-28 2016-03-09 卢星宇 Algorithm escrow PaaS platform for decoupling logic code and performance code
CN106980678A (en) * 2017-03-30 2017-07-25 温馨港网络信息科技(苏州)有限公司 Data analysing method and system based on zookeeper technologies

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9875494B2 (en) * 2013-04-16 2018-01-23 Sri International Using intents to analyze and personalize a user's dialog experience with a virtual personal assistant

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104954453A (en) * 2015-06-02 2015-09-30 浙江工业大学 Data mining REST service platform based on cloud computing
CN105391777A (en) * 2015-10-28 2016-03-09 卢星宇 Algorithm escrow PaaS platform for decoupling logic code and performance code
CN106980678A (en) * 2017-03-30 2017-07-25 温馨港网络信息科技(苏州)有限公司 Data analysing method and system based on zookeeper technologies

Also Published As

Publication number Publication date
CN107862038A (en) 2018-03-30

Similar Documents

Publication Publication Date Title
KR102220127B1 (en) Method and apparatus for customized software development kit (sdk) generation
KR102218995B1 (en) Method and apparatus for code virtualization and remote process call generation
US8209674B2 (en) Tier splitting support for distributed execution environments
CN110083455B (en) Graph calculation processing method, graph calculation processing device, graph calculation processing medium and electronic equipment
CN110365751B (en) Service processing method, device and equipment of gateway system
US8615750B1 (en) Optimizing application compiling
CN111338623B (en) Method, device, medium and electronic equipment for developing user interface
WO2016058488A1 (en) Method and device for providing sdk files
Jiang et al. WebPerf: Evaluating what-if scenarios for cloud-hosted web applications
US10084637B2 (en) Automatic task tracking
GB2589658A (en) Method and apparatus for running an applet
US10540259B1 (en) Microservice replay debugger
CN110865889A (en) Method and apparatus for transferring events between components
La et al. A taxonomy of offloading in mobile cloud computing
CN108984202B (en) Electronic resource sharing method and device and storage medium
CN113010561A (en) Data acquisition method and device based on super account book and computer system
Elgendy et al. MCACC: New approach for augmenting the computing capabilities of mobile devices with Cloud Computing
CN109840109B (en) Method and apparatus for generating software development toolkit
US20140149488A1 (en) System and method for engaging a mobile device
CN111414154A (en) Method and device for front-end development, electronic equipment and storage medium
CN107862038B (en) Data mining platform for decoupling WEB client and big data mining analysis and implementation method
CN115982491A (en) Page updating method and device, electronic equipment and computer readable storage medium
US11163603B1 (en) Managing asynchronous operations in cloud computing environments
CN111400623B (en) Method and device for searching information
US20210034507A1 (en) Systems and methods for automated invocation of accessibility validations in accessibility scripts

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant