CN107547541B - spark-mllib calling method, storage medium, electronic device and system - Google Patents

spark-mllib calling method, storage medium, electronic device and system Download PDF

Info

Publication number
CN107547541B
CN107547541B CN201710771009.6A CN201710771009A CN107547541B CN 107547541 B CN107547541 B CN 107547541B CN 201710771009 A CN201710771009 A CN 201710771009A CN 107547541 B CN107547541 B CN 107547541B
Authority
CN
China
Prior art keywords
spark
learning request
mllib
program
akka
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710771009.6A
Other languages
Chinese (zh)
Other versions
CN107547541A (en
Inventor
王毅
张文明
陈少杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Douyu Network Technology Co Ltd
Original Assignee
Wuhan Douyu Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Douyu Network Technology Co Ltd filed Critical Wuhan Douyu Network Technology Co Ltd
Priority to CN201710771009.6A priority Critical patent/CN107547541B/en
Publication of CN107547541A publication Critical patent/CN107547541A/en
Application granted granted Critical
Publication of CN107547541B publication Critical patent/CN107547541B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Computer And Data Communications (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a spark-mlib calling method, a storage medium, electronic equipment and a spark-mlib calling system, and relates to the field of spark-mlib calling. The method comprises the following steps: when receiving the learning request, the server sends the learning request to at least 2 akka-http arranged on the server; when the akka-http receives the learning request, comparing the url of the learning request with the route of a prearranged spark-mlib program, and calling the spark-mlib program corresponding to the route matched with the learning request to run if the route matched with the learning request exists; after the spark-mlib program trains the model, the format of the model prediction result is converted into vector by the output model prediction result. The method and the device can directly return the trained model prediction result to the user, thereby realizing millisecond response and remarkably improving user experience.

Description

spark-mllib calling method, storage medium, electronic device and system
Technical Field
The invention relates to the field of spark-mllib (machine learning library of distributed machine learning algorithm) calling, in particular to a spark-mllib calling method, a storage medium, electronic equipment and a spark-mllib calling system.
Background
Currently, the method for a user to perform machine learning operation by using spark-mllib program generally includes: and the user initiates a learning request to the server, and the server runs a spark-mllib program training model corresponding to the learning request in a thread calling mode to obtain a model prediction result.
The above method has the following disadvantages:
(1) the server runs the spark-mllib program in a thread calling mode, and when the number of learning requests is large, each learning request occupies 1 thread, so that the memory occupancy rate and the load of the server are greatly improved, and the working efficiency of the server is reduced.
(2) After the spark-mllib program trains the model, the format of the output model prediction result is an RDD (flexible distributed data sets) or Dataframe data frame, the 2 types of the formats cannot be directly identified, and a user needs to rely on a special code or a third-party tool to convert the model prediction results of the 2 types of the formats into a directly identified model prediction result, so that the user experience is reduced.
Disclosure of Invention
Aiming at the defects in the prior art, the invention solves the technical problems that: how to obtain model predictions that can be directly identified. The method and the device can directly return the trained model prediction result to the user, thereby realizing millisecond response and remarkably improving user experience.
In order to achieve the above object, the spark-mllib calling method provided by the invention comprises the following steps:
s1: when receiving the learning request, the server sends the learning request to at least 2 akka-http arranged on the server, and goes to S2;
s2: comparing the url of the learning request with the route of a prearranged spark-mllib program when the akka-http receives the learning request, calling the spark-mllib program corresponding to the route matched with the learning request to operate if the route matched with the learning request exists, and turning to S3;
s3: and after training the model by using the spark-mllib program, outputting a model prediction result, and converting the format of the model prediction result into a vector.
Based on the above technical solution, in S2, the actor system service created in advance in the server calls the spark-mllib program corresponding to the route matching the learning request to operate.
The storage medium provided by the invention is stored with a computer program, and the computer program realizes the spark-mllib calling method when being executed by a processor.
The electronic equipment provided by the invention comprises a memory and a processor, wherein a computer program running on the processor is stored in the memory, and the spark-mllib calling method is realized when the processor executes the computer program.
The spark-mllib calling system provided by the invention comprises a learning request forwarding module, at least 2 akka-http modules and a plurality of spark-mllib program training modules, wherein the learning request forwarding module is arranged in a server;
the learning request forwarding module is used for: when a learning request is received, the learning request is sent to each akka-http module;
the akka-http module is used to: when a learning request is received, comparing the url of the learning request with a route of a prearranged spark-mllib program, and calling a spark-mllib program training module corresponding to the route matched with the learning request to run if the route matched with the learning request exists;
the spark-mllib program training module is used to: and after a spark-mllib program is called to train the model, outputting a model prediction result, and converting the format of the model prediction result into a vector.
On the basis of the technical scheme, the akka-http module calls a spark-mllib program corresponding to a route matched with the learning request to operate through an actor System service pre-created in the server.
Compared with the prior art, the invention has the advantages that:
(1) referring to S3 of the present invention, the model prediction result of the present invention is in vector format, and the user can directly obtain and identify the model prediction result in vector format without relying on a special code or a third-party tool. Therefore, the trained model prediction result can be directly returned to the user, millisecond-level response (the time required for directly returning the model prediction result is measured in milliseconds) is further realized, and the user experience is remarkably improved.
(2) It can be known from S2 that, in the prior art, the method of calling and running the spark-mllib program by using a higher-level and higher-level actor system service instead of calling and running the spark-mllib program by using a thread call is avoided, so that the memory occupancy rate and load of the server are significantly improved, and the working efficiency of the server is greatly improved.
Drawings
FIG. 1 is a flow chart of a spark-mllib calling method according to an embodiment of the present invention;
fig. 2 is a connection block diagram of an electronic device in an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples.
Referring to fig. 1, the spark-mllib calling method in the embodiment of the present invention includes the following steps:
s1: when receiving the learning request, the server sends the learning request to at least 2 akka-http (a toolbox for generating and providing or consuming http-based network services) installed on the server in a polling manner, and goes to S2.
The server in S1 is nginx (engine x, high-performance HTTP and reverse proxy server), and nginx can adapt to HTTP learning requests with high concurrency, thereby improving the working performance and quality of the server.
The process of sending the learning request to at least 2 akka-http in S1 is as follows: adding an upstream node in the configuration file of nginx, and forming proxy _ pass (forwarding path) of nginx according to the unique identifier of each akka-http and the upstream node, wherein the forwarding path is as follows: the unique identification + upstream of http:// akka-http; and sending a learning request according to proxy _ pass.
S2: akka-http compares the url (uniform resource locator) of the learning request with the routes of the preconfigured spark-mllib program (the spark-mllib program in the server has multiple types and respectively corresponds to different service requirements) when receiving the learning request, and if a route matching the learning request exists, calls the spark-mllib program corresponding to the route matching the learning request to run through an actor system service (the actor system service itself is the prior art, and a specific creation process and a subsequent calling process are not described herein), and goes to S3.
In order for akka-http to receive the learning request, S2 may further include the following steps: and binding the ip and the port of the server to akka-http through the ActorSystems service, and confirming that the learning request is received when the akka-http monitors the learning request of the port of the server.
It can be known from S2 that, in the embodiment of the present invention, a mode of invoking and running a spark-mllib program through a thread in the prior art is avoided, and instead, a higher-level and higher-level actor system service is adopted to invoke the spark-mllib program to run, so that the memory occupancy rate and load of the server are significantly improved, and the working efficiency of the server is greatly improved.
S3: after the spark-mllib program trains the model, a model prediction result is output, and the format of the model prediction result is converted into a vector (an object array capable of realizing automatic growth in java).
It can be known from S3 that the model prediction result in the embodiment of the present invention is in the vector format, and a user can directly acquire and identify the model prediction result in the vector format without relying on a dedicated code or a third-party tool. Therefore, the trained model prediction result can be directly returned to the user, millisecond-level response (the time required for directly returning the model prediction result is measured in milliseconds) is further realized, and the user experience is remarkably improved.
The embodiment of the invention also provides a storage medium, wherein a computer program is stored on the storage medium, and when being executed by a processor, the computer program realizes the spark-mllib calling method. The storage medium includes various media capable of storing program codes, such as a usb disk, a removable hard disk, a ROM (Read-Only Memory), a RAM (Random Access Memory), a magnetic disk, or an optical disk.
Referring to fig. 2, an embodiment of the present invention further provides an electronic device, which includes a memory and a processor, where the memory stores a computer program running on the processor, and the processor implements the spark-mllib calling method when executing the computer program.
The spark-mllib calling system in the embodiment of the invention comprises a learning request forwarding module, at least 2 akka-http modules and a plurality of spark-mllib program training modules, wherein the learning request forwarding module is arranged in a nginx server.
The learning request forwarding module is used for: when a learning request is received, the learning request is sent to each akka-http module, and the specific flow is as follows: adding an upstream node in the configuration file of nginx, and forming proxy _ pass of nginx according to the unique identifier of each akka-http module and the upstream node; and sending a learning request according to proxy _ pass.
The akka-http module is used to: and when the learning request is received, comparing the url of the learning request with the route of a prearranged spark-mllib program, and calling a spark-mllib program training module corresponding to the route matched with the learning request to run through an actor System service which is pre-established in a server side if the route matched with the learning request exists.
The spark-mllib program training module is used to: and after a spark-mllib program is called to train the model, outputting a model prediction result, and converting the format of the model prediction result into a vector.
It should be noted that: in the system provided in the embodiment of the present invention, when performing inter-module communication, only the division of each functional module is illustrated, and in practical applications, the above function distribution may be completed by different functional modules as needed, that is, the internal structure of the system is divided into different functional modules to complete all or part of the above described functions.
Further, the present invention is not limited to the above-mentioned embodiments, and it will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the principle of the present invention, and these modifications and improvements are also considered to be within the scope of the present invention. Those not described in detail in this specification are within the skill of the art.

Claims (8)

1. A spark-mllib calling method, comprising the steps of:
s1: when receiving the learning request, the server sends the learning request to at least 2 akka-http arranged on the server, and goes to S2;
s2: comparing the url of the learning request with the route of a prearranged spark-mllib program when the akka-http receives the learning request, calling the spark-mllib program corresponding to the route matched with the learning request to operate if the route matched with the learning request exists, and turning to S3;
s3: after training the model by using a spark-mllib program, outputting a model prediction result, and converting the format of the model prediction result into a vector;
in S2, the actor system service created in advance in the server calls the spark-mllib program corresponding to the route matching the learning request to operate.
2. The spark-mlllib calling method as recited in claim 1, wherein: s1, the service end is nginx.
3. The spark-mlllib calling method as recited in claim 2, wherein: the process of sending the learning request to at least 2 akka-http in S1 includes: adding an upstream node in the configuration file of nginx, and forming proxy _ pass of nginx according to the unique identifier of each akka-http and the upstream node; and sending a learning request according to proxy _ pass.
4. A computer-readable storage medium having a computer program stored thereon, characterized in that: the computer program, when executed by a processor, implements the method of any of claims 1 to 3.
5. An electronic device comprising a memory and a processor, the memory having stored thereon a computer program for execution on the processor, the processor when executing the computer program implementing the method of any of claims 1 to 3.
6. A spark-mllib invocation system, characterized by: the system comprises a learning request forwarding module, at least 2 akka-http modules and a plurality of spark-mllib program training modules, wherein the learning request forwarding module is arranged in a server;
the learning request forwarding module is used for: when a learning request is received, the learning request is sent to each akka-http module;
the akka-http module is used to: when a learning request is received, comparing the url of the learning request with the route of a prearranged spark-mllib program, and calling the spark-mllib program corresponding to the route matched with the learning request to run if the route matched with the learning request exists;
the spark-mllib program training module is used to: after a spark-mllib program training model is called, a model prediction result is output, and the format of the model prediction result is converted into a vector;
and the akka-http module calls a spark-mllib program corresponding to the route matched with the learning request to run through an actor System service pre-created in the server.
7. The spark-mlllib calling system as recited in claim 6, wherein: the server side is nginx.
8. The spark-mlllib calling system as recited in claim 7, wherein: the process that the learning request forwarding module sends the learning request to each akka-http module comprises the following steps: adding an upstream node in the configuration file of nginx, and forming proxy _ pass of nginx according to the unique identifier of each akka-http module and the upstream node; and sending a learning request according to proxy _ pass.
CN201710771009.6A 2017-08-31 2017-08-31 spark-mllib calling method, storage medium, electronic device and system Active CN107547541B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710771009.6A CN107547541B (en) 2017-08-31 2017-08-31 spark-mllib calling method, storage medium, electronic device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710771009.6A CN107547541B (en) 2017-08-31 2017-08-31 spark-mllib calling method, storage medium, electronic device and system

Publications (2)

Publication Number Publication Date
CN107547541A CN107547541A (en) 2018-01-05
CN107547541B true CN107547541B (en) 2020-07-31

Family

ID=60959148

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710771009.6A Active CN107547541B (en) 2017-08-31 2017-08-31 spark-mllib calling method, storage medium, electronic device and system

Country Status (1)

Country Link
CN (1) CN107547541B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160048408A1 (en) * 2014-08-13 2016-02-18 OneCloud Labs, Inc. Replication of virtualized infrastructure within distributed computing environments
CN105975907B (en) * 2016-04-27 2019-05-21 江苏华通晟云科技有限公司 SVM model pedestrian detection method based on distributed platform
CN106228389A (en) * 2016-07-14 2016-12-14 武汉斗鱼网络科技有限公司 Network potential usage mining method and system based on random forests algorithm

Also Published As

Publication number Publication date
CN107547541A (en) 2018-01-05

Similar Documents

Publication Publication Date Title
CN112751826B (en) Method and device for forwarding flow of computing force application
CN106470123B (en) Log collecting method, client, server and electronic equipment
CN108062243B (en) Execution plan generation method, task execution method and device
CN110336848B (en) Scheduling method, scheduling system and scheduling equipment for access request
CN111966289B (en) Partition optimization method and system based on Kafka cluster
CN112838940B (en) Network controller frame and data processing method
CN105260842B (en) Communication method and system between heterogeneous ERP systems
CN109857524B (en) Stream computing method, device, equipment and computer readable storage medium
US10320616B2 (en) Method and a system for sideband server management
CN103067486A (en) Big-data processing method based on platform-as-a-service (PaaS) platform
CN107547541B (en) spark-mllib calling method, storage medium, electronic device and system
CN112817539A (en) Industrial data storage method and system, electronic device and storage medium
EP3672203A1 (en) Distribution method for distributed data computing, device, server and storage medium
CN112019604B (en) Edge data transmission method and system
CN112534399A (en) Semantic-based Internet of things equipment data processing related application program installation method and device
CN117135060A (en) Business data processing method and system based on edge calculation
CN108572863B (en) Distributed task scheduling system and method
CN114095571A (en) Data processing method, data service bus, terminal and storage medium
CN102238505B (en) Method and system for processing multi-user parallel signalling tracking at client
CN116192849A (en) Heterogeneous accelerator card calculation method, device, equipment and medium
CN106254122B (en) Simple network management protocol agent implementation method based on EOC equipment
CN114501347A (en) Information interaction method, device and system between heterogeneous systems
CN114257623A (en) Internet of things equipment communication method based on streaming processing
CN111309467B (en) Task distribution method and device, electronic equipment and storage medium
CN112637288A (en) Streaming data distribution method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant