CN114490834B - Method and device for replacing big data calculation operation data source based on Kubernetes - Google Patents
Method and device for replacing big data calculation operation data source based on Kubernetes Download PDFInfo
- Publication number
- CN114490834B CN114490834B CN202210357263.2A CN202210357263A CN114490834B CN 114490834 B CN114490834 B CN 114490834B CN 202210357263 A CN202210357263 A CN 202210357263A CN 114490834 B CN114490834 B CN 114490834B
- Authority
- CN
- China
- Prior art keywords
- data source
- character string
- serialization
- serialized
- configuration
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2474—Sequence data queries, e.g. querying versioned data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
- G06F16/90344—Query processing by using string matching techniques
Abstract
The invention provides a method and a device for replacing a big data computing operation data source based on Kubernets. The time for inquiring the data source to be connected in the plurality of data sources is saved, and the efficiency for inquiring the data source to be submitted is greatly improved.
Description
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a method and a device for replacing a big data computing operation data source based on Kubernetes.
Background
At present, when a big data operation algorithm is connected with a data source, connection information such as an access host address, a port, a user name, a password, a certificate, a path and the like of the data source needs to be written into the algorithm, when the data source is replaced, on one hand, connection information of the data source needs to be inquired in a large number of data sources, and on the other hand, a plurality of connection information such as the access host address, the port, the user name, the password, the certificate, the path and the like of a new data source need to be written into the algorithm again.
In actual use, a large number of big data operation algorithms in a company need to continuously replace data sources, excessive time is needed for finding out connection information of a new data source from excessive data sources, and connection information such as an access host address, a port, a user name, a password, a certificate, a path and the like of the data source needs to be rewritten in the algorithms when the data sources are replaced every time, so that a large amount of workload of writing in the algorithms exists, and the efficiency of updating the data sources is low.
Disclosure of Invention
The invention aims to solve the technical problem of how to improve the efficiency of connecting a data source during big data operation, and provides a method and a device for replacing a big data calculation operation data source based on Kubernetes.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a method for replacing a big data computing operation data source based on Kubernetes comprises the following steps:
step 1: respectively creating a serialization character string of a data source connection object for access information of different data sources to be connected, wherein the serialization character string of the data source connection object is a character string generated by serialization of the connection data source object created by using a programming language;
step 2: editing the serialized character string on a configuration resource in Kubernets container arrangement software, and creating the configuration resource in the Kubernets container arrangement software according to the edited configuration resource;
and step 3: the big data algorithm obtains the configuration resources corresponding to the data source from the plurality of created configuration resources according to the name of the data source to be connected, and extracts the serialized character strings of the data source connection object from the configuration resources corresponding to the data source;
and 4, step 4: and deserializing the serialized character strings of the data source connection object to obtain the connection object of the data source for connection.
Further, the data source access information in step 1 is information required by the big data algorithm to access the data source.
Further, in step 2, editing the serialized character string on a configuration resource in kubernets container arrangement software means:
appointing the name key of the configuration resource as the data source name corresponding to the serialized character string;
designating the data key of the configuration resource as a serialized string.
Further, in step 3, the method for obtaining the configuration resource corresponding to the data source according to the name of the data source to be connected is as follows:
and querying and finding the configuration resources corresponding to the data source to be connected through a query command provided by Kubernetes container arrangement software and the name of the data source to be connected.
Further, the method for deserializing the serialized character string of the data source connection object in step 4 is as follows: and restoring the serialized character strings of the data source connection objects into the data source connection objects in the memory by using the reverse serialization operation of the programming language serialization operation.
Further, the programming language is a programming language that provides serialized object functionality.
Further, configuration resources in the kubernets container orchestration software include: secret resource, ConfigMap resource.
The invention also provides a device for replacing the big data calculation operation data source based on Kubernets, which is characterized by comprising the following modules:
a serialization string generation module: the serialization character strings used for respectively establishing access information of different data sources to be connected into a data source connection object are character strings generated by serialization of the connection data source object established by using a programming language;
a configuration resource creation module: the system comprises a Kubernets container arrangement software, a database and a database, wherein the Kubernets container arrangement software is used for editing the serialized character strings on configuration resources in the Kubernets container arrangement software and creating configuration resources in the Kubernets container arrangement software according to the edited configuration resources;
the serialization string extraction module: the big data algorithm is used for acquiring the configuration resource corresponding to the data source from the plurality of created configuration resources according to the name of the data source to be connected, and extracting the serialized character string of the data source connection object from the configuration resource corresponding to the data source;
a connecting module: and the device is used for performing deserialization on the serialized character strings of the data source connection objects extracted from the serialized character string extraction module to obtain the connection objects of the data source for connection.
Further, in the connection module, the method for deserializing the serialized character strings comprises the following steps: and restoring the serialized character strings of the data source connection objects into the data source connection objects in the memory by using the reverse serialization operation of the programming language serialization operation.
By adopting the technical scheme, the invention has the following beneficial effects:
according to the method and the device for replacing the big data computing operation data source based on Kubernets, different data source connection objects are stored in configuration resources in Kubernets container arrangement software in a character string mode, and when the number of data sources is increased, the data source to be connected can be quickly found through query commands provided by the Kubernets container arrangement software. The time for inquiring the data source to be connected in the plurality of data sources is saved, and the efficiency for inquiring the data source to be submitted is greatly improved.
In addition, when the data source is replaced by the big data algorithm, all the connection information such as the access host address, the port, the user name, the password, the certificate, the path and the like of the new data source do not need to be rewritten in the algorithm, only the name of the new data source needs to be written in the algorithm, the serialized character string of the data source connection object is obtained through Kubernetes container arrangement software, the deserialization is carried out on the serialized character string of the data source connection object to obtain the connection object of the new data source, the new data source can be directly accessed through the connection object of the new data source, the workload of rewriting all the connection information such as the access host address, the port, the user name, the password, the certificate, the path and the like of the new data source in the algorithm is reduced, the efficiency of updating the data source by the big data algorithm is improved, and the company benefit is improved.
Drawings
FIG. 1 is a flow chart of the system of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
FIG. 1 shows an embodiment of the present invention of a method for replacing a data source of a big data computing job based on Kubernets, comprising the following steps:
step 1: and respectively creating a serialization character string of a data source connection object for different data source access information, wherein the serialization character string of the data source connection object is a character string generated by serialization of the connection data source object created by using a programming language. In this embodiment, the data source access information is information required by the big data algorithm to access the data source, and includes: host address, port, user name, password, certificate, path and other information, and the used programming language is a programming language capable of providing a serialized object function, and comprises the following steps: java, Python, Go.
And 2, step: editing the serialized character string on a configuration resource in Kubernets container arrangement software, and creating the configuration resource in the Kubernets container arrangement software according to the edited configuration resource.
In this embodiment, the configuration resources in the kubernets container orchestration software include: secret resource, ConfigMap resource. Editing the serialized character string on a configuration resource in Kubernets container arrangement software means that: appointing the name key of the configuration resource as the data source name corresponding to the serialized character string; and designating the data key of the configuration resource as a serialization character string. The content of the data key is a key value pair structure. The method for creating configuration resources in Kubernets container orchestration software comprises the following steps: kubecect apply, and the like.
If a plurality of HDFS file system data sources are subjected to editing on a Secret resource in Kubernets container arrangement software through a document editor, wherein a name key of the Secret resource is designated as the name of the HDFS file system data source; and specifying the data key content of the Secret resource as a serialization character string of an HDFS file system data source connection object created by the HDFS file system data source specified by the Secret resource name key. In this embodiment, serialized character string manners of data source connection objects are respectively created according to different data source access information, the different data source connection objects are stored in configuration resources in kubernets container arrangement software in a character string manner, when the number of data sources increases, the configuration resources of the data source to be connected can be quickly found through query commands provided by the kubernets container arrangement software, and then the access information of the data source to be connected edited on the configuration resources can be found through the configuration resources. The time for inquiring the data source to be connected in the plurality of data sources is saved, and the efficiency for inquiring the data source to be submitted is greatly improved.
And step 3: and the big data algorithm acquires the configuration resource corresponding to the data source from the plurality of created configuration resources according to the name of the data source to be connected, and extracts the serialized character string of the data source connection object from the configuration resource corresponding to the data source. In this embodiment, the kubernets container arrangement software obtains the configuration resource by kubecectels get and the like, and extracts the serialized character string containing the data connection object information from the data key of the configuration resource. Because the name key of the configuration resource is the name of the data source to be connected, the corresponding configuration resource is quickly found through the query command provided by the Kubernets container arrangement software, and then the connection access information of the data source to be connected is found.
And 4, step 4: and deserializing the serialized character strings of the data source connection object to obtain the connection object of the data source for connection.
In this embodiment, the deserialization method for the serialized character string of the data source connection object is that reverse serialization operation of the programming language serialization operation is used, so that the serialized character string of the data source connection object is restored to the data source object in the memory, and then the program is run, so that the access object can be quickly connected without rewriting connection information into the algorithm, thereby reducing a large amount of workload of writing in the algorithm and improving the efficiency of updating the data source.
In this embodiment, when the HDFS file system data source is replaced by the big data algorithm, according to the new name of the HDFS file system data source to be connected, a Secret resource to be connected with the HDFS file system data source is obtained through a kubecectes container arrangement software-provided kubecect command, and a serialization character string of a connection object to be connected with the HDFS file system data source is obtained in a data key of the Secret resource. And performing deserialization on the serialized character strings of the HDFS file system data source connection object by using a Java serialization module to obtain the HDFS file system data source connection object, and connecting a new HDFS file system data source through the HDFS file system data source connection object.
The invention also provides a device for replacing a big data calculation operation data source based on Kubernets, which comprises the following modules:
a serialization string generation module: the serialization character strings used for respectively establishing access information of different data sources to be connected into a data source connection object are character strings generated by serialization of the connection data source object established by using a programming language;
a configuration resource creation module: the system comprises a Kubernets container arrangement software, a database and a database, wherein the Kubernets container arrangement software is used for editing the serialized character strings on configuration resources in the Kubernets container arrangement software and creating configuration resources in the Kubernets container arrangement software according to the edited configuration resources;
the serialization string extraction module: the big data algorithm is used for acquiring the configuration resource corresponding to the data source from the plurality of created configuration resources according to the name of the data source to be connected, and extracting the serialized character string of the data source connection object from the configuration resource corresponding to the data source;
a connecting module: and the device is used for performing deserialization on the serialized character strings of the data source connection objects extracted from the serialized character string extraction module to obtain the connection objects of the data source for connection. In the connection module of this embodiment, the method for deserializing the serialized character string includes: and restoring the serialized character strings of the data source connection objects into the data source connection objects in the memory by using the reverse serialization operation of the programming language serialization operation.
When the data source is replaced by the big data algorithm, all the connection information such as the access host address, the port, the user name, the password, the certificate, the path and the like of the new data source do not need to be rewritten in the algorithm, only the name of the new data source to be connected needs to be written in the algorithm, the serialized character string of the data source connection object is obtained through Kubernetes container arrangement software, the deserialization is carried out on the serialized character string of the data source connection object to obtain the connection object of the new data source, the new data source can be directly accessed through the connection object of the new data source, the workload of rewriting all the connection information such as the access host address, the port, the user name, the password, the certificate, the path and the like of the new data source is reduced, the efficiency of updating the data source by the big data algorithm is improved, and the company benefit is improved.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.
Claims (8)
1. A method for replacing a big data computing operation data source based on Kubernetes is characterized by comprising the following steps:
step 1: respectively creating a serialization character string of a data source connection object for access information of different data sources to be connected, wherein the serialization character string of the data source connection object is a character string generated by serialization of the connection data source object created by using a programming language;
step 2: editing the serialized character strings on configuration resources in Kubernets container arrangement software, and creating configuration resources in the Kubernets container arrangement software according to the edited configuration resources;
and step 3: the big data algorithm obtains the configuration resources corresponding to the data source from the plurality of created configuration resources according to the name of the data source to be connected, and extracts the serialized character strings of the data source connection object from the configuration resources corresponding to the data source;
and 4, step 4: deserializing the serialized character strings of the data source connection object to obtain a connection object of the data source for connection;
in step 2, editing the serialized character string on a configuration resource in kubernets container arrangement software means:
appointing the name key of the configuration resource as the data source name corresponding to the serialized character string;
and designating the data key of the configuration resource as a serialization character string.
2. The method according to claim 1, wherein the data source access information in step 1 is information required by a big data algorithm to access the data source.
3. The method according to claim 2, wherein in step 3, the method for obtaining the configuration resource corresponding to the data source according to the name of the data source to be connected is:
and querying and finding the configuration resources corresponding to the data source to be connected through a query command provided by Kubernetes container arrangement software and the name of the data source to be connected.
4. The method of claim 1, wherein the deserializing of the serialized character string of the data source connection object in step 4 is performed by: and restoring the serialized character strings of the data source connection objects into the data source connection objects in the memory by using the reverse serialization operation of the programming language serialization operation.
5. The method of any of claims 1 to 4, wherein the programming language is a programming language that provides serialized object functionality.
6. The method of any one of claims 1 to 4, wherein configuring resources in Kubernets container orchestration software comprises: secret resource, ConfigMap resource.
7. An apparatus for replacing big data computing operation data source based on Kubernetes, which is characterized by comprising the following modules:
a serialization string generation module: the serialization character strings used for respectively establishing access information of different data sources to be connected into a data source connection object are character strings generated by serialization of the connection data source object established by using a programming language;
a configuration resource creation module: the system comprises a Kubernets container arrangement software, a database and a database, wherein the Kubernets container arrangement software is used for editing the serialized character strings on configuration resources in the Kubernets container arrangement software and creating configuration resources in the Kubernets container arrangement software according to the edited configuration resources;
editing the serialized character string on a configuration resource in Kubernets container arrangement software means that:
appointing the name key of the configuration resource as the data source name corresponding to the serialized character string;
designating the data key of the configuration resource as a serialized character string;
the serialization string extraction module: the big data algorithm is used for acquiring the configuration resource corresponding to the data source from the plurality of created configuration resources according to the name of the data source to be connected, and extracting the serialized character string of the data source connection object from the configuration resource corresponding to the data source;
a connecting module: and the device is used for performing deserialization on the serialized character strings of the data source connection objects extracted from the serialized character string extraction module to obtain the connection objects of the data source for connection.
8. The apparatus of claim 7, wherein the connection module deserializes the serialized character string by: and restoring the serialized character strings of the data source connection objects into the data source connection objects in the memory by using the reverse serialization operation of the programming language serialization operation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210357263.2A CN114490834B (en) | 2022-04-07 | 2022-04-07 | Method and device for replacing big data calculation operation data source based on Kubernetes |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210357263.2A CN114490834B (en) | 2022-04-07 | 2022-04-07 | Method and device for replacing big data calculation operation data source based on Kubernetes |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114490834A CN114490834A (en) | 2022-05-13 |
CN114490834B true CN114490834B (en) | 2022-06-21 |
Family
ID=81487371
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210357263.2A Active CN114490834B (en) | 2022-04-07 | 2022-04-07 | Method and device for replacing big data calculation operation data source based on Kubernetes |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114490834B (en) |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107766538A (en) * | 2017-10-28 | 2018-03-06 | 杭州安恒信息技术有限公司 | Data filtering processing module and synchronous, asynchronous filter method based on java |
US11314687B2 (en) * | 2020-09-24 | 2022-04-26 | Commvault Systems, Inc. | Container data mover for migrating data between distributed data storage systems integrated with application orchestrators |
CN112910991B (en) * | 2021-01-29 | 2022-10-04 | 杭州涂鸦信息技术有限公司 | Back-end application calling method and device, computer equipment and readable storage medium |
CN113190528B (en) * | 2021-04-21 | 2022-12-06 | 中国海洋大学 | Parallel distributed big data architecture construction method and system |
CN113741961B (en) * | 2021-11-08 | 2022-02-01 | 梯度云科技(北京)有限公司 | Method and device for submitting big data calculation operation based on Kubernetes container arrangement software |
-
2022
- 2022-04-07 CN CN202210357263.2A patent/CN114490834B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN114490834A (en) | 2022-05-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8832492B1 (en) | Systems and methods for managing applications | |
EP3933581A2 (en) | Evm-based transaction processing method and apparatus, device, program and medium | |
CN110955410A (en) | Automatic code generation method, device, equipment and medium | |
CN104750472A (en) | Resource bundle management method and device of terminal application | |
CN111737227A (en) | Data modification method and system | |
CN112528619A (en) | Page template file generation method and device, electronic equipment and storage medium | |
CN113378579A (en) | Method, system and electronic equipment for voice input of structured data | |
CN107479866A (en) | The method that open terminal applies data and function are realized based on reconfiguration technique | |
EP3893137B1 (en) | Evm-based transaction processing method and apparatus, device, program and medium | |
CN111367890A (en) | Data migration method and device, computer equipment and readable storage medium | |
CN111061743A (en) | Data processing method and device and electronic equipment | |
CN114490834B (en) | Method and device for replacing big data calculation operation data source based on Kubernetes | |
CN117033249A (en) | Test case generation method and device, computer equipment and storage medium | |
CN107423291A (en) | A kind of data translating method and client device | |
CN108198582B (en) | NAND Flash control method and device and SSD | |
EP3910877B1 (en) | Evm-based transaction processing method and apparatus, device, program and medium | |
WO2022099569A1 (en) | Application processing program dynamic loading method for brain-like computer operating system | |
CN115061916A (en) | Method for automatically generating interface test case and related equipment thereof | |
CN115145634A (en) | System management software self-adaption method, device and medium | |
CN111401032B (en) | Text processing method, device, computer equipment and storage medium | |
CN115309811A (en) | ETL script generation method, device, storage medium and equipment | |
CN113708971A (en) | Openstack cloud platform deployment method and related device | |
CN114546418A (en) | application platform of aPaaS all-in-one machine | |
CN109902085B (en) | Configuration storage structure optimization method and system | |
CN104598464A (en) | Information processing method and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |