CN114490834B - Method and device for replacing big data calculation operation data source based on Kubernetes - Google Patents

Method and device for replacing big data calculation operation data source based on Kubernetes Download PDF

Info

Publication number
CN114490834B
CN114490834B CN202210357263.2A CN202210357263A CN114490834B CN 114490834 B CN114490834 B CN 114490834B CN 202210357263 A CN202210357263 A CN 202210357263A CN 114490834 B CN114490834 B CN 114490834B
Authority
CN
China
Prior art keywords
data source
character string
serialization
serialized
configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210357263.2A
Other languages
Chinese (zh)
Other versions
CN114490834A (en
Inventor
王伟华
刘井山
樊宇
梅进
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gradient Cloud Technology Beijing Co ltd
Original Assignee
Gradient Cloud Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gradient Cloud Technology Beijing Co ltd filed Critical Gradient Cloud Technology Beijing Co ltd
Priority to CN202210357263.2A priority Critical patent/CN114490834B/en
Publication of CN114490834A publication Critical patent/CN114490834A/en
Application granted granted Critical
Publication of CN114490834B publication Critical patent/CN114490834B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques

Abstract

The invention provides a method and a device for replacing a big data computing operation data source based on Kubernets. The time for inquiring the data source to be connected in the plurality of data sources is saved, and the efficiency for inquiring the data source to be submitted is greatly improved.

Description

Method and device for replacing big data calculation operation data source based on Kubernetes
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a method and a device for replacing a big data computing operation data source based on Kubernetes.
Background
At present, when a big data operation algorithm is connected with a data source, connection information such as an access host address, a port, a user name, a password, a certificate, a path and the like of the data source needs to be written into the algorithm, when the data source is replaced, on one hand, connection information of the data source needs to be inquired in a large number of data sources, and on the other hand, a plurality of connection information such as the access host address, the port, the user name, the password, the certificate, the path and the like of a new data source need to be written into the algorithm again.
In actual use, a large number of big data operation algorithms in a company need to continuously replace data sources, excessive time is needed for finding out connection information of a new data source from excessive data sources, and connection information such as an access host address, a port, a user name, a password, a certificate, a path and the like of the data source needs to be rewritten in the algorithms when the data sources are replaced every time, so that a large amount of workload of writing in the algorithms exists, and the efficiency of updating the data sources is low.
Disclosure of Invention
The invention aims to solve the technical problem of how to improve the efficiency of connecting a data source during big data operation, and provides a method and a device for replacing a big data calculation operation data source based on Kubernetes.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a method for replacing a big data computing operation data source based on Kubernetes comprises the following steps:
step 1: respectively creating a serialization character string of a data source connection object for access information of different data sources to be connected, wherein the serialization character string of the data source connection object is a character string generated by serialization of the connection data source object created by using a programming language;
step 2: editing the serialized character string on a configuration resource in Kubernets container arrangement software, and creating the configuration resource in the Kubernets container arrangement software according to the edited configuration resource;
and step 3: the big data algorithm obtains the configuration resources corresponding to the data source from the plurality of created configuration resources according to the name of the data source to be connected, and extracts the serialized character strings of the data source connection object from the configuration resources corresponding to the data source;
and 4, step 4: and deserializing the serialized character strings of the data source connection object to obtain the connection object of the data source for connection.
Further, the data source access information in step 1 is information required by the big data algorithm to access the data source.
Further, in step 2, editing the serialized character string on a configuration resource in kubernets container arrangement software means:
appointing the name key of the configuration resource as the data source name corresponding to the serialized character string;
designating the data key of the configuration resource as a serialized string.
Further, in step 3, the method for obtaining the configuration resource corresponding to the data source according to the name of the data source to be connected is as follows:
and querying and finding the configuration resources corresponding to the data source to be connected through a query command provided by Kubernetes container arrangement software and the name of the data source to be connected.
Further, the method for deserializing the serialized character string of the data source connection object in step 4 is as follows: and restoring the serialized character strings of the data source connection objects into the data source connection objects in the memory by using the reverse serialization operation of the programming language serialization operation.
Further, the programming language is a programming language that provides serialized object functionality.
Further, configuration resources in the kubernets container orchestration software include: secret resource, ConfigMap resource.
The invention also provides a device for replacing the big data calculation operation data source based on Kubernets, which is characterized by comprising the following modules:
a serialization string generation module: the serialization character strings used for respectively establishing access information of different data sources to be connected into a data source connection object are character strings generated by serialization of the connection data source object established by using a programming language;
a configuration resource creation module: the system comprises a Kubernets container arrangement software, a database and a database, wherein the Kubernets container arrangement software is used for editing the serialized character strings on configuration resources in the Kubernets container arrangement software and creating configuration resources in the Kubernets container arrangement software according to the edited configuration resources;
the serialization string extraction module: the big data algorithm is used for acquiring the configuration resource corresponding to the data source from the plurality of created configuration resources according to the name of the data source to be connected, and extracting the serialized character string of the data source connection object from the configuration resource corresponding to the data source;
a connecting module: and the device is used for performing deserialization on the serialized character strings of the data source connection objects extracted from the serialized character string extraction module to obtain the connection objects of the data source for connection.
Further, in the connection module, the method for deserializing the serialized character strings comprises the following steps: and restoring the serialized character strings of the data source connection objects into the data source connection objects in the memory by using the reverse serialization operation of the programming language serialization operation.
By adopting the technical scheme, the invention has the following beneficial effects:
according to the method and the device for replacing the big data computing operation data source based on Kubernets, different data source connection objects are stored in configuration resources in Kubernets container arrangement software in a character string mode, and when the number of data sources is increased, the data source to be connected can be quickly found through query commands provided by the Kubernets container arrangement software. The time for inquiring the data source to be connected in the plurality of data sources is saved, and the efficiency for inquiring the data source to be submitted is greatly improved.
In addition, when the data source is replaced by the big data algorithm, all the connection information such as the access host address, the port, the user name, the password, the certificate, the path and the like of the new data source do not need to be rewritten in the algorithm, only the name of the new data source needs to be written in the algorithm, the serialized character string of the data source connection object is obtained through Kubernetes container arrangement software, the deserialization is carried out on the serialized character string of the data source connection object to obtain the connection object of the new data source, the new data source can be directly accessed through the connection object of the new data source, the workload of rewriting all the connection information such as the access host address, the port, the user name, the password, the certificate, the path and the like of the new data source in the algorithm is reduced, the efficiency of updating the data source by the big data algorithm is improved, and the company benefit is improved.
Drawings
FIG. 1 is a flow chart of the system of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
FIG. 1 shows an embodiment of the present invention of a method for replacing a data source of a big data computing job based on Kubernets, comprising the following steps:
step 1: and respectively creating a serialization character string of a data source connection object for different data source access information, wherein the serialization character string of the data source connection object is a character string generated by serialization of the connection data source object created by using a programming language. In this embodiment, the data source access information is information required by the big data algorithm to access the data source, and includes: host address, port, user name, password, certificate, path and other information, and the used programming language is a programming language capable of providing a serialized object function, and comprises the following steps: java, Python, Go.
And 2, step: editing the serialized character string on a configuration resource in Kubernets container arrangement software, and creating the configuration resource in the Kubernets container arrangement software according to the edited configuration resource.
In this embodiment, the configuration resources in the kubernets container orchestration software include: secret resource, ConfigMap resource. Editing the serialized character string on a configuration resource in Kubernets container arrangement software means that: appointing the name key of the configuration resource as the data source name corresponding to the serialized character string; and designating the data key of the configuration resource as a serialization character string. The content of the data key is a key value pair structure. The method for creating configuration resources in Kubernets container orchestration software comprises the following steps: kubecect apply, and the like.
If a plurality of HDFS file system data sources are subjected to editing on a Secret resource in Kubernets container arrangement software through a document editor, wherein a name key of the Secret resource is designated as the name of the HDFS file system data source; and specifying the data key content of the Secret resource as a serialization character string of an HDFS file system data source connection object created by the HDFS file system data source specified by the Secret resource name key. In this embodiment, serialized character string manners of data source connection objects are respectively created according to different data source access information, the different data source connection objects are stored in configuration resources in kubernets container arrangement software in a character string manner, when the number of data sources increases, the configuration resources of the data source to be connected can be quickly found through query commands provided by the kubernets container arrangement software, and then the access information of the data source to be connected edited on the configuration resources can be found through the configuration resources. The time for inquiring the data source to be connected in the plurality of data sources is saved, and the efficiency for inquiring the data source to be submitted is greatly improved.
And step 3: and the big data algorithm acquires the configuration resource corresponding to the data source from the plurality of created configuration resources according to the name of the data source to be connected, and extracts the serialized character string of the data source connection object from the configuration resource corresponding to the data source. In this embodiment, the kubernets container arrangement software obtains the configuration resource by kubecectels get and the like, and extracts the serialized character string containing the data connection object information from the data key of the configuration resource. Because the name key of the configuration resource is the name of the data source to be connected, the corresponding configuration resource is quickly found through the query command provided by the Kubernets container arrangement software, and then the connection access information of the data source to be connected is found.
And 4, step 4: and deserializing the serialized character strings of the data source connection object to obtain the connection object of the data source for connection.
In this embodiment, the deserialization method for the serialized character string of the data source connection object is that reverse serialization operation of the programming language serialization operation is used, so that the serialized character string of the data source connection object is restored to the data source object in the memory, and then the program is run, so that the access object can be quickly connected without rewriting connection information into the algorithm, thereby reducing a large amount of workload of writing in the algorithm and improving the efficiency of updating the data source.
In this embodiment, when the HDFS file system data source is replaced by the big data algorithm, according to the new name of the HDFS file system data source to be connected, a Secret resource to be connected with the HDFS file system data source is obtained through a kubecectes container arrangement software-provided kubecect command, and a serialization character string of a connection object to be connected with the HDFS file system data source is obtained in a data key of the Secret resource. And performing deserialization on the serialized character strings of the HDFS file system data source connection object by using a Java serialization module to obtain the HDFS file system data source connection object, and connecting a new HDFS file system data source through the HDFS file system data source connection object.
The invention also provides a device for replacing a big data calculation operation data source based on Kubernets, which comprises the following modules:
a serialization string generation module: the serialization character strings used for respectively establishing access information of different data sources to be connected into a data source connection object are character strings generated by serialization of the connection data source object established by using a programming language;
a configuration resource creation module: the system comprises a Kubernets container arrangement software, a database and a database, wherein the Kubernets container arrangement software is used for editing the serialized character strings on configuration resources in the Kubernets container arrangement software and creating configuration resources in the Kubernets container arrangement software according to the edited configuration resources;
the serialization string extraction module: the big data algorithm is used for acquiring the configuration resource corresponding to the data source from the plurality of created configuration resources according to the name of the data source to be connected, and extracting the serialized character string of the data source connection object from the configuration resource corresponding to the data source;
a connecting module: and the device is used for performing deserialization on the serialized character strings of the data source connection objects extracted from the serialized character string extraction module to obtain the connection objects of the data source for connection. In the connection module of this embodiment, the method for deserializing the serialized character string includes: and restoring the serialized character strings of the data source connection objects into the data source connection objects in the memory by using the reverse serialization operation of the programming language serialization operation.
When the data source is replaced by the big data algorithm, all the connection information such as the access host address, the port, the user name, the password, the certificate, the path and the like of the new data source do not need to be rewritten in the algorithm, only the name of the new data source to be connected needs to be written in the algorithm, the serialized character string of the data source connection object is obtained through Kubernetes container arrangement software, the deserialization is carried out on the serialized character string of the data source connection object to obtain the connection object of the new data source, the new data source can be directly accessed through the connection object of the new data source, the workload of rewriting all the connection information such as the access host address, the port, the user name, the password, the certificate, the path and the like of the new data source is reduced, the efficiency of updating the data source by the big data algorithm is improved, and the company benefit is improved.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1. A method for replacing a big data computing operation data source based on Kubernetes is characterized by comprising the following steps:
step 1: respectively creating a serialization character string of a data source connection object for access information of different data sources to be connected, wherein the serialization character string of the data source connection object is a character string generated by serialization of the connection data source object created by using a programming language;
step 2: editing the serialized character strings on configuration resources in Kubernets container arrangement software, and creating configuration resources in the Kubernets container arrangement software according to the edited configuration resources;
and step 3: the big data algorithm obtains the configuration resources corresponding to the data source from the plurality of created configuration resources according to the name of the data source to be connected, and extracts the serialized character strings of the data source connection object from the configuration resources corresponding to the data source;
and 4, step 4: deserializing the serialized character strings of the data source connection object to obtain a connection object of the data source for connection;
in step 2, editing the serialized character string on a configuration resource in kubernets container arrangement software means:
appointing the name key of the configuration resource as the data source name corresponding to the serialized character string;
and designating the data key of the configuration resource as a serialization character string.
2. The method according to claim 1, wherein the data source access information in step 1 is information required by a big data algorithm to access the data source.
3. The method according to claim 2, wherein in step 3, the method for obtaining the configuration resource corresponding to the data source according to the name of the data source to be connected is:
and querying and finding the configuration resources corresponding to the data source to be connected through a query command provided by Kubernetes container arrangement software and the name of the data source to be connected.
4. The method of claim 1, wherein the deserializing of the serialized character string of the data source connection object in step 4 is performed by: and restoring the serialized character strings of the data source connection objects into the data source connection objects in the memory by using the reverse serialization operation of the programming language serialization operation.
5. The method of any of claims 1 to 4, wherein the programming language is a programming language that provides serialized object functionality.
6. The method of any one of claims 1 to 4, wherein configuring resources in Kubernets container orchestration software comprises: secret resource, ConfigMap resource.
7. An apparatus for replacing big data computing operation data source based on Kubernetes, which is characterized by comprising the following modules:
a serialization string generation module: the serialization character strings used for respectively establishing access information of different data sources to be connected into a data source connection object are character strings generated by serialization of the connection data source object established by using a programming language;
a configuration resource creation module: the system comprises a Kubernets container arrangement software, a database and a database, wherein the Kubernets container arrangement software is used for editing the serialized character strings on configuration resources in the Kubernets container arrangement software and creating configuration resources in the Kubernets container arrangement software according to the edited configuration resources;
editing the serialized character string on a configuration resource in Kubernets container arrangement software means that:
appointing the name key of the configuration resource as the data source name corresponding to the serialized character string;
designating the data key of the configuration resource as a serialized character string;
the serialization string extraction module: the big data algorithm is used for acquiring the configuration resource corresponding to the data source from the plurality of created configuration resources according to the name of the data source to be connected, and extracting the serialized character string of the data source connection object from the configuration resource corresponding to the data source;
a connecting module: and the device is used for performing deserialization on the serialized character strings of the data source connection objects extracted from the serialized character string extraction module to obtain the connection objects of the data source for connection.
8. The apparatus of claim 7, wherein the connection module deserializes the serialized character string by: and restoring the serialized character strings of the data source connection objects into the data source connection objects in the memory by using the reverse serialization operation of the programming language serialization operation.
CN202210357263.2A 2022-04-07 2022-04-07 Method and device for replacing big data calculation operation data source based on Kubernetes Active CN114490834B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210357263.2A CN114490834B (en) 2022-04-07 2022-04-07 Method and device for replacing big data calculation operation data source based on Kubernetes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210357263.2A CN114490834B (en) 2022-04-07 2022-04-07 Method and device for replacing big data calculation operation data source based on Kubernetes

Publications (2)

Publication Number Publication Date
CN114490834A CN114490834A (en) 2022-05-13
CN114490834B true CN114490834B (en) 2022-06-21

Family

ID=81487371

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210357263.2A Active CN114490834B (en) 2022-04-07 2022-04-07 Method and device for replacing big data calculation operation data source based on Kubernetes

Country Status (1)

Country Link
CN (1) CN114490834B (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107766538A (en) * 2017-10-28 2018-03-06 杭州安恒信息技术有限公司 Data filtering processing module and synchronous, asynchronous filter method based on java
US11314687B2 (en) * 2020-09-24 2022-04-26 Commvault Systems, Inc. Container data mover for migrating data between distributed data storage systems integrated with application orchestrators
CN112910991B (en) * 2021-01-29 2022-10-04 杭州涂鸦信息技术有限公司 Back-end application calling method and device, computer equipment and readable storage medium
CN113190528B (en) * 2021-04-21 2022-12-06 中国海洋大学 Parallel distributed big data architecture construction method and system
CN113741961B (en) * 2021-11-08 2022-02-01 梯度云科技(北京)有限公司 Method and device for submitting big data calculation operation based on Kubernetes container arrangement software

Also Published As

Publication number Publication date
CN114490834A (en) 2022-05-13

Similar Documents

Publication Publication Date Title
US8832492B1 (en) Systems and methods for managing applications
EP3933581A2 (en) Evm-based transaction processing method and apparatus, device, program and medium
CN110955410A (en) Automatic code generation method, device, equipment and medium
CN104750472A (en) Resource bundle management method and device of terminal application
CN111737227A (en) Data modification method and system
CN112528619A (en) Page template file generation method and device, electronic equipment and storage medium
CN113378579A (en) Method, system and electronic equipment for voice input of structured data
CN107479866A (en) The method that open terminal applies data and function are realized based on reconfiguration technique
EP3893137B1 (en) Evm-based transaction processing method and apparatus, device, program and medium
CN111367890A (en) Data migration method and device, computer equipment and readable storage medium
CN111061743A (en) Data processing method and device and electronic equipment
CN114490834B (en) Method and device for replacing big data calculation operation data source based on Kubernetes
CN117033249A (en) Test case generation method and device, computer equipment and storage medium
CN107423291A (en) A kind of data translating method and client device
CN108198582B (en) NAND Flash control method and device and SSD
EP3910877B1 (en) Evm-based transaction processing method and apparatus, device, program and medium
WO2022099569A1 (en) Application processing program dynamic loading method for brain-like computer operating system
CN115061916A (en) Method for automatically generating interface test case and related equipment thereof
CN115145634A (en) System management software self-adaption method, device and medium
CN111401032B (en) Text processing method, device, computer equipment and storage medium
CN115309811A (en) ETL script generation method, device, storage medium and equipment
CN113708971A (en) Openstack cloud platform deployment method and related device
CN114546418A (en) application platform of aPaaS all-in-one machine
CN109902085B (en) Configuration storage structure optimization method and system
CN104598464A (en) Information processing method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant