WO2021013057A1

WO2021013057A1 - Data management method and apparatus, and device and computer-readable storage medium

Info

Publication number: WO2021013057A1
Application number: PCT/CN2020/102540
Authority: WO
Inventors: 王和平; 尹强; 刘有; 黄山; 杨峙岳; 邸帅; 卢道和
Original assignee: 深圳前海微众银行股份有限公司
Priority date: 2019-07-19
Filing date: 2020-07-17
Publication date: 2021-01-28
Also published as: CN110362630A; CN110362630B

Abstract

A data management method, a data management apparatus, a device and a computer-readable storage medium. The method comprises: when a data importing request is detected, reading data content of first data corresponding to the data importing request, and determining, on the basis of the data content, a first data type corresponding to the first data (S10); determining, on the basis of the first data type, a first data source corresponding to the first data (S20); determining a first conversion format and a first column of information of the first data, and generating, on the basis of the first conversion format and the first column of information, a first data set from the first data (S30); and importing the first data set into the first data source (S40).

Description

Data management method, device, equipment and computer readable storage medium

This application claims the priority of the Chinese patent application filed on July 19, 2019, the application number is 201910655646.6, and the title is "data management methods, devices, equipment and computer-readable storage media", the entire contents of which are incorporated by reference In this application.

Technical field

This application relates to the technical field of financial technology (Fintech), in particular to data management methods, devices, equipment and computer-readable storage media.

Background technique

In recent years, with the continuous development of Fintech, especially Internet finance, big data technology has been introduced into the daily business of financial institutions such as banks. In the daily service process of financial institutions such as banks, personnel in data analysis or data warehouse positions need to export data from the database for data analysis; or business personnel need to export data to a file in response to customer needs The file is sent to the customer; or the business personnel get the data and need to import the current data into the database for storage. Obviously, the import and export of data is a data management task that banks and other financial institutions must do.

However, the existing data management method is not associated with each database, and the logic language of each database is different, generally only for a single database, only export data from the database to the local, or import local data into the database, and import The export method is relatively limited, and the data cannot be processed during the import and export process, resulting in a rigid import and export method, and the data cannot be intelligently managed.

Technical solutions

The main purpose of this application is to propose a data management method, device, equipment and computer-readable storage medium, aiming to realize intelligent management of data.

In order to achieve the above objective, the present application provides a data management method. The data management method includes the following steps:

When a data import request is detected, read the data content of the first data corresponding to the data import request, and determine the first data type corresponding to the first data based on the data content;

Determine the first data source corresponding to the first data based on the first data type;

Determining a first conversion format and first column information of the first data, and generating a first data set from the first data based on the first conversion format and the first column information;

Import the first data set into the first data source.

In an embodiment, when a data import request is detected, the data content of the first data corresponding to the data import request is read, and the first data type corresponding to the first data is determined based on the data content The steps include:

When a data import request is detected, read the data content of the preset number of rows of first data corresponding to the data import request, and determine the second data type to which the column information of each column in the data content belongs;

The number of occurrences of each data type in the second data type is counted, and the data type with the most frequency is determined as the first data type corresponding to the first data.

In an embodiment, the step of determining a first data source corresponding to the first data based on the first data type includes:

Based on the first data type, determine the first data source corresponding to the first data, and return the first data source to the client corresponding to the data import request;

If a second data source sent by the user terminal based on the first data source is received, the second data source is used as the first data source corresponding to the first data.

In an embodiment, the step of importing the first data set into the first data source includes:

Determining the writing type of the first data set;

Import the first data set into the first data source according to the write type.

In an embodiment, the data management method further includes:

When a data export request is detected, obtain configuration information of the data export request, where the configuration information includes a third data source, query statement, file format, and output path;

Obtaining second data corresponding to the data export request based on the third data source and the query sentence;

Based on the file format, generating a second data set from the second data, and determining a file writing object corresponding to the second data set;

The second data set is written into the file write-out object, and the file write-out object is exported to a terminal corresponding to the output path.

In an embodiment, the file format includes a second column of information, a second conversion format and a file format type corresponding to the second column of information, and the second data is generated based on the file format. Data set, and the step of determining the file writing object corresponding to the second data set includes:

Generating a second data set from the second data based on the second conversion format and the second column information;

Based on the file format type, determine the file writing object corresponding to the second data set.

In an embodiment, the step of writing the second data set into the file and writing out the object includes:

Traverse the partitions of the second data set, and write the second data set into the file write-out object according to a write mode of one partition at a time.

In addition, in order to achieve the above objective, this application also provides a data management device, the data management device including:

A reading module, configured to read the data content of the first data corresponding to the data import request when a data import request is detected, and determine the first data type corresponding to the first data based on the data content;

A determining module, configured to determine a first data source corresponding to the first data based on the first data type;

A generating module, configured to determine a first conversion format and first column information of the first data, and generate a first data set from the first data based on the first conversion format and the first column information;

The import module is used to import the first data set into the first data source.

In an embodiment, the reading module is further used for:

In an embodiment, the determining module is further used for:

In an embodiment, the import module is also used to:

Determining the writing type of the first data set;

In an embodiment, the data management device further includes:

The obtaining module is configured to obtain configuration information of the data export request when the data export request is detected, the configuration information including the third data source, query statement, file format and output path;

The obtaining module is further configured to obtain second data corresponding to the data export request based on the third data source and the query sentence;

The generating module is further configured to generate a second data set from the second data based on the file format, and determine a file writing object corresponding to the second data set;

The export module is configured to write the second data set into the file write-out object, and export the file write-out object to the terminal corresponding to the output path.

In an embodiment, the file format includes a second column of information, a second conversion format and file format type corresponding to the second column of information, and the generating module is further configured to:

In an embodiment, the export module is further used to:

In addition, in order to achieve the above object, this application also provides a data management device, the data management device includes: a memory, a processor, and a data management program stored on the memory and running on the processor, so When the data management program is executed by the processor, the steps of the data management method described above are implemented.

In addition, in order to achieve the above objective, the present application also provides a computer-readable storage medium having a data management program stored on the computer-readable storage medium, and when the data management program is executed by a processor, the data management as described above is realized. Method steps.

In the data management method proposed in this application, when a data import request is detected, the data content of the first data corresponding to the data import request is read, and the first data type corresponding to the first data is determined based on the data content ; Based on the first data type, determine the first data source corresponding to the first data; determine the first conversion format and first column information of the first data, and based on the first conversion format and the In the first column of information, a first data set is generated from the first data; the first data set is imported into the first data source. When a data import request is detected, this application processes the data corresponding to the data import request, and imports the processed data into the data source by determining the corresponding data source to realize intelligent data management.

Description of the drawings

FIG. 1 is a schematic diagram of a device structure of a hardware operating environment involved in a solution of an embodiment of the present application;

2 is a schematic flowchart of the first embodiment of the data management method of this application;

FIG. 3 is a schematic flowchart of a second embodiment of the data management method of this application.

The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

Embodiments of the invention

It should be understood that the specific embodiments described here are only used to explain the application, and are not used to limit the application.

As shown in FIG. 1, FIG. 1 is a schematic diagram of the device structure of the hardware operating environment involved in the solution of the embodiment of the present application.

The device in the embodiment of this application may be a PC or a server device.

As shown in FIG. 1, the device may include a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, and a communication bus 1002. Among them, the communication bus 1002 is used to implement connection and communication between these components. The user interface 1003 may include a display screen (Display) and an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. The network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface). The memory 1005 may be a high-speed RAM memory, or a stable memory (non-volatile memory), such as a magnetic disk memory. Optionally, the memory 1005 may also be a storage device independent of the foregoing processor 1001.

Those skilled in the art can understand that the structure of the device shown in FIG. 1 does not constitute a limitation on the device, and may include more or fewer components than those shown in the figure, or a combination of certain components, or different component arrangements.

As shown in Fig. 1, a memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a data management program.

Among them, the operating system is a program that manages and controls data management equipment and software resources, and supports the operation of network communication modules, user interface modules, data management programs, and other programs or software; the network communication module is used to manage and control the network interface 1002; users The interface module is used to manage and control the user interface 1003.

In the data management device shown in FIG. 1, the data management device calls the data management program stored in the memory 1005 through the processor 1001, and executes the operations in the following embodiments of the data management method.

Based on the above hardware structure, an embodiment of the data management method of this application is proposed.

Referring to Fig. 2, Fig. 2 is a schematic flowchart of a first embodiment of a data management method according to this application, and the method includes:

Step S10, when a data import request is detected, read the data content of the first data corresponding to the data import request, and determine the first data type corresponding to the first data based on the data content;

Step S20: Determine a first data source corresponding to the first data based on the first data type;

Step S30: Determine a first conversion format and first column information of the first data, and generate a first data set from the first data based on the first conversion format and the first column information;

Step S40: Import the first data set into the first data source.

The data management method of this embodiment is applied to the data management equipment of financial institutions such as financial institutions or banking systems. For ease of description, the data management equipment is hereinafter referred to as the management equipment, and the management equipment may be a terminal or a PC equipment. In this embodiment of the application, Spark (a fast and universal computing engine designed for large-scale data processing) is built into the management device, which enables the management device to import and export multiple types of files to data storage components based on Spark technology support, such as Excel, CSV, JSON, etc., at the same time , Based on Spark technology, the management device supports importing and exporting to various types of data storage components such as Hive, Mysql, Oracle, HDFS, Hbase, Mongodb, etc. Specifically, the data storage component class is increased through the DataSource API provided by Spark. The specific The program segment is edited according to actual needs, so that DataSourceAPI supports connecting multiple data sources. The implementation of this embodiment relies on Spark’s distributed computing capabilities and the DataSource API (data source call interface) that supports the connection of multiple data sources. It should be noted that Spark’s native Datasource (is a set of connections to external data sources and the Spark engine). Framework, it is mainly to provide the Spark framework with the ability to quickly read external data. It can easily register different data formats as Spark tables through the DataSource API (call interface)). It has been implemented for JSON, ORC, Parquet, etc. File format support, but the supported formats are limited. It does not meet actual needs. On this basis, the embodiment of this application adds support for Excel (such as supporting xls version 03 and xlsx after version 07), CSV, TXT and other file formats through the DataSource API provided by Spark. Segments are edited according to the file format, that is, by adding supporting file formats in the management device, the management device can import and export multiple types of files.

The management device of this embodiment, when detecting a data import request, processes the data corresponding to the data import request, and imports the processed data into the data source by determining the corresponding data source to realize data intelligence management.

Each step will be described in detail below:

In this embodiment, when the relevant business personnel of financial institutions such as financial institutions or banks, that is, users, obtain data through various channels, and the data needs to be imported into the data source corresponding to the financial institution, they only need to transfer the data. Into the management device, the management device can complete the data import.

Specifically, when the management device detects the data import request, it reads the data content of the first data corresponding to the data import request, and identifies the first data type corresponding to the first data according to the data content, that is, the first data When importing the corresponding data source, first determine the first data type of the first data, so that the first data can be subsequently imported into the correct data source.

It is understandable that there are multiple data sources in this embodiment, that is, there are multiple data storage components that the management device supports import and export, such as Hive, Mysql, Oracle, HDFS, Hbase, and Mongodb, etc., in order to achieve accurate first data To import the data source desired by the user, the first data type of the first data needs to be determined first.

Further, step S10 includes:

Step a: When a data import request is detected, read the data content of the preset number of rows of first data corresponding to the data import request, and determine the second data type to which the column information of each column in the data content belongs;

In this step, when the management device detects the data import request, the file reading object (Reader) in the management device will read the first data corresponding to the data import request. In this process, in order to quickly determine the first data The first data type of a data, the number of rows can be preset to read, that is, the reader only needs to read the data content of the preset number of rows, and judge the first data type of the first data by the read data content Wherein, the preset number of rows may refer to the previous preset number of rows of the first data, and then the second data type to which the column information of each column in the read data content belongs is determined. If the column information of the current column is a number, the data type of the current column is determined to be a number type; if the column information of the current column is a character, then the data type of the current column information is determined to be a character type, etc.

Step b: Count the number of occurrences of each data type in the second data type, and determine the data type with the largest number of times as the first data type corresponding to the first data.

In this step, according to the second data type to which the column information of each column in the read data content belongs, count the number of occurrences of each data type in the second data type, and determine the data type with the most occurrences as the first data For the corresponding first data type, if the number type appears the most times, the first data is determined to be a number type; if the character type appears the most times, the first data is determined to be a character type, etc.

In specific implementation, the preset number of rows is preferably 10 rows, that is, the Reader reads the data content of the first 10 rows of the first data, and the data type inference device of the management device judges the data type of the data content of the first 10 rows. Determine the data type of each column in the data content, and infer by judging which type appears the most times, such as: user: String, orderId: Int. In order to improve the accuracy of the data type judgment, the judgment result, that is, the first data type, can be returned to the user terminal corresponding to the data import request for the user to view and confirm. In this process, if the user receives the user terminal based on the first data type. A modification instruction sent by a data type, the management device changes the data type of the first data according to the user's modification wishes.

Step S20: Determine a first data source corresponding to the first data based on the first data type.

In this embodiment, the management device determines the first data source corresponding to the first data based on the determined first data type, that is, determines the first data source into which the first data is to be imported. Specifically, the data type and the data source are mapped in advance to obtain the data type-data source mapping table. When the first data type of the first data is determined, the data type-data source mapping table can be used to determine the first data corresponding The first data source.

Further, step S20 includes:

Step c: Determine the first data source corresponding to the first data based on the first data type, and return the first data source to the client corresponding to the data import request;

In this step, the management device determines the first data source corresponding to the first data type based on the first data type, and returns the first data source to the client corresponding to the data import request for confirmation by the user of the client.

Step d, if a second data source sent by the user terminal based on the first data source is received, the second data source is used as the first data source corresponding to the first data.

In this step, the user can confirm through the user terminal whether the inference of the management device is correct, and if the inference is incorrect, the user terminal can send a corresponding modification instruction for the management device to modify the data type of the first data. Specifically, if the management device receives the second data source sent by the user terminal based on the first data, it uses the second data source as the first data source corresponding to the first data; if it does not receive the user terminal sent based on the first data source If the second data source is given, or a confirmation instruction based on the first data source is received, the first data source is determined to be the data source corresponding to the first data.

It is understandable that since the data type of the first data is inferred from the data content of the preset number of rows of the first data, the accuracy is not very correct. In order to improve the accuracy of judging the data type of the first data, it is necessary to After the management device determines the first data type of the first data, it returns the first data type to the user for confirmation, thereby improving the accuracy of judging the data type of the first data, and the management device is checking the data type of the first data. When modifying, the modified data type will be saved, so that the next time it encounters the same data content as the first data, the data type can be accurately obtained.

Step S30: Determine a first conversion format and first column information of the first data, and generate a first data set from the first data based on the first conversion format and the first column information.

In this embodiment, the management device determines the first conversion format of the first data and the first column information, so as to perform conversion processing on the first column information according to the first conversion format, wherein the conversion processing includes data desensitization processing and data type Data desensitization refers to the transformation of certain sensitive information through desensitization rules to achieve reliable protection of sensitive private data. In the case of customer security data or some commercially sensitive data, the real data should be modified and used for testing without violating system rules, such as personal information such as ID card number, mobile phone number, card number, customer number, etc. Data desensitization; data type conversion processing, such as converting word files to PDF files, etc.

Among them, the first conversion format and the first column of information of the first data can be user-defined, that is, when the user initiates a data import request, the first conversion format and the first column of information of the first data are defined. Decrypt user information in a data, etc.

Specifically, the first column of information of the first data is processed according to the first conversion format, so as to generate the first data set from the first data. It should be noted that the first data set may be data in multiple files. If the data that the user wants to import is the data in the import file A, the data in the import file B, and the data in the import file C, then the first data in this embodiment is the import file A, the import file B, and the import file C. When the first data is converted, that is, the imported files A, B, and C are processed, and finally the first data set is merged. The first data set is specifically a DataFrame (a tabular data structure that contains a group of Ordered columns, each column can have a different value, is a distributed data set organized in named columns).

Step S40: Import the first data set into the first data source.

In this embodiment, based on the determined first data source, the corresponding calling interface (DatasourceAPI) is called, and the first data set (DataFrame) is imported into the first data source through the calling interface, for example, in the Mysql library In the user order form, the calling interface is an interface reserved by the data management device based on Spark technology, through which data transmission of distributed data sources can be realized.

It should be noted that each data source corresponds to a dedicated call interface, that is, after the first data source corresponding to the first data set is determined, the call interface corresponding to the first data source needs to be determined, and the call Interface, import the first data set into the first data source, that is, import the data of the first data source through the calling interface corresponding to the first data source.

Of course, in order to realize quick import and reduce the calling time of calling interfaces, the calling interfaces of different data sources can be integrated into a general calling interface. The specific program segments can be edited according to actual needs. Through the general calling interface, different data sources can be realized. Data transmission, that is, no matter which data source the data is imported into, it is imported to the corresponding data source through a common calling interface.

Further, step S40 includes:

Step e, determining the writing type of the first data set;

In this step, the user can also customize the write type of the first data set, where the write type includes new data, overwritten data, and additional data. For example, the user selects the user order form and selects data addition. The management device can determine the writing type of the first data set, so as to subsequently write the first data set.

Step f: Import the first data set into the first data source according to the write type.

In this step, the management device calls the corresponding calling interface based on the determined first data source, and imports the first data set into the first data source through the calling interface according to the determined writing type.

In this embodiment, when the text to be disseminated is received, when a data import request is detected, the data content of the first data corresponding to the data import request is read, and the first data corresponding to the first data is determined based on the data content. A data type; based on the first data type, determine the first data source corresponding to the first data; determine the first conversion format and first column information of the first data, and based on the first conversion format And the first column of information, generating a first data set from the first data; importing the first data set into the first data source. When a data import request is detected, this application processes the data corresponding to the data import request, and imports the processed data into the data source by determining the corresponding data source to realize intelligent data management.

Further, based on the first embodiment of the data management method of the present application, a second embodiment of the data management method of the present application is proposed.

The difference between the second embodiment of the data management method and the first embodiment of the data management method is that, referring to FIG. 3, the data management method further includes:

Step S50: When a data export request is detected, obtain configuration information of the data export request, where the configuration information includes a third data source, query sentences, file format, and output path;

Step S60: Acquire second data corresponding to the data export request based on the third data source and the query sentence;

Step S70: Generate a second data set from the second data based on the file format, and determine a file writing object corresponding to the second data set;

Step S80: Write the second data set into the file write-out object, and export the file write-out object to a terminal corresponding to the output path.

In this embodiment, when a data export request is detected, the corresponding second data is determined, the second data is processed into a second data set, and the second data set is written into the corresponding file write-out object to export, Realize the intelligent management of data.

Each step will be described in detail below:

Step S50: When a data export request is detected, configuration information of the data export request is obtained, where the configuration information includes a third data source, a query sentence, a file format, and an output path.

In this embodiment, when the management device detects the data export request, it obtains the configuration information of the data export request. The configuration information is configured by the user. The configuration information includes the third data source, query statement, file format, and output path. That is, when the user exports data, he can select the corresponding data source and the corresponding data table that needs to be exported, such as the user order table in the Mysql library, and define the query statement for the data to be exported from the data table, and Define the data conversion of the column information of the specified column. For example: define the export order form for the last six months, and perform data desensitization processing on user information, and then select the file format and output path to be exported, such as: export the user order form to Excel, the path is: /home/username/orders .xlsx. The management device can determine the corresponding parameters according to the user's configuration information.

Step S60: Acquire second data corresponding to the data export request based on the third data source and the query sentence.

In this embodiment, the management device obtains the second data corresponding to the data export request based on the third data source and the corresponding query statement, and specifically obtains the corresponding data table from the third data source, and uses the query statement in the data table Extract the second data corresponding to the data export request in the data export request, where the second data may be data of a single file or data of multiple files.

Step S70: Based on the file format, generate a second data set from the second data, and determine a file writing object corresponding to the second data set.

In this embodiment, the management device processes the determined second data according to the file format configured by the user, such as desensitization processing, to generate a second data set, which is specifically also a DataFrame, and determines the second data Set the corresponding file to write out the object.

Specifically, the file format includes a second column of information, and the second conversion format and file format type corresponding to the second column of information, step S70 includes:

Step g, generating a second data set from the second data based on the second conversion format and the second column information;

In this step, the management device generates a second data set from the second data according to the second column information and the second conversion format corresponding to the second column information. Specifically, the second column information is extracted from the second data, and The second column information is converted according to the second conversion format, such as decryption, and the second data is generated into a second data set.

Step h: Determine a file writing object corresponding to the second data set based on the file format type.

In this step, the management device determines the file writing object corresponding to the second data set based on the file format type. Specifically, a mapping table between the file format type and the file writing object can be established in advance. After determining the file format type selected by the user, You can determine the corresponding file write-out object. Such as: support Spark Excel file to write out objects. The write module (Writer) in the management device supports multiple file format types such as Excel, csv, Json, etc.

In this embodiment, after determining the file write-out object of the second data set, the management device writes the second data set into the file write-out object, and exports the file write-out object to the corresponding output path configured by the user Terminal, such as: the output path is /home/username/orders.xlsx.

Further, the step of writing the second data set into the file write-out object includes:

Step i: Traverse the partitions of the second data set, and write the second data set into the file write-out object according to a write mode of one partition at a time.

In this step, the management device traverses the partitions of the second data set. It is understandable that the second data set, that is, the DataFrame, has multiple partitions, and each partition stores data. The partitions are defined by the user in advance, such as Based on the hash rule, the DataFrame is divided into multiple areas. For example, data with a hash value of A is placed in area a, and data with a hash value of B is placed in area b. The management device is written in one partition at a time. Write the second data set to the file write-out object.

This embodiment is to prevent the memory overflow problem during the writing process. Therefore, the Writer part is modified and Spark is called ToLocalIterator to traverse the partitions of the DataFrame, collect data in a way that collects one partition at a time, and provides a general writing scheme that can be written to HDFS (Hadoop Distributed File System is an implementation of Hadoop abstract file system, which refers to distributed file system) and local file system.

In this embodiment, when a data export request is detected, configuration information of the data export request is acquired. The configuration information includes a third data source, query sentences, file format, and output path; based on the third data source and the Query sentence to obtain the second data corresponding to the data export request; generate a second data set from the second data based on the file format, and determine the file corresponding to the second data set to write out objects; The second data set is written into the file write-out object, and the file write-out object is exported to the terminal corresponding to the output path. When the data export request is detected, the corresponding second data is determined, and the second data is processed into a second data set, and the second data set is written into the corresponding file write-out object to export the data. Intelligent management.

The application also provides a data management device. The data management device of this application includes:

Further, the reading module is also used for:

Further, the determining module is also used for:

Further, the import module is also used for:

Determining the writing type of the first data set;

Further, the data management device further includes:

Further, the file format includes a second column of information, a second conversion format and file format type corresponding to the second column of information, and the generating module is further configured to:

Further, the export module is also used for:

The application also provides a computer-readable storage medium.

The computer-readable storage medium of the present application stores a data management program, and when the data management program is executed by a processor, the steps of the data management method described above are realized.

For the method implemented when the data management program running on the processor is executed, please refer to the various embodiments of the data management method of the present application, which will not be repeated here.

It should be noted that in this article, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or system including a series of elements not only includes those elements, It also includes other elements not explicitly listed, or elements inherent to the process, method, article, or system. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article or system that includes the element.

The serial numbers of the foregoing embodiments of the present application are for description only, and do not represent the superiority of the embodiments.

Through the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above. , Magnetic disk, optical disk), including several instructions to make a terminal device (can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the method described in each embodiment of the present application.

The above are only preferred embodiments of this application, and do not limit the scope of this application. Any equivalent structure or equivalent process transformation made by using the content of the description and drawings of this application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims

A data management method, wherein the data management method includes the following steps:

When a data import request is detected, read the data content of the first data corresponding to the data import request, and determine the first data type corresponding to the first data based on the data content;

Determine the first data source corresponding to the first data based on the first data type;

Determining a first conversion format and first column information of the first data, and generating a first data set from the first data based on the first conversion format and the first column information;

Import the first data set into the first data source.
The data management method according to claim 1, wherein when the data import request is detected, the data content of the first data corresponding to the data import request is read, and the first data content is determined based on the data content. The steps of the first data type corresponding to the data include:

When a data import request is detected, read the data content of the preset number of rows of first data corresponding to the data import request, and determine the second data type to which the column information of each column in the data content belongs;

The number of occurrences of each data type in the second data type is counted, and the data type with the most frequency is determined as the first data type corresponding to the first data.
5. The data management method according to claim 1, wherein the step of determining the first data source corresponding to the first data based on the first data type comprises:

Based on the first data type, determine the first data source corresponding to the first data, and return the first data source to the client corresponding to the data import request;

If a second data source sent by the user terminal based on the first data source is received, the second data source is used as the first data source corresponding to the first data.
The data management method according to claim 1, wherein the step of importing the first data set into the first data source comprises:

Determining the writing type of the first data set;

Import the first data set into the first data source according to the write type.
The data management method according to any one of claims 1 to 4, wherein the data management method further comprises:

When a data export request is detected, obtain configuration information of the data export request, where the configuration information includes a third data source, query statement, file format, and output path;

Obtaining second data corresponding to the data export request based on the third data source and the query sentence;

Based on the file format, generating a second data set from the second data, and determining a file writing object corresponding to the second data set;

The second data set is written into the file write-out object, and the file write-out object is exported to a terminal corresponding to the output path.
The data management method of claim 5, wherein the file format includes a second column of information, a second conversion format and a file format type corresponding to the second column of information, and the file format is based on the file format. The step of generating a second data set from the second data, and determining a file writing object corresponding to the second data set includes:

Generating a second data set from the second data based on the second conversion format and the second column information;

Based on the file format type, determine the file writing object corresponding to the second data set.
5. The data management method according to claim 5, wherein the step of writing the second data set into the file write-out object comprises:

Traverse the partitions of the second data set, and write the second data set into the file write-out object according to a write mode of one partition at a time.
A data management device, wherein the data management device includes:

A reading module, configured to read the data content of the first data corresponding to the data import request when a data import request is detected, and determine the first data type corresponding to the first data based on the data content;

A determining module, configured to determine a first data source corresponding to the first data based on the first data type;

A generating module, configured to determine a first conversion format and first column information of the first data, and generate a first data set from the first data based on the first conversion format and the first column information;

The import module is used to import the first data set into the first data source.
A data management device, wherein the data management device includes: a memory, a processor, and a data management program stored in the memory and capable of running on the processor, and the data management program is controlled by the processor The steps of the data management method according to any one of claims 1 to 7 are realized when executed.
A computer-readable storage medium, wherein a data management program is stored on the computer-readable storage medium, and when the data management program is executed by a processor, the data management according to any one of claims 1 to 7 is realized Method steps.