CN117370289A

CN117370289A - Data acquisition method and device and electronic equipment

Info

Publication number: CN117370289A
Application number: CN202311338999.6A
Authority: CN
Inventors: 张旭
Original assignee: China Construction Bank Corp; CCB Finetech Co Ltd
Current assignee: China Construction Bank Corp; CCB Finetech Co Ltd
Priority date: 2023-10-16
Filing date: 2023-10-16
Publication date: 2024-01-09

Abstract

The application provides a data acquisition method, a data acquisition device and electronic equipment, and relates to the field of data processing, wherein the data acquisition method comprises the following steps: receiving an acquisition instruction sent by a server, wherein the acquisition instruction carries acquisition parameters; according to the acquisition parameters, asynchronously acquiring multiple batches of target data at different times from an intermediate database, wherein the intermediate database stores data generated by a service system executing related services; generating a first file digest of the compressed file based on the plurality of batches of target data; and sending the compressed file and the first file abstract to a server, wherein the server is used for decompressing the compressed file according to the first file abstract to obtain target data. According to the method and the device, the plurality of batches of target data are summarized and packaged and then sent to the server, and the problems of network fluctuation and low data transmission efficiency caused by real-time data transmission can be avoided.

Description

Data acquisition method and device and electronic equipment

Technical Field

The present disclosure relates to the field of data processing, and in particular, to a data acquisition method, apparatus, and electronic device.

Background

Currently, a client uses a paging query database to collect data, and then transmits the data to a server in real time through an HTTP (Hypertext Transfer Protocol ) interface. The data acquisition mode has the following defects: 1) The method comprises the steps of circulating paging inquiry for many times, occupying frequent database links, and if the database links are not released timely, easily leading to database connection pools being full and causing database death; 2) The data is transmitted in real time through HTTP, so that transmission failure caused by network fluctuation is more likely to occur; 3) The data acquisition efficiency is low; 4) Data are collected in real time, and the pressure on a server is high.

In view of the foregoing, there is a need for a data acquisition method that can solve the above-mentioned problems.

Disclosure of Invention

The embodiment of the application provides a data acquisition method, a data acquisition device and electronic equipment, so as to solve the problems of the existing data acquisition method.

The first aspect of the present application provides a data acquisition method, applied to a client, the data acquisition method including: receiving an acquisition instruction sent by a server, wherein the acquisition instruction carries acquisition parameters; according to the acquisition parameters, asynchronously acquiring multiple batches of target data at different times from an intermediate database, wherein the intermediate database stores data generated by a service system executing related services; generating a first file digest of the compressed file based on the plurality of batches of target data; and sending the compressed file and the first file abstract to a server, wherein the server is used for decompressing the compressed file according to the first file abstract to obtain target data.

In one embodiment of the present application, generating a compressed file and a first file digest of the compressed file based on a plurality of batches of target data includes: encrypting a plurality of batches of target data by adopting a preset encryption algorithm to obtain encrypted data; writing the encrypted data into a file to obtain a target file; compressing the target file to obtain a compressed file; based on the compressed file, a first file digest is determined.

In one embodiment of the present application, the acquisition parameters include: acquisition scope and/or acquisition data type.

The second aspect of the present application provides a data acquisition method, applied to a server, the data acquisition method including:

and sending an acquisition instruction to the client, wherein the acquisition instruction is used for instructing the client to acquire target data according to the data acquisition method of any one of the first aspect.

In an alternative embodiment, after sending the acquisition instruction to the client, the method further includes: receiving a compressed file and a first file abstract sent by a client; determining a second file digest of the compressed file; and under the condition that the comparison of the first file abstract and the second file abstract is consistent, decompressing the compressed file to obtain target data.

In an alternative embodiment, decompressing the compressed file to obtain the target data includes: decompressing the compressed file to obtain a decompressed file; decrypting the decompressed file by adopting a preset decryption algorithm to obtain decrypted data; and checking the decrypted data, and determining the decrypted data as target data under the condition that the decrypted data passes the check.

In an alternative embodiment, after decompressing the compressed file to obtain the target data, the method further includes: and storing the target data to a preset big data cluster.

In an alternative embodiment, further comprising: and under the condition that the decrypted data is not checked, storing the checking result into a preset big data cluster.

A third aspect of the present application provides a data acquisition device, applied to a client, the data acquisition device comprising:

the receiving module is used for receiving an acquisition instruction sent by the server, wherein the acquisition instruction carries acquisition parameters;

the acquisition module is used for asynchronously acquiring multiple batches of target data at different times from the intermediate database according to the acquisition parameters, wherein the intermediate database stores data generated by the execution of related services by the service system;

the generation module is used for generating a compressed file and a first file abstract of the compressed file based on the target data of multiple batches;

the sending module is used for sending the compressed file and the first file abstract to the server, and the server is used for decompressing the compressed file according to the first file abstract to obtain target data.

In one embodiment of the present application, the generating module: the method is particularly used for encrypting a plurality of batches of target data by adopting a preset encryption algorithm to obtain encrypted data; writing the encrypted data into a file to obtain a target file; compressing the target file to obtain a compressed file; based on the compressed file, a first file digest is determined.

A fourth aspect of the present application provides a data acquisition device, applied to a server, the data acquisition device comprising:

the sending module is used for sending an acquisition instruction to the client, and the acquisition instruction is used for instructing the client to acquire target data according to the data acquisition method of any one of the first aspects.

In an alternative embodiment, after sending the acquisition instruction to the client, the method further includes: the processing module is used for receiving the compressed file and the first file abstract sent by the client; determining a second file digest of the compressed file; and under the condition that the comparison of the first file abstract and the second file abstract is consistent, decompressing the compressed file to obtain target data.

In an alternative embodiment, the processing module is specifically configured to, when decompressing the compressed file to obtain the target data: decompressing the compressed file to obtain a decompressed file; decrypting the decompressed file by adopting a preset decryption algorithm to obtain decrypted data; and checking the decrypted data, and determining the decrypted data as target data under the condition that the decrypted data passes the check.

In an alternative embodiment, the method further includes a storage module, configured to decompress the compressed file to obtain target data, and then store the target data to a preset big data cluster.

In an alternative embodiment, the storing module is further configured to store the verification result to a preset big data cluster if the decrypted data fails to be verified.

A fifth aspect of the present application provides an electronic device comprising a memory and a processor; wherein,

the memory is used for storing program codes;

the processor is used for calling program codes to realize the data acquisition method of any one of the above.

A sixth aspect of the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, causes an electronic device to perform the data acquisition method of any one of the above.

A seventh aspect of the present application provides a computer program product having a computer program stored thereon, which, when executed by a processor, causes an electronic device to perform the data acquisition method of any of the above.

According to the technical scheme, the embodiment of the application comprises the following steps: receiving an acquisition instruction sent by a server, wherein the acquisition instruction carries acquisition parameters; according to the acquisition parameters, asynchronously acquiring multiple batches of target data at different times from an intermediate database, wherein the intermediate database stores data generated by a service system executing related services; generating a first file digest of the compressed file based on the plurality of batches of target data; and sending the compressed file and the first file abstract to a server, wherein the server is used for decompressing the compressed file according to the first file abstract to obtain target data. According to the method and the device, the plurality of batches of target data are summarized and packaged and then sent to the server, and the problems of network fluctuation and low data transmission efficiency caused by real-time data transmission can be avoided.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic view of an application scenario of a data acquisition method provided in the present application;

FIG. 2 is a flow chart of steps of a data acquisition method according to an embodiment of the present application;

FIG. 3 is a flow chart of steps of a data acquisition method according to another embodiment of the present application;

FIG. 4 is a block diagram of a data acquisition device according to an embodiment of the present application;

FIG. 5 is a block diagram of a data acquisition device according to another embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to better understand the solution of the present application, the following description will clearly and completely describe the solution of the embodiment of the present application with reference to the accompanying drawings in the embodiment of the present application, and it is obvious that the described embodiment is only a part of the embodiment of the present application, not all the embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

The principles and spirit of the present application are explained in detail below with reference to several representative embodiments thereof.

The inventor finds that in the prior art, when a service system executes a service, some service data are generated, after the service data are generated, the generated service data are stored in an intermediate database, a client actively collects data from the intermediate database in real time, and then the collected data are transmitted to a server in real time, so that the server needs to receive the data collected by the client at fixed time, the pressure of the server is high, in addition, if network fluctuation exists, some data which are being transmitted are lost, and the client needs to frequently occupy the link of the intermediate database to collect the data in real time, so that the data collection efficiency is low.

Based on the above problems, the application provides a data acquisition method, which is used for carrying out data acquisition asynchronously according to acquisition parameters when a client receives an acquisition instruction, so that the problem that the client needs to occupy a middle database link frequently when actively carrying out acquisition in real time is avoided.

Having described the basic principles of the present application, various non-limiting embodiments of the present application are specifically described below.

Referring to fig. 1, an application scenario diagram of a data collection method provided in the present application includes a server and a plurality of clients, where each client corresponds to an intermediate database (e.g., in fig. 1, a client ai corresponds to an intermediate database bi, i takes an integer between 1 and n), each intermediate database corresponds to a service system, such as a transaction system, and data generated when the service system executes a service is stored in the corresponding intermediate database, and the data includes transaction amount, transaction time, and a transaction user. In addition, a plurality of clients can collect data of different areas and gather the data to a server for processing, so that the server gathers the data generated by a plurality of service systems.

The following describes the technical scheme of the present application in detail through specific embodiments. It should be noted that the following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.

Referring to fig. 2, a flowchart of steps of a data collection method provided in an embodiment of the present application is applied to a client, and the data collection method specifically includes the following steps:

s201, receiving an acquisition instruction sent by a server.

Wherein, the collection instruction carries collection parameters. The acquisition parameters include: acquisition scope and/or acquisition data type. In the embodiment of the application, the acquisition parameters are used for guiding the client to acquire the target data according to the acquisition parameters. Further, the acquisition range includes an acquisition time range, for example, acquisition of data generated from 10 points to 13 points. The type of data collected is such as transaction amount, transaction time, transaction user, etc.

S202, asynchronously acquiring multiple batches of target data at different times from an intermediate database according to acquisition parameters.

The intermediate database stores data generated by the service system executing related services.

In the embodiment of the present application, asynchronous refers to multiple acquisitions of target data, that is, a batch of target data may be acquired at a time.

For example, when the current time is 11 points, data from 10 points to 11 points are obtained at 11 points as a batch of target data. And then, at 12 points, acquiring 11-12 points as a batch of target data, and at 13 points, acquiring 12-13 points as a batch of target data, and acquiring 10-13 points as a plurality of batches of target data.

In the prior art, the client side acquires data from the intermediate database in real time, and then acquires data from the intermediate database and then transmits the acquired data to the server side after transmitting the acquired data to the server side in real time, so that the method is repeated. For example, if the client acquires data from the intermediate database once every 5 minutes, the client needs to acquire 36 data from the intermediate database and transmit 36 data to the server, which has a problem of low data acquisition efficiency, but in the embodiment of the present application, the client only needs to acquire 3 data from the intermediate database and transmit 1 data to the server.

S203, generating a compressed file and a first file digest of the compressed file based on the plurality of batches of target data.

The compressed file is decompressed to obtain multiple batches of target data. The first file digest is generated based on the compressed file. Further, the first file Digest may be generated by using an MD5Message-Digest Algorithm (MD 5Message-Digest algoritm), where the MD5Message-Digest Algorithm is a widely used cryptographic hash function, the MD5 code processes the input information in 512-bit packets, each packet is divided into 16 32-bit sub-packets, after a series of processing, the output of the Algorithm is composed of four 32-bit packets, and after cascading the four 32-bit packets, a 128-bit hash value is generated to ensure that the information transmission is completely consistent.

S204, the compressed file and the first file abstract are sent to a server, and the server is used for decompressing the compressed file according to the first file abstract to obtain target data.

The method can package a plurality of batches of target data into a compressed file and send the compressed file to the server, so that the integrity of the target data in the transmission process can be ensured, and the data loss is avoided.

In the embodiment of the application, the acquisition instruction is sent by the receiving server, and the acquisition instruction carries acquisition parameters; according to the acquisition parameters, asynchronously acquiring multiple batches of target data at different times from an intermediate database, wherein the intermediate database stores data generated by a service system executing related services; generating a first file digest of the compressed file based on the plurality of batches of target data; the method comprises the steps of sending a compressed file and a first file abstract to a server, decompressing the compressed file according to the first file abstract to obtain target data, and sending the target data to the server after summarizing and packaging a plurality of batches of target data, so that the problems of network fluctuation and low data transmission efficiency caused by real-time data transmission can be avoided

Referring to fig. 3, a flowchart of steps of another data collection method according to an embodiment of the present application is provided, where the data collection method specifically includes the following steps:

s301, the business system generates data and stores the data in an intermediate database.

Wherein the business system is a transaction system, and the data is transaction data. When the business system generates transaction, the business system generates transaction data, and the generated transaction data is stored in a corresponding intermediate database.

S302, the server sends an acquisition instruction to the client.

The server is provided with acquisition parameters, and the server sends acquisition instructions to the client according to the acquisition parameters.

S303, after receiving the acquisition instruction, the client asynchronously acquires multiple batches of target data with different times from the intermediate database.

The specific implementation process parameter S201 of this step is not described herein.

S304, the client encrypts the target data in batches by adopting a preset encryption algorithm to obtain encrypted data.

The preset encryption algorithm, such as SM4 algorithm, wherein SMs algorithm is a symmetric encryption algorithm, and is published along with WAPI standard (wireless local area network security national standard), and the encryption strength is 128 bits. The algorithm is a grouping algorithm for wireless lan products. The algorithm has a packet length of 128 bits and a key length of 128 bits. The encryption algorithm and the key expansion algorithm both adopt a 32-round nonlinear iterative structure. The decryption algorithm is the same as the encryption algorithm except that the round keys are used in reverse order, and the decryption round keys are in reverse order of the encryption round keys.

In the embodiment of the application, each time the client acquires a batch of target data, the target data is encrypted.

S305, the client writes the encrypted data into the file to obtain the target file.

And the client writes the encrypted target data into the target file.

S306, compressing the target file by the client to obtain a compressed file.

S307, the client determines a first file digest based on the compressed file.

Wherein the first file digest may be generated based on an MD5 algorithm.

S308, the client informs the server to pull the data file.

The client sends a notification to the server, which is used for notifying the server to pull the compressed file from the client.

S309, after receiving the notice of pulling the data file, the server acquires the compressed file and the first file abstract from the client.

S310, the server determines a second file digest of the compressed file.

Wherein the server calculates a second file digest of the received compressed file using an MD5 algorithm.

S311, the server decompresses the compressed file to obtain the target data under the condition that the comparison of the first file abstract and the second file abstract is consistent.

The server can compare the first file abstract with the second file abstract, if the comparison is consistent, the server can determine that the compressed file has no data loss in the transmission process, if the comparison is inconsistent, the server can pull the latest compressed file from the client again, S311 is executed, and if the comparison is inconsistent for a plurality of times, the server returns an exception and terminates the pulling operation.

Decompressing the compressed file to obtain target data, wherein the method comprises the following steps: decompressing the compressed file to obtain a decompressed file; decrypting the decompressed file by adopting a preset decryption algorithm to obtain decrypted data; and checking the decrypted data, and determining the decrypted data as target data under the condition that the decrypted data passes the check.

The preset decryption algorithm is a decryption algorithm corresponding to the preset encryption algorithm.

Further, under the condition that the decrypted data is not checked, the checking result is stored to a preset big data cluster.

Specifically, the verification of the decrypted data includes: and checking the integrity, timeliness, validity, consistency, uniqueness and accuracy of the decrypted data. The verification mode comprises single-column, cross-row and cross-table verification. Specifically, a single column includes: and may not be blank columns, grammar constraint classes, format specification classes, length constraint classes, value range constraint classes, and fact reference standard classes. The crossing columns include: the constraint classes are empty classes, warehouse-in time classes and constraint classes with consistent single-table values and single-table logic. The crossing includes: the unique class and hierarchy coincidence constraints are recorded. The cross table comprises: an external association constraint class, a cross-table equivalent consistency constraint class and a cross-table logic consistency constraint class.

S312, storing the target data to a preset big data cluster.

The method comprises the steps that decrypted data passing verification is used as target data to be uploaded to a preset big data cluster, verification results are generated by decrypted data not passing verification, and the verification results comprise: error logs and data supervision clues, etc. The verification result can be used for subsequent analysis of reasons of acquisition failure.

Further, the big data cluster stores target data for subsequent data processing, analysis, statistics and the like.

In an embodiment of the present application, a big data cluster includes: hadoop clusters. The Hadoop is an open source distributed computing platform, and a Hadoop with an HDFS (Hadoop Distributed File System ) and a MapReduce (programming model) as cores provides a transparent distributed infrastructure for the details of the bottom layer of the system for users, wherein the Hadoop2.0 is added with a yacn, and YARN is a resource scheduling framework, which can manage and schedule tasks in fine granularity and can support other computing frameworks such as spark (a computing engine). The advantages of high fault tolerance, high scalability, high efficiency and the like of the HDFS enable users to deploy Hadoop on low-cost hardware to form a distributed system.

Furthermore, hadoop has a very perfect and large open-source ecological circle: HDFS provides file storage, YARN provides resource management, and on this basis, performs various processes including MapReduce, tez (a computing engine), sprak, storm (an open source distributed real-time computing system), etc., to meet different required data usage scenarios. The HDFS adopts a master-slave structure model, and an HDFS cluster is composed of a NameNode (name node) and a plurality of datanodes (data nodes), where the NameNode is used as a master server to manage the namespaces of the file system and the access operation of the client to the files, and the datanodes are responsible for managing the stored data. The HDFS bottom data is cut into a plurality of blocks, and the blocks are copied and stored on different datanodes so as to achieve the purpose of fault tolerance and disaster tolerance.

Further, mapReduce highly abstracts complex parallel computing processes running on a scale cluster into two functional processes: map and Reduce. The map function takes as input key/value pairs, producing another series of key/value pairs as intermediate outputs to be written to the local disk. The MapReduce framework automatically gathers the intermediate data according to the key values, and the data with the same key values are uniformly delivered to the reduction function for processing. The reduce function takes the key and the corresponding value list as input, and generates another series of key/value pairs as final output to write into the HDFS after merging the same value values of the keys.

Hive is a data warehouse tool based on Hadoop, which can map a structured data file into a database table, provide a simple sql query function, and convert sql statements into MapReduce tasks for running. HBase is a database of Hadoop, a distributed, scalable, large data store. Hive itself does not store and calculate data, it relies entirely on HDFS and MapReduce, table-pure logic in Hive. Hive needs to use the hdfs storage file and needs to use the MapReduce computation framework. Hive may be considered a wrapper for map-reduce. Hive's meaning is to convert the sql of Hive that writes well into a complicated and hard-to-write map-reduce program.

The application relates to improving the quality and efficiency of data collection, and in particular, the application can help collect, integrate and analyze mass data by adopting a preset large data cluster. Therefore, by combining the technology, the problems of incomplete data, unreliable data source, data incapability of tracing and the like in the prior art can be solved, and the quality of the acquired data is finally improved. In addition, the client can collect data in real time, meanwhile, the problem of data transmission failure caused by network fluctuation is solved, and the performance pressure on the server in the collection process is reduced.

The following are device embodiments of the present application, which may be used to perform method embodiments of the present application. For details not disclosed in the device embodiments of the present application, please refer to the method embodiments of the present application.

Referring to fig. 4, a schematic diagram of a data acquisition device provided in an embodiment of the present application is applied to a client, where the data acquisition device 400 includes: a receiving module 401, an acquiring module 402, a generating module 403 and a transmitting module 404; wherein:

the receiving module 401 is configured to receive an acquisition instruction sent by the server, where the acquisition instruction carries an acquisition parameter;

the acquiring module 402 is configured to asynchronously acquire multiple batches of target data at different times from an intermediate database according to the acquisition parameters, where the intermediate database stores data generated by a service system executing related services;

a generating module 403, configured to generate a compressed file and a first file digest of the compressed file based on the plurality of batches of target data;

the sending module 404 is configured to send the compressed file and the first file digest to a server, where the server is configured to decompress the compressed file according to the first file digest to obtain the target data.

In one embodiment of the present application, the generating module 403: the method is particularly used for encrypting a plurality of batches of target data by adopting a preset encryption algorithm to obtain encrypted data; writing the encrypted data into a file to obtain a target file; compressing the target file to obtain a compressed file; based on the compressed file, a first file digest is determined.

Referring to fig. 5, the present application further provides a data acquisition device 500, applied to a server, where the data acquisition device includes:

the sending module 501 is configured to send an acquisition instruction to the client, where the acquisition instruction is used to instruct the client to acquire target data according to any one of the data acquisition methods described above.

In an alternative embodiment, after sending the acquisition instruction to the client, the method further includes: a processing module (not shown) for receiving the compressed file and the first file digest sent by the client; determining a second file digest of the compressed file; and under the condition that the comparison of the first file abstract and the second file abstract is consistent, decompressing the compressed file to obtain target data.

In an alternative embodiment, the method further includes a storage module (not shown) configured to decompress the compressed file to obtain target data, and then store the target data to a preset large data cluster.

The data acquisition device refers to the embodiment content of the data acquisition method, and is not described herein.

It should be noted that, it should be understood that the division of the modules of the above apparatus is merely a division of a logic function, and may be fully or partially integrated into a physical entity or may be physically separated. And these modules may all be implemented in software in the form of calls by the processing element; or can be realized in hardware; the method can also be realized in a form of calling software by a processing element, and the method can be realized in a form of hardware by a part of modules. For example, the processing module may be a processing element that is set up separately, may be implemented in a chip of the above-mentioned apparatus, or may be stored in a memory of the above-mentioned apparatus in the form of program codes, and the functions of the above-mentioned processing module may be called and executed by a processing element of the above-mentioned apparatus. The implementation of the other modules is similar. In addition, all or part of the modules can be integrated together or can be independently implemented. The processing element here may be an integrated circuit with signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in a software form.

For example, the modules above may be one or more integrated circuits configured to implement the methods above, such as: one or more specific integrated circuits (application specific integrated circuit, ASIC), or one or more microprocessors (digital signal processor, DSP), or one or more field programmable gate arrays (field programmable gate array, FPGA), or the like. For another example, when a module above is implemented in the form of a processing element scheduler code, the processing element may be a general purpose processor, such as a central processing unit (central processing unit, CPU) or other processor that may invoke the program code. For another example, the modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.) means from one website, computer, central, or data center. Computer readable storage media can be any available media that can be accessed by a computer or data storage devices such as a central office, data center, or the like, that contains an integration of one or more available media. Usable media may be magnetic media (e.g., floppy disks, hard disks, magnetic tape), optical media (e.g., DVD), or semiconductor media (e.g., solid State Disk (SSD)), among others.

Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 6, the electronic device may include: a processor 61, a memory 62, a communication interface 63 and a system bus 64. The memory 62 and the communication interface 63 are connected to the processor 61 through the system bus 64 and complete communication with each other, the memory 62 is used for storing computer execution instructions, the communication interface 63 is used for communicating with other devices, and the scheme of the above embodiment is implemented when the processor 61 executes the computer execution instructions.

The system bus referred to in fig. 6 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, or the like. The system bus may be classified into an address bus, a data bus, a control bus, and the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus. The communication interface is used for realizing communication between the database access device and other devices (such as a user side, a read-write library and a read-only library). The memory may comprise random access memory (random access memory, RAM) and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

The processor may be a general-purpose processor, including a Central Processing Unit (CPU), a network processor (network processor, NP), etc.; but may also be a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component.

Optionally, an embodiment of the present application further provides a computer readable storage medium, where a computer program is stored, where the computer program when executed by a processor causes an electronic device to perform a method according to the above embodiment.

Optionally, an embodiment of the present application further provides a computer program product, on which a computer program is stored, which when executed by a processor causes an electronic device to perform the method of the above embodiment.

In the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a alone, a and B together, and B alone, wherein a, B may be singular or plural. The character "/" generally indicates that the front and rear associated objects are an "or" relationship; in the formula, the character "/" indicates that the front and rear associated objects are a "division" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.

It will be appreciated that the various numerical numbers referred to in the embodiments of the present application are merely for ease of description and are not intended to limit the scope of the embodiments of the present application. In the embodiments of the present application, the sequence number of each process does not mean the sequence of execution sequence, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions from the scope of the technical solutions of the embodiments of the present application.

Claims

1. A data acquisition method, characterized in that it is applied to a client, the data acquisition method comprising:

receiving an acquisition instruction sent by a server, wherein the acquisition instruction carries acquisition parameters;

according to the acquisition parameters, asynchronously acquiring multiple batches of target data at different times from an intermediate database, wherein the intermediate database stores data generated by a service system executing related services;

generating a compressed file and a first file digest of the compressed file based on the plurality of batches of target data;

and sending the compressed file and the first file abstract to the server, wherein the server is used for decompressing the compressed file according to the first file abstract to obtain the target data.

2. The method of claim 1, wherein generating a compressed file and a first file digest of the compressed file based on the plurality of batches of target data comprises:

encrypting the batches of target data by adopting the preset encryption algorithm to obtain encrypted data;

writing the encrypted data into a file to obtain a target file;

compressing the target file to obtain a compressed file;

and determining the first file abstract based on the compressed file.

3. The data acquisition method according to claim 1 or 2, wherein the acquisition parameters include: acquisition scope and/or acquisition data type.

4. A data acquisition method, characterized in that it is applied to a server, the data acquisition method comprising:

sending an acquisition instruction to a client, the acquisition instruction being for instructing the client to acquire target data according to the data acquisition method of any one of claims 1 to 3.

5. The method for collecting data according to claim 4, further comprising, after the sending the collection instruction to the client:

receiving a compressed file and a first file abstract sent by the client;

determining a second file digest of the compressed file;

and decompressing the compressed file to obtain the target data under the condition that the comparison of the first file abstract and the second file abstract is consistent.

6. The method of claim 5, wherein decompressing the compressed file to obtain the target data comprises:

decompressing the compressed file to obtain a decompressed file;

decrypting the decompressed file by adopting a preset decryption algorithm to obtain decrypted data;

and checking the decrypted data, and determining the decrypted data as the target data under the condition that the decrypted data passes the check.

7. The method of claim 6, wherein after decompressing the compressed file to obtain the target data, further comprising:

and storing the target data to a preset big data cluster.

8. The data acquisition method of claim 6, further comprising: and under the condition that the decrypted data is not checked, storing a checking result to a preset big data cluster.

9. A data acquisition device, characterized by being applied to a client, the data acquisition device comprising:

the acquisition module is used for asynchronously acquiring multiple batches of target data at different times from an intermediate database according to the acquisition parameters, wherein the intermediate database stores data generated by a service system executing related services;

the generation module is used for generating a compressed file and a first file abstract of the compressed file based on the plurality of batches of target data;

and the sending module is used for sending the compressed file and the first file abstract to the server, and the server is used for decompressing the compressed file according to the first file abstract to obtain the target data.

10. An electronic device comprising a memory and a processor; wherein,

the memory is used for storing program codes;

the processor is configured to invoke the program code to implement the data acquisition method of any of claims 1 to 8.

11. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, causes an electronic device to perform the data acquisition method according to any one of claims 1 to 8.

12. A computer program product having a computer program stored thereon, which, when executed by a processor, causes an electronic device to perform the data acquisition method according to any one of claims 1 to 8.