CN115391328A - Data acquisition method, data acquisition device, electronic equipment and medium - Google Patents

Data acquisition method, data acquisition device, electronic equipment and medium Download PDF

Info

Publication number
CN115391328A
CN115391328A CN202110570068.3A CN202110570068A CN115391328A CN 115391328 A CN115391328 A CN 115391328A CN 202110570068 A CN202110570068 A CN 202110570068A CN 115391328 A CN115391328 A CN 115391328A
Authority
CN
China
Prior art keywords
data
scanned
data table
scanning tool
acquiring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110570068.3A
Other languages
Chinese (zh)
Inventor
袁赫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110570068.3A priority Critical patent/CN115391328A/en
Publication of CN115391328A publication Critical patent/CN115391328A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a data acquisition method, a data acquisition device, electronic equipment and a medium, which are applied to the technical field of cloud and can particularly relate to the technical field of big data. The method comprises the following steps: the method comprises the steps of obtaining structural information of a data table to be scanned, wherein the structural information comprises a field name and a field type of the data table to be scanned, obtaining a configuration file corresponding to the field type from a data scanning library according to the field type of the data table to be scanned, generating a data scanning tool based on the field name and the configuration file of the data table to be scanned, and obtaining target data from the data table to be scanned through the data scanning tool. By adopting the embodiment of the application, the efficiency of acquiring data can be improved. The present application may also relate to blockchain techniques, such as where the retrieved target data may be written to a blockchain.

Description

Data acquisition method, data acquisition device, electronic equipment and medium
Technical Field
The present application relates to the field of cloud technologies, and in particular, to a data acquisition method and apparatus, an electronic device, and a medium.
Background
At present, in the process of scanning a data table to acquire data, a developer is usually required to manually write a complete set of data scanning scheme, so as to acquire the data in the data table. However, the inventor has recognized that, since the table structures of the data tables are various, writing a data scanning scheme for different data tables may consume a lot of time and resources, and may even make the data obtained by the last scanning unavailable due to some compiling errors in the writing process, thereby resulting in inefficient data acquisition. Therefore, how to improve the efficiency of data acquisition when scanning the data table to acquire data becomes an urgent problem to be solved.
Disclosure of Invention
The embodiment of the application provides a data acquisition method, a data acquisition device, electronic equipment and a medium, which are beneficial to improving the data acquisition efficiency.
In one aspect, an embodiment of the present application provides a data obtaining method, where the method includes:
acquiring structural information of a data table to be scanned; the structure information comprises a field name and a field type of the data table to be scanned;
acquiring a configuration file corresponding to the field type from a data scanning library according to the field type of the data table to be scanned, and generating a data scanning tool based on the field name of the data table to be scanned and the configuration file;
and acquiring target data from the data table to be scanned through the data scanning tool.
In one aspect, an embodiment of the present application provides a data obtaining apparatus, where the apparatus includes:
the acquisition module is used for acquiring the structural information of the data table to be scanned; the structure information comprises a field name and a field type of the data table to be scanned;
the processing module is used for acquiring a configuration file corresponding to the field type from a data scanning library according to the field type of the data table to be scanned and generating a data scanning tool based on the field name of the data table to be scanned and the configuration file;
the acquisition module is further configured to acquire target data from the to-be-scanned data table through the data scanning tool.
In one aspect, an embodiment of the present application provides an electronic device, which includes a processor and a memory, where the memory is used to store a computer program, and the computer program includes program instructions, and the processor is configured to call the program instructions to perform some or all of the steps in the above method.
In one aspect, the present application provides a computer-readable storage medium, which stores a computer program, where the computer program includes program instructions, and the program instructions, when executed by a processor, are used to perform some or all of the steps of the above method.
According to the embodiment of the application, the configuration file corresponding to the field type can be obtained from the data scanning library according to the field type of the data table to be scanned by obtaining the structural information including the field name and the field type of the data table to be scanned, the data scanning tool is generated based on the field name and the configuration file of the data table to be scanned, and the target data is obtained from the data table to be scanned through the data scanning tool. By implementing the scheme, the efficiency of acquiring data is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic diagram of an application architecture according to an embodiment of the present application;
fig. 2 is a schematic flowchart of a data acquisition method according to an embodiment of the present application;
fig. 3 is a schematic view of an application scenario of a tool for generating data scanning according to an embodiment of the present application;
fig. 4 is a schematic flowchart of another data acquisition method according to an embodiment of the present application;
fig. 5 is a schematic view of an application scenario for acquiring data according to an embodiment of the present application;
fig. 6 is a schematic flowchart of a data acquisition method according to an embodiment of the present application;
fig. 7 is a schematic view of an application scenario of a parallel scan data table according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a data acquisition apparatus according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
The data acquisition method provided by the embodiment of the application can be realized in electronic equipment, and the electronic equipment can be a server or terminal equipment. The server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server for providing basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN (content delivery network), big data and artificial intelligence platforms and the like. The terminal device may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, and the like.
The embodiment of the application can relate to the technical field related to cloud technology, for example, specifically to the technical field of Big data, wherein the Big data (Big data) refers to a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth rate and diversified information asset which needs a new processing mode to have stronger decision-making power, insight discovery power and process optimization capability. With the advent of the cloud era, big data has attracted more and more attention, and the big data needs special technology to effectively process a large amount of data within a tolerance elapsed time. The method is suitable for the technology of big data, and comprises a large-scale parallel processing database, data mining, a distributed file system, a distributed database, a cloud computing platform, the Internet and an extensible storage system. By executing the technical scheme of the application, the data tables stored in the database can be scanned in a large-scale parallel or serial mode to achieve data acquisition, and the efficiency of data table scanning and the efficiency of data acquisition are improved.
Referring to fig. 1, fig. 1 is a schematic view of an application architecture provided in an embodiment of the present application, through which a data acquisition method provided in the present application can be executed. As shown in FIG. 1, FIG. 1 may include an electronic device, and a database storing data tables. The electronic device can generate a data scanning tool by executing the technical scheme of the application, and scan the data table by using the data scanning tool so as to acquire data from the data table. Optionally, the database storing the data table may include a storage client and a storage server, the data table may be stored in the storage server, and the storage client may be configured with a data acquisition interface, and the data in the data table may be acquired from the storage server through the data acquisition interface and may be sent to the electronic device. The electronic device receives data sent by the storage client through the data scanning tool.
It should be understood that fig. 1 merely represents an application architecture of the present disclosure by way of example, and does not limit the specific architecture of the present disclosure, that is, the present disclosure may also provide other forms of application architectures.
Optionally, in some embodiments, the electronic device may execute the data acquisition method according to an actual service requirement, so as to improve data acquisition efficiency.
For example, the technical scheme of the application can be applied to a data migration scenario, for example, in a user data migration scenario of surcharge payment, the electronic device can generate a data scanning tool for a data table storing user data through the technical scheme of the application, and scan the data table through the data scanning tool to obtain the user data, so that the user data can be processed after being obtained, for example, the user data is transmitted to a specified position for data storage to implement data migration. It can be understood that in an application scenario of data migration, data to be migrated can be acquired through the technical scheme of the application.
For another example, the technical solution of the present application may also be applied to a data aggregation scenario, for example, if a product a and a product B are merged into a same product, user authorization data of the product a and user authorization data of the product B need to be aggregated, the electronic device may generate a data scanning tool for storing a data table of the user authorization data of the product a and a data table of the user authorization data of the product B through the technical solution of the present application, and scan the corresponding data tables through the data scanning tool to obtain the user authorization data of the product a and the user authorization data of the product B, so that the two user authorization data are merged to implement data aggregation. It can be understood that in an application scenario of data aggregation, data needing to be aggregated can be acquired through the technical scheme of the application.
Optionally, the data related to the present application, such as a data table to be scanned, data obtained by scanning the data table, and the like, may be stored in a database, or may be stored in a block chain, such as being stored by a block chain distributed system, which is not limited in the present application. For example, a server, such as a storage server, may be a node in a blockchain, and a client, such as a storage client, may be a node in a blockchain, or both a server and a client may be nodes in a blockchain. Further alternatively, the electronic device to which the present application relates, such as a server for processing target data, may be a node in a blockchain.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism and an encryption algorithm. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer.
The block chain underlying platform can comprise processing modules such as user management, basic service, intelligent contract and operation monitoring. The user management module is responsible for identity information management of all blockchain participants, and comprises public and private key generation maintenance (account management), key management, user real identity and blockchain address corresponding relation maintenance (authority management) and the like, and under the authorization condition, the user management module supervises and audits the transaction condition of certain real identities and provides rule configuration (wind control audit) of risk control; the basic service module is deployed on all block chain node equipment and used for verifying the validity of the service request, recording the service request to storage after consensus on the valid request is completed, for a new service request, the basic service firstly performs interface adaptation analysis and authentication processing (interface adaptation), then encrypts service information (consensus management) through a consensus algorithm, transmits the service information to a shared account (network communication) completely and consistently after encryption, and performs recording and storage; the intelligent contract module is responsible for registering and issuing contracts, triggering the contracts and executing the contracts, developers can define contract logics through a certain programming language, issue the contract logics to a block chain (contract registration), call keys or other event triggering and executing according to the logics of contract clauses, complete the contract logics and simultaneously provide the function of canceling contract upgrading logout; the operation monitoring module is mainly responsible for deployment, configuration modification, contract setting, cloud adaptation in the product release process, and visual output of real-time status in product operation, for example: alarm, monitoring network conditions, monitoring node equipment health status, and the like.
The platform product service layer provides basic capability and an implementation framework of typical application, and developers can complete block chain implementation of business logic based on the basic capability and the characteristics of the superposed business. The application service layer provides the application service based on the block chain scheme for the business participants to use. For example, in the present application, a data storage function may be provided by a blockchain, and stored data may be provided to a developer or the like.
It is to be understood that the foregoing scenarios are only examples, and do not constitute a limitation on application scenarios of the technical solutions provided in the embodiments of the present application, and the technical solutions of the present application may also be applied to other scenarios. For example, as a person having ordinary skill in the art can know, with the evolution of the system architecture and the emergence of new service scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
Based on the above description, the present application embodiment proposes a data acquisition method, which may be executed by the above-mentioned electronic device. As shown in fig. 2, a flow of the data obtaining method according to the embodiment of the present application may include the following steps:
s201, acquiring structural information of a data table to be scanned; the structure information includes field names and field types of the data table to be scanned.
The data table to be scanned can be stored in the database, namely the data table can be obtained from the database and used as the data table to be scanned; alternatively, the data table to be scanned may also be a data table transmitted by another device, or a data table stored locally, and the like, which is not limited in this application. Optionally, the data table to be scanned may be any designated data table, or may be multiple data tables having the same feature information, where the same feature information may refer to that the structural information is the same and of the same type. For example, the types of the data tables are multiple data tables for storing user payment information, and the structural information of the multiple data tables is the same, that is, the field names and the field types of the contained fields are the same.
In a possible implementation manner, the obtaining of the structural information of the to-be-scanned data table may specifically be that a developer obtains, by analyzing the to-be-scanned data table, field names of a plurality of (two or more) fields constituting the to-be-scanned data table and a field type of each field. For example, the data table to be scanned is a data table storing user information, and it is assumed that the acquired structure information of the data table to be scanned includes: the method comprises the following steps that 3 fields are respectively provided, the field types of the fields with the field names of [ user ID ], [ user name ], [ user mobile phone number ], the field types of the fields with the field names of [ user ID ] are string types, the field types of the fields with the field names of [ user name ] are varchar types (variable length types), and the field types of the fields with the field names of [ user mobile phone number ] are int types (integer types).
Optionally, the structural information is obtained by analyzing the data table to be scanned, and the data size of the data in the data table to be scanned, that is, the size of the memory occupied by the data can also be obtained, and when the electronic device obtains the data size of the data table to be scanned, a hard disk with a proper memory capacity is allocated to the electronic device in advance to store the data obtained by scanning.
S202, according to the field type of the data table to be scanned, a configuration file corresponding to the field type is obtained from the data scanning library, and a data scanning tool is generated based on the field name and the configuration file of the data table to be scanned.
In one possible embodiment, the data scanning library encapsulates a plurality of configuration files for different field types, the configuration files are configured with a Protocol that can support type conversion of data corresponding to the corresponding field types, the Protocol may be protobuf, which is also called Protocol Buffer (PB), and a mechanism based on a description language and a reflection principle of protobuf may perform serialization processing and deserialization processing on the data.
The serialization processing can be used for converting original data acquired from data to be scanned into binary data, the deserialization processing can be used for converting the binary data into data in a specified format, the data in the specified format can exist in a data table form, namely the data table obtained after conversion has the same structure as the data table to be scanned before conversion, and the description language and reflection principle mechanism of protobuf is shown as that the deserialization processing can be carried out on the serialized data to obtain effective data information.
Optionally, because the protobuf has different processing modes for performing serialization processing and deserialization processing on data corresponding to different field types, a plurality of PB files for different field types may be generated in advance based on protobuf definition, and the PB files may be encapsulated as configuration files and stored in the data scanning library. The data scanning library can be deployed in the electronic device, and can also be deployed in other devices capable of storing data, and the electronic device can acquire a configuration file corresponding to a field type from the data scanning library according to the field type of the data table to be scanned. For example, the configuration file 1 may perform serialization processing and deserialization processing on field data whose field type is int type.
For example, referring to fig. 3, fig. 3 is a schematic view of an application scenario for generating a data scanning tool provided in the present application, where when an electronic device acquires structure information of a to-be-scanned data table sent by a developer, a configuration file corresponding to a field type of the to-be-scanned data table is acquired from a data scanning library, and the data scanning tool is generated by compiling the configuration file based on a field name of the to-be-scanned data table.
In one possible implementation, the electronic device may obtain a corresponding configuration file from the data scanning library according to the field type of the data table to be scanned, and generate a data scanning tool based on the field name and the configuration file of the data table to be scanned. The data scanning tool has the capability of identifying the structure of a data table to be scanned, and also has the capability of performing serialization processing and deserialization processing on data corresponding to a plurality of fields in the scanned data according to different field types.
Illustratively, the electronic device obtains a certain row of data from a data table to be scanned through a data scanning tool, and identifies the structure of the row of data by using the data scanning tool, to obtain field names and corresponding data corresponding to a plurality of fields included in the row of data, and field types corresponding to the plurality of fields, respectively, and performs serialization processing on the data based on the data scanning tool. For example, the row data includes [ field 1 ] and [ field 2 ], the field type corresponding to [ field 1 ] is type 1, and the field type corresponding to [ field 2 ] is type 2, and the data corresponding to [ field 1 ] is sequentially serialized based on a data scanning tool, and the data corresponding to [ field 2 ] is serialized to obtain a byte sequence corresponding to the row data, where the byte sequence includes the row data and the type of the included field.
And S203, acquiring target data from the data table to be scanned through a data scanning tool.
The target data may be data that has been serialized by the data scanning tool, or data that has been deserialized by the data scanning tool. Alternatively, the target data may be part or all of the data to be scanned in the form of a data table.
In one possible implementation, the electronic device may scan the data table to be scanned by the data scanning tool at a row level. The data table to be scanned may include N rows of data, where the N rows of data are data of a first type, and the data of the first type is data in a table structure form. And the N rows of data comprise the ith row of data, N is a positive integer, and i is a positive integer less than or equal to N.
The specific implementation of the electronic device obtaining the target data from the to-be-scanned data table through the data scanning tool may be: scanning the ith row of the data table to be scanned in the data table to be scanned through the data scanning tool and acquiring the ith row of data, converting the ith row of data into second type of data based on a configuration file in the data scanning tool, and determining target data according to the second type of data corresponding to the N rows. The ith line of data is converted into a second type of data based on a configuration file in the data scanning tool, wherein the second type of data is the data obtained by performing serialization processing on the ith line of data. For example, the second type of data may be a binary type of data.
Optionally, in a possible embodiment, the determining of the target data according to the second type of data corresponding to the N rows may specifically be directly taking the second type of data corresponding to the N rows as the target data, and at this time, the electronic device may process the target data, for example, directly store the second type of target data, so that the data occupation space may be reduced, and if the target data of the second type is used for transmission, the data transmission efficiency may be improved.
Optionally, the determining of the target data according to the second type of data corresponding to the N rows may specifically be further performed by performing deserialization on the second type of data corresponding to the N rows to obtain a specified type of data, which may be the first type of data, as the target data. At this time, the electronic device may process the target data, for example, store or perform data migration on the target data, and the target data may be restored to a form of a data table, that is, the data table to be scanned may be visually represented. The deserializing of the second type data corresponding to the N rows may specifically be deserializing of the second type data corresponding to the N rows based on a configuration file in the data scanning tool, that is, the data scanning tool may identify which part of the second type data corresponding to the N rows is serialized data corresponding to one field and the field type of the field, deserializing is sequentially performed based on the second type data corresponding to the N rows in the data scanning tool, and the deserialized data is restored to the form of the data table by the field name defined in the data scanning tool.
In the embodiment of the application, the electronic device obtains structural information of a data table to be scanned, the structural information includes a field name and a field type of the data table to be scanned, a configuration file corresponding to the field type is obtained from a data scanning library according to the field type of the data table to be scanned, a data scanning tool is generated based on the field name and the configuration file of the data table to be scanned, and target data is obtained from the data table to be scanned through the data scanning tool. By implementing the method, the data scanning tool can be generated through the structural information of the data table to be scanned, and the data scanned by the data scanning tool can be accurately available, so that the efficiency of scanning the data table to obtain the data and the reliability of data acquisition are improved.
Referring to fig. 4, fig. 4 is a schematic flowchart illustrating another data acquisition method according to an embodiment of the present application, where the method can be executed by the above-mentioned electronic device. As shown in fig. 4, the flow of the data obtaining method in the embodiment of the present application may include the following steps:
s401, acquiring structural information of a data table to be scanned; the structure information includes field names and field types of the data table to be scanned.
S402, according to the field type of the data table to be scanned, obtaining a configuration file corresponding to the field type from the data scanning library, and generating a data scanning tool based on the field name and the configuration file of the data table to be scanned.
For specific implementation of steps S401 to S402, reference may be made to the related description of steps S201 to S202 in the foregoing embodiment, and details are not described here again.
S403, acquiring storage information of the data table to be scanned; the storage information comprises the table name of the data table to be scanned and the storage address of the data table to be scanned.
In a possible implementation manner, the storage information for acquiring the data table to be scanned may specifically be a table name constituting the data table to be scanned, and a storage address of the data table to be scanned. For example, the table name and the storage address can be obtained by a developer through analyzing the data table to be scanned; as well as from a management module storing information for the data table. Because the database for storing the data table to be scanned contains a large number of data tables, the electronic equipment can quickly search the data table to be scanned from the database through the table name and the storage address of the data table to be scanned, and scan the data table to be scanned.
Optionally, the storage address of the data table to be scanned may include an identifier of the storage device where the data table to be scanned is located and a module name of the storage module where the data table to be scanned is located. For example, if the database storing the data table to be scanned is a distributed storage system, because the distributed storage system includes a plurality of storage devices and different storage modules are used in the storage devices to store the data table, the target storage device where the data table to be scanned is located can be quickly found from the plurality of storage devices through the identifier of the storage device, and the target storage module where the data table to be scanned is located can be quickly found from the different storage modules through the module names of the storage modules. Therefore, the identification of the storage device where the data table to be scanned is located and the module name of the storage module where the data table to be scanned is located can be used as the storage address of the data table to be scanned.
Further optionally, the distributed storage system may specifically be a distributed Key-Value pair storage system (also referred to as a distributed Key-Value database, distributed KV database). The Data Table to be scanned may be a Key-Value Data Table (Key-Value Data Table), i.e., the Data Table to be scanned includes Key Data (Key) and corresponding Value Data (Value). When the designated line data is to be acquired from the data table to be scanned, the corresponding value data can be found through the key data in the designated line, and the designated line data is acquired according to the key data and the corresponding value data. For example, the data table to be scanned is a data table storing user information, and it is assumed that the acquired structure information of the data table to be scanned includes: and 3 fields with field names of [ user ID ], [ user name ] and [ user mobile phone number ], so that data corresponding to the field with the field name of [ user ID ] can be used as a key, and data corresponding to the other fields can be used as values corresponding to the key.
S404, acquiring target data from the data table to be scanned through a data scanning tool according to the table name of the data table to be scanned and the storage address of the data table to be scanned.
The electronic equipment can search the data table to be scanned according to the table name and the storage address, scan the data table to be scanned according to the row level through a data scanning tool, and acquire target data based on a scanning result.
In one possible embodiment, the data table to be scanned comprises N rows of data, where the N rows of data include the ith row of data, N is a positive integer, i is a positive integer less than or equal to N, and the N rows of data are data of the first type; the specific implementation of the electronic device scanning the data table to be scanned by a data scanning tool according to the row level and acquiring the target data based on the scanning result may be as follows: scanning an ith row of the data table to be scanned in the data table to be scanned through a data scanning tool, if the ith row meets a data acquisition condition, acquiring the ith row of data, converting the ith row of data into second type of data based on the data scanning tool, and determining target data according to the second type of data corresponding to the R row meeting the data acquisition condition; r is a positive integer less than or equal to N.
Converting the ith data into second type data based on a data scanning tool, namely, performing serialization processing on the ith data; determining the target data according to the second type of data corresponding to the R rows that satisfy the data acquisition condition may be directly taking the second type of data corresponding to the R rows that satisfy the data acquisition condition as the target data, or may be taking data obtained by deserializing the second type of data corresponding to the R rows that satisfy the data acquisition condition as the target data.
Optionally, when the data table to be scanned is a key value pair data table, and N rows of data of the data table to be scanned include N key data, the ith row of data includes ith key data, and i is a positive integer, the data obtaining condition may be that the ith key data is designated key data, so that the ith row of the data table to be scanned is scanned in the data table to be scanned by the data scanning tool, and if the ith row satisfies the data obtaining condition, the obtaining of the ith row of data may specifically be: when an ith row of the data table to be scanned is scanned in the data table to be scanned through a data scanning tool, ith key data corresponding to the ith row is obtained, if the ith key data corresponding to the ith row is designated key data, value data corresponding to the ith key data is obtained, and the ith row of data is obtained according to the ith key data and the corresponding value data. The required target data can be quickly acquired by defining the data acquisition conditions, and the data acquisition efficiency is accelerated.
In a possible embodiment, the specific implementation of the electronic device scanning the data table to be scanned by a data scanning tool at a row level and acquiring the target data based on the scanning result may further be that: scanning the ith row of the data table to be scanned in the data table to be scanned through a data scanning tool, acquiring data meeting data acquisition conditions from the ith row of data, converting the data meeting the data acquisition conditions acquired from the ith row of data into second type of data based on the data scanning tool, and determining target data according to the second type of data corresponding to the data meeting the data acquisition conditions in the N rows of data.
Converting data meeting the data acquisition condition in the ith row of data into second type of data based on a data scanning tool, namely performing serialization processing on the data meeting the data acquisition condition in the ith row of data; determining the target data according to the second type of data corresponding to the data satisfying the data acquisition condition in the N rows of data may directly use the second type of data corresponding to the data satisfying the data acquisition condition in the N rows of data as the target data, or use the data after deserializing the second type of data corresponding to the data satisfying the data acquisition condition in the N rows of data as the target data.
Optionally, the data obtaining condition may be to obtain data corresponding to a specified field in the ith row of data, where the specified field may be one or more fields in the data table to be scanned, the scanning, by using a data scanning tool, of the ith row of the data table to be scanned, and the obtaining, from the ith row of data, data that meets the data obtaining condition specifically may be: scanning the ith row of the data table to be scanned in the data table to be scanned through a data scanning tool to obtain ith row of data, identifying the structure of the ith row of data through the data scanning tool, and extracting data corresponding to the designated field in the ith row of data. It can be understood that, when the data table to be scanned is a key-value pair data table, the obtaining of the ith row of data may specifically be obtaining value data corresponding to the ith key data according to the ith key data in the ith row of data, and obtaining the ith row of data according to the ith key data and the corresponding value data.
Optionally, the data obtaining condition may also be that data corresponding to the designated field in the ith row of data is obtained when the ith key data corresponding to the ith row of data is the designated key data. That is, the specific way of acquiring the target data from the data table to be scanned by the data scanning tool may be: when an ith row of a data table to be scanned is scanned in the data table to be scanned through a data scanning tool, if ith key data corresponding to the ith row is designated key data, the ith row of data is acquired, data corresponding to a designated field is extracted from the ith row of data through the data acquisition tool, the data corresponding to the designated field extracted from the ith row of data is converted into second type of data based on the data scanning tool, target data are determined according to second type of data corresponding to the data extracted from R row of data meeting data acquisition conditions, and R is a positive integer smaller than or equal to N.
Optionally, in a possible implementation manner, the data table to be scanned is stored in a storage server (kv server) in the distributed key-value pair storage system, the distributed key-value pair storage system further includes a storage client (kv client), the storage client is configured with a data acquisition interface for acquiring data from the storage server, the storage server can provide data query service, and the electronic device can request the data query service provided by the storage server to query and acquire data by calling the data acquisition interface in the storage client. The specific way for the electronic device to acquire the target data from the to-be-scanned data table through the data scanning tool may be: the method comprises the steps of generating a data acquisition instruction aiming at a data table to be scanned through a data scanning tool, sending the data acquisition instruction to a storage client, receiving data in the data table to be scanned sent by the storage client, and determining target data according to the data in the data table to be scanned sent by the storage client.
The data acquisition instruction can be used for instructing the storage client to acquire the data in the data table to be scanned at the storage server according to the data acquisition interface corresponding to the data acquisition instruction. That is, the data acquisition instruction can be used to instruct the storage client to search for the corresponding data acquisition interface according to the data acquisition instruction, and call the data acquisition interface to acquire the data in the data table to be scanned at the storage server.
In addition, when the electronic device generates the data scanning tool, a protocol file for the storage client is further encapsulated in the data scanning tool, and the protocol file enables the data scanning tool to generate a data acquisition instruction which can be analyzed and executed by the storage client; the data acquisition instruction carries the table name and the storage address of the data table to be scanned, the storage client side acquires the corresponding data acquisition interface according to the table name and the storage address of the data table to be scanned after receiving the data acquisition instruction, and calls the data acquisition interface to request data query service from the storage server side. The storage client side can inquire the value data according to the key data through the data inquiry service, returns the key data and the value data to the electronic equipment after acquiring the key data and the value data, and sends the key data and the value data to the electronic equipment after acquiring one line of data in the data table to be scanned. Optionally, the manner in which the storage client requests the storage server for the data query service by calling the data obtaining interface may be a manner of Remote Procedure Call (RPC), where the Remote Procedure Call may be understood as a manner in which one node requests a service provided by another node, that is, the storage client may request the data query service provided by the storage server, and the like.
Based on this, if the electronic device receives the data acquisition condition for the data table to be scanned and if the data acquisition condition is that the ith key data included in the ith row of data is the designated key data, the data acquisition instruction generated by the data scanning tool may carry the data acquisition condition that the ith key data included in the ith row of data is the designated key data. When the ith row of data is acquired through a data acquisition interface of a storage client, only when the ith key data included in the ith row of data is designated key data, inquiring value data according to the key data, and returning the acquired ith row of data; wherein the ith row of data includes ith key data and corresponding value data. When the electronic device receives the ith data sent by the storage client through the data scanning tool, the ith data can be serialized on the basis of the data scanning tool, and the target data can be determined on the basis of the serialized data corresponding to the R row data sent by the storage client. The R row data can be part or all of data in a data table to be scanned, and R is a positive integer less than or equal to N.
Optionally, the data acquisition instruction generated by the data scanning tool may not carry a data acquisition condition that the ith key data included in the ith row of data is the designated key data, and when the ith row of data returned by the storage client is received, the electronic device identifies, by the data scanning tool, whether the ith key value data included in the received ith row of data is the designated key data, if the ith key value data is the designated key data, the ith row of data is serialized, and if the ith key value data is not the designated key data, the ith row of data is deleted.
Optionally, if the data obtaining condition is to obtain data corresponding to the specified field in the ith row of data, the data obtaining instruction generated by the data scanning tool may carry the data obtaining condition to obtain the data corresponding to the specified field in the ith row of data. When the ith row of data is acquired through a data acquisition interface of the storage client, inquiring corresponding value data according to the ith key data included in the ith row, acquiring data corresponding to the designated field from the ith row of data, and returning the data corresponding to the designated field acquired from the ith row of data. Or the data acquisition instruction generated by the data scanning tool may not carry a data acquisition condition for acquiring data corresponding to the designated field in the ith row of data, when the ith row of data returned by the storage client is received, the electronic device identifies the data corresponding to the designated field in the received ith row of data by the data scanning tool, performs serialization processing on the data corresponding to the designated field in the ith row of data by using the data scanning tool, and determines target data based on the serialized data corresponding to the designated field in the R row of data sent by the storage client.
In a possible embodiment, in the process of scanning the data table to be scanned by using the data scanning tool to acquire the target data, if a data scanning end instruction for the data scanning tool is received, the electronic device stops the scanning operation of the data table to be scanned. The data scanning end instruction may be generated by triggering a control, for example, the electronic device provides a button, and when it is detected that the button is triggered, the data scanning end instruction is generated, or may also be generated by a command, for example, when a command "kill-9 process name" is input, the data scanning end instruction is generated, and the process name is a process name corresponding to the scanning operation of the data table to be scanned.
For example, referring to fig. 5, fig. 5 is a schematic view of an application scenario for acquiring data provided by the present application, and the process includes: s1, a developer analyzes a data table to be scanned in a database to obtain structural information and storage information of the data table to be scanned; s2, the electronic equipment receives the structural information, the storage information and the data acquisition conditions of the data table to be scanned, which are sent by a developer, and generates a data scanning tool; s3, the electronic equipment runs a data scanning tool and scans a data table to be scanned to obtain target data (also called a result set) meeting data acquisition conditions; and S4, the electronic equipment performs data processing on the target data, such as data migration or data combination.
In the embodiment of the application, the electronic device acquires the structural information of the data table to be scanned, the structural information includes the field name and the field type of the data table to be scanned, the configuration file corresponding to the field type is acquired from the data scanning library according to the field type of the data table to be scanned, a data scanning tool is generated based on the field name and the configuration file of the data table to be scanned, the storage information of the data table to be scanned is acquired, the storage information includes the table name of the data table to be scanned and the storage address of the data table to be scanned, and the target data is acquired from the data table to be scanned through the data scanning tool according to the table name of the data table to be scanned and the storage address of the data table to be scanned. By implementing the method, the data scanning tool can be generated through the structural information of the data table to be scanned, the data obtained through scanning of the data scanning tool is enabled to be accurate and usable, the required target data can be obtained according to the data obtaining condition, the efficiency of obtaining the data from the data table, the reliability of obtaining the data and the efficiency of processing the data are further facilitated to be improved, for example, in a data aggregation scene, the data to be aggregated can be accurately obtained, the data aggregation processing can be directly carried out on the basis of the data after the serialization processing, and the data aggregated can be quickly and stably restored to the data in the form of the table structure by using the anti-serialization processing function of the data scanning tool.
Referring to fig. 6, fig. 6 is a flowchart illustrating a data obtaining method according to an embodiment of the present application, where the method can be executed by the electronic device mentioned above and can be combined with the above embodiment to obtain data from a data table through a data scanning tool. In this embodiment of the application, the data table to be scanned may include at least two data sub-tables to be scanned. As shown in fig. 6, the flow of the data obtaining method in the embodiment of the present application may include the following steps:
s601, acquiring the quantity of the data sub-tables to be scanned and the quantity of the designated threads.
In a possible embodiment, if the data table to be scanned includes at least two data partial tables to be scanned, that is, if the data amount in the data table to be scanned is too large, the data is stored in the at least two data partial tables to be scanned when being stored, and the structures of the at least two data partial tables to be scanned are the same. Therefore, the scanning of the data tables to be scanned can be understood as scanning of the at least two data sub-tables to be scanned, each data table to be scanned in the at least two data sub-tables to be scanned corresponds to a unique table ID, and when the electronic device acquires the table IDs of the at least two data sub-tables to be scanned, the electronic device can sequentially scan according to the sequence of the table IDs from small to large to acquire data by using the data scanning tool.
Optionally, in order to further improve the data acquisition efficiency, when the electronic device obtains the number of the sub-tables of the data to be scanned, the table IDs of the sub-tables of the data to be scanned, and the number of the designated threads, the plurality of threads are started to concurrently scan the plurality of sub-tables of the data to be scanned. The number of the designated threads can be set by a developer according to experience, and the developer can analyze the database to obtain concurrency limit of the key and set the concurrency limit based on the concurrency limit. And the concurrency limit of the key refers to the number of times of concurrent call of the data query interface corresponding to the key. That is, in one thread, the scanning of the data sub-table to be scanned is to sequentially obtain corresponding values according to the keys of each row, specifically, the corresponding values may be obtained according to the keys of each row by calling the data query interface corresponding to the keys of each row, and one data query interface may correspond to one or more keys. Therefore, when a plurality of data sub-tables to be scanned are scanned in a plurality of threads, a situation that one data query interface is called to query the value corresponding to the key concurrently may occur, and therefore the number of the designated threads is determined according to the concurrency limit of the key, so that the number of the created threads is more reasonable. For example, a thread queries the value corresponding to a key once every second, that is, calls the data query interface once every second, and the concurrency limit of the key indicates that the number of times of concurrent calls of the data query interface corresponding to the key is 5 times/key/second, so the number of the designated threads may be 5.
S602, creating threads with specified thread quantity, and distributing at least one data sub-table to be scanned for each thread in the threads with the specified thread quantity.
In a possible implementation manner, the allocating at least one data sub-table to be scanned to each thread in the threads with the specified number of threads may specifically be sequentially allocating at least one data sub-table to be scanned to each thread according to the table ID of each data sub-table to be scanned. For example, there are 10 data to be scanned branch tables with table IDs from 01 to 10, and the number of threads is designated as 3, so that the data to be scanned branch table with table ID 01 to 03 can be allocated to the first thread, the data to be scanned branch table with table ID 04 to 07 can be allocated to the second thread, and the data to be scanned branch table with table ID 08 to 10 can be allocated to the third thread.
And S603, respectively acquiring target data from at least one sub-table of data to be scanned which is correspondingly distributed in each thread of the threads with the specified thread quantity by using a data scanning tool.
In a possible implementation manner, when a plurality of data are obtained by scanning from the corresponding data table to be scanned in each thread, the plurality of data need to be summarized, and specifically, the data need to be summarized based on the order from small to large of the table IDs of the data sub-tables to be scanned, which are correspondingly allocated to each thread.
For example, please refer to fig. 7, fig. 7 is a schematic view of an application scenario of a parallel scan data table provided by the present application, where a table ID of a to-be-scanned data sub-table allocated by a first thread is 01-03, a table ID of a to-be-scanned data sub-table allocated by a second thread is 04-07, and a table ID of a to-be-scanned data sub-table allocated by a third thread is 08-10, so that the electronic device may scan the to-be-scanned data sub-table with the table ID of 01-03 by using a data scanning tool in the first thread, scan the to-be-scanned data sub-table with the table ID of 04-07 by using the data scanning tool in the second thread, and scan the to-be-scanned data sub-table with the table ID of 08-10 by using the data scanning tool in the third thread; the data obtained by scanning the correspondingly distributed data sub-table to be scanned in the first thread is data 1, the data obtained by scanning the correspondingly distributed data sub-table to be scanned in the second thread is data 2, and the data obtained by scanning the correspondingly distributed data sub-table to be scanned in the third thread is data 3, so that the sequence of summarizing the multiple data is data 1-data 2-data 3.
In one possible embodiment, in order to keep a reasonable load on the electronic device when scanning the sub-table of data to be scanned, load information of the electronic device may be obtained in real time, where the load information may include a CPU usage rate of a central processing unit. At least one thread is halted when the CPU usage is greater than or equal to a threshold. Wherein the threshold value can be set by a developer according to an empirical value. Based on this, the specific step of the electronic device, in each thread of the threads with the specified thread number, of acquiring the target data from the at least one to-be-scanned data sublist correspondingly allocated by using the data scanning tool may be: and in each non-suspended thread, acquiring first data from the data to be scanned sub-table correspondingly distributed to each non-suspended thread by using a data scanning tool, restarting the suspended thread after acquiring the first data, and acquiring second data from the data to be scanned sub-table correspondingly distributed to the restarted thread by using the data scanning tool in the restarted thread. It is understood that the target data includes the first data and the second data.
Optionally, the load information may further include an occupancy rate of a hard disk for storing the data, and when the occupancy rate of the hard disk is greater than or equal to the maximum occupancy rate, at least one thread may be suspended and the data in the hard disk may be processed, for example, the data in the hard disk (the data in the data sub-table that is not to be scanned) may be deleted or transferred to another location, and then the at least one thread may be restarted. Wherein the maximum occupancy rate can be set by a developer according to an empirical value.
Further optionally, the load information may include both the CPU usage rate and the hard disk occupancy rate, and when the CPU usage rate is greater than or equal to the threshold or the hard disk occupancy rate is greater than or equal to the maximum occupancy rate, suspend at least one thread and generate an alarm prompt.
The load information of the electronic device may be acquired at any time during the scanning process of the sub-table of data to be scanned by the electronic device, and the specific manner of acquiring the load information of the electronic device may be to acquire the load information when the electronic device detects a load information acquisition operation, or to automatically detect the load of the electronic device itself, and when the load of the electronic device exceeds a reasonable range, acquire the load information and generate an alarm prompt.
In the embodiment of the application, if the data table to be scanned comprises at least two data sub-tables to be scanned, the electronic device obtains the number of the data sub-tables to be scanned and the number of the designated threads, creates the threads with the designated thread number, allocates at least one data sub-table to be scanned to each thread in the threads with the designated thread number, and obtains target data from the at least one data sub-table to be scanned, which is correspondingly allocated, by using a data scanning tool in each thread of the threads with the designated thread number. By implementing the method, the multiple data sub-tables to be scanned can be scanned based on multiple threads concurrently, the data acquisition efficiency is improved, and reasonable load of the electronic equipment can be kept in the process of scanning the data sub-tables to be scanned, so that the data acquisition process can be carried out stably.
Please refer to fig. 8, fig. 8 is a schematic structural diagram of a data acquisition apparatus provided in the present application. It should be noted that, the data acquisition apparatus shown in fig. 8 is used for executing the method of the embodiment shown in fig. 2, fig. 4 and fig. 6 of the present application, for convenience of description, only the portion related to the embodiment of the present application is shown, and specific technical details are not disclosed, and reference is made to the embodiment shown in fig. 2, fig. 4 and fig. 6 of the present application. The data acquisition device 800 may include: an acquisition module 801 and a processing module 802. Wherein:
an obtaining module 801, configured to obtain structure information of a data table to be scanned; the structure information comprises a field name and a field type of the data table to be scanned;
a processing module 802, configured to obtain a configuration file corresponding to a field type from a data scanning library according to the field type of the to-be-scanned data table, and generate a data scanning tool based on the field name of the to-be-scanned data table and the configuration file;
the obtaining module 801 is further configured to obtain target data from the data table to be scanned through the data scanning tool.
In a possible implementation, the obtaining module 801 is further configured to:
acquiring storage information of the data table to be scanned; the storage information comprises a table name of the data table to be scanned and a storage address of the data table to be scanned;
when the obtaining module 801 is configured to obtain target data from the to-be-scanned data table through the data scanning tool, specifically:
and acquiring the target data from the data table to be scanned through the data scanning tool according to the table name of the data table to be scanned and the storage address of the data table to be scanned.
In one possible embodiment, the data table to be scanned comprises N rows of data, where the N rows of data include the ith row of data, N is a positive integer, and i is a positive integer less than or equal to N; the N rows of data are data of a first type;
the obtaining module 801, when configured to obtain target data from the data table to be scanned through the data scanning tool, is further configured to:
scanning an ith row of the data table to be scanned in the data table to be scanned through the data scanning tool, if the ith row meets a data acquisition condition, acquiring the ith row of data, and converting the ith row of data into second type of data based on the data scanning tool;
determining the target data according to the data of the second type corresponding to the R rows meeting the data acquisition condition; r is a positive integer less than or equal to N.
In one possible embodiment, the N rows of data include N key data, and the ith row of data includes ith key data; the data acquisition condition includes that the ith key data is designated key data;
the obtaining module 801 is configured to scan an ith row of the to-be-scanned data table in the to-be-scanned data table through the data scanning tool, and if the ith row meets a data obtaining condition, when obtaining the ith row of data, specifically:
when the ith row of the data table to be scanned is scanned in the data table to be scanned through the data scanning tool, if the ith key data corresponding to the ith row is the designated key data, the ith row of data is acquired.
In a possible embodiment, the data table to be scanned is stored in a storage server of a storage system, and the storage system further includes a storage client configured with a data acquisition interface for acquiring data from the storage server;
the obtaining module 801 is configured to obtain target data from the to-be-scanned data table through the data scanning tool, and specifically configured to:
generating a data acquisition instruction aiming at the data table to be scanned through the data scanning tool, and sending the data acquisition instruction to the storage client, wherein the data acquisition instruction is used for instructing the storage client to acquire data in the data table to be scanned at the storage server according to a data acquisition interface corresponding to the data acquisition instruction;
receiving data in the data table to be scanned, which is sent by the storage client;
and determining the target data according to the data in the data table to be scanned, which is sent by the storage client.
In one possible embodiment, the data table to be scanned comprises at least two data sub-tables to be scanned; the obtaining module 801 is further configured to:
acquiring the quantity of the data to be scanned in the sub-tables and the quantity of the designated threads;
creating threads with the specified thread quantity, and allocating at least one sub-table of the data to be scanned for each thread in the threads with the specified thread quantity;
when the obtaining module 801 is configured to obtain target data from the to-be-scanned data table through the data scanning tool, specifically:
and respectively acquiring the target data from at least one correspondingly distributed data sub-table to be scanned by using the data scanning tool in each thread of the threads with the specified thread quantity.
In one possible embodiment, the obtaining module is further configured to:
acquiring load information; the load information comprises the CPU utilization rate of the central processing unit;
if the CPU utilization rate is larger than or equal to a threshold value, pausing at least one thread;
the obtaining module 801 is specifically configured to, when the data scanning tool is used to obtain the target data from the at least one sub-table of data to be scanned, which is correspondingly allocated, in each thread of the threads with the specified thread number, specifically:
in each thread which is not suspended, acquiring first data from the data to be scanned sublist which is correspondingly distributed to each thread which is not suspended by using the data scanning tool;
after the first data are obtained, restarting the suspended thread, and in the restarted thread, utilizing the data scanning tool to obtain second data from the data to be scanned sublist correspondingly distributed to the restarted thread; the target data includes the first data and the second data.
In the embodiment of the application, the electronic device obtains structural information of a data table to be scanned, the structural information includes a field name and a field type of the data table to be scanned, a configuration file corresponding to the field type is obtained from a data scanning library according to the field type of the data table to be scanned, a data scanning tool is generated based on the field name and the configuration file of the data table to be scanned, and target data is obtained from the data table to be scanned through the data scanning tool. By implementing the device, the data scanning tool can be generated through the structural information of the data table to be scanned, and the data scanned by the data scanning tool is accurate and available, so that the efficiency of scanning the data table to obtain the data and the reliability of data acquisition are improved.
The functional modules in the embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules are integrated into one module. The integrated module may be implemented in a form of hardware, or may be implemented in a form of software functional module, which is not limited in this application.
Referring to fig. 9, fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 9, the electronic device 900 includes: at least one processor 901, memory 902. Optionally, the electronic device may further comprise a network interface 903. The processor 901, the memory 902 and the network interface 903 may exchange data with each other, the network interface 903 is controlled by the processor 901 to send and receive messages, the memory 902 is used to store a computer program, the computer program includes program instructions, and the processor 901 is used to execute the program instructions stored in the memory 902. Wherein the processor 901 is configured to call the program instructions to execute the above method.
The memory 902 may include volatile memory (volatile memory), such as random-access memory (RAM); the memory 902 may also include a non-volatile memory (non-volatile memory), such as a flash memory (flash memory), a solid-state drive (SSD), etc.; the memory 902 may also comprise a combination of memories of the kind described above.
The processor 901 may be a Central Processing Unit (CPU). In one embodiment, the processor 901 may also be a Graphics Processing Unit (GPU). The processor 901 may also be a combination of a CPU and a GPU.
In one possible implementation, the memory 902 is used to store program instructions. The processor 901 may call the program instructions to perform the following steps:
acquiring structural information of a data table to be scanned; the structure information comprises a field name and a field type of the data table to be scanned;
acquiring a configuration file corresponding to the field type from a data scanning library according to the field type of the data table to be scanned, and generating a data scanning tool based on the field name of the data table to be scanned and the configuration file;
and acquiring target data from the data table to be scanned through the data scanning tool.
In one possible implementation, the processor 901 is further configured to:
acquiring storage information of the data table to be scanned; the storage information comprises a table name of the data table to be scanned and a storage address of the data table to be scanned;
when the processor 901 is configured to obtain the target data from the to-be-scanned data table through the data scanning tool, specifically, the processor is configured to:
and acquiring the target data from the data table to be scanned through the data scanning tool according to the table name of the data table to be scanned and the storage address of the data table to be scanned.
In one possible embodiment, the data table to be scanned comprises N rows of data, where the N rows of data include the ith row of data, N is a positive integer, and i is a positive integer less than or equal to N; the N rows of data are data of a first type; when the processor 901 is configured to obtain the target data from the to-be-scanned data table through the data scanning tool, specifically, the processor is configured to:
scanning an ith row of the data table to be scanned in the data table to be scanned through the data scanning tool, if the ith row meets a data acquisition condition, acquiring the ith row of data, and converting the ith row of data into second type of data based on the data scanning tool;
determining the target data according to the second type of data corresponding to the R rows meeting the data acquisition condition; r is a positive integer less than or equal to N.
In one possible embodiment, the N row data includes N key data, and the ith row data includes ith key data; the data acquisition condition includes that the ith key data is designated key data; the processor 901 is configured to scan an ith row of the to-be-scanned data table in the to-be-scanned data table through the data scanning tool, and when the ith row of data is acquired if the ith row of data meets a data acquisition condition, specifically:
when the ith row of the data table to be scanned is scanned in the data table to be scanned through the data scanning tool, if the ith key data corresponding to the ith row is the designated key data, the ith row of data is acquired.
In a possible embodiment, the data table to be scanned is stored in a storage server of a storage system, and the storage system further includes a storage client configured with a data acquisition interface for acquiring data from the storage server; when the processor 901 is configured to obtain the target data from the to-be-scanned data table through the data scanning tool, specifically, the processor is configured to:
generating a data acquisition instruction for the data table to be scanned through the data scanning tool, and sending the data acquisition instruction to the storage client, wherein the data acquisition instruction is used for instructing the storage client to acquire data in the data table to be scanned at the storage server according to a data acquisition interface corresponding to the data acquisition instruction;
receiving data in the data table to be scanned, which is sent by the storage client;
and determining the target data according to the data in the data table to be scanned, which is sent by the storage client.
In one possible embodiment, the data table to be scanned comprises at least two data sub-tables to be scanned; the processor 901 is further configured to:
acquiring the quantity of the data to be scanned in the sub-tables and the quantity of the designated threads;
creating the threads with the specified thread quantity, and distributing at least one sub-table of the data to be scanned for each thread in the threads with the specified thread quantity;
when the processor 901 is configured to obtain the target data from the to-be-scanned data table through the data scanning tool, specifically, the processor is configured to:
and respectively acquiring the target data from at least one correspondingly distributed data sub-table to be scanned by using the data scanning tool in each thread of the threads with the specified thread quantity.
In one possible implementation, the processor 901 is further configured to:
acquiring load information; the load information comprises the CPU utilization rate of the central processing unit;
if the CPU utilization rate is larger than or equal to a threshold value, pausing at least one thread;
the processor 901 is further configured to, in each thread of the threads with the specified thread number, obtain, by using the data scanning tool, the target data from at least one sub-table of the data to be scanned that is correspondingly allocated, and in each thread that is not suspended, obtain, by using the data scanning tool, first data from the sub-table of the data to be scanned that is correspondingly allocated to each thread that is not suspended;
after the first data are obtained, restarting the suspended thread, and in the restarted thread, utilizing the data scanning tool to obtain second data from the data sub-table to be scanned, which is correspondingly distributed to the restarted thread; the target data includes the first data and the second data.
In a specific implementation, the data obtaining apparatus 800, the processor 901, the memory 902, and the like described in this embodiment of the application may perform the implementation described in the above method embodiment, and may also perform the implementation described in this embodiment of the application, which is not described herein again.
Also provided in embodiments of the present application is a computer (readable) storage medium storing a computer program comprising program instructions that, when executed by a processor, cause the processor to perform some or all of the steps performed in the above-described method embodiments. Alternatively, the computer storage media may be volatile or nonvolatile. The computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
A computer program product or computer program is also provided in an embodiment of the present application, and includes program instructions that, when executed by a processor, implement some or all of the steps of the above-described method. Alternatively, the program instructions may be stored in a computer-readable storage medium, and a processor of the computer device may read the program instructions from the computer-readable storage medium, and execute the program instructions to cause the computer device to perform some or all of the steps of the method.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by a computer program, which may be stored in a computer storage medium, where the computer storage medium may be a computer readable storage medium, and when executed, the computer program may include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
While the present disclosure has been described with reference to particular embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present disclosure.

Claims (10)

1. A method for data acquisition, the method comprising:
acquiring structural information of a data table to be scanned; the structure information comprises a field name and a field type of the data table to be scanned;
acquiring a configuration file corresponding to the field type from a data scanning library according to the field type of the data table to be scanned, and generating a data scanning tool based on the field name of the data table to be scanned and the configuration file;
and acquiring target data from the data table to be scanned through the data scanning tool.
2. The method of claim 1, further comprising:
acquiring storage information of the data table to be scanned; the storage information comprises a table name of the data table to be scanned and a storage address of the data table to be scanned;
the obtaining of the target data from the data table to be scanned by the data scanning tool includes:
and acquiring the target data from the data table to be scanned through the data scanning tool according to the table name of the data table to be scanned and the storage address of the data table to be scanned.
3. The method according to claim 1, wherein the data table to be scanned comprises N rows of data, the N rows of data comprising ith row of data, N being a positive integer, i being a positive integer less than or equal to N; the N rows of data are data of a first type;
the obtaining of the target data from the data table to be scanned by the data scanning tool includes:
scanning an ith row of the data table to be scanned in the data table to be scanned through the data scanning tool, if the ith row meets a data acquisition condition, acquiring the ith row of data, and converting the ith row of data into second type of data based on the data scanning tool;
determining the target data according to the data of the second type corresponding to the R rows meeting the data acquisition condition; r is a positive integer less than or equal to N.
4. The method of claim 3, wherein the N rows of data comprise N key data and the ith row of data comprises ith key data; the data acquisition condition includes that the ith key data is designated key data;
scanning an ith row of the data table to be scanned in the data table to be scanned through the data scanning tool, and if the ith row meets a data acquisition condition, acquiring the ith row of data, including:
when the ith row of the data table to be scanned is scanned in the data table to be scanned through the data scanning tool, if the ith key data corresponding to the ith row is the designated key data, the ith row of data is acquired.
5. The method according to claim 1, wherein the data table to be scanned is stored in a storage server of a storage system, the storage system further comprises a storage client configured with a data acquisition interface for acquiring data from the storage server;
the obtaining of the target data from the data table to be scanned by the data scanning tool includes:
generating a data acquisition instruction for the data table to be scanned through the data scanning tool, and sending the data acquisition instruction to the storage client, wherein the data acquisition instruction is used for instructing the storage client to acquire data in the data table to be scanned at the storage server according to a data acquisition interface corresponding to the data acquisition instruction;
receiving data in the data table to be scanned, which is sent by the storage client;
and determining the target data according to the data in the data table to be scanned, which is sent by the storage client.
6. The method according to any one of claims 1 to 5, wherein the data table to be scanned comprises at least two data sub-tables to be scanned; the method further comprises the following steps:
acquiring the quantity of the data to be scanned in the sub-tables and the quantity of the designated threads;
creating the threads with the specified thread quantity, and distributing at least one sub-table of the data to be scanned for each thread in the threads with the specified thread quantity;
the obtaining of the target data from the data table to be scanned by the data scanning tool includes:
and respectively acquiring the target data from at least one correspondingly distributed data sub-table to be scanned by using the data scanning tool in each thread of the threads with the specified thread quantity.
7. The method of claim 6, further comprising:
acquiring load information; the load information comprises the CPU utilization rate of the central processing unit;
if the CPU utilization rate is larger than or equal to a threshold value, pausing at least one thread;
the obtaining, by the data scanning tool, the target data from the at least one sub-table of the data to be scanned, which is allocated correspondingly, in each of the threads of the specified number of threads, respectively, includes:
in each thread which is not suspended, acquiring first data from the data to be scanned sublist which is correspondingly distributed to each thread which is not suspended by using the data scanning tool;
after the first data are obtained, restarting the suspended thread, and in the restarted thread, utilizing the data scanning tool to obtain second data from the data sub-table to be scanned, which is correspondingly distributed to the restarted thread; the target data includes the first data and the second data.
8. A data acquisition apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring the structural information of the data table to be scanned; the structure information comprises a field name and a field type of the data table to be scanned;
the processing module is used for acquiring a configuration file corresponding to the field type from a data scanning library according to the field type of the data table to be scanned and generating a data scanning tool based on the field name of the data table to be scanned and the configuration file;
the acquisition module is further configured to acquire target data from the to-be-scanned data table through the data scanning tool.
9. An electronic device comprising a processor and a memory, wherein the memory is configured to store a computer program comprising program instructions, and wherein the processor is configured to invoke the program instructions to perform the method of any of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions for executing the method of any of claims 1-7 when executed by a processor.
CN202110570068.3A 2021-05-24 2021-05-24 Data acquisition method, data acquisition device, electronic equipment and medium Pending CN115391328A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110570068.3A CN115391328A (en) 2021-05-24 2021-05-24 Data acquisition method, data acquisition device, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110570068.3A CN115391328A (en) 2021-05-24 2021-05-24 Data acquisition method, data acquisition device, electronic equipment and medium

Publications (1)

Publication Number Publication Date
CN115391328A true CN115391328A (en) 2022-11-25

Family

ID=84114847

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110570068.3A Pending CN115391328A (en) 2021-05-24 2021-05-24 Data acquisition method, data acquisition device, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN115391328A (en)

Similar Documents

Publication Publication Date Title
CN111866191B (en) Message event distribution method, distribution platform, system and server
CN103038788B (en) Providing multiple network resources
CN108881111B (en) Method and device for realizing multi-tenant system
CN108509523A (en) Structuring processing method, equipment and the readable storage medium storing program for executing of block chain data
US10817327B2 (en) Network-accessible volume creation and leasing
CN109787882A (en) Information push method, device, computer equipment and storage medium
JP4205323B2 (en) Distribution system, distribution server and distribution method, distribution program
CN113722114A (en) Data service processing method and device, computing equipment and storage medium
CN113590433B (en) Data management method, data management system, and computer-readable storage medium
CN114866416A (en) Multi-cluster unified management system and deployment method
CN116204239A (en) Service processing method, device and computer readable storage medium
CN113360210A (en) Data reconciliation method and device, computer equipment and storage medium
CN111327680B (en) Authentication data synchronization method, device, system, computer equipment and storage medium
CN112800066A (en) Index management method, related device and storage medium
CN112181599A (en) Model training method, device and storage medium
US20240241981A1 (en) Methods and systems for data synchronization, and computer-readable storage media
CN112416980A (en) Data service processing method, device and equipment
US10110670B2 (en) Allocation of service endpoints to servers
CN106936643B (en) Equipment linkage method and terminal equipment
CN115391328A (en) Data acquisition method, data acquisition device, electronic equipment and medium
CN117014175A (en) Permission processing method and device of cloud system, electronic equipment and storage medium
WO2020259326A1 (en) Signal transmission method and apparatus
CN116781780A (en) Request processing method, device, server and storage medium
CN113472781A (en) Service acquisition method, server and computer readable storage medium
CN115113800A (en) Multi-cluster management method and device, computing equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination