CN110633280A - Batch data acquisition method and device, readable storage medium and computing equipment - Google Patents

Batch data acquisition method and device, readable storage medium and computing equipment Download PDF

Info

Publication number
CN110633280A
CN110633280A CN201910860041.0A CN201910860041A CN110633280A CN 110633280 A CN110633280 A CN 110633280A CN 201910860041 A CN201910860041 A CN 201910860041A CN 110633280 A CN110633280 A CN 110633280A
Authority
CN
China
Prior art keywords
acquisition
databases
data
database
data tables
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910860041.0A
Other languages
Chinese (zh)
Inventor
张斌
蔡云山
陈志辉
杨秋亮
龚平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Data Co Ltd Of Beijing Asiainfo
Original Assignee
Data Co Ltd Of Beijing Asiainfo
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Data Co Ltd Of Beijing Asiainfo filed Critical Data Co Ltd Of Beijing Asiainfo
Priority to CN201910860041.0A priority Critical patent/CN110633280A/en
Publication of CN110633280A publication Critical patent/CN110633280A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Abstract

The embodiment of the invention provides a batch data acquisition method, a batch data acquisition device, a readable storage medium and computing equipment, which are used for automatically acquiring and processing a plurality of data sources in batches and solving the problem of low efficiency of a manual acquisition mode, and the method comprises the following steps: acquiring information of data tables of a plurality of first databases; determining a data table of a second database for entering data tables of the plurality of first databases; determining an acquisition strategy for acquiring a data table of a first database by an agent cluster; and instructing the agent cluster to write the data tables of the plurality of first databases into the data table of the second database according to the acquisition strategy.

Description

Batch data acquisition method and device, readable storage medium and computing equipment
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a batch data acquisition method and apparatus, a readable storage medium, and a computing device.
Background
With the increasing of medical data service types and the increasing of data, the method has certain epoch significance in data mining and data analysis of a large amount of medical related data.
Medical data acquisition is a prerequisite for data mining and analysis, a large number of acquisition tasks are often required to be established when medical data are acquired by butting various medical institutions, the acquisition tasks are mainly configured individually by manpower at present, labor cost is high, and efficiency is low.
Disclosure of Invention
To this end, the present disclosure provides a batch data collection method, apparatus, readable storage medium and computing device in an effort to solve or at least mitigate at least one of the problems identified above.
According to an aspect of an embodiment of the present disclosure, there is provided a batch data acquisition method, including:
acquiring information of data tables of a plurality of first databases;
determining a data table of a second database for entering data tables of the plurality of first databases;
determining an acquisition strategy for acquiring a data table of a first database by an agent cluster;
and instructing the agent cluster to write the data tables of the plurality of first databases into the data table of the second database according to the acquisition strategy.
Optionally, instructing the agent cluster to write the data tables of the plurality of first databases into the data table of the second database according to the collection policy, further includes:
determining mapping rules of the data tables of the plurality of first databases and the data table of the second database according to the information of the data tables of the plurality of first databases and the information of the data table of the second database;
instructing the agent cluster to write the data tables of the plurality of first databases into the data table of the second database according to an acquisition strategy, wherein the method comprises the following steps:
instructing the agent cluster to collect data tables of a plurality of first databases according to a collection strategy;
and writing the data tables of the plurality of first databases into the data table of the second database according to the data tables of the plurality of first databases and the mapping rule.
Optionally, the method further comprises:
checking the data table of the second database according to the data tables of the plurality of first databases;
when the data table of the second database is determined to have errors, repairing the data table of the second database;
and updating the mapping rule according to the error and the repairing mode existing in the data table of the second database.
Optionally, the collection strategy comprises:
triggering time, execution period and validity period of each acquisition task;
the trigger time is used for indicating the trigger time point of each acquisition task in each execution cycle;
the execution period is used for indicating the period of executing the acquisition task by the agent cluster;
and a validity period indicating a time period for executing the collection task according to the execution cycle.
Optionally, the collecting strategy further comprises:
the redo interval, the longest running time, the times of failed redo and the selection of whether the execution is covered or not of each acquisition task;
the redo interval is used for indicating the time interval of restarting the collection task when the agent cluster fails to execute any collection task;
the maximum running time length is used for indicating the maximum running time length of any acquisition task, and when the execution time length of any acquisition task is longer than the maximum running time length, any acquisition task is determined to fail and any acquisition task is terminated;
the times of the failed redoing are used for indicating the times of re-executing the task when any acquisition task fails;
and the selection of whether to execute the data is covered or not is used for indicating whether the newly acquired data covers the acquired data or not after any acquisition task fails and the task is executed again.
Optionally, the collecting strategy further comprises:
the starting type and priority of each acquisition task;
the starting type is used for indicating the starting sequence of a plurality of acquisition tasks with the same priority;
and the priority is used for indicating the agent cluster to execute the acquisition tasks according to the high-low sequence of the priority.
Optionally, determining a data table of a second database for entering a data table of the plurality of first databases comprises:
determining the service classification information of the data tables of the plurality of first databases according to the information of the data tables of the plurality of first databases;
and determining the data table of the second database corresponding to the service classification information according to the service classification information of the data tables of the plurality of first databases.
According to still another aspect of an embodiment of the present disclosure, there is provided a batch data acquisition apparatus including:
a first database acquisition unit configured to acquire information of data tables of a plurality of first databases;
a second database determination unit configured to determine a data table of a second database for entering data tables of the plurality of first databases;
the acquisition strategy making unit is used for determining an acquisition strategy for acquiring the data table of the first database by the agent cluster;
and the acquisition task execution unit is used for indicating the agent cluster to write the data tables of the plurality of first databases into the data table of the second database according to the acquisition strategy.
According to yet another aspect of embodiments of the present disclosure, there is provided a readable storage medium having executable instructions thereon that, when executed, cause a computing device to perform operations included in the batch data collection method described above.
According to yet another aspect of embodiments of the present disclosure, there is provided a computing device including: a processor; and a memory storing executable instructions that, when executed, cause the processor to perform the operations included in the batch data collection method.
In the embodiment of the disclosure, the collection tasks are automatically executed based on the uniformly configured collection strategy, so that the data tables of the databases from different sources are collected, and compared with a mode of manually configuring each collection task, the data collection efficiency is greatly improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the disclosure and together with the description serve to explain the principles of the disclosure.
FIG. 1 is a schematic block diagram of an exemplary computing device 100;
FIG. 2 is a schematic flow chart diagram of a batch data collection method according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a batch data acquisition device according to an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
FIG. 1 is a block diagram of an example computing device 100 arranged to implement a batch data acquisition method according to the present disclosure. In a basic configuration 102, computing device 100 typically includes system memory 106 and one or more processors 104. A memory bus 108 may be used for communication between the processor 104 and the system memory 106.
Depending on the desired configuration, the processor 104 may be any type of processing, including but not limited to: the processor 104 may include one or more levels of cache, such as a level one cache 110 and a level two cache 112, a processor core 114, and registers 116. the example processor core 114 may include an Arithmetic Logic Unit (ALU), a Floating Point Unit (FPU), a digital signal processing core (DSP core), or any combination thereof.
Depending on the desired configuration, system memory 106 may be any type of memory, including but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. System memory 106 may include an operating system 120, one or more programs 122, and program data 124. In some implementations, the program 122 can be configured to execute instructions on an operating system by one or more processors 104 using program data 124.
Computing device 100 may also include an interface bus 140 that facilitates communication from various interface devices (e.g., output devices 142, peripheral interfaces 144, and communication devices 146) to the basic configuration 102 via the bus/interface controller 130. The example output device 142 includes a graphics processing unit 148 and an audio processing unit 150. They may be configured to facilitate communication with various external devices, such as a display terminal or speakers, via one or more a/V ports 152. Example peripheral interfaces 144 may include a serial interface controller 154 and a parallel interface controller 156, which may be configured to facilitate communication with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device) or other peripherals (e.g., printer, scanner, etc.) via one or more I/O ports 158. An example communication device 146 may include a network controller 160, which may be arranged to facilitate communications with one or more other computing devices 162 over a network communication link via one or more communication ports 164.
A network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, and may include any information delivery media, such as carrier waves or other transport mechanisms, in a modulated data signal. A "modulated data signal" may be a signal that has one or more of its data set or its changes made in such a manner as to encode information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or private-wired network, and various wireless media such as acoustic, Radio Frequency (RF), microwave, Infrared (IR), or other wireless media. The term computer readable media as used herein may include both storage media and communication media.
Computing device 100 may be implemented as part of a small-form factor portable (or mobile) electronic device such as a cellular telephone, a Personal Digital Assistant (PDA), a personal media player device, a wireless web-watch device, a personal headset device, an application specific device, or a hybrid device that include any of the above functions. Computing device 100 may also be implemented as a personal computer including both desktop and notebook computer configurations.
Among other things, one or more programs 122 of computing device 100 include instructions for performing a batch data collection method according to the present disclosure.
FIG. 2 illustrates a flow chart of a batch data collection method 200 according to the present disclosure, the method 200 beginning at step S210.
S210, acquiring information of data tables of a plurality of first databases;
s220, determining a data table of a second database for recording data tables of a plurality of first databases;
s230, determining an acquisition strategy for acquiring the data table of the first database by the agent cluster;
and S240, instructing the agent cluster to write the data tables of the plurality of first databases into the data table of the second database according to the acquisition strategy.
In step S210, the first database refers to databases of different information sources, for example, databases of hospitals in different regions. The acquired information of the data table of the first database comprises: information of the database, for example, a database access address, a database type; and data table information, e.g., name, range, field information of the data table.
In step S220, the determined data table of the second database for entering the data tables of the plurality of first databases is a data table of a local database, or a data table of a preset database of a storage cluster, so as to uniformly collect data from different sources to a designated database.
In step S230, the determined agent cluster acquires the acquisition policies of the data tables of the plurality of first databases, which are policies for all data table acquisition tasks, so as to implement unified management of the acquisition policies.
In step S240, instructing the agent cluster to write the data tables of the plurality of first databases into the data table of the second database according to the acquisition policy; through a uniform acquisition strategy, databases from different sources are uniformly acquired to an appointed database, all data acquisition tasks can be efficiently and automatically completed, independent configuration of each acquisition task is avoided, and data acquisition efficiency is improved.
Further, before step S240, the method further includes the steps of:
determining mapping rules of the data tables of the plurality of first databases and the data table of the second database according to the information of the data tables of the plurality of first databases and the information of the data table of the second database;
in step S240, instructing the agent cluster to write the data tables of the plurality of first databases into the data table of the second database according to the collection policy, including:
instructing the agent cluster to collect data tables of a plurality of first databases according to a collection strategy;
and writing the data tables of the plurality of first databases into the data table of the second database according to the data tables of the plurality of first databases and the mapping rule.
In the embodiment of the present disclosure, the mapping rule mainly includes two aspects: the method comprises the following steps of firstly, converting fields expressing the same meaning of databases from different sources into uniform fields; and the data mapping rule is used for processing the field data and uniformly converting the data of different database types (databases such as MySQL, MongoDB, Oracle and the like) and different parameter definitions into the data of the specified parameter types under the specified database types, so that the problem of inconsistent field definitions of different databases is solved.
Further, an embodiment of the present disclosure further provides an updating step of the mapping rule, including:
checking the data table of the second database according to the data tables of the plurality of first databases;
when the data table of the second database is determined to have errors, repairing the data table of the second database;
and updating the mapping rule according to the error and the repairing mode existing in the data table of the second database.
Specifically, the updated mapping rule is a data mapping rule, and the error of the data table of the second database is determined by performing character or numerical value verification on the same field of the second database and the same field of the first database, wherein the comparison algorithm is set according to data expression logic of the bottom layer of the database, so that the problems of messy codes, overflow, numerical value errors or character errors caused by the difference of data type definitions of different databases are avoided.
Optionally, the collection strategy comprises:
triggering time, execution period and validity period of each acquisition task;
the trigger time is used for indicating the trigger time point of each acquisition task in each execution cycle;
the execution period is used for indicating the period of executing the acquisition task by the agent cluster;
and a validity period indicating a time period for executing the collection task according to the execution cycle.
By the acquisition strategy, the aim of automatically and periodically acquiring the data of the databases on time is fulfilled.
Optionally, the collecting strategy further comprises:
the redo interval, the longest running time, the times of failed redo and the selection of whether the execution is covered or not of each acquisition task;
the redo interval is used for indicating the time interval of restarting the collection task when the agent cluster fails to execute any collection task;
the maximum running time length is used for indicating the maximum running time length of any acquisition task, and when the execution time length of any acquisition task is longer than the maximum running time length, any acquisition task is determined to fail and any acquisition task is terminated;
the times of the failed redoing are used for indicating the times of re-executing the task when any acquisition task fails;
and the selection of whether to execute the data is covered or not is used for indicating whether the newly acquired data covers the acquired data or not after any acquisition task fails and the task is executed again.
By the acquisition strategy, automatic acquisition fault management is realized, and when a network or a server fails, acquisition can be tried to be carried out again or terminated.
Optionally, the collecting strategy further comprises:
the starting type and priority of each acquisition task;
the starting type is used for indicating the starting sequence of a plurality of acquisition tasks with the same priority;
and the priority is used for indicating the agent cluster to execute the acquisition tasks according to the high-low sequence of the priority.
Through the acquisition strategy, priority management is carried out on a plurality of acquisition tasks, the acquisition task with high importance is set as high priority, and the acquisition task with low importance is set as low priority, so that the data acquisition utility is maximized under the non-ideal condition.
Optionally, step S220 specifically includes:
determining the service classification information of the data tables of the plurality of first databases according to the information of the data tables of the plurality of first databases;
and determining the data table of the second database corresponding to the service classification information according to the service classification information of the data tables of the plurality of first databases.
Specifically, the service classification information may be information items such as an attribution theme, an acquisition mode, an attribution department, an attribution system, and a source channel of an acquisition task, and is used to perform classification management on the acquired data.
According to the embodiment of the disclosure, the database is managed based on the service classification information, and the data table of the database is divided into different service classifications, so that the user can conveniently search and maintain the database.
Specific examples of the present invention are given below.
Firstly, configuring batch acquisition tasks by a user;
the user inputs the name of the collection task on a batch collection task configuration page: collecting tasks, numbering: 2019051015245410000, from the source database: digital _ china, target database: phedda, topic of attribution: outpatient service, home department: the test department and the collection method comprise the following steps: increment; the user can establish a plurality of collection task entries on the page, and after selecting the data table of the source database, the user can fill in the target table name, the table Chinese name and the collection mode after the data table.
Step two, scheduling task configuration is carried out;
and selecting an agent cluster for executing the acquisition task by the user on a scheduling task configuration interface, and filling the starting type: sequential start, priority: in the execution cycle: day, redo interval: 10 minutes, longest run length: 3 hours, trigger type: time-triggered, expiration date: 5/10/2019 to 5/10/2029, trigger time: 00:00, number of failed redos: 3, whether to perform the following steps: is.
Secondly, collecting task management;
the user can see the configured relevant information of each acquisition task in the acquisition task management interface, and can also perform operations of adding, suspending, releasing, canceling and the like of the tasks.
Referring to fig. 3, a batch data collecting apparatus 300 provided in an embodiment of the present disclosure includes:
a first database acquisition unit 310 configured to acquire information of data tables of a plurality of first databases;
a second database determination unit 320 for determining a data table of a second database for entering data tables of the plurality of first databases;
the acquisition strategy making unit 330 is configured to determine an acquisition strategy for acquiring the data table of the first database by the agent cluster;
and the collection task execution unit 340 is configured to instruct the agent cluster to write the data tables of the plurality of first databases into the data table of the second database according to the collection policy.
Optionally, the batch data collecting apparatus 300 further includes:
the mapping rule management unit is used for determining the mapping rules of the data tables of the plurality of first databases and the data tables of the second database according to the information of the data tables of the plurality of first databases and the information of the data tables of the second database;
the collection task execution unit 340 is specifically configured to:
instructing the agent cluster to collect data tables of a plurality of first databases according to a collection strategy;
and writing the data tables of the plurality of first databases into the data table of the second database according to the data tables of the plurality of first databases and the mapping rule.
Optionally, the batch data collecting apparatus 300 further includes:
the verification unit is used for verifying the data table of the second database according to the data tables of the plurality of first databases;
when the data table of the second database is determined to have errors, repairing the data table of the second database;
and updating the mapping rule according to the error and the repairing mode existing in the data table of the second database.
Optionally, the acquisition policy formulated by the acquisition policy formulation unit 330 includes:
triggering time, execution period and validity period of each acquisition task;
the trigger time is used for indicating the trigger time point of each acquisition task in each execution cycle;
the execution period is used for indicating the period of executing the acquisition task by the agent cluster;
and the effective period is used for indicating the time period for executing the acquisition task according to the execution cycle.
Optionally, the acquisition policy formulated by the acquisition policy formulation unit 330 further includes:
the redo interval, the longest running time, the times of failed redo and the selection of whether the execution is covered or not of each acquisition task;
the redo interval is used for indicating the time interval of restarting the collection task when the agent cluster fails to execute any collection task;
the maximum running time length is used for indicating the maximum running time length of any acquisition task, and when the execution time length of any acquisition task is longer than the maximum running time length, any acquisition task is determined to fail and any acquisition task is terminated;
the times of the failed redoing are used for indicating the times of re-executing the task when any acquisition task fails;
and the selection of whether to execute the data is covered or not is used for indicating whether the newly acquired data covers the acquired data or not after any acquisition task fails and the task is executed again.
Optionally, the acquisition policy formulated by the acquisition policy formulation unit 330 further includes:
the starting type and priority of each acquisition task;
the starting type is used for indicating the starting sequence of a plurality of acquisition tasks with the same priority;
and the priority is used for indicating the agent cluster to execute the acquisition tasks according to the high-low sequence of the priority.
Optionally, the second database determining unit 320 is specifically configured to:
determining the service classification information of the data tables of the plurality of first databases according to the information of the data tables of the plurality of first databases;
and determining the data table of the second database corresponding to the service classification information according to the service classification information of the data tables of the plurality of first databases.
It should be understood that the various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present disclosure, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the disclosure.
In the case of program code execution on programmable computers, the computing device will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to perform the various methods of the present disclosure according to instructions in the program code stored in the memory.
By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer-readable media includes both computer storage media and communication media. Computer storage media store information such as computer readable instructions, data structures, program modules or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of computer readable media.
It should be appreciated that in the foregoing description of exemplary embodiments of the disclosure, various features of the disclosure are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that is, the claimed disclosure requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this disclosure.
Those skilled in the art will appreciate that the modules or units or components of the devices in the examples disclosed herein may be arranged in a device as described in this embodiment or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may be further divided into multiple sub-modules.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Moreover, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the disclosure and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
Furthermore, some of the described embodiments are described herein as a method or combination of method elements that can be performed by a processor of a computer system or by other means of performing the described functions. A processor having the necessary instructions for carrying out the method or method elements thus forms a means for carrying out the method or method elements. Further, the elements of the apparatus embodiments described herein are examples of the following apparatus: the apparatus is used to implement the functions performed by the elements for the purpose of carrying out the invention.
As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
While the disclosure has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the disclosure as described herein. Moreover, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the disclosed subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The disclosure of the present disclosure is intended to be illustrative, but not limiting, of the scope of the disclosure, which is set forth in the following claims.

Claims (10)

1. A method for batch data acquisition, comprising:
acquiring information of data tables of a plurality of first databases;
determining a data table of a second database for entering data tables of the plurality of first databases;
determining an acquisition strategy for acquiring the data tables of the plurality of first databases by the agent cluster;
and instructing the agent cluster to write the data tables of the plurality of first databases into the data table of the second database according to the acquisition strategy.
2. The method of claim 1, wherein instructing the agent cluster to write the data tables of the first plurality of databases to the data tables of the second database according to the collection policy further comprises:
determining mapping rules of the data tables of the plurality of first databases and the data table of the second database according to the information of the data tables of the plurality of first databases and the information of the data table of the second database;
instructing the agent cluster to write the data tables of the plurality of first databases into the data table of the second database according to the acquisition policy, including:
instructing the agent cluster to collect the data tables of the plurality of first databases according to the collection strategy;
and writing the data tables of the plurality of first databases into the data table of the second database according to the data tables of the plurality of first databases and the mapping rule.
3. The method of claim 2, further comprising:
checking the data table of the second database according to the data tables of the plurality of first databases;
when the data table of the second database is determined to have errors, repairing the data table of the second database;
and updating the mapping rule according to the error and the repairing mode existing in the data table of the second database.
4. The method of claim 1, wherein the acquisition strategy comprises:
triggering time, execution period and validity period of each acquisition task;
the trigger time is used for indicating the trigger time point of each acquisition task in each execution cycle;
the execution period is used for indicating the period of executing the collection task by the agent cluster;
the validity period is used for indicating a time period for executing the acquisition task according to the execution cycle.
5. The method of claim 4, wherein the acquisition strategy further comprises:
the redo interval, the longest running time, the times of failed redo and the selection of whether the execution is covered or not of each acquisition task;
the redo interval is a time interval for restarting the collection task when the agent cluster fails to execute any collection task;
the maximum running time length is used for indicating the maximum running time length of any acquisition task, and when the execution time length of any acquisition task is longer than the maximum running time length, the failure of any acquisition task is determined and any acquisition task is terminated;
the times of the failed redoing are used for indicating the times of re-executing the task when any acquisition task fails;
the selection of whether to execute the data coverage is used for indicating whether the newly acquired data covers the acquired data after any acquisition task fails and the task is executed again.
6. The method of claim 5, wherein the acquisition strategy further comprises:
the starting type and priority of each acquisition task;
the starting type is used for indicating the starting sequence of a plurality of acquisition tasks with the same priority;
and the priority is used for indicating the agent cluster to execute the acquisition tasks according to the high-low sequence of the priority.
7. The method of claim 1, wherein determining a data table for a second database for entering data tables of the plurality of first databases comprises:
determining the service classification information of the data tables of the plurality of first databases according to the information of the data tables of the plurality of first databases;
and determining the data table of the second database corresponding to the service classification information according to the service classification information of the data tables of the plurality of first databases.
8. A batch data acquisition device, comprising:
a first database acquisition unit configured to acquire information of data tables of a plurality of first databases;
a second database determination unit configured to determine a data table of a second database used for entering the data tables of the plurality of first databases;
the acquisition strategy making unit is used for determining an acquisition strategy for acquiring the data tables of the plurality of first databases by the agent cluster;
and the acquisition task execution unit is used for indicating the agent cluster to write the data tables of the plurality of first databases into the data table of the second database according to the acquisition strategy.
9. A readable storage medium having executable instructions thereon that, when executed, cause a computing device to perform the operations included in any of claims 1-7.
10. A computing device, comprising:
a processor; and
a memory storing executable instructions that, when executed, cause the processor to perform the operations included in any of claims 1-7.
CN201910860041.0A 2019-09-11 2019-09-11 Batch data acquisition method and device, readable storage medium and computing equipment Pending CN110633280A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910860041.0A CN110633280A (en) 2019-09-11 2019-09-11 Batch data acquisition method and device, readable storage medium and computing equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910860041.0A CN110633280A (en) 2019-09-11 2019-09-11 Batch data acquisition method and device, readable storage medium and computing equipment

Publications (1)

Publication Number Publication Date
CN110633280A true CN110633280A (en) 2019-12-31

Family

ID=68971965

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910860041.0A Pending CN110633280A (en) 2019-09-11 2019-09-11 Batch data acquisition method and device, readable storage medium and computing equipment

Country Status (1)

Country Link
CN (1) CN110633280A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110178842A1 (en) * 2010-01-20 2011-07-21 American Express Travel Related Services Company, Inc. System and method for identifying attributes of a population using spend level data
CN105930389A (en) * 2016-04-14 2016-09-07 北京京东尚科信息技术有限公司 Method and system for transferring data
CN106484857A (en) * 2016-10-09 2017-03-08 珠海经济特区远宏科技有限公司大连分公司 Data collecting system and its method
CN107330238A (en) * 2016-08-12 2017-11-07 中国科学院上海技术物理研究所 Medical information collection, processing, storage and display methods and device
CN108897658A (en) * 2018-05-31 2018-11-27 康键信息技术(深圳)有限公司 Primary database monitoring method, device, computer equipment and storage medium
CN109426600A (en) * 2017-12-21 2019-03-05 中国平安人寿保险股份有限公司 Data acquisition treatment method, device, equipment and readable storage medium storing program for executing
CN109873785A (en) * 2017-12-01 2019-06-11 广州明领基因科技有限公司 Multi-source heterogeneous secure data acquisition system based on semantic Agent
CN110162563A (en) * 2019-05-28 2019-08-23 深圳市网心科技有限公司 A kind of data storage method, system and electronic equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110178842A1 (en) * 2010-01-20 2011-07-21 American Express Travel Related Services Company, Inc. System and method for identifying attributes of a population using spend level data
CN105930389A (en) * 2016-04-14 2016-09-07 北京京东尚科信息技术有限公司 Method and system for transferring data
CN107330238A (en) * 2016-08-12 2017-11-07 中国科学院上海技术物理研究所 Medical information collection, processing, storage and display methods and device
CN106484857A (en) * 2016-10-09 2017-03-08 珠海经济特区远宏科技有限公司大连分公司 Data collecting system and its method
CN109873785A (en) * 2017-12-01 2019-06-11 广州明领基因科技有限公司 Multi-source heterogeneous secure data acquisition system based on semantic Agent
CN109426600A (en) * 2017-12-21 2019-03-05 中国平安人寿保险股份有限公司 Data acquisition treatment method, device, equipment and readable storage medium storing program for executing
CN108897658A (en) * 2018-05-31 2018-11-27 康键信息技术(深圳)有限公司 Primary database monitoring method, device, computer equipment and storage medium
CN110162563A (en) * 2019-05-28 2019-08-23 深圳市网心科技有限公司 A kind of data storage method, system and electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
杜丽丽: "医药协同数据采集技术的研究与实现", 《中国优秀硕士学位论文全文数据库(信息科技辑)》 *
石爱中 等: "《信息系统审计实务》", 30 November 2012 *

Similar Documents

Publication Publication Date Title
CN111045933B (en) Regression strategy updating method and device, storage medium and terminal equipment
CN105446799A (en) Method and system for performing rule management in computer system
CN103309801A (en) Method and device for determining regression testing range
CN110633100A (en) Method, device, storage medium and computing equipment for providing data service
CN110659287A (en) Method for processing field names of table and computing equipment
CN113807046A (en) Test excitation optimization regression verification method, system and medium
CN110990350B (en) Log analysis method and device
WO2022095847A1 (en) System upgrading method and apparatus, device, and storage medium
CN111125240B (en) Distributed transaction realization method and device, electronic equipment and storage medium
US10482099B2 (en) Systems and methods for facilitating data transformation
CN112256672B (en) Database change approval method and device
CN110633280A (en) Batch data acquisition method and device, readable storage medium and computing equipment
CN111324373B (en) Method and device for sub-coding warehouse on multiple engineering files and computing equipment
CN111913858A (en) Pressure testing system and method
CN105681384A (en) Information expiration processing method and apparatus
CN111177147B (en) Metadata batch warehousing method, readable storage medium and computing device
CN110309145A (en) A kind of tables of data method of adjustment, tables of data creation method and device
CN109656936A (en) Method of data synchronization, device, computer equipment and storage medium
CN112035432B (en) Data replacement migration method and device and computer equipment
CN114238390A (en) Data warehouse optimization method, device, equipment and storage medium
CN107688613B (en) Data packet processing rule optimization method and computing device
CN113485712A (en) Kernel cutting method and computing device
US8321844B2 (en) Providing registration of a communication
CN111625268B (en) Method and device for quickly scanning software patch and computing equipment
CN112486854B (en) NAND Flash Nand Flash storage management method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191231

RJ01 Rejection of invention patent application after publication