CN111026796B - Multi-source heterogeneous data acquisition method, device, system, medium and equipment - Google Patents

Multi-source heterogeneous data acquisition method, device, system, medium and equipment Download PDF

Info

Publication number
CN111026796B
CN111026796B CN201911201662.4A CN201911201662A CN111026796B CN 111026796 B CN111026796 B CN 111026796B CN 201911201662 A CN201911201662 A CN 201911201662A CN 111026796 B CN111026796 B CN 111026796B
Authority
CN
China
Prior art keywords
data
acquisition
source
data acquisition
keywords
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911201662.4A
Other languages
Chinese (zh)
Other versions
CN111026796A (en
Inventor
李康顺
朱展标
周华智
徐润炫
邱鑫垚
何凯敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Agricultural University
Original Assignee
South China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Agricultural University filed Critical South China Agricultural University
Priority to CN201911201662.4A priority Critical patent/CN111026796B/en
Publication of CN111026796A publication Critical patent/CN111026796A/en
Application granted granted Critical
Publication of CN111026796B publication Critical patent/CN111026796B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a multi-source heterogeneous data acquisition method, a device, a system, a medium and equipment, wherein the acquisition method comprises the following steps: firstly, establishing a keyword list; creating a corresponding acquisition grammar according to the acquisition content of each data source, then respectively establishing data acquisition rules aiming at the acquisition grammar, and associating the data acquisition rules to keywords corresponding to the keyword list; when the data source is to be acquired, transmitting the data acquisition rule to an acquisition end, so that the acquisition end can acquire data according to the corresponding data acquisition rule, and creating the data acquisition rule of the newly-appearing data source according to the currently-constructed data acquisition rule; in the method, the corresponding data acquisition rules are respectively constructed for each data source, so that the method can realize acquisition for different data sources, solves the technical problem that in the prior art, the acquisition format of a tool is single, and the acquisition tool needs to be independently designed for a certain format, and has universality, expansibility and reusability.

Description

Multi-source heterogeneous data acquisition method, device, system, medium and equipment
Technical Field
The invention relates to the technical field of informatization, in particular to a multi-source heterogeneous data acquisition method, a device, a system, a medium and equipment.
Background
In the process of enterprise informatization construction, due to the influence of the staged, technical and other economic and human factors of each business system construction and implementation data management system, a large amount of business data adopting different storage modes is accumulated in the development process of enterprises, and the adopted data management systems are also quite different, from simple file databases to complex network databases, which form heterogeneous data sources of the enterprises. Multisource data is an indispensable part of constructing information systems, and is mainly divided into two types, one is structured data, which is usually stored in a database in the form of a data table, and the other is unstructured data, such as factory equipment data and worker work condition data. Because the collectors used by the various data sources are different, the different collectors have different communication protocols, the transmission rates are different, and in particular, the semantics exist in the aspect of communication, and the data formats collected by the various data sources are also generally different. Therefore, for various data sources, the corresponding acquisition tools are required to be adopted for acquisition respectively and independently, and great inconvenience is brought to data acquisition, management and analysis.
Disclosure of Invention
The first object of the present invention is to overcome the drawbacks and disadvantages of the prior art, and to provide a multi-source heterogeneous data collection method, which can realize collection for different data sources, and solve the technical problem that in the prior art, a tool collection format is single, and a collection tool needs to be designed for a certain format separately for collection, and has universality, expansibility and reusability.
The second object of the present invention is to provide a multi-source heterogeneous data acquisition device.
A third object of the present invention is to provide a multi-source heterogeneous data acquisition system.
A fourth object of the present invention is to provide a storage medium.
It is a fifth object of the present invention to provide a computing device.
The first object of the present invention is to provide a multi-source heterogeneous data acquisition method, which includes:
establishing a keyword list for storing keywords;
acquiring collected content of each data source, and creating a collection grammar according to the collected content of each data source;
for various current data sources, respectively establishing data acquisition rules according to acquisition grammar, and associating the data acquisition rules to keywords corresponding to a keyword list, wherein the keywords correspond to keywords in acquired contents of the data sources;
aiming at the newly-appearing data source, creating a data acquisition rule according to the currently-constructed data acquisition rule;
when the data source is to be acquired, the data acquisition rule corresponding to the data source is sent to the acquisition end of the data source, so that the acquisition end acquires data according to the corresponding data acquisition rule.
Preferably, the process of constructing the data collection rule for the newly-appearing data source is as follows:
firstly, determining keywords in acquired content of the data source, then inquiring the keywords in a keyword list, acquiring data acquisition rules related to the keywords aiming at the keywords appearing in the keyword list, and acquiring acquisition grammars related to the keywords from the data acquisition rules; aiming at keywords which do not appear in a keyword list, newly establishing an acquisition grammar according to acquisition contents corresponding to the keywords; and constructing a data acquisition rule of the newly-appearing data source by the acquired acquisition grammar.
Preferably, the data collection rule further includes configuration file content, where the configuration file content includes a center server name, a center server IP, a data source generator name IP, a data cache server name, and a data cache server IP address;
the method further comprises the steps of: and linking the database through the configuration file in the data acquisition rule, and when corresponding data is acquired through the data acquisition rule, transmitting the data to the linked database for storage.
Preferably, the data collection rule further includes the following: the data structure of collection, the data structure of collection and export, the type of collection and preservation of the data set, the database address and the rules of the database.
Preferably, the data acquisition rule is converted into json data format and then transmitted to the acquisition end.
A second object of the present invention is to provide a multi-source heterogeneous data collection method, a multi-source heterogeneous data collection device, the device comprising
The keyword list construction module is used for establishing a keyword list for storing keywords;
the acquisition module is used for acquiring acquired content of each data source;
the collection grammar construction module is used for creating collection grammar according to the collected content of each data source;
the first data acquisition rule construction module is used for respectively constructing data acquisition rules according to acquisition grammar;
the association module is used for associating the data acquisition rule to the corresponding keyword of the keyword list, and the corresponding keyword is the keyword in the acquired content of the data source;
the sending module is used for sending the data acquisition rule corresponding to the data source to the acquisition end of the data source when the data source is to be acquired, so that the acquisition end acquires data according to the corresponding data acquisition rule;
and the second data acquisition rule construction module is used for creating the data acquisition rule of the new data source according to the current constructed data acquisition rule.
The third object of the invention is to provide a multi-source heterogeneous data acquisition method, a multi-source heterogeneous data acquisition system, the system comprises a central server and an acquisition end server; the central server is connected with the acquisition end server through a network, and the acquisition end server is connected with each data acquisition device of the acquisition end;
the central server is used for realizing the multi-source heterogeneous data acquisition method of the first object of the invention;
the acquisition end server is used for realizing the multi-source heterogeneous data acquisition method of the first object of the invention and/or is used for receiving the data acquisition rule sent by the central server; the data acquisition equipment is used for sending the data acquisition rules to the acquisition end; the data acquisition device is used for receiving data acquired by each data acquisition device according to the data acquisition rules and sending the data to the central server.
Preferably, the system further comprises a distributed cache server, wherein the distributed cache server is connected with the acquisition end server and is used for caching data received by the acquisition end server from the data acquisition equipment.
A fourth object of the present invention is to provide a storage medium storing a computer program which, when executed by a processor, causes the processor to perform the multi-source heterogeneous data collection method according to the first object of the present invention.
A fifth object of the present invention is to provide a computing device, including a processor and a memory for storing a program executable by the processor, wherein the method for collecting multi-source heterogeneous data according to the first object of the present invention is implemented when the processor executes the program stored in the memory.
Compared with the prior art, the invention has the following advantages and effects:
(1) In the multi-source heterogeneous data acquisition method, firstly, a keyword list is established; creating a corresponding acquisition grammar according to the acquisition content of each data source, then respectively establishing data acquisition rules aiming at the acquisition grammar, and associating the data acquisition rules to keywords corresponding to the keyword list; when data sources are to be acquired, the data acquisition rules are sent to the acquisition end, so that the acquisition end can acquire the data according to the corresponding data acquisition rules.
(2) In the multi-source heterogeneous data acquisition method, keywords in acquisition contents of the newly-appearing data sources are acquired aiming at the newly-appearing data sources, the data acquisition rules managed by the keywords are found out by the keyword list, and the data acquisition rules of the newly-appearing data sources are constructed by utilizing the acquisition grammar in the associated data acquisition rules, so that the construction of the data acquisition rules of the new data sources with the same data acquisition contents as the old data sources is greatly simplified.
(3) In the multi-source heterogeneous data acquisition method, the data acquisition rule also comprises the configuration file content, and the object and the storage address of the acquired data can be clearly acquired through the configuration file content, so that greater convenience is brought to the acquisition work of the multi-source heterogeneous data.
(4) In the multi-source heterogeneous data acquisition method, the database is linked through the configuration file in the data acquisition rule, and when corresponding data is acquired through the data acquisition rule, the corresponding data is sent to the linked database for storage.
Drawings
FIG. 1 is a flow chart of a multi-source heterogeneous data collection method of the present invention.
Fig. 2 is a block diagram of a multi-source heterogeneous data collection device according to the present invention.
FIG. 3 is a block diagram of the architecture of a computing device of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but embodiments of the present invention are not limited thereto.
Example 1
The embodiment discloses a multi-source heterogeneous data acquisition method, which is used for realizing heterogeneous multi-data source acquisition, as shown in fig. 1, and comprises the following specific processes:
s1, establishing a keyword list for storing keywords;
s2, acquiring acquired content of each data source, and creating acquisition grammar according to the acquired content of each data source; the collection grammar in this embodiment refers to some semantic actions performed on the collected content, including some data processing of the collected content.
Each data source refers to an object collected by each collection device at the collection end, for example, the temperature collected by a thermometer for a certain device, the data source is the device, and the collected content of the data source is the temperature data of the device.
S3, respectively establishing data acquisition rules according to acquisition grammar aiming at various current data sources, and associating the data acquisition rules to keywords corresponding to a keyword list, wherein the keywords correspond to keywords in acquired content of the data sources; if the number of words in the acquired content of the data source is not large, all words of the acquired content can be used as a keyword.
In this embodiment, the data collection rules include, in addition to collection grammar, configuration file contents, collected data structures, collected and derived data structures, collected and stored data set types, database addresses, and rules of the database, where the configuration file contents include a center server name, a center server IP, a data source generator name IP, a data cache server name, and a data cache server IP address. In this embodiment, the database may be linked through a configuration file in the data collection rule, and when corresponding data is collected through the data collection rule, the corresponding data is sent to the linked database for storage.
When two camera data or one motor is to be collected, the data collection rules created according to the collection grammar in this embodiment may be the following codes:
run_name=mes warehouse collection
VERSION=1.0.0
Device_camera_1=ip address
Device_camera_2=ip address
Device_Motor=COM1
Redis_IP=IP
Redis_Name=master
G->A+B+C+D
A->speed
B->running_Time
C->temperature
D->state
G in the acquisition rule is the corresponding acquisition grammar, and A, B, C and D are the acquired contents of the camera.
S4, aiming at the newly-appearing data source, creating a data acquisition rule according to the currently-constructed data acquisition rule; the method comprises the following steps:
for a newly-appearing data source, firstly determining keywords in acquired content of the data source, then inquiring the keywords in a keyword table, acquiring data acquisition rules related to the keywords aiming at the keywords appearing in the keyword table, and acquiring acquisition grammars related to the keywords from the data acquisition rules; aiming at keywords which do not appear in a keyword list, newly establishing an acquisition grammar according to acquisition contents corresponding to the keywords; and constructing a data acquisition rule of the newly-appearing data source by the acquired acquisition grammar.
S5, when the data source is to be acquired, transmitting a data acquisition rule corresponding to the data source to an acquisition end of the data source, so that the acquisition end acquires data according to the corresponding data acquisition rule; in this embodiment, the data collection rule is converted into json data format and then transmitted to the collection end.
Example 2
The embodiment discloses a multi-source heterogeneous data acquisition device, as shown in FIG. 2, the device comprises
The keyword list construction module is used for establishing a keyword list for storing keywords;
the acquisition module is used for acquiring acquired content of each data source;
the collection grammar construction module is used for creating collection grammar according to the collected content of each data source;
the first data acquisition rule construction module is used for respectively constructing data acquisition rules according to acquisition grammar;
the association module is used for associating the data acquisition rule to the corresponding keyword of the keyword list, and the corresponding keyword is the keyword in the acquired content of the data source;
the sending module is used for sending the data acquisition rule corresponding to the data source to the acquisition end of the data source when the data source is to be acquired, so that the acquisition end acquires data according to the corresponding data acquisition rule;
the second data acquisition rule construction module is used for establishing a data acquisition rule aiming at the newly-appearing data source, and specifically comprises the following steps: firstly, determining keywords in acquired content of the data source, then inquiring the keywords in a keyword list, acquiring data acquisition rules related to the keywords aiming at the keywords appearing in the keyword list, and acquiring acquisition grammars related to the keywords from the data acquisition rules; aiming at keywords which do not appear in a keyword list, newly establishing an acquisition grammar according to acquisition contents corresponding to the keywords; and constructing a data acquisition rule of the newly-appearing data source by the acquired acquisition grammar.
The multi-source heterogeneous data acquisition device in this embodiment corresponds to the multi-source heterogeneous data acquisition method in embodiment 1, so specific implementation of each module can be referred to embodiment 1 above, and will not be described in detail herein; it should be noted that, the apparatus provided in this embodiment is only exemplified by the division of the above functional modules, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure is divided into different functional modules, so as to perform all or part of the functions described above. Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
Example 3
The embodiment discloses a multi-source heterogeneous data acquisition system, which comprises a central server, an acquisition end server and a distributed cache server; the central server is connected with the acquisition end server through a network, the acquisition end server is connected with each data acquisition device of the acquisition end, and the distributed cache server is connected with the acquisition end server;
the central server is used for the multi-source heterogeneous data collection method described in the embodiment 1, and is as follows:
establishing a keyword list for storing keywords;
acquiring collected content of each data source, and creating a collection grammar according to the collected content of each data source;
for various current data sources, respectively establishing data acquisition rules according to acquisition grammar, and associating the data acquisition rules to keywords corresponding to a keyword list, wherein the keywords correspond to keywords in acquired contents of the data sources;
when the data source is to be acquired, transmitting a data acquisition rule corresponding to the data source to an acquisition end of the data source, so that the acquisition end acquires data according to the corresponding data acquisition rule;
aiming at the newly-appearing data source, creating a data acquisition rule according to the currently-constructed data acquisition rule; the method comprises the following steps:
for a newly-appearing data source, firstly determining keywords in acquired content of the data source, then inquiring the keywords in a keyword table, acquiring data acquisition rules related to the keywords aiming at the keywords appearing in the keyword table, and acquiring acquisition grammars related to the keywords from the data acquisition rules; aiming at keywords which do not appear in a keyword list, newly establishing an acquisition grammar according to acquisition contents corresponding to the keywords; and constructing a data acquisition rule of the newly-appearing data source by the acquired acquisition grammar.
The acquisition end server is used for realizing the multi-source heterogeneous data acquisition method described in the embodiment 1 and/or is used for receiving the data acquisition rule sent by the central server; the data acquisition equipment is used for sending the data acquisition rules to the acquisition end; the data acquisition device is used for receiving data acquired by each data acquisition device according to the data acquisition rules and sending the data to the central server.
And the distributed cache server is used for caching the data received by the acquisition end server from the data acquisition equipment.
Example 4
The present embodiment discloses a storage medium, wherein the storage medium stores a computer program, which when executed by a processor, causes the processor to execute the multi-source heterogeneous data collection method described in embodiment 1, as follows:
establishing a keyword list for storing keywords;
acquiring collected content of each data source, and creating a collection grammar according to the collected content of each data source;
for various current data sources, respectively establishing data acquisition rules according to acquisition grammar, and associating the data acquisition rules to keywords corresponding to a keyword list, wherein the keywords correspond to keywords in acquired contents of the data sources;
when the data source is to be acquired, transmitting a data acquisition rule corresponding to the data source to an acquisition end of the data source, so that the acquisition end acquires data according to the corresponding data acquisition rule;
aiming at the newly-appearing data source, creating a data acquisition rule according to the currently-constructed data acquisition rule; the method comprises the following steps:
for a newly-appearing data source, firstly determining keywords in acquired content of the data source, then inquiring the keywords in a keyword table, acquiring data acquisition rules related to the keywords aiming at the keywords appearing in the keyword table, and acquiring acquisition grammars related to the keywords from the data acquisition rules; aiming at keywords which do not appear in a keyword list, newly establishing an acquisition grammar according to acquisition contents corresponding to the keywords; and constructing a data acquisition rule of the newly-appearing data source by the acquired acquisition grammar.
The storage medium in this embodiment may be a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a usb disk, a removable hard disk, or the like.
Example 5
The present embodiment discloses a computing device, as shown in fig. 3, comprising a processor 1402, a memory, an input device 1403, a display 1404, and a network interface 1405 connected by a system bus 1401. The processor 1402 is configured to provide computing and control capabilities, and the memory includes a nonvolatile storage medium 1406 and an internal memory 1407, where the nonvolatile storage medium 1406 stores an operating system, a computer program, and a database, and the internal memory 1407 provides an environment for the operating system and the computer program in the nonvolatile storage medium 1406 to run, and when the computer program is executed by the processor 1402, the multi-source heterogeneous data collection method described in embodiment 1 is implemented as follows:
establishing a keyword list for storing keywords;
acquiring collected content of each data source, and creating a collection grammar according to the collected content of each data source;
for various current data sources, respectively establishing data acquisition rules according to acquisition grammar, and associating the data acquisition rules to keywords corresponding to a keyword list, wherein the keywords correspond to keywords in acquired contents of the data sources;
when the data source is to be acquired, transmitting a data acquisition rule corresponding to the data source to an acquisition end of the data source, so that the acquisition end acquires data according to the corresponding data acquisition rule;
aiming at the newly-appearing data source, creating a data acquisition rule according to the currently-constructed data acquisition rule; the method comprises the following steps:
for a newly-appearing data source, firstly determining keywords in acquired content of the data source, then inquiring the keywords in a keyword table, acquiring data acquisition rules related to the keywords aiming at the keywords appearing in the keyword table, and acquiring acquisition grammars related to the keywords from the data acquisition rules; aiming at keywords which do not appear in a keyword list, newly establishing an acquisition grammar according to acquisition contents corresponding to the keywords; and constructing a data acquisition rule of the newly-appearing data source by the acquired acquisition grammar.
The computing device in this embodiment may be a desktop computer, a notebook computer, a smart phone, a PDA handheld terminal, a tablet computer, or other terminal devices with processor functionality.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.

Claims (9)

1. The multi-source heterogeneous data acquisition method is characterized by comprising the following steps of:
establishing a keyword list for storing keywords;
acquiring collected content of each data source, and creating a collection grammar according to the collected content of each data source;
for various current data sources, respectively establishing data acquisition rules according to acquisition grammar, and associating the data acquisition rules to keywords corresponding to a keyword list, wherein the keywords correspond to keywords in acquired contents of the data sources; the process of constructing data collection rules for the newly occurring data sources is as follows:
firstly, determining keywords in acquired content of the data source, then inquiring the keywords in a keyword list, acquiring data acquisition rules related to the keywords aiming at the keywords appearing in the keyword list, and acquiring acquisition grammars related to the keywords from the data acquisition rules; aiming at keywords which do not appear in a keyword list, newly establishing an acquisition grammar according to acquisition contents corresponding to the keywords; the acquired collection grammar forms a data collection rule of a new data source;
aiming at the newly-appearing data source, creating a data acquisition rule according to the currently-constructed data acquisition rule;
when the data source is to be acquired, the data acquisition rule corresponding to the data source is sent to the acquisition end of the data source, so that the acquisition end acquires data according to the corresponding data acquisition rule.
2. The multi-source heterogeneous data collection method according to claim 1, wherein the data collection rule further comprises configuration file content, and the configuration file content comprises a center server name, a center server IP, a data source generator name IP, a data cache server name and a data cache server IP address;
the method further comprises the steps of: and linking the database through the configuration file in the data acquisition rule, and when corresponding data is acquired through the data acquisition rule, transmitting the data to the linked database for storage.
3. The multi-source heterogeneous data collection method according to claim 1, wherein the data collection rule further comprises the following contents: the data structure of collection, the data structure of collection and export, the type of collection and preservation of the data set, the database address and the rules of the database.
4. The multi-source heterogeneous data collection method according to claim 1, wherein the data collection rule is transferred to the collection terminal after being converted into json data format.
5. A multi-source heterogeneous data acquisition device is characterized in that the device comprises
The keyword list construction module is used for establishing a keyword list for storing keywords;
the acquisition module is used for acquiring acquired content of each data source;
the collection grammar construction module is used for creating collection grammar according to the collected content of each data source;
the first data acquisition rule construction module is used for respectively constructing data acquisition rules according to acquisition grammar;
the association module is used for associating the data acquisition rule to the corresponding keyword of the keyword list, and the corresponding keyword is the keyword in the acquired content of the data source;
the sending module is used for sending the data acquisition rule corresponding to the data source to the acquisition end of the data source when the data source is to be acquired, so that the acquisition end acquires data according to the corresponding data acquisition rule; the process of constructing data collection rules for the newly occurring data sources is as follows:
firstly, determining keywords in acquired content of the data source, then inquiring the keywords in a keyword list, acquiring data acquisition rules related to the keywords aiming at the keywords appearing in the keyword list, and acquiring acquisition grammars related to the keywords from the data acquisition rules; aiming at keywords which do not appear in a keyword list, newly establishing an acquisition grammar according to acquisition contents corresponding to the keywords; the acquired collection grammar forms a data collection rule of a new data source;
and the second data acquisition rule construction module is used for creating the data acquisition rule of the new data source according to the current constructed data acquisition rule.
6. The multi-source heterogeneous data acquisition system is characterized by comprising a central server and an acquisition end server; the central server is connected with the acquisition end server through a network, and the acquisition end server is connected with each data acquisition device of the acquisition end;
the central server is used for realizing the multi-source heterogeneous data acquisition method according to any one of claims 1-4;
the acquisition end server is used for realizing the multi-source heterogeneous data acquisition method according to any one of claims 1-4 and/or receiving a data acquisition rule sent by the central server; the data acquisition equipment is used for sending the data acquisition rules to the acquisition end; the data acquisition device is used for receiving data acquired by each data acquisition device according to the data acquisition rules and sending the data to the central server.
7. The multi-source heterogeneous data collection system of claim 6, further comprising a distributed cache server coupled to the collection-side server for caching data received by the collection-side server from the data collection device.
8. A storage medium having stored therein a computer program which, when executed by a processor, causes the processor to perform the multi-source heterogeneous data collection method of any of claims 1-4.
9. A computing device comprising a processor and a memory for storing a program executable by the processor, wherein the processor, when executing the program stored in the memory, implements the multi-source heterogeneous data collection method of any of claims 1-4.
CN201911201662.4A 2019-11-29 2019-11-29 Multi-source heterogeneous data acquisition method, device, system, medium and equipment Active CN111026796B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911201662.4A CN111026796B (en) 2019-11-29 2019-11-29 Multi-source heterogeneous data acquisition method, device, system, medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911201662.4A CN111026796B (en) 2019-11-29 2019-11-29 Multi-source heterogeneous data acquisition method, device, system, medium and equipment

Publications (2)

Publication Number Publication Date
CN111026796A CN111026796A (en) 2020-04-17
CN111026796B true CN111026796B (en) 2023-05-16

Family

ID=70207261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911201662.4A Active CN111026796B (en) 2019-11-29 2019-11-29 Multi-source heterogeneous data acquisition method, device, system, medium and equipment

Country Status (1)

Country Link
CN (1) CN111026796B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737529B (en) * 2020-07-23 2020-12-18 北京东方通科技股份有限公司 Multi-source heterogeneous data acquisition method
CN114661513B (en) * 2022-04-18 2024-01-23 广州菩润信息科技有限公司 Distributed multi-source data acquisition method, system, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092817A (en) * 2013-01-18 2013-05-08 五八同城信息技术有限公司 Data collection method and data collection device based on script engine
CN106528769A (en) * 2016-11-04 2017-03-22 乐视控股(北京)有限公司 Data acquisition method and apparatus
CN109460944A (en) * 2018-12-14 2019-03-12 平安健康保险股份有限公司 Core based on big data protects method, apparatus, equipment and readable storage medium storing program for executing
CN109992603A (en) * 2019-04-04 2019-07-09 北京金堤科技有限公司 A kind of data search method, device, electronic equipment and computer-readable medium
CN110414986A (en) * 2019-06-21 2019-11-05 中国平安财产保险股份有限公司 Cash register method for routing foundation and relevant device based on big data analysis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092817A (en) * 2013-01-18 2013-05-08 五八同城信息技术有限公司 Data collection method and data collection device based on script engine
CN106528769A (en) * 2016-11-04 2017-03-22 乐视控股(北京)有限公司 Data acquisition method and apparatus
CN109460944A (en) * 2018-12-14 2019-03-12 平安健康保险股份有限公司 Core based on big data protects method, apparatus, equipment and readable storage medium storing program for executing
CN109992603A (en) * 2019-04-04 2019-07-09 北京金堤科技有限公司 A kind of data search method, device, electronic equipment and computer-readable medium
CN110414986A (en) * 2019-06-21 2019-11-05 中国平安财产保险股份有限公司 Cash register method for routing foundation and relevant device based on big data analysis

Also Published As

Publication number Publication date
CN111026796A (en) 2020-04-17

Similar Documents

Publication Publication Date Title
CN110908997B (en) Data blood relationship construction method and device, server and readable storage medium
CN107809383B (en) MVC-based path mapping method and device
CN111026796B (en) Multi-source heterogeneous data acquisition method, device, system, medium and equipment
US10754628B2 (en) Extracting web API endpoint data from source code to identify potential security threats
CN109101607B (en) Method, apparatus and storage medium for searching blockchain data
KR102067032B1 (en) Method and system for data processing based on hybrid big data system
US11934287B2 (en) Method, electronic device and computer program product for processing data
CN111666293A (en) Database access method and device
CN112269706B (en) Interface parameter verification method, device, electronic equipment and computer readable medium
CN110096521A (en) Log information processing method and device
CN111385264A (en) Communication service data access system and method
US11068496B2 (en) System and method for data management
CN104881454A (en) Updating method and system of parameter
CN111737564A (en) Information query method, device, equipment and medium
CN109086414B (en) Method, apparatus and storage medium for searching blockchain data
WO2018165420A1 (en) Enterprise integration processing for mainframe cobol programs
CN113900907A (en) Mapping construction method and system
CN110990350B (en) Log analysis method and device
CN103324567A (en) App engine debugging method and debugging system
KR20210000041A (en) Method and apparatus for analyzing log data in real time
US20150106363A1 (en) Computer system, data management method, and recording medium storing program
CN108491448B (en) Data pushing method and device
CN113761016A (en) Data query method, device, equipment and storage medium
CN115757041B (en) Method for collecting dynamically configurable multi-cluster logs and application
CN111124923B (en) Running state query method and device, server equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant