CN109739818B - Portable high-throughput big data acquisition method and system - Google Patents

Portable high-throughput big data acquisition method and system Download PDF

Info

Publication number
CN109739818B
CN109739818B CN201811622024.5A CN201811622024A CN109739818B CN 109739818 B CN109739818 B CN 109739818B CN 201811622024 A CN201811622024 A CN 201811622024A CN 109739818 B CN109739818 B CN 109739818B
Authority
CN
China
Prior art keywords
data
server
data collection
file
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811622024.5A
Other languages
Chinese (zh)
Other versions
CN109739818A (en
Inventor
张晨光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Software Co Ltd
Original Assignee
Inspur Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Software Co Ltd filed Critical Inspur Software Co Ltd
Priority to CN201811622024.5A priority Critical patent/CN109739818B/en
Publication of CN109739818A publication Critical patent/CN109739818A/en
Application granted granted Critical
Publication of CN109739818B publication Critical patent/CN109739818B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a portable high-throughput big data acquisition method and a system, belonging to the field of data acquisition, aiming at solving the technical problem of instantly collecting and processing data of different data structures of different databases, and adopting the technical scheme that: a portable high throughput big data acquisition method is characterized in that a central server sends an instruction, each cluster server starts logstack, configuration parameters are read through an etcd component of datatrains, a response configuration file is automatically generated, the logstack reads the configuration file, various database data are collected according to the configuration file, the database data are sent to each server through Kafka in a message form after being sorted, and the Kafka and the logstack are matched at a consumption end to temporarily store the collected data according to corresponding formats; the central server calls the related components to process the collected related data. The invention also discloses a portable high-throughput large data acquisition system.

Description

Portable high-throughput big data acquisition method and system
Technical Field
The invention relates to the field of data acquisition, in particular to a convenient high-throughput large-data acquisition method and a system.
Background
The information is the core basis of decision making, and the timely and effective data acquisition and data processing is very important. Because of the dispersion of data and the inconsistency of structure, the data collection mode of the transmission is very time-consuming and labor-consuming. Therefore, how to instantly collect and process data of different data structures of different databases is a technical problem which needs to be solved urgently at present;
to solve the above technical problems, people are beginning to pay attention to enterprise data integration research. Data in different systems are attempted to be reprocessed, so that an integrated analysis-oriented environment is formed, and rules can be mined from massive information, knowledge can be extracted, and decision making can be assisted. In the prior art, data is extracted in a single-threaded polling database mode, but each table in the single-threaded polling database is adopted, so that the problems that time consumption is too large and data extraction efficiency is low due to the fact that the data size of some tables is too large may exist.
Patent No. CN106330963A discloses a cross-network multi-node log collection method, wherein node logs are sent to a headquarters, an application server needs to send the node logs of each day to the headquarters, an external network transmission mode is adopted, the server executes shell scripts regularly every day, and log files on the server are compressed and sent to the headquarters server; sending the log data from the outside network of the headquarters to the inside network of the headquarters, providing a log ferrying program and storing the log into an inside network database; and recovering the log data in the database into an original log file, recovering the log data in the headquarter intranet database into the original log file, and sending the original log file to a big data platform through a log management tool logstack. However, the technical scheme cannot realize the instantaneous collection and processing of data of different data structures of different databases.
Disclosure of Invention
The technical task of the invention is to provide a convenient high-throughput large-data acquisition method and a convenient high-throughput large-data acquisition system, so as to solve the problem of how to instantly collect and process data of different data structures of different databases.
The technical task of the invention is realized in the following way, a portable high-throughput big data acquisition method is characterized in that a central server sends an instruction, each cluster server starts Logstash, configuration parameters are read through an etcd component of datatrains, a response configuration file is automatically generated, the Logstash reads the configuration file, various database data are collected according to the configuration file, the configuration file is sent to each server in a message form through Kafka, and the Kafka and the Logstash are matched at a consumption end to temporarily store the collected data according to a corresponding format; the central server calls the related components to process the collected related data; the method comprises the following specific steps:
s1, the server generates different parameter message packets according to different service requirements, and sends the parameter message packets through timing tasks or manual operation;
s2, the consumption end analyzes the message packet after receiving the message packet sent by the server end, and corresponding parameters are obtained; data collection is carried out according to the parameters, and data collection is completed through a dtp transmission channel;
s3, writing the collected data into the csv file, compressing and encrypting the csv file, and producing a data collection compression packet;
s4, the consumer side uploads the data collection compression packet to the server side through FTP or SFTP mode (the two transmission modes mainly consider the problem of system compatibility);
s5, the server side decrypts and decompresses the data of the data collection compression packet to obtain a decompressed csv file;
s6, judging whether the decompressed csv file is a common table or a large table:
firstly, if the table is a common table, normal data insertion and deletion are carried out;
and secondly, if the table is large, inserting data by adopting gpload.
Preferably, in step S6, gpload is used for data insertion, and the specific method is as follows:
(1) java controls the shell instruction of a local linux server, and Java, lang, Runtime and the like package a runtime environment;
(2) each Java application program has a Runtime instance, so that the Java application program can be connected with the running environment of the Java application program;
(3) acquiring the reference of the current Runtime object of the Runtime by a getRuntime method;
(4) and obtaining the reference of a current Runtime object, and calling the method of the Runtime object to control the state and the behavior of the Java virtual machine.
Preferably, the data extraction in the data collection process in step S2 is completed by select query paging polling; each table in the database is polled in a multithreading manner during the data collection process.
Preferably, when data is extracted in the data collection process, each table in the database is disassembled into sql according to the index hit rule, and the data extraction efficiency is improved through the index hit rule processing.
Preferably, the method for multithread selection specifically comprises the following steps:
(1) and transmission control parameters: the thread and the paging size control the stability of the system, and the thread number is not less than 2;
(2) and the number of the opening threads: the number of the threads determines the system resources occupied by the extracted data, including db connection number and system io resources, and the range of the number of the threads is as follows: 1< thread _ num < 10;
(3) querying page size: the page size determines the size of the occupied heap and the io resource, the heap burst is caused by the control of two parameters of the size of the occupied heap and the io resource, and the range of the page size is as follows: 10000< page < 100000;
(4) and the operation and maintenance personnel actively adjust the performance parameters of the corresponding server according to the requirements.
Preferably, the Memory Leak (Memory Leak) or Memory Overflow (Memory Overflow) occurs after the occupied heap reaches the capacity limit of the maximum heap, and the specific determination method is as follows:
firstly, if the memory leaks, looking up a reference chain from a leaking object to GC Roots through a tool, and finding out a path through which the leaking object is related to the GC Roots and causing that a garbage collector cannot automatically recover garbage;
and secondly, if the memory overflows, the objects in the memory must live, checking whether the heap parameters (-Xmx and Xms) of the virtual machine can be increased compared with the physical memory of the machine, checking whether the life cycle of some objects is too long and the holding state time is too long from the code, and trying to reduce the memory consumption of the program in the operating period.
Preferably, the central server calls the related component to process the collected related data, and the specific steps are as follows:
initiating jar, polling 242 servers every 2 minutes;
(ii) session, getsttdout () acquires server file information;
(iii) judging whether the file is transmitted completely:
firstly, if the transmission is finished, pulling the compressed file from the remote place through ch.ethz.ssh2;
(iv), deleting 242 the media file by ssh;
(v), decrypting the decompressed file;
(vi) resolving the related parameters through file names;
(vii), deleting duplicate data from the GreenPlum database;
(viii), importing from the csv file to a GP database;
(ix), end.
A portable high-throughput big data acquisition system comprises a server side and a consumption side;
the server side is used for generating different parameter message packets according to different service requirements, sending the parameter message packets through a timing task or manual operation, and carrying out data decryption and decompression on the data collection compression packets to obtain decompressed csv files;
the consumption end is used for analyzing after receiving the message packet sent by the server end, acquiring corresponding parameters, then performing data collection according to the parameters, completing data collection through a dtp transmission channel, writing the collected data into a csv file, performing compression and encryption operations on the csv file, producing a data collection compression packet and uploading the data collection compression packet to the server end through an FTP or SFTP mode (the two transmission modes mainly consider the problem of system compatibility).
The portable high-throughput big data acquisition method and the system have the following advantages:
the invention uses open source middleware and data collection technique to manage multiple server clusters remotely, to realize central server control multiple distributed servers to process data collection and transmission of various data sources; the method comprises the following steps that all servers cooperatively complete an operation task, a central server acquires operation information of related servers and controls and monitors all servers in real time, the servers are remotely controlled by utilizing an open source middleware technology, data of all servers are extracted, cleaned, converted and stored, compressed and encrypted and then put into a data collection library, and therefore the data of a relational database can be conveniently and rapidly extracted and output according to a required format;
the initiation of the central service control flow can be carried out in batch or in fixed-point operation, the purposes of portability and high efficiency are achieved through some open-source components such as kafka, and the like, and the convenient and fast high-throughput data acquisition is realized;
(III) operation convenience: reading configuration parameters through an etcd component of datatrains, and automatically generating a response configuration file; the Logstash reads the configuration file, and the collection and the sending of data are completed automatically;
(IV) operation exhibition: theoretical and practical tests show that the kafka cluster has the performance advantage under the condition of large data volume, and the performance advantage is shown in the following table:
kafka is time consuming in case of large data volume
Difference of transmission environment Total time (hours) Total time (hours) for sql query kafka pile-up (Peak) Data volume
kfka singleton 10.5 1.5 Within 3000 Node 1(3.3G)
kafka cluster 13.5 0.5 Within 3000 Node 2(3.3G)
kafka cluster 12.7 1.4 Within 3000 Node 3(6.1G)
(V) starting and stopping the operation: executing the timed task circularly;
according to the invention, data are temporarily stored at the consumption end by matching kafka and logstash according to corresponding formats, the central server calls related components to process the collected related data, so that non-software developers who change data collection can easily operate the data collection, and the cost and time of data collection are greatly reduced.
Drawings
The invention is further described below with reference to the accompanying drawings.
FIG. 1 is a flow diagram of the present invention;
FIG. 2 is a schematic structural view of example 4.
Detailed Description
A portable high throughput big data acquisition method and system of the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
Example 1:
the invention relates to a portable high-throughput big data acquisition method, which is characterized in that a central server sends an instruction, each cluster server starts Logstash, reads configuration parameters through an etcd component of datatrains, automatically generates a response configuration file, reads the configuration file through Logstash, collects various database data according to the configuration file, sends the data to each server in a message form through Kafka after arrangement, and temporarily stores the collected data according to a corresponding format at a consumption end by matching the Kafka with the Logstash; the central server calls the related components to process the collected related data; the method comprises the following specific steps:
s1, the server generates different parameter message packets according to different service requirements, and sends the parameter message packets through timing tasks or manual operation;
s2, the consumption end analyzes the message packet after receiving the message packet sent by the server end, and corresponding parameters are obtained; data collection is carried out according to the parameters, and data collection is completed through a dtp transmission channel; data extraction in the data collection process is completed through select query paging polling; each table in the database is polled in a multithreading manner during the data collection process. When data are extracted in the data collection process, each table in the database is disassembled into sql according to the hit rule of the index, and the data extraction efficiency is improved through the processing of the index hit rule. The multithreading selection method specifically comprises the following steps:
(1) and transmission control parameters: the thread and the paging size control the stability of the system, and the thread number is not less than 2;
(2) and the number of the opening threads: the number of the threads determines the system resources occupied by the extracted data, including db connection number and system io resources, and the range of the number of the threads is as follows: 1< thread _ num < 10;
(3) querying page size: the page size determines the size of the occupied heap and the io resource, the heap burst is caused by the control of two parameters of the size of the occupied heap and the io resource, and the range of the page size is as follows: 10000< page < 100000; memory Leak (Memory Leak) or Memory Overflow (Memory Overflow) occurs after the occupied heap reaches the maximum heap capacity limit, and the specific determination method is as follows:
firstly, if the memory leaks, looking up a reference chain from a leaking object to GC Roots through a tool, and finding out a path through which the leaking object is related to the GC Roots and causing that a garbage collector cannot automatically recover garbage;
and secondly, if the memory overflows, the objects in the memory must live, checking whether the heap parameters (-Xmx and Xms) of the virtual machine can be increased compared with the physical memory of the machine, checking whether the life cycle of some objects is too long and the holding state time is too long from the code, and trying to reduce the memory consumption of the program in the operating period.
(4) And the operation and maintenance personnel actively adjust the performance parameters of the corresponding server according to the requirements.
S3, writing the collected data into the csv file, compressing and encrypting the csv file, and producing a data collection compression packet;
s4, the consumer side uploads the data collection compression packet to the server side through FTP or SFTP mode (the two transmission modes mainly consider the problem of system compatibility);
s5, the server side decrypts and decompresses the data of the data collection compression packet to obtain a decompressed csv file;
s6, judging whether the decompressed csv file is a common table or a large table:
firstly, if the table is a common table, normal data insertion and deletion are carried out;
secondly, if the table is large, inserting data by adopting gpload, and inserting data by adopting gpload, wherein the specific method comprises the following steps:
(1) java controls the shell instruction of a local linux server, and Java, lang, Runtime and the like package a runtime environment;
(2) each Java application program has a Runtime instance, so that the Java application program can be connected with the running environment of the Java application program;
(3) acquiring the reference of the current Runtime object of the Runtime by a getRuntime method;
(4) obtaining the reference of a current Runtime object, and calling the method of the Runtime object to control the state and behavior of the Java virtual machine;
code to implement a data insertion scheme with gpload:
Figure BDA0001927103180000061
Figure BDA0001927103180000071
example 2:
as shown in fig. 1, the provincial bureau, the national bureau and the 101 server are taken as examples. The task starting mode includes two modes: firstly, a local timing task is saved; and receiving the mq dispatching of the national bureau.
The specific steps for starting the data acquisition task by the provincial bureau are as follows:
(1) the provincial bureau starts a data collection task through a timing task or receiving the scheduling of the national bureau mq;
(2) data compression and encryption;
(3) sending a message to a central server;
(4) timer polls for five minutes and inquires the transmission permission from the national bureau;
(5) whether a national bureau transmission permission is obtained:
firstly, if the transmission permission is obtained, the Ftp uploads the data to a 242 storage server of a national bureau;
(6) judging whether the uploading is successful:
if uploading is successful, sending the mp message to a national bureau, and after receiving the mp message, modifying the file state and feeding the file state back to the 242 storage server.
The specific steps of the server 101 for processing data are as follows:
initiating jar, polling 242 servers every 2 minutes;
(ii) session, getsttdout () acquires server file information;
(iii) judging whether the file is transmitted completely:
firstly, if the transmission is finished, pulling the compressed file from the remote place through ch.ethz.ssh2;
(iv) 242 storage Server for deleting 242 media files by ssh and feeding them back to the State office
(v), decrypting the decompressed file;
(vi) resolving the related parameters through file names;
(vii), deleting duplicate data from the GreenPlum database;
(viii), importing from the csv file to a GP database;
(ix), end.
Example 3:
the invention relates to a quick high-throughput big data acquisition system, which comprises a server side and a consumption side; the server side is used for generating different parameter message packets according to different service requirements, sending the parameter message packets through a timing task or manual operation, and carrying out data decryption and decompression on the data collection compression packets to obtain decompressed csv files; the consumption end is used for analyzing after receiving the message packet sent by the server end, acquiring corresponding parameters, then performing data collection according to the parameters, completing data collection through a dtp transmission channel, writing the collected data into a csv file, performing compression and encryption operations on the csv file, producing a data collection compression packet and uploading the data collection compression packet to the server end through an FTP or SFTP mode (the two transmission modes mainly consider the problem of system compatibility).
Example 4: take information center, business company, industrial company as examples:
as shown in FIG. 2, the information center includes an industry supervision platform, a marketing big data platform and a GreenPlum database. The business company includes a marketing platform and a DB2/Oracle database. The industrial company includes an analysis command platform and a greenplus database.
The data transmission steps from the business company to the information center are as follows:
(1) polling DB2/Oracle database;
(2) the Logstash reads the configuration file and collects data to obtain a data file-1;
(3) the data file-1 is encrypted and compressed by a pushing module and then pushed to a transmission channel;
(4) the information center receives the data transmitted by the transmission channel;
(5) the reading module decrypts the decompressed data and the like to the data file-1;
(6) and storing the data file-1 to a GreenPlum database.
The data transmission steps from the information center to the industrial company are as follows:
(1) a GreenPlum database of the polling information center;
(2) the Logstash reads the configuration file, collects the sending data and obtains a data file-2;
(3) the data file-2 is encrypted and compressed by the pushing module and then pushed to the transmission channel;
(4) the industrial company receives the data transmitted by the transmission channel;
(5) the reading module decrypts the decompressed data and the like to a data file-2;
(6) and storing the data file-2 to a GreenPlum database of an industrial company.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (4)

1. A portable high throughput big data acquisition method is characterized in that a central server sends an instruction, each cluster server starts Logstash, configuration parameters are read through an etcd component of datatrains, a response configuration file is automatically generated, the Logstash reads the configuration file, various database data are collected according to the configuration file, the database data are sent to each server through Kafka in a message form after being sorted, and the Kafka and the Logstash are matched at a consumption end to temporarily store the collected data according to a corresponding format; the central server calls the related components to process the collected related data; the method comprises the following specific steps:
s1, the server generates different parameter message packets according to different service requirements, and sends the parameter message packets through timing tasks or manual operation;
s2, the consumption end analyzes the message packet after receiving the message packet sent by the server end, and corresponding parameters are obtained; data collection is carried out according to the parameters, and data collection is completed through a dtp transmission channel; data extraction in the data collection process is completed through select query paging polling; each table in the multithreading polling database is adopted in the data collection process; when data are extracted in the data collection process, disassembling sql of each table in the database according to the hit rule of the index; the multithreading selection method specifically comprises the following steps:
(1) and transmission control parameters: the thread and the paging size control the stability of the system, and the thread number is not less than 2;
(2) and the number of the opening threads: the number of the threads determines to extract system resources occupied by the data, the system resources comprise db connection number and system io resources, and the range of the number of the threads is as follows: 1< thread _ num < 10;
(3) querying page size: the page size determines the size of the occupied heap and the amount of the occupied io resources; the range of page sizes is: 10000< page < 100000;
(4) the operation and maintenance personnel actively adjust the performance parameters of the corresponding servers according to the requirements;
s3, writing the collected data into the csv file, compressing and encrypting the csv file, and producing a data collection compression packet;
s4, the consumer side uploads the data collection compression packet to the server side in an FTP or SFTP mode;
s5, the server side decrypts and decompresses the data of the data collection compression packet to obtain a decompressed csv file;
s6, judging whether the decompressed csv file is a common table or a large table:
(1) if the table is a common table, normal data insertion and deletion are carried out;
(2) and if the table is large, inserting data by adopting gpload.
2. The portable high-throughput big data collection method according to claim 1, wherein gploid is used for data insertion in step S6, and the specific method is as follows:
(1) java controls the shell instruction of a local linux server, and Java, lang, Runtime and the like package a runtime environment;
(2) each Java application program has a Runtime instance, so that the Java application program can be connected with the running environment of the Java application program;
(3) acquiring the reference of the current Runtime object of the Runtime by a getRuntime method;
(4) and obtaining the reference of a current Runtime object, and calling the method of the Runtime object to control the state and the behavior of the Java virtual machine.
3. The portable high-throughput big data collection method according to claim 1, wherein the central server invokes a related component to process the collected related data, and the specific steps are as follows:
starting jar, polling the central server every 2 minutes;
(ii) session, getsttdout () acquires server file information;
(iii) judging whether the file is transmitted completely:
if the transmission is finished, pulling the compressed file from the remote place through ch.ethz.ssh2;
(iv) after the transmission is finished, deleting the media file through ssh;
(v), decrypting the decompressed file;
(vi) resolving the related parameters through file names;
(vii), deleting duplicate data from the GreenPlum database;
(viii), importing from the csv file to a GP database;
(ix), end.
4. A portable high-throughput big data acquisition system is characterized by comprising a server side and a consumption side;
the server side is used for generating different parameter message packets according to different service requirements, sending the parameter message packets through a timing task or manual operation, and carrying out data decryption and decompression on the data collection compression packets to obtain decompressed csv files; and simultaneously judging whether the decompressed csv file is a common table or a large table: firstly, if the table is a common table, normal data insertion and deletion are carried out; secondly, if the table is large, inserting data by adopting gpload;
the consumption end is used for analyzing after receiving the message packet sent by the server end, acquiring corresponding parameters, then performing data collection according to the parameters, completing data collection through a dtp transmission channel, writing the collected data into a csv file, performing compression and encryption operations on the csv file, producing a data collection compression packet and uploading the data collection compression packet to the server end in an FTP or SFTP mode; wherein, the data extraction in the data collection process is completed by select query paging polling; each table in the multithreading polling database is adopted in the data collection process; when data are extracted in the data collection process, disassembling sql of each table in the database according to the hit rule of the index; the multithreading selection method specifically comprises the following steps:
(1) and transmission control parameters: the thread and the paging size control the stability of the system, and the thread number is not less than 2;
(2) and the number of the opening threads: the number of the threads determines the system resources occupied by the extracted data, and the system resources comprise db connection number and system io resources; the range of the number of threads is: 1< thread _ num < 10;
(3) querying page size: the page size determines the size of the occupied heap and the amount of the occupied io resources; the range of page sizes is: 10000< page < 100000;
(4) and the operation and maintenance personnel actively adjust the performance parameters of the corresponding server according to the requirements.
CN201811622024.5A 2018-12-28 2018-12-28 Portable high-throughput big data acquisition method and system Active CN109739818B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811622024.5A CN109739818B (en) 2018-12-28 2018-12-28 Portable high-throughput big data acquisition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811622024.5A CN109739818B (en) 2018-12-28 2018-12-28 Portable high-throughput big data acquisition method and system

Publications (2)

Publication Number Publication Date
CN109739818A CN109739818A (en) 2019-05-10
CN109739818B true CN109739818B (en) 2021-04-02

Family

ID=66361770

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811622024.5A Active CN109739818B (en) 2018-12-28 2018-12-28 Portable high-throughput big data acquisition method and system

Country Status (1)

Country Link
CN (1) CN109739818B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110636116B (en) * 2019-08-29 2022-05-10 武汉烽火众智数字技术有限责任公司 Multidimensional data acquisition system and method
CN111241184B (en) * 2020-02-17 2023-07-04 湖南工学院 High throughput rate data processing method for multi-source database of power distribution network
CN111400367B (en) * 2020-02-28 2023-12-29 金蝶蝶金云计算有限公司 Service report generation method, device, computer equipment and storage medium
CN111526176A (en) * 2020-03-26 2020-08-11 青岛奥利普自动化控制系统有限公司 Data acquisition method and system for Claus Ma Fei injection molding machine
CN111866137B (en) * 2020-07-20 2022-08-23 北京百度网讯科技有限公司 Data acquisition dynamic control method and device, electronic equipment and medium
CN112612830B (en) * 2020-12-03 2023-01-31 海光信息技术股份有限公司 Method and system for exporting compressed data in batches and electronic equipment
CN112527836B (en) * 2020-12-08 2022-12-30 航天科技控股集团股份有限公司 Big data query method based on T-BOX platform
CN113377726A (en) * 2021-06-02 2021-09-10 浪潮软件股份有限公司 High-reliability distributed mass data transmission method and tool
CN115883545B (en) * 2023-02-15 2023-05-30 江西飞尚科技有限公司 High-frequency data transmission method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106330963A (en) * 2016-10-11 2017-01-11 江苏电力信息技术有限公司 Cross-network multi-node log collecting method
CN108133017A (en) * 2017-12-21 2018-06-08 广州市申迪计算机系统有限公司 A kind of multi-data source acquisition configuration method and device
CN108365985A (en) * 2018-02-07 2018-08-03 深圳壹账通智能科技有限公司 A kind of cluster management method, device, terminal device and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106330963A (en) * 2016-10-11 2017-01-11 江苏电力信息技术有限公司 Cross-network multi-node log collecting method
CN108133017A (en) * 2017-12-21 2018-06-08 广州市申迪计算机系统有限公司 A kind of multi-data source acquisition configuration method and device
CN108365985A (en) * 2018-02-07 2018-08-03 深圳壹账通智能科技有限公司 A kind of cluster management method, device, terminal device and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Java调用批处理或可执行文件和Runtime、Process类实现Java版进程管理器;ljheee;《https://www.cnblogs.com/JamesWang1993/archive/2018/09/03/8548763.html》;20160729;第2页 *
kafka+etcd+es+kibana日志系统;SnowXaviera;《https://java.ctolib.com/cyhe-LogSystem.html》;20180718;第1-7页 *
深入理解java虚拟机读后总结;王菜鸟;《https://www.cnblogs.com/JamesWang1993/archive/2018/09/03/8548763.html》;20180903;三、Out of Memory Error异常 *

Also Published As

Publication number Publication date
CN109739818A (en) 2019-05-10

Similar Documents

Publication Publication Date Title
CN109739818B (en) Portable high-throughput big data acquisition method and system
CN109542733B (en) High-reliability real-time log collection and visual retrieval method
CN111124679B (en) Multi-source heterogeneous mass data-oriented time-limited automatic processing method
US8078394B2 (en) Indexing large-scale GPS tracks
CN108985981B (en) Data processing system and method
CN102970158A (en) Log storage and processing method and log server
CN111125260A (en) Data synchronization method and system based on SQL Server
CN104778188A (en) Distributed device log collection method
CN104331435A (en) Low-influence high-efficiency mass data extraction method based on Hadoop big data platform
CN107145576B (en) Big data ETL scheduling system supporting visualization and process
CN109669975B (en) Industrial big data processing system and method
CN112347071A (en) Power distribution network cloud platform data fusion method and power distribution network cloud platform
CN113312376B (en) Method and terminal for real-time processing and analysis of Nginx logs
US20120323924A1 (en) Method and system for a multiple database repository
Murugesan et al. Audit log management in MongoDB
CN107346270B (en) Method and system for real-time computation based radix estimation
CN106919566A (en) A kind of query statistic method and system based on mass data
CN102594889B (en) Data-call-based data synchronization and analysis system
CN109684279B (en) Data processing method and system
CN109471892B (en) Database cluster data processing method and device, storage medium and terminal
Iuhasz et al. Monitoring of exascale data processing
US9250839B1 (en) Printing system for data handling having a primary server for storing active and passive data and a second server for storing normalized and analytical data
CN113886065A (en) Storage and calculation method for acquiring mass data based on NB-lot Internet of things list in distributed environment
CN110515955B (en) Data storage and query method and system, electronic equipment and storage medium
CN113407415A (en) Log management method and device of intelligent terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant