CN109104487A - Data transmission method based on logstack + kafka - Google Patents

Data transmission method based on logstack + kafka Download PDF

Info

Publication number
CN109104487A
CN109104487A CN201810947702.9A CN201810947702A CN109104487A CN 109104487 A CN109104487 A CN 109104487A CN 201810947702 A CN201810947702 A CN 201810947702A CN 109104487 A CN109104487 A CN 109104487A
Authority
CN
China
Prior art keywords
data
business datum
logstash
kafka
middleware
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810947702.9A
Other languages
Chinese (zh)
Inventor
颜朋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Software Co Ltd
Original Assignee
Inspur Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Software Co Ltd filed Critical Inspur Software Co Ltd
Priority to CN201810947702.9A priority Critical patent/CN109104487A/en
Publication of CN109104487A publication Critical patent/CN109104487A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/30Profiles

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data transmission method based on logstack + kafka, which comprises a service data production end operation and a service data consumption end operation; 1) automatically acquiring service data in a service system database through a service data production end logstack, and transmitting the service data to a middleware kafka; 2) the middleware kafka collects data transmitted by each service data production end and uniformly sends the data to a service data consumption end logstack; 3) and the service data consuming end logstack receives the message of the middleware kafka, converts the message into a data file and stores the data file in a target database. Compared with the prior art, the method and the device have the advantages that the configuration operation is simple and convenient, the processing speed is high, the operations of collecting, processing, storing and the like of scattered service data can be effectively realized, the transmission channel of the service data production end and the service data consumption end can be rapidly and conveniently established, the expansibility is good, and the development, operation and maintenance difficulty is reduced.

Description

One kind being based on logstash+kafka data transmission method
Technical field
The present invention relates to high-speed data acquisition application fields, specifically a kind of to be passed based on logstash+kafka data Transmission method.
Background technique
In enterprise's application process, there is the information system of oneself in each information enterprise, but the data letter between each enterprise Breath can not achieve it is shared, also lack a unified data platform come these business datums are integrated, are handled, are excavated with divide Analysis, it is serious to hinder the whole progress of IT application in enterprise.To solve this problem, begin one's study various data of people pass Defeated mode, attempt by not homologous ray data efficient, safety collect together, the basis of data platform in Unified Set On, further mining analysis, mining data rule, aid decision are carried out to data.Available data transmission mode is broadly divided into It is several below:
1) CDN technology
By adding one layer of new network architecture in existing network, the content of website is published to the network closest to user " edge " allows users to the content needed for obtaining nearby, improves the response speed that user accesses website.But disadvantage is also very bright It is aobvious: it is non real-time, indirect to update to specified object, and there is manual intervention in centre, it need be compared tight, thoughtful It arranges;
2) based on the transmission technology of File Transfer Protocol
The effect of FTP remote file transferring agreement is that file is moved on to from a computer there are one computer.Most frequently make It is the transmitted in both directions using FTP, i.e., data are transmitted between remote system and local.User can will be on remote computer File download to user where host on, then copy to again in the terminating machine of user, or be directly downloaded to the end of user In terminal, additionally it is possible to which the file in the file of host where user or user terminal is transferred on remote computer;
FTPserver need be established using FTP transmission file.Using the FTP of registration user, user and password need be also managed. General host all provides the client of FTP, it is possible to use dedicated FTPclient uses integrated ftp software.According to the people Banking software constraint is forbidden to use anonymous ftp transmitting data;
Had using the major defect that FTP mode carries out file transmission: the integrality for transmitting data is unable to get guarantee;Scalability compared with Difference;
3) it is based on mail transfer
File is transmitted using e-mail system.E-mail system is with transmission speed is fast, file type is diversified, transmitting-receiving side Just, the features such as communicatee is extensive, safe.But the integrality for transmitting data is unable to get guarantee, and efficiency of transmission is lower, and And file is transmitted based on lettergram mode, efficiency is lower;
4) it is transmitted based on middleware
Data are transmitted using middlewares such as MQ, MT, has the function of data compression, the big file of transmission, breakpoint transmission etc., may be implemented File security, reliable transmission.But that there are efficiency is lower for major part middleware transmission mode at present, needs corresponding interface exploitation Workload, the inflexible disadvantage of data pick-up.
Summary of the invention
Technical assignment of the invention is to provide a kind of based on logstash+kafka data transmission method.
Technical assignment of the invention is realized in the following manner:
One kind being based on logstash+kafka data transmission method, including the operation of the business datum manufacturing side and business datum consumption terminal Operation;
Operating procedure is as follows:
Step 1) passes through business datum manufacturing side logstash, business datum in automatic collection operation system database, and transmits To middleware kafka;
Middleware kafka described in step 2 collects the data of each business datum manufacturing side transmission, is uniformly sent to industry Be engaged in data consumption end logstash;
Business datum consumption terminal logstash described in step 3) receives the message of the middleware kafka, switchs to data text Part, and store to target database.
Business datum in step 1) the automatic collection operation system database, comprising:
The business datum of the acquisition is not needed again by specifying in logstash configuration file through development interface program.
The business datum manufacturing side can dispose more logstash tools in the step 1).
The data of the extraction different business of logstash tool described in every.
The more logstash tools concurrently execute data pick-up work.
The middleware kafka clustered deploy(ment) is in multiple servers, equally, business datum consumption terminal logstash according to The difference for extracting data service range, opens multiple logstash tools, receives different numbers from middleware kafka cluster respectively According to concurrent processing is data file.
One kind being based on logstash+kafka data transmission device, including the business datum manufacturing side and business datum consumption End;
The business datum manufacturing side is used for the business datum that capturing service system generates;
The business datum consumption terminal is used to handle the data of acquisition, data is focused in unified data platform.
The business datum manufacturing side includes: input module and output module;
The input module is data acquisition, business datum needed for acquiring in service database automatically;
The output module is data transmission, and acquisition data are converted to message and are sent in data manufacturing side kafka queue.
The business datum consumption terminal includes consumption terminal kafka data middleware and consumption terminal logstash;
The consumption terminal kafka data middleware is used for transmission and receives the business datum message that each operation system is sent;
The consumption terminal logstash is converted into data file and deposits in unification for receiving middleware kafka message, processing Data platform in.
Compared to the prior art one kind of the invention is based on logstash+kafka data transmission method, data content is really It is fixed to be realized not by development interface, it can be made simply by the content in the logstash configuration file of the data manufacturing side It is fixed;When needing operation expanding, data area increases, it is only necessary to modify configuration file;And with the promotion of data volume, can pass through The guarantee of logstash and kafka cluster extended to realize performance;Final data processing format can be needed to configure according to enterprise Different logstash configuration files realizations, can be according to profile name rule, to be placed into server designated position.
Detailed description of the invention
Attached drawing 1 is a kind of flow diagram based on logstash+kafka data transmission method.
Specific embodiment
Embodiment 1:
Configuration device:
One kind being based on logstash+kafka data transmission device, including the business datum manufacturing side and business datum consumption terminal;
The business datum manufacturing side is used for the business datum that capturing service system generates;The business datum manufacturing side packet It includes: input module and output module;The input module is data acquisition, business needed for acquiring in service database automatically Data;The output module is data transmission, and acquisition data are converted to message and are sent to data manufacturing side kafka queue In.
The business datum consumption terminal is used to handle the data of acquisition, data is focused in unified data platform. The business datum consumption terminal includes consumption terminal kafka data middleware and consumption terminal logstash;The consumption terminal Kafka data middleware is used for transmission and receives the business datum message that each operation system is sent;The consumption terminal Logstash is converted into data file and deposits in unified data platform for receiving middleware kafka message, processing.
Operating method:
Step 1) passes through business datum manufacturing side logstash, and business datum in automatic collection operation system database is described The business datum of acquisition is not needed again by development interface program, and be transmitted to centre by specifying in logstash configuration file Part kafka;
The business datum manufacturing side can dispose more logstash tools;Logstash tool described in every extracts different The data of business;The more logstash tools concurrently execute data pick-up work.
Middleware kafka described in step 2 collects the data of each business datum manufacturing side transmission, unified to send To business datum consumption terminal logstash;
Business datum consumption terminal logstash described in step 3) receives the message of the middleware kafka, switchs to data text Part, and store to target database.
The middleware kafka clustered deploy(ment) is in multiple servers, equally, business datum consumption terminal logstash according to The difference for extracting data service range, opens multiple logstash tools, receives different numbers from middleware kafka cluster respectively According to concurrent processing is data file.
Business datum manufacturing side configuration file sample:
input {
jdbc {
jdbc_connection_string => "jdbc:oracle:thin:@xxx.xxx.xxx.xxx:1521:db1"
jdbc_user => "user1"
jdbc_password => "upwd1"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
jdbc_driver_library => "/opt/langchao/logstash-5.6.6/jdbcdrivers/ ojdbc6.jar"
jdbc_paging_enabled => "false"
jdbc_page_size => "50000"
statement => "SELECT 'table1' as tablename, id, name, to_char (createdate, 'yyyy-MM-dd HH24:mi:ss') as createdate FROM produce where to_ char(createdate, 'yyyy-MM-dd HH24:mi:ss') > to_char(:sql_last_value) order by createdate"
schedule => "0 0 16 * * *"
type => "test"
record_last_run => true
use_column_value => true
parameters => {"createdate" => "2018-02-28 23:59:59"}
tracking_column => "createdate"
tracking_column_type => "numeric"
last_run_metadata_path => "/opt/langchao/logstash-5.6.6/runcache/ tables/table1"
clean_run => false
}
}
output {
if [type]== "test" {
kafka {
codec => json_lines {
charset => "UTF-8"
}
topic_id => "test"
bootstrap_servers => "xxx.xxx.xxx.xxx:9092"
}
}
}
Business datum consumption terminal configuration file sample:
input {
kafka {
codec => json {
charset => "UTF-8"
}
auto_offset_reset => "earliest"
topics => ["test"]
bootstrap_servers => "10.1.80.238:9092"
max_poll_records => "10"
request_timeout_ms => "300000"
session_timeout_ms => "180000"
}
}
filter {
date {
match => ["message","UNIX_MS"]
target => "@timestamp"
}
ruby {
code => "event.set('timestamp', event.get('@timestamp').time.localtim e + 8*60*60)"
}
ruby {
code => "event.set('@timestamp',event.get('timestamp'))"
}
mutate {
remove_field => ["timestamp"]
}
json {
source => "message"
}
}
output {
if [tabname] == "plm_item" {
csv {
codec => plain {
charset => "UTF-8"
}
path => "/opt/ldatas/data/%{category}/%{+YYYYMMdd}/%{tabname}_% {comid}_%{+YYYYMMdd-H}.csv"
dir_mode => 0777
file_mode => 0777
filename_failure => "/opt/ldatas/failures/filefailures.txt"
fields => ["item_id", "item_name"]
csv_options => {"col_sep" => ","}
}
}
}
The technical personnel in the technical field can readily realize the present invention with the above specific embodiments,.But it should manage Solution, the present invention is not limited to above-mentioned several specific embodiments.On the basis of the disclosed embodiments, the technical field Technical staff can arbitrarily combine different technical features, to realize different technical solutions.

Claims (9)

1. one kind be based on logstash+kafka data transmission method, which is characterized in that including the business datum manufacturing side operation and The operation of business datum consumption terminal;
Operating procedure is as follows:
Step 1) passes through business datum manufacturing side logstash, business datum in automatic collection operation system database, and transmits To middleware kafka;
Middleware kafka described in step 2 collects the data of each business datum manufacturing side transmission, is uniformly sent to industry Be engaged in data consumption end logstash;
Business datum consumption terminal logstash described in step 3) receives the message of the middleware kafka, switchs to data text Part, and store to target database.
2. data transmission method according to claim 1, which is characterized in that step 1) the automatic collection operation system Business datum in database, comprising:
The business datum of the acquisition is not needed again by specifying in logstash configuration file through development interface program.
3. data transmission method according to claim 1, which is characterized in that the business datum manufacturing side in the step 1) More logstash tools can be disposed.
4. data transmission method according to claim 3, which is characterized in that logstash tool described in every extracts not With the data of business.
5. data transmission method according to claim 3, which is characterized in that the more logstash tools are concurrently held The work of row data pick-up.
6. data transmission method according to claim 1, which is characterized in that the middleware kafka clustered deploy(ment) exists Multiple servers, equally, business datum consumption terminal logstash are opened multiple according to the difference for extracting data service range Logstash tool, receives different data from middleware kafka cluster respectively, and concurrent processing is data file.
7. one kind is based on logstash+kafka data transmission device, which is characterized in that including the business datum manufacturing side and business Data consumption end;
The business datum manufacturing side is used for the business datum that capturing service system generates;
The business datum consumption terminal is used to handle the data of acquisition, data is focused in unified data platform.
8. data transmission device according to claim 7, which is characterized in that the business datum manufacturing side includes: defeated Enter module and output module;
The input module is data acquisition, business datum needed for acquiring in service database automatically;
The output module is data transmission, and acquisition data are converted to message and are sent in data manufacturing side kafka queue.
9. data transmission device according to claim 7, which is characterized in that the business datum consumption terminal includes consumption Hold kafka data middleware and consumption terminal logstash;
The consumption terminal kafka data middleware is used for transmission and receives the business datum message that each operation system is sent;
The consumption terminal logstash is converted into data file and deposits in unification for receiving middleware kafka message, processing Data platform in.
CN201810947702.9A 2018-08-20 2018-08-20 Data transmission method based on logstack + kafka Pending CN109104487A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810947702.9A CN109104487A (en) 2018-08-20 2018-08-20 Data transmission method based on logstack + kafka

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810947702.9A CN109104487A (en) 2018-08-20 2018-08-20 Data transmission method based on logstack + kafka

Publications (1)

Publication Number Publication Date
CN109104487A true CN109104487A (en) 2018-12-28

Family

ID=64850446

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810947702.9A Pending CN109104487A (en) 2018-08-20 2018-08-20 Data transmission method based on logstack + kafka

Country Status (1)

Country Link
CN (1) CN109104487A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110401724A (en) * 2019-08-22 2019-11-01 北京旷视科技有限公司 File management method, ftp server and storage medium
CN110442436A (en) * 2019-07-12 2019-11-12 平安普惠企业管理有限公司 Process management method and relevant apparatus based on container
CN111753007A (en) * 2020-06-16 2020-10-09 国家电网有限公司客户服务中心 Pluggable component data aggregation system and method based on multiple systems

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294672A (en) * 2016-08-08 2017-01-04 杭州玳数科技有限公司 The method and system that a kind of daily record represents in real time and inquires about
CN106330963A (en) * 2016-10-11 2017-01-11 江苏电力信息技术有限公司 Cross-network multi-node log collecting method
CN106844171A (en) * 2016-12-27 2017-06-13 浪潮软件集团有限公司 Mass operation and maintenance implementation method
US20170272516A1 (en) * 2016-03-17 2017-09-21 International Business Machines Corporation Providing queueing in a log streaming messaging system
CN108365985A (en) * 2018-02-07 2018-08-03 深圳壹账通智能科技有限公司 A kind of cluster management method, device, terminal device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170272516A1 (en) * 2016-03-17 2017-09-21 International Business Machines Corporation Providing queueing in a log streaming messaging system
CN106294672A (en) * 2016-08-08 2017-01-04 杭州玳数科技有限公司 The method and system that a kind of daily record represents in real time and inquires about
CN106330963A (en) * 2016-10-11 2017-01-11 江苏电力信息技术有限公司 Cross-network multi-node log collecting method
CN106844171A (en) * 2016-12-27 2017-06-13 浪潮软件集团有限公司 Mass operation and maintenance implementation method
CN108365985A (en) * 2018-02-07 2018-08-03 深圳壹账通智能科技有限公司 A kind of cluster management method, device, terminal device and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘锴: "海量数据日志系统架构分析与应用", 《长春工业大学学报》 *
王力群: "基于日志分析平台的监控系统的设计与实现", 《计算机应用与软件》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110442436A (en) * 2019-07-12 2019-11-12 平安普惠企业管理有限公司 Process management method and relevant apparatus based on container
CN110401724A (en) * 2019-08-22 2019-11-01 北京旷视科技有限公司 File management method, ftp server and storage medium
CN110401724B (en) * 2019-08-22 2022-04-12 北京旷视科技有限公司 File management method, file transfer protocol server and storage medium
CN111753007A (en) * 2020-06-16 2020-10-09 国家电网有限公司客户服务中心 Pluggable component data aggregation system and method based on multiple systems

Similar Documents

Publication Publication Date Title
US11188397B2 (en) Mobile application for an information technology (IT) and security operations application
US11716260B2 (en) Hybrid execution of playbooks including custom code
US11575579B2 (en) Systems and methods for networked microservice modeling
US10187461B2 (en) Configuring a system to collect and aggregate datasets
US9201910B2 (en) Dynamically processing an event using an extensible data model
US7685143B2 (en) Unified logging service for distributed applications
US7634557B2 (en) Apparatus and method for network analysis
US9082127B2 (en) Collecting and aggregating datasets for analysis
AU2021214781B2 (en) Sensor data device
US7779113B1 (en) Audit management system for networks
CN109104487A (en) Data transmission method based on logstack + kafka
US20100088197A1 (en) Systems and methods for generating remote system inventory capable of differential update reports
JP2008502044A (en) Performance management system and performance management method in multi-tier computing environment
US11573955B1 (en) Data-determinant query terms
WO2009032925A1 (en) Apparatus and method for network analysis
US11799798B1 (en) Generating infrastructure templates for facilitating the transmission of user data into a data intake and query system
US11573971B1 (en) Search and data analysis collaboration system
US11843622B1 (en) Providing machine learning models for classifying domain names for malware detection
US11593477B1 (en) Expediting processing of selected events on a time-limited basis
US11537942B1 (en) Machine learning-based data analyses for outlier detection
US11487513B1 (en) Reusable custom functions for playbooks
US20020083072A1 (en) System, method and software application for incorporating data from unintegrated applications within a central database
US11516069B1 (en) Aggregate notable events in an information technology and security operations application
US11792157B1 (en) Detection of DNS beaconing through time-to-live and transmission analyses
CN101267405A (en) Instant communication monitoring method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181228