CN110413701A - Distributed data base storage method, system, equipment and storage medium - Google Patents

Distributed data base storage method, system, equipment and storage medium Download PDF

Info

Publication number
CN110413701A
CN110413701A CN201910730157.2A CN201910730157A CN110413701A CN 110413701 A CN110413701 A CN 110413701A CN 201910730157 A CN201910730157 A CN 201910730157A CN 110413701 A CN110413701 A CN 110413701A
Authority
CN
China
Prior art keywords
data
distributed
storage
data base
distributed data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910730157.2A
Other languages
Chinese (zh)
Inventor
张宁
董延峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Manyun Software Technology Co Ltd
Original Assignee
Jiangsu Manyun Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Manyun Software Technology Co Ltd filed Critical Jiangsu Manyun Software Technology Co Ltd
Priority to CN201910730157.2A priority Critical patent/CN110413701A/en
Publication of CN110413701A publication Critical patent/CN110413701A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of distributed data base storage method, system, equipment and storage mediums, this method comprises: obtaining data loading configuration data, the data loading configuration data includes the information of the information of synchronous meter and distributed data base decline earth's surface in kafka message queue;Flink distributed traffic engine receives data to be put in storage from the synchronous meter in the kafka message queue;The data to be put in storage are inserted into the landing table of the distributed data base by the Flink distributed traffic engine.By using the solution of the present invention, the time of data processing link is extremely compressed, the business scenario of more convergence nodes, multi-source is supported to greatly improve the efficiency of data processing and analysis using hommization.

Description

Distributed data base storage method, system, equipment and storage medium
Technical field
The present invention relates to big data processing technology field more particularly to a kind of distributed data base storage method, system, set Standby and storage medium.
Background technique
With the fast development of internet and Internet technology, the data generated daily are just increased with exponential speed, To the processing of these mass data and analysis there is huge application value, and real time data increases, traditional offline number It has been increasingly difficult to according to calculating to meet the needs of analysis, therefore streaming computing is using more and more extensive.
Large-scale parallel data analysis engine Greenplum is a kind of database based on PostgreSQL for core, tool There is the features such as resource-sharing, high concurrent, rapid data processing, in current complicated and diversified data processing scene, relative to Not the problem of mySQL database not can be carried out distributed extension, and the framework of Hive is more applicable for off-line analysis scene, Greenplum still has very big advantage near real-time scene field.It is defeated rapidly after the distributed calculating of data landing progress Out as a result, meeting the growing demand of current big data analysis requirement of real-time.Due to the timeliness of data calculating, accurately Property requirement, to data source landing to Greenplum this ETL (Extract TransformLoad, data extract, conversion and Load) scene real-time, accuracy requirement also becomes particularly important.
Although Greenplum distributed data base has been developed many years, but GreenPlum distributed data base needle It is very single to the landing scheme of real time data, at present on the market also without other ETL process schemes.GreenPlum official A kind of data implementation mode announced uses json data format, by configuring local property file, completes number after operation order According to transmission.But for such scheme due to not supporting corresponding data to handle, message transmission rate is relatively low, and one table of a necessary table Configuration, significantly reduces worker productivity, supports near real-time scene not high.
Summary of the invention
For the problems of the prior art, the purpose of the present invention is to provide a kind of distributed data base storage method, it is System, equipment and storage medium, configuration is easy to use, has the characteristics that low latency and high handling capacity.
The embodiment of the present invention provides a kind of distributed data base storage method, and described method includes following steps:
Data loading configuration data is obtained, the data loading configuration data includes synchronous meter in kafka message queue The information of information and distributed data base decline earth's surface;
Flink distributed traffic engine receives data to be put in storage from the synchronous meter in the kafka message queue;
The data to be put in storage are inserted into the landing of the distributed data base by the Flink distributed traffic engine In table.
Optionally, the data to be put in storage are inserted into the distributed data by the Flink distributed traffic engine In the landing table in library, include the following steps:
The Flink distributed traffic engine parsing data source type to be put in storage;
The Flink distributed traffic engine is selected according to the data source type to the distributed data base Land the mode of operation of table.
Optionally, the data source type includes insertion type, updating type and deletes type;
The selection includes the following steps: the mode of operation of the landing table of the distributed data base
If the data to be put in storage are insertion type, the data to be put in storage are inserted into the distributed data In the landing table in library;
If the data to be put in storage are updating type, the distributed number is updated using the data to be put in storage According to the corresponding data in the landing table in library;
If the data to be put in storage be delete type, by the landing table of the distributed data base with it is described to Data corresponding to the data of storage are deleted.
Optionally, the Flink distributed traffic engine received from the synchronous meter in the kafka message queue to Further include following steps after the data of storage:
The Flink distributed traffic engine carries out data filtering to the data to be put in storage and data format turns It changes.
Optionally, the acquisition data loading configuration data, includes the following steps:
The mission bit stream of waiting task is obtained from real-time computing platform, the mission bit stream includes that the data loading is matched Set data.
Optionally, the method also includes following steps:
Opentsdb time series databases are written into monitoring parameter.
Optionally, the method also includes following steps:
According to the visual configuration data of user, the monitoring parameter is shown.
Optionally, the method also includes following steps:
Judge whether the monitoring parameter meets preset task abnormity alarm conditions;
If it is, determining that task abnormity corresponding to the data loading parameter alerts grade, according to preset alarm The mapping relations of mode and task abnormity alarm grade, select corresponding alarm mode to be alerted.
The embodiment of the present invention also provides a kind of distributed data base Input System, enters applied to the distributed data base Library method, the system comprises:
Configuration obtains module, and for obtaining data loading configuration data, the data loading configuration data includes that kafka disappears Cease the information of synchronous meter and the information of distributed data base decline earth's surface in queue;
Data processing module, for based on Flink distributed traffic engine from the synchronization in the kafka message queue Data to be put in storage are received in table, and the data to be put in storage are inserted into the landing table of the distributed data base.
The embodiment of the present invention also provides a kind of distributed data base and enters library facilities, comprising:
Processor;
Memory instruct wherein being stored with the processor;
Wherein, the processor is configured to carry out the distributed data base via that can be instructed described in progress to enter The step of library method.
The embodiment of the present invention also provides a kind of computer readable storage medium, and for storing program, described program is carried out Described in Shi Shixian the step of distributed data base storage method.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not The disclosure can be limited.
Distributed data base storage method, system, equipment and storage medium provided by the present invention have the advantage that
The present invention solves the problems of the prior art, carries out data processing using real-time calculation processing engine Flink, leads to Parsing kafka message queue real time data is crossed, is efficiently treated through in Flink distributed traffic engine, finally according to industry Business scene demand carries out database and accordingly updates, and carries out ETL process by using Flink distributed traffic engine, utilizes The efficient data processing technique of Flink distributed traffic engine extremely compresses the time of data processing link;Support converge more The business scenario for coalescing point, multi-source, uses hommization;Data are handled by Flink distributed traffic engine, are inserted directly into Or corresponding table is updated, after data landing, data analyst can directly carry out business diagnosis, greatly improve data processing Efficiency.
Detailed description of the invention
Upon reading the detailed description of non-limiting embodiments with reference to the following drawings, other feature of the invention, Objects and advantages will become more apparent upon.
Fig. 1 is the flow chart of the distributed data base storage method of one embodiment of the invention;
Fig. 2 is the flow chart of the Flink distributed traffic engines handle data of one embodiment of the invention;
Fig. 3 is the flow chart of the distributed data base storage process monitoring of one embodiment of the invention;
Fig. 4 is the structural schematic diagram of the distributed data base Input System of one embodiment of the invention;
Fig. 5 is the architecture diagram of the distributed data base Input System of one embodiment of the invention;
Fig. 6 is that the distributed data base of one embodiment of the invention enters the schematic diagram of library facilities;
Fig. 7 is the schematic diagram of the computer readable storage medium of one embodiment of the invention.
Specific embodiment
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be with a variety of shapes Formula is implemented, and is not understood as limited to example set forth herein;On the contrary, thesing embodiments are provided so that the disclosure will more Fully and completely, and by the design of example embodiment comprehensively it is communicated to those skilled in the art.Described feature, knot Structure or characteristic can be incorporated in any suitable manner in one or more embodiments.
In addition, attached drawing is only the schematic illustrations of the disclosure, it is not necessarily drawn to scale.Identical attached drawing mark in figure Note indicates same or similar part, thus will omit repetition thereof.Some block diagrams shown in the drawings are function Energy entity, not necessarily must be corresponding with physically or logically independent entity.These function can be realized using software form Energy entity, or these functional entitys are realized in one or more hardware modules or integrated circuit, or at heterogeneous networks and/or place These functional entitys are realized in reason device device and/or microcontroller device.
As shown in Figure 1, the embodiment of the present invention provides a kind of distributed data base storage method, the method includes walking as follows It is rapid:
S100: obtaining data loading configuration data, and the data loading configuration data includes synchronous in kafka message queue The information of table and the information of distributed data base decline earth's surface;
S200:Flink distributed traffic engine receives to be put in storage from the synchronous meter in the kafka message queue Data;
S300: the data to be put in storage are inserted into the distributed data base by the Flink distributed traffic engine Landing table in.The distributed data base can be GreenPlum distributed data base, however, the present invention is not limited thereto, application It is also possible in other distributed data bases, all belong to the scope of protection of the present invention within.It hereinafter will be with GreenPlum It is illustrated for database.
Therefore, distributed data base storage method of the invention carries out at data using real-time calculation processing engine Flink Reason is efficiently treated through, final root by parsing kafka message queue real time data in Flink distributed traffic engine Database is carried out according to business scenario demand accordingly to update, and carries out ETL process, benefit by using Flink distributed traffic engine With the efficient data processing technique of Flink distributed traffic engine, the time of data processing link is extremely compressed;It supports more The business scenario for converging node (sink), multi-source (source), uses hommization.
As shown in Fig. 2, in this embodiment, the step S200:Flink distributed traffic engine is from the kafka Further include following steps after receiving data to be put in storage in synchronous meter in message queue:
S210: the Flink distributed traffic engine carries out data filtering and data lattice to the data to be put in storage Formula conversion.Data filtering, which can be, herein is filtered processing to data according to preset filter condition, such as filters out repetition Data, invalid data etc..Data Format Transform, which can be, converts the data into satisfactory format etc..
In the prior art, when storing data in distributed data base, often directly all source datas are stored In the database, and without any processing.For example, for a table in database, wherein a plurality of update may be stored with Data, user are needed according to the time sequencing of a plurality of more new data when being analyzed using the data in the table in table Data are parsed, and newest data content can be just obtained.
In order to solve this problem, in this embodiment, the Flink distributed traffic engine will be described to be put in storage Data are inserted into the landing table of the distributed data base, are included the following steps:
S310: the Flink distributed traffic engine parsing data source type to be put in storage;
S320: the Flink distributed traffic engine is selected according to the data source type to the distributed number According to the mode of operation of the landing table in library.
Specifically, the data source type may include insertion type, updating type and delete type.The step S320: the mode of operation of the landing table to the distributed data base is selected, is included the following steps:
S321: if the data to be put in storage are insertion type, the data to be put in storage are inserted into the distribution In the landing table of formula database;
S322: if the data to be put in storage are updating type, described point is updated using the data to be put in storage Corresponding data in the landing table of cloth database;
S323: if the data to be put in storage be delete type, by the landing table of the distributed data base with Data corresponding to the data to be put in storage are deleted.
Therefore, by using step S310 and step S320, using Flink distributed traffic engine to data according to not Same data source type carries out classification processing, is inserted directly into, updates corresponding table or deletes the data in corresponding table, data After landing, data analyst can directly adopt the data in distributed data base carry out business diagnosis, improve data processing and Data analysis efficiency.
In this embodiment, in the step S100, data loading configuration data is obtained, is included the following steps:
The mission bit stream of waiting task is obtained from real-time computing platform, the mission bit stream includes that the data loading is matched Set data.And the data loading configuration data in mission bit stream may include multiple synchronous meters information and multiple landing tables Information, that is, support more convergence nodes, multi-source scene configuration, and one task of user configuration can carry out the shunting transmission of multilist.
User configures in kakfa message queue according to business demand in real-time computing platform release tasks and needs to synchronize Table, and landing is to the table name in GreenPlum database.And according to data volume size configuration task degree of parallelism in synchronous meter Deng.Specific data loading configuration data can include but is not limited to the address kafka broker, topic (theme), synchronous table name, GreenPlum database decline ground table name, unique key constraint etc..The Flink distributed traffic engine disappears in consumption kafka When ceasing the data in queue, specific Consumption rate is controlled by task degree of parallelism.
In order to realize the reliability of distributed data base storage method, as shown in figure 3, in this embodiment, the distribution Formula database storage method further includes monitoring step, specifically, including step S410: the opentsdb time is written into monitoring parameter Sequence database.Opentsdb is one based on the distributed of Hbase, and telescopic time series databases are mainly used as Monitoring system, such as collect the monitoring data of large-scale cluster and stored and inquired.
In this embodiment, the distributed data base storage method further includes following steps:
S420: according to the visual configuration data of user, the monitoring parameter is shown.Visual configuration and displaying can adopt It is realized with grafana, grafana is the metric analysis and visualization tool of a cross-platform open source, can be by that will acquire Data query it is then visual show, and notify in time.
User further can select monitoring alarm to support after passing through real-time platform release tasks.In this embodiment, institute Stating distributed data base storage method further includes abnormality alarming step, and specifically abnormality alarming includes the following steps:
S431: judge whether the monitoring parameter meets preset task abnormity alarm conditions;
S432: if it is, determining that task abnormity corresponding to the data loading parameter alerts grade, according to preset The mapping relations of alarm mode and task abnormity alarm grade, select corresponding alarm mode to be alerted;
S433: it if it is not, then not triggering task abnormity alarm, continues to execute to the real-time of distributed data base storage process Monitoring.
User is when selecting monitoring alarm to support, necessary information needed for can inputting alarm, such as alarm call, nail nail account Name etc..It is reported an error grade, can be alerted accordingly according to task abnormity by step S432, as mission failure carries out phone announcement Alert, there is the exception in reference line and carries out nail nail alarm in task index, buries point data and carries out nail nail alarm, mail alarm extremely Deng.Therefore, the distributed data base storage method type of alarm of the embodiment is more flexible and user-friendly.
As shown in figure 4, the embodiment of the present invention also provides a kind of distributed data base Input System, applied to the distribution Formula database storage method, the system comprises:
Configuration obtains module M100, and for obtaining data loading configuration data, the data loading configuration data includes The information of the information of synchronous meter and distributed data base decline earth's surface in kafka message queue;
Data processing module M200, for based on Flink distributed traffic engine from the kafka message queue Data to be put in storage are received in synchronous meter, and the data to be put in storage are inserted into the landing table of the distributed data base.
Wherein, the function of modules realizes the specific implementation using above-mentioned distributed data base storage method, example Such as, configuration, which obtains module M100, can use the specific embodiment of above-mentioned steps S100, and data processing module M200 can be adopted With the specific embodiment of above-mentioned steps S200 and step S300, it will not go into details herein.
Therefore, distributed data base Input System of the invention carries out at data using real-time calculation processing engine Flink Reason is efficiently treated through, final root by parsing kafka message queue real time data in Flink distributed traffic engine Database is carried out according to business scenario demand accordingly to update, and carries out ETL process, benefit by using Flink distributed traffic engine With the efficient data processing technique of Flink distributed traffic engine, the time of data processing link is extremely compressed;It supports more The business scenario for converging node (sink), multi-source (source), uses hommization.
Further, in this embodiment, the distributed data base Input System further includes data monitoring module M300, The data monitoring module be used for by monitoring parameter be written opentsdb time series databases, and can further according to The visual configuration data at family, show the monitoring parameter.
Further, in this embodiment, the distributed data base Input System further includes abnormality alarming module M400, The abnormality alarming module M400 for judging whether the monitoring parameter meets preset task abnormity alarm conditions, if It is, it is determined that task abnormity corresponding to the data loading parameter alerts grade, different according to preset alarm mode and task The often mapping relations of alarm grade, select corresponding alarm mode to be alerted, and otherwise, do not trigger task abnormity alarm.
Therefore, in the embodiment, the distributed data Input System passes through data monitoring module M300 and abnormality alarming Module M400 ensures the reliability of distributed data base storage process, and alarm mode is more flexible and more human nature Change.
As shown in figure 5, the architecture diagram of the distributed data base Input System for one embodiment of the invention.Wherein, Spring MVC is a part of Spring frame, and after Spring frame becomes Java EE exploitation mainstream frame, Spring development group is again It is proposed MVC framework on the basis of Spring frame, is mainly used for supporting the exploitation of WEB application program.MyBatis is a Outstanding Persistence Layer Framework, it supports to customize SQL, storing process and advanced mapping.MyBatis can be used simply XML explains to configure and map primary information, by the POJOs of interface and Java (Plain Ordinary Java Object, Common Java object) it is mapped to the record in database.MySQL is a kind of relational database management system, relational database It saves the data in different tables, rather than all data is placed in one big warehouse, which adds speed and mention High flexibility.
The embodiment of the present invention also provides a kind of distributed data base and enters library facilities, including processor;Memory, wherein storing There is the processor instruct;Wherein, the processor is configured to carry out institute via that can be instructed described in progress The step of distributed data base storage method stated.
Person of ordinary skill in the field it is understood that various aspects of the invention can be implemented as system, method or Program product.Therefore, various aspects of the invention can be embodied in the following forms, it may be assumed that complete hardware embodiment, complete The embodiment combined in terms of full Software Implementation (including firmware, microcode etc.) or hardware and software, can unite here Referred to as " circuit ", " module " or " platform ".
The electronic equipment 600 of this embodiment according to the present invention is described referring to Fig. 6.The electronics that Fig. 6 is shown Equipment 600 is only an example, should not function to the embodiment of the present invention and use scope bring any restrictions.
As shown in fig. 6, electronic equipment 600 is showed in the form of universal computing device.The combination of electronic equipment 600 can wrap Include but be not limited to: at least one processing unit 610, at least one storage unit 620, connection different platform combination (including storage Unit 620 and processing unit 610) bus 630, display unit 640 etc..
Wherein, the storage unit is stored with program code, said program code can by the processing unit 610 into Row, so that the processing unit 610 carries out described in this specification above-mentioned electronic prescription circulation processing method part according to this The step of inventing various illustrative embodiments.For example, the processing unit 610 can carry out step as shown in fig. 1.
The storage unit 620 may include the readable medium of volatile memory cell form, such as random access memory Unit (RAM) 6201 and/or cache memory unit 6202 can further include read-only memory unit (ROM) 6203.
The storage unit 620 can also include program/practical work with one group of (at least one) program module 6205 Tool 6204, such program module 6205 includes but is not limited to: operating system, one or more application program, other programs It may include the realization of network environment in module and program data, each of these examples or certain combination.
Bus 630 can be to indicate one of a few class bus structures or a variety of, including storage unit bus or storage Cell controller, peripheral bus, graphics acceleration port, processing unit use any bus structures in a variety of bus structures Local bus.
Electronic equipment 600 can also be with one or more external equipments 700 (such as keyboard, sensing equipment, bluetooth equipment Deng) communication, can also be enabled a user to one or more equipment interact with the electronic equipment 600 communicate, and/or with make Any equipment (such as the router, modulation /demodulation that the electronic equipment 600 can be communicated with one or more of the other calculating equipment Device etc.) communication.This communication can be carried out by input/output (I/O) interface 650.Also, electronic equipment 600 can be with By network adapter 660 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public network, Such as internet) communication.Network adapter 660 can be communicated by bus 630 with other modules of electronic equipment 600.It should Understand, although not shown in the drawings, other hardware and/or software module can be used in conjunction with electronic equipment 600, including but unlimited In: microcode, device driver, redundant processing unit, external disk drive array, RAID system, tape drive and number According to backup storage platform etc..
The embodiment of the present invention also provides a kind of computer readable storage medium, and for storing program, described program is carried out Described in Shi Shixian the step of distributed data base storage method.In some possible embodiments, each side of the invention Face is also implemented as a kind of form of program product comprising program code, when described program product is transported on the terminal device When row, said program code is for carrying out the terminal device in this specification above-mentioned electronic prescription circulation processing method part The step of various illustrative embodiments according to the present invention of description.
Refering to what is shown in Fig. 7, describing the program product for realizing the above method of embodiment according to the present invention 800, can using portable compact disc read only memory (CD-ROM) and including program code, and can in terminal device, Such as it is run on PC.However, program product of the invention is without being limited thereto, in this document, readable storage medium storing program for executing can be with To be any include or the tangible medium of storage program, the program can be commanded carry out system, device or device use or It is in connection.
Described program product can be using any combination of one or more readable mediums.Readable medium can be readable letter Number medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can be but be not limited to electricity, magnetic, optical, electromagnetic, infrared ray or System, device or the device of semiconductor, or any above combination.The more specific example of readable storage medium storing program for executing is (non exhaustive List) include: electrical connection with one or more conducting wires, portable disc, hard disk, random access memory (RAM), read-only Memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
The computer readable storage medium may include in a base band or the data as the propagation of carrier wave a part are believed Number, wherein carrying readable program code.The data-signal of this propagation can take various forms, including but not limited to electromagnetism Signal, optical signal or above-mentioned any appropriate combination.Readable storage medium storing program for executing can also be any other than readable storage medium storing program for executing Readable medium, the readable medium can send, propagate or transmit for by instruction carry out system, device or device use or Person's program in connection.The program code for including on readable storage medium storing program for executing can transmit with any suitable medium, packet Include but be not limited to wireless, wired, optical cable, RF etc. or above-mentioned any appropriate combination.
The program for carrying out operation of the present invention can be write with any combination of one or more programming languages Code, described program design language include object oriented program language-Java, C++ etc., further include conventional Procedural programming language-such as " C " language or similar programming language.Program code can be fully in user It calculates and carries out in equipment, partly carries out on a user device, being carried out as an independent software package, partially in user's calculating Upper side point is carried out on a remote computing or is carried out in remote computing device or server completely.It is being related to far Journey calculates in the situation of equipment, and remote computing device can pass through the network of any kind, including local area network (LAN) or wide area network (WAN), it is connected to user calculating equipment, or, it may be connected to external computing device (such as utilize ISP To be connected by internet).
In conclusion compared with prior art, distributed data base storage method provided by the present invention, system, equipment And storage medium has the advantage that
The present invention solves the problems of the prior art, carries out ETL process, benefit using Flink distributed traffic engine With the efficient data processing technique of Flink distributed traffic engine, the time of data processing link is extremely compressed;Data warp The processing of Flink distributed traffic engine is crossed, is inserted directly into or updates corresponding table, after data landing, data analyst can Business diagnosis is directly carried out, data-handling efficiency is greatly improved;For single table allocation problem, this programme supports more convergence knots The business scenario configuration of point, multi-source, one task of user configuration can carry out the shunting transmission of multilist;This programme configures simultaneously Visually popular monitoring technology scheme, data processing go out in the industry by opentsdb time series databases and grafana It is now abnormal to carry out nail nail alarm in time, to realize the monitoring of the parameters such as data consumption rate.
The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be said that Specific implementation of the invention is only limited to these instructions.For those of ordinary skill in the art to which the present invention belongs, In Under the premise of not departing from present inventive concept, a number of simple deductions or replacements can also be made, all shall be regarded as belonging to of the invention Protection scope.

Claims (11)

1. a kind of distributed data base storage method, which comprises the steps of:
Data loading configuration data is obtained, the data loading configuration data includes the information of synchronous meter in kafka message queue And the information of distributed data base decline earth's surface;
Flink distributed traffic engine receives data to be put in storage from the synchronous meter in the kafka message queue;
The data to be put in storage are inserted into the landing table of the distributed data base by the Flink distributed traffic engine In.
2. distributed data base storage method according to claim 1, which is characterized in that the Flink distributed data The data to be put in storage are inserted into the landing table of the distributed data base by stream engine, are included the following steps:
The Flink distributed traffic engine parsing data source type to be put in storage;
The Flink distributed traffic engine selects the landing to the distributed data base according to the data source type The mode of operation of table.
3. distributed data base storage method according to claim 2, which is characterized in that the data source type includes It is inserted into type, updating type and deletes type;
The selection includes the following steps: the mode of operation of the landing table of the distributed data base
If the data to be put in storage are insertion type, the data to be put in storage are inserted into the distributed data base It lands in table;
If the data to be put in storage are updating type, the distributed data base is updated using the data to be put in storage Landing table in corresponding data;
If the data to be put in storage be delete type, by the landing table of the distributed data base with described wait be put in storage Data corresponding to data delete.
4. distributed data base storage method according to claim 1, which is characterized in that the Flink distributed data Engine is flowed after receiving data to be put in storage in the synchronous meter in the kafka message queue, further includes following steps:
The Flink distributed traffic engine carries out data filtering and Data Format Transform to the data to be put in storage.
5. distributed data base storage method according to claim 1, which is characterized in that the acquisition data loading configuration Data include the following steps:
The mission bit stream of waiting task is obtained from real-time computing platform, the mission bit stream includes the data loading configuration number According to.
6. distributed data base storage method according to claim 1, which is characterized in that the method also includes walking as follows It is rapid:
Opentsdb time series databases are written into monitoring parameter.
7. distributed data base storage method according to claim 6, which is characterized in that the method also includes walking as follows It is rapid:
According to the visual configuration data of user, the monitoring parameter is shown.
8. distributed data base storage method according to claim 6, which is characterized in that the method also includes walking as follows It is rapid:
Judge whether the monitoring parameter meets preset task abnormity alarm conditions;
If it is, determining that task abnormity corresponding to the data loading parameter alerts grade, according to preset alarm mode With the mapping relations of task abnormity alarm grade, corresponding alarm mode is selected to be alerted.
9. a kind of distributed data base Input System, which is characterized in that be applied to described in any item of the claim 1 to 8 point Cloth database storage method, the system comprises:
Configuration obtains module, and for obtaining data loading configuration data, the data loading configuration data includes kafka message team The information of the information of synchronous meter and distributed data base decline earth's surface in column;
Data processing module, for based on Flink distributed traffic engine from the synchronous meter in the kafka message queue Data to be put in storage are received, and the data to be put in storage are inserted into the landing table of the distributed data base.
10. a kind of distributed data base enters library facilities characterized by comprising
Processor;
Memory instruct wherein being stored with the processor;
Wherein, the processor is configured to carry out described in any one of claims 1 to 8 via that can be instructed described in progress Distributed data base storage method the step of.
11. a kind of computer readable storage medium, for storing program, which is characterized in that realize power when described program is carried out Benefit require any one of 1 to 8 described in distributed data base storage method the step of.
CN201910730157.2A 2019-08-08 2019-08-08 Distributed data base storage method, system, equipment and storage medium Pending CN110413701A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910730157.2A CN110413701A (en) 2019-08-08 2019-08-08 Distributed data base storage method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910730157.2A CN110413701A (en) 2019-08-08 2019-08-08 Distributed data base storage method, system, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110413701A true CN110413701A (en) 2019-11-05

Family

ID=68366599

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910730157.2A Pending CN110413701A (en) 2019-08-08 2019-08-08 Distributed data base storage method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110413701A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111159135A (en) * 2019-12-23 2020-05-15 五八有限公司 Data processing method and device, electronic equipment and storage medium
CN111339175A (en) * 2020-02-28 2020-06-26 成都运力科技有限公司 Data processing method and device, electronic equipment and readable storage medium
CN111538789A (en) * 2020-04-27 2020-08-14 咪咕文化科技有限公司 Data synchronization method and device, electronic equipment and storage medium
CN111625300A (en) * 2020-06-08 2020-09-04 成都信息工程大学 Efficient data acquisition loading method and system
CN112039968A (en) * 2020-08-25 2020-12-04 中央广播电视总台 Data processing system
CN112288907A (en) * 2020-10-28 2021-01-29 山东超越数控电子股份有限公司 Vehicle real-time monitoring method
CN112559453A (en) * 2020-12-09 2021-03-26 恒安嘉新(北京)科技股份公司 Data storage method and device, electronic equipment and storage medium
CN113836120A (en) * 2021-11-29 2021-12-24 江苏金恒信息科技股份有限公司 Breakpoint resume method and system based on data acquisition engine to data application

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180074852A1 (en) * 2016-09-14 2018-03-15 Salesforce.Com, Inc. Compact Task Deployment for Stream Processing Systems
CN108040074A (en) * 2018-01-26 2018-05-15 华南理工大学 A kind of real-time network unusual checking system and method based on big data
CN109271412A (en) * 2018-09-28 2019-01-25 中国-东盟信息港股份有限公司 The real-time streaming data processing method and system of smart city
CN109558400A (en) * 2018-11-28 2019-04-02 北京锐安科技有限公司 Data processing method, device, equipment and storage medium
CN109684352A (en) * 2018-12-29 2019-04-26 江苏满运软件科技有限公司 Data analysis system, method, storage medium and electronic equipment
CN109840253A (en) * 2019-01-10 2019-06-04 北京工业大学 Enterprise-level big data platform framework
CN109951463A (en) * 2019-03-07 2019-06-28 成都古河云科技有限公司 A kind of Internet of Things big data analysis method stored based on stream calculation and novel column

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180074852A1 (en) * 2016-09-14 2018-03-15 Salesforce.Com, Inc. Compact Task Deployment for Stream Processing Systems
CN108040074A (en) * 2018-01-26 2018-05-15 华南理工大学 A kind of real-time network unusual checking system and method based on big data
CN109271412A (en) * 2018-09-28 2019-01-25 中国-东盟信息港股份有限公司 The real-time streaming data processing method and system of smart city
CN109558400A (en) * 2018-11-28 2019-04-02 北京锐安科技有限公司 Data processing method, device, equipment and storage medium
CN109684352A (en) * 2018-12-29 2019-04-26 江苏满运软件科技有限公司 Data analysis system, method, storage medium and electronic equipment
CN109840253A (en) * 2019-01-10 2019-06-04 北京工业大学 Enterprise-level big data platform framework
CN109951463A (en) * 2019-03-07 2019-06-28 成都古河云科技有限公司 A kind of Internet of Things big data analysis method stored based on stream calculation and novel column

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111159135A (en) * 2019-12-23 2020-05-15 五八有限公司 Data processing method and device, electronic equipment and storage medium
CN111339175A (en) * 2020-02-28 2020-06-26 成都运力科技有限公司 Data processing method and device, electronic equipment and readable storage medium
CN111339175B (en) * 2020-02-28 2023-08-11 成都运力科技有限公司 Data processing method, device, electronic equipment and readable storage medium
CN111538789A (en) * 2020-04-27 2020-08-14 咪咕文化科技有限公司 Data synchronization method and device, electronic equipment and storage medium
CN111538789B (en) * 2020-04-27 2023-08-15 咪咕文化科技有限公司 Data synchronization method, device, electronic equipment and storage medium
CN111625300A (en) * 2020-06-08 2020-09-04 成都信息工程大学 Efficient data acquisition loading method and system
CN111625300B (en) * 2020-06-08 2023-03-24 成都信息工程大学 Efficient data acquisition loading method and system
CN112039968A (en) * 2020-08-25 2020-12-04 中央广播电视总台 Data processing system
CN112288907A (en) * 2020-10-28 2021-01-29 山东超越数控电子股份有限公司 Vehicle real-time monitoring method
CN112559453A (en) * 2020-12-09 2021-03-26 恒安嘉新(北京)科技股份公司 Data storage method and device, electronic equipment and storage medium
CN113836120A (en) * 2021-11-29 2021-12-24 江苏金恒信息科技股份有限公司 Breakpoint resume method and system based on data acquisition engine to data application
CN113836120B (en) * 2021-11-29 2022-03-11 江苏金恒信息科技股份有限公司 Breakpoint resume method and system based on data acquisition engine to data application

Similar Documents

Publication Publication Date Title
CN110413701A (en) Distributed data base storage method, system, equipment and storage medium
US10394770B2 (en) Methods and systems for implementing a data reconciliation framework
CN107632924B (en) Alarm application visual display method, system, equipment and storage medium
CN110050257A (en) The difference of executable data flow diagram
CN110351150A (en) Fault rootstock determines method and device, electronic equipment and readable storage medium storing program for executing
CN107896175A (en) Collecting method and device
CN105357311B (en) A kind of storage of secondary device big data and processing method of cloud computing technology
CN107506451A (en) abnormal information monitoring method and device for data interaction
CN107918600A (en) report development system and method, storage medium and electronic equipment
CN108292323A (en) Use the database manipulation of the metadata of data source
CN106557457B (en) QT-based system for automatically generating cross-platform complex flow chart
US20240037374A1 (en) System and method for chaining discrete models
CN109902105A (en) For the data query system of micro services framework, method, equipment and storage medium
CN202391474U (en) Mine emergency integrated monitoring system
CN107463356A (en) The execution method and apparatus of flow of task
CN106846184A (en) A kind of wisdom exhibitions interaction platform
CN108063699A (en) Network performance monitoring method, apparatus, electronic equipment, storage medium
CN107678852A (en) Method, system, equipment and the storage medium calculated in real time based on flow data
CN109582699A (en) Method, system, equipment and storage medium based on mixed cloud data aggregate
CN102354283A (en) Method for constructing rule base and method for checking data by utilizing rule base
CN102467705A (en) Early warning mechanism for controlling operational risk of container terminal and method for implementing early warning mechanism
CN103365923B (en) Method and apparatus for assessing the partition scheme of database
CN109491873A (en) It caches monitoring method, medium, device and calculates equipment
CN105760284A (en) Website performance monitoring method and device
CN117541217A (en) Operation and maintenance method based on three-dimensional visual power grid equipment management service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191105