CN107038260A - A kind of efficient parallel loading method for keeping titan Real-time Data Uniforms - Google Patents

A kind of efficient parallel loading method for keeping titan Real-time Data Uniforms Download PDF

Info

Publication number
CN107038260A
CN107038260A CN201710390469.4A CN201710390469A CN107038260A CN 107038260 A CN107038260 A CN 107038260A CN 201710390469 A CN201710390469 A CN 201710390469A CN 107038260 A CN107038260 A CN 107038260A
Authority
CN
China
Prior art keywords
data
module
titan
pieceofdata
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710390469.4A
Other languages
Chinese (zh)
Other versions
CN107038260B (en
Inventor
毛洪亮
唐积强
王秀文
李焱余
苏沐冉
马秀娟
吴震
徐小磊
张露晨
李传海
李斌斌
蒲路
谢铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING SCISTOR TECHNOLOGY Co Ltd
National Computer Network and Information Security Management Center
Original Assignee
BEIJING SCISTOR TECHNOLOGY Co Ltd
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING SCISTOR TECHNOLOGY Co Ltd, National Computer Network and Information Security Management Center filed Critical BEIJING SCISTOR TECHNOLOGY Co Ltd
Priority to CN201710390469.4A priority Critical patent/CN107038260B/en
Publication of CN107038260A publication Critical patent/CN107038260A/en
Application granted granted Critical
Publication of CN107038260B publication Critical patent/CN107038260B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of efficient parallel loading method for keeping titan Real-time Data Uniforms, belong to big data process field;First, titan is divided into the module of 7 concurrent workings, cleaning rule management module real-time update filtering rule;Data reception module receives pieceOfData and is put into queue1;Data cleansing module filters qualified data and is put into queue2;ID modular converters are interacted with high speed index module, judge two points in current pieceOfData and titan ID corresponding relation whether there is with chart database;If it is, ID attributes inside titan and ID value substitution points are saved in pieceOfDataT, it is put into queue4;Otherwise, the point not loaded is put into HashSet, and corresponding pieceOfData is put into queue3;PieceOfDataT is loaded into titan by remaining data load-on module multi-threaded parallel;Point load-on module is responsible for HashSet midpoints adding titan, will put the corresponding relation addition high speed index module with titan ID.Each module of the invention is alone or interaction completes partial function, so as to realize the lifting of loading efficiency on the whole.

Description

A kind of efficient parallel loading method for keeping titan Real-time Data Uniforms
Technical field
The invention belongs to big data process field, it is related to a kind of chart database real time data pretreatment loading of highly effective and safe Method, specifically a kind of efficient parallel loading method for keeping titan Real-time Data Uniforms.
Background technology
With the continuous improvement continued to develop with the level of informatization of computer technology, data volume is being increased rapidly, data Structure is also gradually being complicated, and traditional relevant database is difficult with many scenes, therefore various non-passes of being born It is type database.
Chart database is one kind in non-relational database, the various relational network data of storage is good at, in numerous figure numbers According in storehouse, titan is as very outstanding handy distributed chart database, with high scalability, by expanding cluster Size linearly improves the upper limit of figure storage, while the memory scan of super big figure can be supported;Therefore apply in many scenes Under;But in loading processing real time data, in order to ensure the uniformity of data, titan can only carry out single thread loading, in real time The inefficiency of data loading, with significant limitations, it is impossible to meet the loading demand of big flow real time data.
The content of the invention
For in the prior art, chart database titan when handling big flow real time data the problem of poorly efficient insecurity, The invention provides a kind of colleges and universities' loaded in parallel method for keeping titan Real-time Data Uniforms.
Comprise the following steps that:
Step 1: chart database titan is divided into 7 modules, and 7 modular concurrent operations;
7 modules include:Data reception module, cleaning rule management module, data cleansing module, ID modular converters are high Fast index module, point load-on module and remaining data load-on module;
Data reception module, which is responsible for reception, needs data to be processed, and is put into bounded queue;
Cleaning rule management module realizes that the dynamic of filtering rule updates by monitoring rules file;
Data cleansing module is by unwanted data in the given rule-based filtering bounded queue of cleaning rule management module;
ID modular converters replace with the point in the data after cleaning the ID of corresponding points in chart database.
High speed index module is responsible for accelerating ID conversion rate.
Point load-on module, is responsible for the point being not present in during load id conversion in chart database;And after loading is complete by point And its ID corresponding relations are added to high speed index module.
Remaining data load-on module, the loading velocity of diagram data is substantially improved by loaded in parallel.
Step 2: the multithreading of data reception module concurrent working simultaneously, each thread loops are literary from message queue or CSV The data source such as part or message queue obtains data, is parsed into a plurality of pieceOfData data, is put into bounded queue queue1.
Relation of the pieceOfData data between two points, two points, and point are constituted with the attribute in relation;
Bounded queue queue1 is used to deposit the data obtained from data source;
Step 3: regular configuration file is read in the timing of cleaning rule management module, or receive client request reading rule Configuration file, the filtering rule of dynamic renewal in real time;
Step 4: data cleansing module multi-threaded parallel works, each thread loops are obtained from bounded queue queue1 successively A pieceOfData data are taken, are judged using cleaning rule, if meeting filter condition, directly abandons, otherwise, puts Enter bounded queue queue2.
Queue2 is used to deposit the data after filtering in bounded queue queue1;
Step 5: ID modular converters multi-threaded parallel works, each thread loops take out clearly from bounded queue queue2 The pieceOfData data after filtering are washed to be handled;
Concretely comprise the following steps:
Step 501, judge the corresponding relation between ID inside two points in current pieceOfData data and titan Whether all it is present in high speed index module;If it is, into step 502, otherwise, into step 503;
Step 502, ID modular converters take out corresponding relation from high speed index module, corresponding with the replacement of ID values with ID attributes PieceOfData data in point, and be saved in pieceOfDataT data, pieceOfDataT data be put into Boundary's queue queue4;
What is preserved in pieceOfDataT data is that point in pieceOfData data is replaced by corresponding ID attributes and ID values PieceOfData after alternatively;
Queue4 is used to deposit pieceOfDataT data;
Corresponding relation between the point of at least one in step 503, current pieceOfData data and titan inside ID is not It is loaded into high speed index module, the point not being loaded is put into HashSet by ID modular converters, and should PieceOfData data are put into bounded queue queue3;
Queue3 is used to deposit the pieceOfData data selected from bounded queue queue2, the pieceOfData numbers Corresponding relation between at least one point and titan inside ID is not loaded into high speed index module.
Step 6: the concurrent working simultaneously of the multithreading of remaining data load-on module, each thread loops are from bounded queue PieceOfDataT data are obtained in queue4, and are carried in titan databases;
Step 7: point load-on module is interacted with high speed index module, after termination condition is met, terminate all threads;
Comprise the following steps that:
Step 701, judge whether to meet termination condition, if it is, all threads terminate;Otherwise, into step 702;
Step 702, when judging that data are alreadyd exceed in the whether full HashSet apart from last time loading of bounded queue queue3 Between threshold value t, if it is, perform step 703, otherwise, dormancy time t1;Return to step 701 continues;
Threshold value t is that system initialization is participated in the experiment, and is set according to actual conditions;
Step 703, the point put in each thread loading HashSet of load-on module, and by ID inside the point and titan it Between corresponding relation add high speed index module in;
Step 704, point load-on module are reset to HashSet, and record current time is data in loading HashSet Time;
Step 705, the pieceOfData data in bounded queue queue3 are all put into bounded queue queue2, Empty bounded queue queue3;Return to step 701.
The advantage of the invention is that:
1), a kind of efficient parallel loading method for keeping titan Real-time Data Uniforms, can greatly improve titan real When data loading performance, loading velocity is lifted on 20 times.
2), a kind of efficient parallel loading method for keeping titan Real-time Data Uniforms, is the real-time number of highly effective and safe Data preprocess loading method;Data loading efficiency can be greatly improved on the premise of data consistency is kept, and can real time modifying Interpolation data filtering rule.
Brief description of the drawings
Fig. 1 is the structure chart that chart database titan of the present invention is divided into 7 modules;
Fig. 2 is a kind of efficient parallel loading method flow chart for keeping titan Real-time Data Uniforms of the present invention.
Specific embodiment
The specific implementation method to the present invention is described in detail below in conjunction with the accompanying drawings.
The present invention in order to ensure data consistency on the premise of be greatly enhanced the loading performance of titan real time datas, carry A kind of colleges and universities' loaded in parallel method for keeping titan Real-time Data Uniforms is gone out;Generally include three parts:Real time data is clear Wash, storage control and the processing of new point;
Regulation management thread is responsible for dynamic in real time and updates filtering rule;Main thread receives pieceOfData data, has been put into In boundary's queue queue1;Data cleansing module filters out underproof data according to cleaning rule, is put into bounded queue queue2 In;ID modular converters are fetched evidence from bounded queue queue2, are interacted with high speed index module;Judge current pieceOfData Corresponding relation inside two points in data and titan between ID whether there is with chart database;If it is, from index ID attributes and ID value substitution points inside the corresponding titan of off-take point, and be saved in pieceOfDataT data, it is put into bounded Queue queue4;Otherwise, the point not being loaded is put into HashSet, and the corresponding pieceOfData data has been put into In boundary's queue queue3;Remaining data load-on module obtains pieceOfDataT data, multithreading from bounded queue queue4 Loaded in parallel is in titan;
Point Loading Control thread judges whether data cleansing terminates, if it has not ended, continuing to judge bounded queue Whether queue3 is full, if less than thread dormancy waits bounded queue queue3 to expire for a period of time, and otherwise, multithreading adds The point in HashSet is carried, and the corresponding relation between ID inside the point and titan is added in high speed index module;Then, point Load-on module resets HashSet, and the pieceOfData data in bounded queue queue3 are all put into bounded queue queue2 In, empty bounded queue queue3.
Specific steps are as shown in Fig. 2 as follows:
Step 1: chart database titan is divided into 7 modules, and 7 modular concurrent operations;
As shown in figure 1,7 modules include:Data reception module, cleaning rule management module, data cleansing module, ID turns Change the mold block, high speed index module, point load-on module and remaining data load-on module;Each module is alone or interaction completes part work( Can, so as to realize the lifting of loading efficiency on the whole.
First module data receiving module, realizing to receive from the place such as message queue or csv file needs what is be processed Data, and be put into bounded queue.
Second module data cleaning module, is responsible for filtering unwanted data according to given rule;Given rule includes Accurate matching, is obscured or canonical matching.
3rd module cleaning rule management module, the dynamic for realizing filtering rule by monitoring rules file updates.
Filtering rule file is Json formatted files, and concrete structure is shown in annex 1.
Filtering rule file:
4th module I D modular converter, is responsible for replacing with the point in data the ID of corresponding points in chart database.
5th module high speed index module, structure is key-value types;It is responsible for accelerating the conversion of ID in the 4th module Speed.
6th module point load-on module, is responsible for the point being not present in during loading the 4th module I D conversions in chart database; And point and its ID corresponding relations are added to high speed index module after loading is complete.
7th module remaining data load-on module, the loading velocity of diagram data is substantially improved by loaded in parallel.
Step 2: the multithreading of data reception module concurrent working simultaneously, each thread loops are literary from message queue or CSV The data sources such as part obtain diagram data, are parsed into a plurality of pieceOfData data, are put into bounded queue queue1.
Diagram data is the various topological diagram datas by putting and side is constituted;
Relation of the pieceOfData data between two points, two points, and point and the attribute of relation are constituted;Putting is One is used for the key-value pair of unique mark specified point, such as uid=9867;
Bounded queue queue1 is used to deposit the data obtained from data source;
Step 3: regular configuration file is read in the timing of cleaning rule management module, or receive client request reading rule Configuration file, the filtering rule of dynamic renewal in real time;
Step 4: data cleansing module multi-threaded parallel works, each thread loops are obtained from bounded queue queue1 successively A pieceOfData data are taken, are judged using cleaning rule, if meeting filter condition, directly abandons, otherwise, puts Enter bounded queue queue2.
Queue2 is used to deposit the data after filtering in bounded queue queue1;
Step 5: ID modular converters multi-threaded parallel works, each thread loops take out clearly from bounded queue queue2 The pieceOfData data after filter are washed, and judge two points in current pieceOfData data and ID inside titan Between corresponding relation whether be all present in high speed index module;If it is, into step 6, otherwise, into step 8;
Step 6: ID modular converters take out corresponding relation from high speed index module, with ID attributes inside titan and ID values The point in corresponding pieceOfData data is replaced, and is saved in pieceOfDataT data, bounded queue is put into queue4;
What is preserved in pieceOfDataT data is that point in pieceOfData data is replaced by corresponding ID attributes and ID values PieceOfData after alternatively;
Queue4 is used to deposit pieceOfDataT data;
Step 7: the concurrent working simultaneously of the multithreading of remaining data load-on module, each thread loops are from bounded queue PieceOfDataT data are obtained in queue4, and are carried in titan databases, return to step five;
Step 8: corresponding relation in current pieceOfData data between at least one point and ID inside titan not by It is loaded into high speed index module, the point not being loaded is put into HashSet by ID modular converters, and is somebody's turn to do corresponding PieceOfData data are put into bounded queue queue3;
Queue3 is used to deposit the pieceOfData data selected from bounded queue queue2, the pieceOfData numbers Corresponding relation between at least one point and titan inside ID is not loaded into high speed index module.
Step 9: judging whether the full or time reaches given threshold t to bounded queue queue3, if it is, performing step Ten, otherwise, return to step five;
Threshold value t is that system initialization is participated in the experiment, and is set according to actual conditions;
When the full and time reaches that both given threshold t condition meets one of them to bounded queue queue3, Continue into subsequent step;Conversely, when bounded queue queue3 less than and the time be not up to given threshold t when, current thread Dormancy is carried out, the data not being loaded into bounded queue queue2 in high speed index module are waited, by ID modular converters by point It is put into HashSet, and corresponding pieceOfData data is put into bounded queue queue3;Until bounded queue The full or time reaches given threshold t to queue3;
Step 10: point load-on module judges whether data cleansing terminates, if it is, all threads terminate;Otherwise, into step Rapid 11;
Step 11: the point in each thread loading HashSet of point load-on module, and by ID inside the point and titan Between corresponding relation add high speed index module in;
Step 12: point load-on module is reset to HashSet, by the pieceOfData in bounded queue queue3 Data are all put into bounded queue queue2, empty bounded queue queue3;Return to step five.
It should be noted that and understand, in the feelings for not departing from the spirit and scope of the present invention required by appended claims Under condition, various modifications and improvements can be made to the present invention of foregoing detailed description.It is therefore desirable to the model of the technical scheme of protection Enclose and do not limited by given any specific exemplary teachings.

Claims (4)

1. a kind of colleges and universities' loaded in parallel method for keeping titan Real-time Data Uniforms, it is characterised in that specific steps are such as Under:
Step 1: chart database titan is divided into 7 modules, and 7 modular concurrent operations;
7 modules include:Data reception module, cleaning rule management module, data cleansing module, ID modular converters, high speed rope Draw module, point load-on module and remaining data load-on module;
Step 2: the concurrent working simultaneously of the multithreading of data reception module, each thread loops from message queue or csv file or The data sources such as message queue obtain data, are parsed into a plurality of pieceOfData data, are put into bounded queue queue1;
Relation of the pieceOfData data between two points, two points, and point and the attribute of relation are constituted;
Step 3: regular configuration file is read in the timing of cleaning rule management module, or receive client request reading rule configuration File, the filtering rule of dynamic renewal in real time;
Step 4: data cleansing module multi-threaded parallel works, each thread loops obtain one from bounded queue queue1 successively Bar pieceOfData data, are judged using cleaning rule, if meeting filter condition, are directly abandoned, otherwise, have been put into Boundary's queue queue2;
Step 5: ID modular converters multi-threaded parallel works, each thread loops take out from bounded queue queue2 and cleaned PieceOfData data after filter are handled;
Step 6: the concurrent working simultaneously of the multithreading of remaining data load-on module, each thread loops are from bounded queue queue4 Middle acquisition pieceOfDataT data, and be carried in titan databases;
Step 7: point load-on module is interacted with high speed index module, after termination condition is met, terminate all threads.
2. a kind of colleges and universities' loaded in parallel method for keeping titan Real-time Data Uniforms as claimed in claim 1, its feature It is, in the step one, data reception module, which is responsible for reception, needs data to be processed, and is put into bounded queue;
Cleaning rule management module realizes that the dynamic of filtering rule updates by monitoring rules file;
Data cleansing module is by unwanted data in the given rule-based filtering bounded queue of cleaning rule management module;
ID modular converters replace with the point in the data after cleaning the ID of corresponding points in chart database;
High speed index module is responsible for accelerating ID conversion rate;
Point load-on module, is responsible for the point being not present in during load id conversion in chart database;And after loading is complete will point and its ID corresponding relations are added to high speed index module;
Remaining data load-on module, the loading velocity of diagram data is substantially improved by loaded in parallel.
3. a kind of colleges and universities' loaded in parallel method for keeping titan Real-time Data Uniforms as claimed in claim 1, its feature It is, the step 5 is specially:
Step 501, judge whether is corresponding relation inside two points in current pieceOfData data and titan between ID All it is present in high speed index module;If it is, into step 502, otherwise, into step 503;
Step 502, ID modular converters take out corresponding relation from high speed index module, corresponding with the replacement of ID values with ID attributes Point in pieceOfData data, and be saved in pieceOfDataT data, pieceOfDataT data are put into bounded Queue queue4;
What is preserved in pieceOfDataT data is that point in pieceOfData data is replaced it by corresponding ID attributes and ID values PieceOfData afterwards;
Corresponding relation between the point of at least one in step 503, current pieceOfData data and titan inside ID is not added It is downloaded in high speed index module, the point not being loaded is put into HashSet by ID modular converters, and by the pieceOfData numbers According to being put into bounded queue queue3;
Queue3 is used to deposit in the pieceOfData data selected from bounded queue queue2, the pieceOfData data Corresponding relation between at least one point and titan inside ID is not loaded into high speed index module.
4. a kind of colleges and universities' loaded in parallel method for keeping titan Real-time Data Uniforms as claimed in claim 1, its feature It is, the step 7 is specially:
Step 701, judge whether to meet termination condition, if it is, all threads terminate;Otherwise, into step 702;
Step 702, judge that data already exceed time threshold in the whether full HashSet apart from last time loading of bounded queue queue3 Value t, if it is, performing step 703, otherwise, dormancy time t1;Return to step 701 continues;
Threshold value t is that system initialization is participated in the experiment, and is set according to actual conditions;
Step 703, the point put in each thread loading HashSet of load-on module, and by between ID inside the point and titan Corresponding relation is added in high speed index module;
Step 704, point load-on module HashSet is reset, record current time for load HashSet in data when Between;
Step 705, the pieceOfData data in bounded queue queue3 are all put into bounded queue queue2, emptied Bounded queue queue3;Return to step 701.
CN201710390469.4A 2017-05-27 2017-05-27 Efficient parallel loading method capable of keeping titan real-time data consistency Active CN107038260B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710390469.4A CN107038260B (en) 2017-05-27 2017-05-27 Efficient parallel loading method capable of keeping titan real-time data consistency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710390469.4A CN107038260B (en) 2017-05-27 2017-05-27 Efficient parallel loading method capable of keeping titan real-time data consistency

Publications (2)

Publication Number Publication Date
CN107038260A true CN107038260A (en) 2017-08-11
CN107038260B CN107038260B (en) 2020-03-10

Family

ID=59539492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710390469.4A Active CN107038260B (en) 2017-05-27 2017-05-27 Efficient parallel loading method capable of keeping titan real-time data consistency

Country Status (1)

Country Link
CN (1) CN107038260B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109189743A (en) * 2018-06-26 2019-01-11 国家计算机网络与信息安全管理中心 A kind of the super node identification filter method and system of the low consumption of resources towards the real-time diagram data of big flow
CN112597145A (en) * 2020-12-29 2021-04-02 恩亿科(北京)数据科技有限公司 Real-time data cleaning method, system, electronic equipment and storage medium
CN112685419A (en) * 2020-12-31 2021-04-20 北京赛思信安技术股份有限公司 Distributed efficient parallel loading method capable of keeping consistency of janusGraph data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103279546A (en) * 2013-05-13 2013-09-04 清华大学 Graph data query method
WO2014130035A1 (en) * 2013-02-21 2014-08-28 Bluearc Uk Limited Object-level replication of cloned objects in a data storage system
CN106095977A (en) * 2016-06-20 2016-11-09 环球大数据科技有限公司 The distributed approach of a kind of data base and system
CN106126583A (en) * 2016-06-20 2016-11-16 环球大数据科技有限公司 The collection group strong compatibility processing method of a kind of distributed chart database and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014130035A1 (en) * 2013-02-21 2014-08-28 Bluearc Uk Limited Object-level replication of cloned objects in a data storage system
CN103279546A (en) * 2013-05-13 2013-09-04 清华大学 Graph data query method
CN106095977A (en) * 2016-06-20 2016-11-09 环球大数据科技有限公司 The distributed approach of a kind of data base and system
CN106126583A (en) * 2016-06-20 2016-11-16 环球大数据科技有限公司 The collection group strong compatibility processing method of a kind of distributed chart database and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄权隆: "HybriG:一种高效处理大量重边的属性图存储架构", 《计算机学报》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109189743A (en) * 2018-06-26 2019-01-11 国家计算机网络与信息安全管理中心 A kind of the super node identification filter method and system of the low consumption of resources towards the real-time diagram data of big flow
CN109189743B (en) * 2018-06-26 2021-09-28 国家计算机网络与信息安全管理中心 Super node recognition filtering method and system with low resource consumption and oriented to large-flow real-time graph data
CN112597145A (en) * 2020-12-29 2021-04-02 恩亿科(北京)数据科技有限公司 Real-time data cleaning method, system, electronic equipment and storage medium
CN112685419A (en) * 2020-12-31 2021-04-20 北京赛思信安技术股份有限公司 Distributed efficient parallel loading method capable of keeping consistency of janusGraph data

Also Published As

Publication number Publication date
CN107038260B (en) 2020-03-10

Similar Documents

Publication Publication Date Title
CN107526645B (en) A kind of communication optimization method and system
CN104317970B (en) A kind of data stream type processing method based on data mart modeling center
CN114399227A (en) Production scheduling method and device based on digital twins and computer equipment
CN106126601A (en) A kind of social security distributed preprocess method of big data and system
CN105989129A (en) Real-time data statistic method and device
CN107038260A (en) A kind of efficient parallel loading method for keeping titan Real-time Data Uniforms
CN110222029A (en) A kind of big data multidimensional analysis computational efficiency method for improving and system
CN111459646B (en) Big data quality management task scheduling method based on pipeline model and task combination
CN104317942A (en) Massive data comparison method and system based on hadoop cloud platform
CN108334557A (en) A kind of aggregated data analysis method, device, storage medium and electronic equipment
CN108829740A (en) Date storage method and device
CN107977504A (en) A kind of asymmetric in-core fuel management computational methods, device and terminal device
CN110162736A (en) Large Scale Sparse symmetrical linear equation group method for parallel processing based on elimination-tree
CN107436865A (en) A kind of word alignment training method, machine translation method and system
CN104036141A (en) Open computing language (OpenCL)-based red-black tree acceleration algorithm
CN104933110B (en) A kind of data prefetching method based on MapReduce
CN112561902A (en) Chip inverse reduction method and system based on deep learning
CN109062866B (en) Solving method and system for upper triangular equation set of electric power system based on greedy layering
CN108985622B (en) Power system sparse matrix parallel solving method and system based on DAG
CN106776810A (en) The data handling system and method for a kind of big data
CN113661510A (en) Non-linear programming model-based production planning system, production planning method, and computer-readable storage medium
CN107423028A (en) A kind of parallel scheduling method of extensive flow
Shen et al. Massive power device condition monitoring data feature extraction and clustering analysis using MapReduce and graph model
CN116644136A (en) Data acquisition method, device, equipment and medium for increment and full data
CN102253861A (en) Method for executing stepwise plug-in computation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant