CN108073620A - A kind of method for quickly retrieving based on graph data structure - Google Patents

A kind of method for quickly retrieving based on graph data structure Download PDF

Info

Publication number
CN108073620A
CN108073620A CN201611001983.6A CN201611001983A CN108073620A CN 108073620 A CN108073620 A CN 108073620A CN 201611001983 A CN201611001983 A CN 201611001983A CN 108073620 A CN108073620 A CN 108073620A
Authority
CN
China
Prior art keywords
data
distributed
request
query
message
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611001983.6A
Other languages
Chinese (zh)
Inventor
张伯轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Changfeng Science Technology Industry Group Corp
Original Assignee
China Changfeng Science Technology Industry Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Changfeng Science Technology Industry Group Corp filed Critical China Changfeng Science Technology Industry Group Corp
Priority to CN201611001983.6A priority Critical patent/CN108073620A/en
Publication of CN108073620A publication Critical patent/CN108073620A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/547Messaging middleware

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A kind of method for quickly retrieving based on graph data structure, including:(1) data source gathers:Figure basic structure is formed in memory by artificially importing;Multiple data collecting systems are monitored by distributed message middleware in real time, when data collecting system receives data source update change request, just feed back information to distributed system real-time update;(2) query and search is asked:Distributed message middleware monitors the inquiry request of the diagram data of inquiry system generation in real time, when a request is received, multithreading will just be asked to be sent to multiple distributed nodes and perform parallel and quickly return to inquiry system;(3) diagram data is retrieved:According to the request of message-oriented middleware, each distributed node carries out parallel computation retrieval;(4) retrieval result is sent:Integrate distributed node based on memory parallel query and search as a result, query result multithreading is sent to message request person.

Description

A kind of method for quickly retrieving based on graph data structure
Technical field
The present invention relates to big data retrieval technique fields, and in particular to a kind of quick-searching side based on graph data structure Method.
Background technology
Data persistence into relevant database is a kind of very traditional data persistence mode by the information age, However as the arrival in big data epoch, this data storage method brings very big for the query and search work of big data Difficulty is mainly reflected in cumbersome, speed is slow, real-time update retrieval is slow etc..To solve this problem, the applicant A kind of technical solution is provided, the storage of multi-source data and association are realized by connecting graph structure, by Distributed Parallel Computing Technology realizes the distributed storage of diagram data and parallel type retrieval and inquisition, and updates the data source and only need to change some again The structure and connection attribute of diagram data in node, rapidly and efficiently, graph structure data storage realize data relationship storage and The flexible and changeable property of retrieval, has greatly widened data source source, has compatibility well, and parallel type Computational frame is protected It has demonstrate,proved inquiry data rapidly and efficiently, can easily realize that real-time retrieval is inquired about.
The content of the invention
It is an object of the invention to by graph data structure, with reference to distributive parallel computation framework, design is a kind of will be a variety of Data source and data relationship distributed storage and the interface that real-time query is provided, data retrieval by parallel type Computational frame, It can realize the quick-searching inquiry of diagram data.
Technical scheme is as follows:
A kind of method for quickly retrieving based on graph data structure, it is characterised in that including:
(1) data source gathers:Data source acquisition includes basic data typing and real time data update;Basic data typing is led to It crosses artificial import to form figure basic structure in memory, caching is set, ensure basic graph data structure memory-resident, realize real When reading retrieval;Real time data updates, and monitors multiple data collecting systems in real time by distributed message middleware, works as data When acquisition system receives data source update change request, listener will feed back information to distributed system, realize memory The real-time update of diagram data;
(2) query and search is asked:The inquiry that distributed message middleware monitors the diagram data of inquiry system generation in real time please It asks, when a request is received, will just multithreading be asked to be sent to multiple distributed nodes, according to the anti-of heartbeat mechanism decision node Feedback, the split instruction distributed parallel so as to fulfill request perform and quickly return to inquiry system;
(3) diagram data is retrieved:According to the request of message-oriented middleware, each distributed node carries out parallel computation retrieval, and It by MapReduce processes, realizes that the merging of distributed query result is integrated, ensures the quick complete of query and search process;
(4) retrieval result is sent:Integrate distributed node based on memory parallel query and search as a result, passing through the middleware heart Query result multithreading is sent to message request person by the information that jump mechanism receives, and realizes the inquiry of graph structure data quick-searching.
The present invention makes mass data storage update inquiry more convenient, simplifies data management work, accelerates work Make progress;Data message network of personal connections is built based on connected graph, it is more flexible and convenient, there is very high scalability;It is distributed real Existing message-oriented middleware and distributed computing framework, between multiple Slaver and Master by heartbeat mechanism come realize communication and Accordingly, if the machine of delaying occurs in some node, JobTracker can carry out distribution node task automatically according to configuration, ensure depositing for data The progress that storage is retrieved and inquiry can normally rapidly and efficiently;Distributed presence realizes the combining properties hair of economic machines It waves, more costs can be saved, realize more efficient system service.
Description of the drawings
Fig. 1 is data source acquisition system block diagram;
Fig. 2 is memory map data storage method schematic diagram;
Fig. 3 is diagram data real-time retrieval inquiry mode schematic diagram.
Specific embodiment
According to attached data source acquisition system shown in FIG. 1, memory diagram data shown in Fig. 2 storage and figure number shown in Fig. 3 Retrieval and inquisition inquires about Computational frame to build distributed parallel when factually, sets setting and figure of the configuration file for caching mechanism Structure storage mode sets distribution to can be achieved with for the setting of heartbeat mechanism of communication and fault-tolerant processing mechanism based on figure The quick retrieval system of data structure is built.
Specifically include following steps:
(1) data source gathers:
As shown in Figure 1, data source acquisition includes basic data typing and real time data update.
Basic data typing:Figure basic structure is formed in memory by artificially importing, caching is set, ensures foundation drawing Data structure memory-resident is realized and reads retrieval in real time.
Real time data updates:Distributed message middleware monitors multiple data collecting systems in real time, works as data collecting system When receiving data source update change request, listener will feed back information to distributed system, realize memory diagram data Real-time update.
Memory map data storage method is as shown in Fig. 2, basic data ensure that the structure composition of basic diagram data, number in real time According to typing change pass through the forwarding of listener and realize the real-time update of diagram data.
(2) query and search is asked:
As shown in figure 3, distributed message middleware monitors the inquiry request of the diagram data of inquiry system generation in real time, when connecing When receiving request, request multithreading will be sent to multiple distributed nodes, according to the feedback of heartbeat mechanism decision node, thus Realize that the split instruction distributed parallel of request performs and quickly returns to inquiry system.
(3) diagram data is retrieved:
As shown in figure 3, according to the request of message-oriented middleware, each distributed node carries out parallel computation retrieval, and passes through MapReduce processes realize that the merging of distributed query result is integrated, and ensure the quick complete of query and search process.
(4) retrieval result is sent:
As shown in figure 3, integrate distributed node based on memory parallel query and search as a result, passing through middleware heartbeat mechanism Query result multithreading is sent to message request person by the information received, realizes the inquiry of graph structure data quick-searching.

Claims (1)

1. a kind of method for quickly retrieving based on graph data structure, it is characterised in that including:
(1) data source gathers:Data source acquisition includes basic data typing and real time data update;Basic data typing passes through people Figure basic structure is formed in memory to import, caching is set, ensures basic graph data structure memory-resident, is realized real-time Read retrieval;Real time data updates, and monitors multiple data collecting systems in real time by distributed message middleware, works as data acquisition When system receives data source update change request, listener will feed back information to distributed system, realize memory map number According to real-time update;
(2) query and search is asked:Distributed message middleware monitors the inquiry request of the diagram data of inquiry system generation in real time, when When receiving request, will just multithreading be asked to be sent to multiple distributed nodes, according to the feedback of heartbeat mechanism decision node, thus Realize that the split instruction distributed parallel of request performs and quickly returns to inquiry system;
(3) diagram data is retrieved:According to the request of message-oriented middleware, each distributed node carries out parallel computation retrieval, and passes through MapReduce processes realize that the merging of distributed query result is integrated, and ensure the quick complete of query and search process;
(4) retrieval result is sent:Integrate distributed node based on memory parallel query and search as a result, passing through middleware heartbeat machine The information received is made, query result multithreading is sent to message request person, realizes the inquiry of graph structure data quick-searching.
CN201611001983.6A 2016-11-14 2016-11-14 A kind of method for quickly retrieving based on graph data structure Pending CN108073620A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611001983.6A CN108073620A (en) 2016-11-14 2016-11-14 A kind of method for quickly retrieving based on graph data structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611001983.6A CN108073620A (en) 2016-11-14 2016-11-14 A kind of method for quickly retrieving based on graph data structure

Publications (1)

Publication Number Publication Date
CN108073620A true CN108073620A (en) 2018-05-25

Family

ID=62161942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611001983.6A Pending CN108073620A (en) 2016-11-14 2016-11-14 A kind of method for quickly retrieving based on graph data structure

Country Status (1)

Country Link
CN (1) CN108073620A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710314A (en) * 2018-12-20 2019-05-03 四川新网银行股份有限公司 A method of based on graph structure distributed parallel mode construction figure
CN110659292A (en) * 2019-09-21 2020-01-07 北京海致星图科技有限公司 Spark and Ignite-based distributed real-time graph construction and query method and system
CN111488492A (en) * 2020-04-08 2020-08-04 北京百度网讯科技有限公司 Method and apparatus for retrieving graph database
CN114742691A (en) * 2022-05-19 2022-07-12 支付宝(杭州)信息技术有限公司 Graph data sampling method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090204593A1 (en) * 2008-02-11 2009-08-13 Yahoo! Inc. System and method for parallel retrieval of data from a distributed database
CN101609610A (en) * 2009-07-17 2009-12-23 中国民航大学 A kind of Flight Information data acquisition unit and disposal route thereof
CN103336808A (en) * 2013-06-25 2013-10-02 中国科学院信息工程研究所 System and method for real-time graph data processing based on BSP (Board Support Package) model
CN103731298A (en) * 2013-11-15 2014-04-16 中国航天科工集团第二研究院七〇六所 Large-scale distributed network safety data acquisition method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090204593A1 (en) * 2008-02-11 2009-08-13 Yahoo! Inc. System and method for parallel retrieval of data from a distributed database
CN101609610A (en) * 2009-07-17 2009-12-23 中国民航大学 A kind of Flight Information data acquisition unit and disposal route thereof
CN103336808A (en) * 2013-06-25 2013-10-02 中国科学院信息工程研究所 System and method for real-time graph data processing based on BSP (Board Support Package) model
CN103731298A (en) * 2013-11-15 2014-04-16 中国航天科工集团第二研究院七〇六所 Large-scale distributed network safety data acquisition method and system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710314A (en) * 2018-12-20 2019-05-03 四川新网银行股份有限公司 A method of based on graph structure distributed parallel mode construction figure
CN110659292A (en) * 2019-09-21 2020-01-07 北京海致星图科技有限公司 Spark and Ignite-based distributed real-time graph construction and query method and system
CN111488492A (en) * 2020-04-08 2020-08-04 北京百度网讯科技有限公司 Method and apparatus for retrieving graph database
CN111488492B (en) * 2020-04-08 2023-11-17 北京百度网讯科技有限公司 Method and device for searching graph database
CN114742691A (en) * 2022-05-19 2022-07-12 支付宝(杭州)信息技术有限公司 Graph data sampling method and system
CN114742691B (en) * 2022-05-19 2023-08-18 支付宝(杭州)信息技术有限公司 Graph data sampling method and system

Similar Documents

Publication Publication Date Title
CN108073620A (en) A kind of method for quickly retrieving based on graph data structure
CN111400326B (en) Smart city data management system and method thereof
CN111327681A (en) Cloud computing data platform construction method based on Kubernetes
CN105512336A (en) Method and device for mass data processing based on Hadoop
CN109840253A (en) Enterprise-level big data platform framework
CN107679192A (en) More cluster synergistic data processing method, system, storage medium and equipment
CN107395669A (en) A kind of collecting method and system based on the real-time distributed big data of streaming
CN109710731A (en) A kind of multidirectional processing system of data flow based on Flink
CN106790718A (en) Service call link analysis method and system
CN105468735A (en) Stream preprocessing system and method based on mass information of mobile internet
CN106339509A (en) Power grid operation data sharing system based on large data technology
CN103106249A (en) Data parallel processing system based on Cassandra
CN110674154B (en) Spark-based method for inserting, updating and deleting data in Hive
CN106296788B (en) Across the computer room Cluster Rendering of one kind disposes realization system
CN105320085A (en) Method, apparatus and system for acquiring industrial automation data
CN103927314B (en) A kind of method and apparatus of batch data processing
CN102456031A (en) MapReduce system and method for processing data streams
CN109284195A (en) A kind of real-time representation data calculation method and system
WO2014117295A1 (en) Performing an index operation in a mapreduce environment
CN112287015A (en) Image generation system, image generation method, electronic device, and storage medium
CN107733696A (en) A kind of machine learning and artificial intelligence application all-in-one dispositions method
CN101986661A (en) Improved MapReduce data processing method under virtual machine cluster
CN106250566A (en) A kind of distributed data base and the management method of data operation thereof
CN104834635A (en) Data processing method and device
CN105574032A (en) Rule matching operation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180525