CN102622414A - Peer-to-peer structure based distributed high-dimensional indexing parallel query framework - Google Patents

Peer-to-peer structure based distributed high-dimensional indexing parallel query framework Download PDF

Info

Publication number
CN102622414A
CN102622414A CN2012100381150A CN201210038115A CN102622414A CN 102622414 A CN102622414 A CN 102622414A CN 2012100381150 A CN2012100381150 A CN 2012100381150A CN 201210038115 A CN201210038115 A CN 201210038115A CN 102622414 A CN102622414 A CN 102622414A
Authority
CN
China
Prior art keywords
index
website
module
working terminal
index block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012100381150A
Other languages
Chinese (zh)
Other versions
CN102622414B (en
Inventor
丁贵广
林梓佳
文海龙
王建民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN 201210038115 priority Critical patent/CN102622414B/en
Publication of CN102622414A publication Critical patent/CN102622414A/en
Application granted granted Critical
Publication of CN102622414B publication Critical patent/CN102622414B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a peer-to-peer structure based distributed high-dimensional indexing parallel query framework, which comprises an index creation module, a monitor module, a peer-to-peer site cluster and a soft load balance module. The index creation module is used for creating an index and storing index block files. The monitor module is used for detecting information of available memory of operation sites in the peer-to-peer site cluster and corresponding index block information of each operation site so as to coordinate the operation sites according to detection results. The soft load balance module is used for controlling load balance for the each operation site, and control is regularly synchronized through the monitor module so that the soft load balance module can timely regulate and update a lift of the operation sites currently available. The peer-to-peer structure based distributed high-dimensional indexing parallel query framework has the advantages of capability of reliably searching massive candidates in real time under support of dynamic increase and decrease, and is fast in query speed.

Description

Distributed high dimensional indexing parallel query framework based on peering structure
Technical field
The present invention relates to the computer information retrieval technical field, particularly a kind of distributed high dimensional indexing parallel query framework based on peering structure.
Background technology
Technology rapid development such as CBIR, video frequency searching makes High-dimensional Index Technology obtain development and application more and more widely.The application of High-dimensional Index Technology mainly is in order to accelerate retrieval rate, and then the inquiry that promotes the user is experienced.In the present existing overwhelming majority was used, in fact the inquiry of high dimensional indexing was exactly preceding k the arest neighbors (k Nearest Neighbors is called for short kNN) of retrieval and inquisition object on semantic level.
Generally speaking, the content-based retrieval technology at first need be converted into the high dimensional feature vector with the candidate target (image, video etc.) of magnanimity, with the form that the quantizes characteristic of conservation object as far as possible itself.Then, need carry out index to the magnanimity high dimension vector that extracts.When the submit queries object, be converted into the high dimensional feature vector of equal dimension through quantification, and utilize the high dimensional indexing structure, retrieve its kNN fast, and the result is returned.Under the major applications scene, the establishment of high dimensional indexing is to allow off-line, does not have too high requirement for real-time, but stability, reliability and integrality are had relatively high expectations.And for the inquiry of high dimensional indexing, the demand in the practical application is very clear and definite and strict, must be as far as possible accurately and fast, reliably, and supports the dynamically retrieval of the magnanimity candidate target of increase and decrease.This is emphatically thinking and the problem that solves of the present invention just also.
Along with the popularization of types of applications, High-dimensional Index Technology has all caused concern widely in academia and industry member, and has emerged in large numbers the algorithm and the scheme of a large amount of lifting high dimensional indexings establishments and search efficiency.The improved efficiency strategy of high dimensional indexing can be divided into three major types on the current overall context.The first kind is to raise the efficiency with the cost that reduces the kNN degree of accuracy, that is only inquiry is similar to kNN.Second type then is through changing establishment and the inquiry velocity that index structure promotes index, still through the research of decades, finding very to support efficiently dimensions up to a hundred even more high-dimensional index structure at present as yet.The 3rd type then is that framework from directory system improves, and promote and create and the speed of inquiry through introducing parallel directory system, yet aspect the research of in real time reliable parallel system, because the existence of technology barriers, correlation technique is difficult to be widely used at present.In addition, traditional master-slave mode search index framework lacks self-organization, and occurs performance bottleneck easily.
Summary of the invention
The present invention is intended to one of solve the problems of the technologies described above at least.
For this reason; One object of the present invention is to propose a kind of distributed high dimensional indexing parallel query framework based on peering structure; This distributed high dimensional indexing parallel query framework is based on peering structure; Have advantage real-time, that magnanimity candidate target reliable, that support dynamically increase and decrease is retrieved, and inquiry velocity is fast.
To achieve these goals; Embodiments of the invention have proposed a kind of distributed high dimensional indexing parallel query framework based on peering structure; Comprise: index creation module, monitor module, reciprocity website cluster and the soft balance module of load; Wherein, Said index creation module is used for the magnanimity candidate target is cut apart and created index to obtain a plurality of index block files for each partitioning portion; And said index block file stored; Wherein, Said index block file comprises index block information; Said monitor module is used for detecting the corresponding index block information of free memory information and each working terminal of working terminal of said reciprocity website cluster according to testing result the index block that each working terminal was loaded being coordinated and to be sent reciprocity site list update instruction to each working terminal, and the working terminal in the said reciprocity website cluster carries out load or unload according to the index block information of self to corresponding index block file, and is inquiring about and Query Result is integrated and exported in corresponding index block file according to the query requests that the user sends; The work at present site list that the soft balance module of said load is used for obtaining said monitor module is carrying out load balancing control according to said working terminal tabulation to the work at present website, and the soft balance module of said load is regularly undertaken synchronously by said monitor module so that the soft balance module of said load is adjusted current available working terminal tabulation and upgraded.
The distributed high dimensional indexing parallel query framework based on peering structure according to the embodiment of the invention adopts the high index structure of efficient; Particularly; Adopt Hybrid Spill Tree as basic index structure (high dimensional indexing structure); And adopt the framework mode of reciprocity website, all done comparatively comprehensively ensureing at aspects such as real-time, reliability, stability, extensibility, self-organizations.Can efficiently inquire about, and have stronger extensibility, can move to easily in the various content-based search systems, and have the fast advantage of inquiry velocity.In addition, should be clear based on the distributed high dimensional indexing parallel query framed structure of peering structure, be easy to realize.
In addition, the distributed high dimensional indexing parallel query framework based on peering structure according to the above embodiment of the present invention can also have following additional technical characterictic:
In one embodiment of the invention, said index creation module further comprises: index creation submodule, said index creation submodule adopt Map Reduce framework that a plurality of data partitioning portions are walked abreast and create index to obtain said a plurality of index block file; Distributed memory system, said distributed memory system are used to preserve said a plurality of index block file.
In one embodiment of the invention; Said monitor module also be used for detect the index block file upgrade the back to inoperative set of sites pocket transmission index load instructions so that said inoperative website cluster gets into duty; Said monitor module will carry out the cluster switching after said index block file load finishes; And use the working terminal cluster that has loaded up-to-date index, original working terminal cluster is placed off working state.
In one embodiment of the invention; Said monitor module judges whether there is the inefficacy website in the current working terminal according to said testing result, and sends reciprocity site list update instruction to current available working terminal detecting when having the inefficacy website.
In one embodiment of the invention;, registers the inoperative website when adding said working terminal cluster to said monitor module; So that said monitor module upgrades said working terminal tabulation; And the working terminal after said renewal tabulation sent to the soft balance module of said load, the reciprocity site list after will upgrading simultaneously sends to all said working terminals.
In one embodiment of the invention; When each working terminal in the said reciprocity website cluster receives query requests; Create host process and according to a plurality of corresponding subprocesss of the index block information creating of self; So that each subprocess inquires about from the index block of loading of correspondence, and Query Result is sent to host process and integrates transmission through host process.
In one embodiment of the invention, said host process also is used for responding the query requests that other working terminal is distributed.
In one embodiment of the invention, said subprocess is monitored host process, so that when said host process accident withdraws from, close automatically.
In one embodiment of the invention, the soft balance module of said load comprises the soft balanced website of a plurality of loads, and the soft balance module of said load is safeguarded said current available working terminal tabulation.
In one embodiment of the invention, said index block file adopts Hybird Spill Tree data structure.
Additional aspect of the present invention and advantage part in the following description provide, and part will become obviously from the following description, or recognize through practice of the present invention.
Description of drawings
Above-mentioned and/or additional aspect of the present invention and advantage obviously with are easily understood becoming the description of embodiment from combining figs, wherein:
Fig. 1 is the synoptic diagram based on the distributed high dimensional indexing parallel query framework of peering structure of the embodiment of the invention; And
Fig. 2 is the Organization Chart based on the distributed high dimensional indexing parallel query framework of peering structure of the embodiment of the invention.
Embodiment
Describe embodiments of the invention below in detail, the example of said embodiment is shown in the drawings, and wherein identical from start to finish or similar label is represented identical or similar elements or the element with identical or similar functions.Be exemplary through the embodiment that is described with reference to the drawings below, only be used to explain the present invention, and can not be interpreted as limitation of the present invention.
In description of the invention; It will be appreciated that; The orientation of indications such as term " " center ", " vertically ", " laterally ", " on ", D score, " preceding ", " back ", " left side ", " right side ", " vertically ", " level ", " top ", " end ", " interior ", " outward " or position relation are for based on orientation shown in the drawings or position relation; only be to describe with simplifying for the ease of describing the present invention; rather than the device or the element of indication or hint indication must have specific orientation, with specific azimuth configuration and operation, therefore can not be interpreted as limitation of the present invention.In addition, term " first ", " second " only are used to describe purpose, and can not be interpreted as indication or hint relative importance.
In description of the invention, need to prove that only if clear and definite regulation and qualification are arranged in addition, term " installation ", " linking to each other ", " connection " should be done broad understanding, for example, can be to be fixedly connected, also can be to removably connect, or connect integratedly; Can be mechanical connection, also can be to be electrically connected; Can be directly to link to each other, also can link to each other indirectly through intermediary, can be the connection of two element internals.For those of ordinary skill in the art, can concrete condition understand above-mentioned term concrete implication in the present invention.
The distributed high dimensional indexing parallel query framework based on peering structure of the embodiment of the invention is described below in conjunction with accompanying drawing.
Referring to Fig. 1 and Fig. 2, comprise index creation module Builder, monitor module Monitor, reciprocity website cluster and the soft balance module Balancer of load according to the distributed high dimensional indexing parallel query framework 100 based on peering structure of the embodiment of the invention.Wherein:
Index creation module Builder is used for the magnanimity candidate target is cut apart and created index obtaining a plurality of index block files for each partitioning portion, and said index block file is stored, and wherein, the index block file comprises index block information.
Monitor module Monitor is used for detecting the corresponding index block information of free memory information and each working terminal of working terminal of said reciprocity website cluster according to testing result the index block that each working terminal was loaded coordinated and to be sent reciprocity site list update instruction to each working terminal.Particularly, monitor module Monitor monitors, coordinates and manage the work between each working terminal.On the one hand; For example monitor module Monitor regularly obtains free memory information and index block information on each working terminal through the mode that is similar to the heartbeat packet detection; And, analyze whether there is repetition index or lack index, and carry out intelligent coordinated according to free memory information according to the result who obtains; Require the index block of some working terminal unloading index block or load miss, with lowest redundancy and the integrality of guaranteeing index block.On the other hand; Monitor module Monitor will provide the management function of reciprocity website cluster, and heartbeat packet detects can find the website that lost efficacy, and notifies other websites to carry out the adjustment of reciprocity site list; And after new website adds; Also can register, and notify other working terminals to adjust reciprocity site list, can be close to the site information of knowing in real time in the same index cluster to guarantee each website to the full extent by Monitor from trend monitor module Monitor.In addition, monitor module Monitor be responsible for load balancing module Balancer between synchronous working.Monitor module Monitor gives the soft balanced website of each load among the load balancing module Balancer synchronously with the effective site list in the own detected current index cluster, can select effective available work website for all inquiries with proof load balancing equipment Balancer.
Working terminal in the equity website cluster carries out load or unload according to the index block information of self to corresponding index block file, and inquires about and Query Result is integrated and exported in corresponding index block file according to the query requests that the user sends.
The work at present site list that the soft balance module Balancer of load is used for obtaining monitor module Monitor is carrying out load balancing control according to said working terminal tabulation to the work at present website, and the soft balance module Balancer of load will regularly be undertaken synchronously by monitor module Monitor so that the soft balance module Balancer of load adjusts current available working terminal tabulation and upgrades.
The distributed high dimensional indexing parallel query framework 100 based on peering structure according to the embodiment of the invention adopts the high index structure of efficient; Particularly; Adopt Hybrid Spill Tree as basic index structure (high dimensional indexing structure); And adopt the framework mode of reciprocity website, all done comparatively comprehensively ensureing at aspects such as real-time, reliability, stability, extensibility, self-organizations.Can efficiently inquire about, and have stronger extensibility, can move to easily in the various content-based search systems, and have the fast advantage of inquiry velocity.In addition, should be clear based on the distributed high dimensional indexing parallel query framed structure of peering structure, be easy to realize.
In one embodiment of the invention, index creation module Builder further comprises index creation submodule and distributed memory system.Wherein:
The index creation submodule adopts Map Reduce framework that a plurality of data partitioning portions are walked abreast and creates index to obtain said a plurality of index block file.Distributed memory system is used to preserve said a plurality of index block file.
Particularly, consider that index creation allows off-line to carry out, less demanding for real-time; And require comparatively strict for reliability and stability; So directly adopt ripe Map Reduce framework, be used for parallel establishment index, and by means of distributed memory system; Storage organization transparence with bottom is used to deposit the index block file.
It will be appreciated that, in index creation module Builder, the magnanimity candidate target is divided into a plurality of parts, and create corresponding index block, can lay the foundation for follow-up distributed parallel inquiry for each part.Adopt Map Reduce framework, the regularly parallel efficiently index of creating, and after new index creation success, cover old index file, to adapt to the change of candidate target.
In one embodiment of the invention; Said monitor module Monitor also be used for detect the index block file upgrade the back to inoperative set of sites pocket transmission index load instructions so that said inoperative website cluster gets into duty; And said monitor module will carry out the cluster switching after the index block file load finishes; And use has loaded the working terminal cluster of up-to-date index; Original working terminal cluster is placed off working state, send the latest work site list to the soft balance module of load simultaneously.That is to say, keep watch on the change conditions of index file, and after the indexed file vary stable, require the inoperative website to carry out index and load, carry out cluster after loading is accomplished and switch, make it to become working terminal, replacement working terminal before.
In one embodiment of the invention; Said monitor module judges whether there is the inefficacy website in the current working terminal according to said testing result, and sends reciprocity site list update instruction to current available working terminal detecting when having the inefficacy website.The heartbeat packet that is monitor module Monitor through the foregoing description detects can find the website that lost efficacy, and notifies other websites to carry out the adjustment of reciprocity site list.
Further;, will register the inoperative website when adding the working terminal cluster to monitor module Monitor; So that tabulating to working terminal, upgrades monitor module Monitor; And the working terminal after said renewal tabulation sent to the soft balance module of said load, send other working terminals of command request simultaneously and carry out the renewal of reciprocity site list.
As a concrete example; When each working terminal in the reciprocity website cluster receives query requests; Create host process and according to a plurality of corresponding subprocesss of the index block information creating of self; So that each subprocess inquires about from the index block that has loaded of correspondence, and Query Result is sent to host process and integrates transmission through host process.Further, host process also is used for responding the query requests that other working terminal is distributed, that is after certain working terminal receives the user inquiring request, will further distribute query requests, requires other websites to assist the completion of inquiry.In this example, subprocess is monitored host process, so that when said host process accident withdraws from, close automatically.
Particularly, the host process of working terminal mainly is responsible for the work of three parts.
1, the heartbeat detection of response monitor module Monitor provides work at present website available internal memory information, and the index block information of being responsible for etc.Therefore, when working terminal when starting service, should initiatively register to monitor module Monitor.And host process need be safeguarded index block information, when needs load the assigned indexes piece, creates corresponding subprocess index block is managed, and when needs unloading assigned indexes piece, then be that the subprocess of being responsible for this index block is closed.In addition, host process needs the antithetical phrase state of a process to monitor, when finding that subprocess lost efficacy, and can be in time to monitor module Monitor feedback index block deletion condition.
2, respond the query requests of other working terminal distributions, and according to query requests subprocess is submitted in request and handled, the result who in the time threshold of appointment, subprocess is returned then integrates, and finally returns to the working terminal of distribution query requests.
3, accept user's query requests, the parallel working terminal that is distributed to other, the return results to each working terminal (comprising itself) carries out secondary integration in the time threshold of appointment at last, and returns to the user.
The work that the subprocess of working terminal is responsible for is following:
1, loads the index block file, and the index block file is carried out internal memory launch, shone upon the one-to-one relationship between proper vector and the candidate target.
2, respond the query requests of host process, in index structure, inquire about the most similar Query Result of the predetermined quantity of appointed object, and the result is returned to host process.
3, the activity of monitoring host process, can be closed, in order to avoid take excess resource when the host process accident withdraws from assurance voluntarily.
It will be appreciated that; What the index block management in the embodiment of the invention was adopted is the subprocess pattern; But not thread mode; Mainly be for guarantee low memory with the situation that loads new index block under, adopt the mode of subprocess can avoid the situation that causes host process to be collapsed fully because loading new index block failure, finally be unlikely to cause all the index block of loaded all lost efficacy.So also be in order to guarantee the stability and the reliability of system to the full extent.
In one embodiment of the invention, the soft balance module of said load comprises the soft balanced website of a plurality of loads.The soft balanced website of each load act as suitable scheduling query requests, make the load of each working terminal reach balance generally, to avoid occurring the overweight and phenomenon of other website underloads of certain site load.The soft balance module Balancer of load obtains effectively working terminal tabulation in the current index cluster through the synchronous effect of monitor module Monitor, and receives the request of query interface, selects suitable working terminal to return and supplies query interface to carry out query calls.
Need to prove, similar in order to guarantee synchronous smooth and easy and working terminal, when starting service, can initiatively register the soft balanced website of load to the Monitor website.Generally speaking, if the hardware facility situation of each working terminal is close, the soft balanced website of load then can take the mode of picked at random to carry out the selection of working terminal, and returns to query interface and carry out further query calls.
In the embodiments of the invention, can also have query interface module Searcher, query interface module Searcher mainly is responsible for the function of two aspects.
1, query object is quantized, be converted into vector, and submit to this vector to inquire about, then Query Result is converted into the required form of external interface with equal dimension.
2, the scheduling relation with bottom encapsulates, upwards shielding, and with the coupling of reduction framework, and the extensibility of elevator system.In fact, the query manipulation of this part can be divided into two steps: from the soft balance module Balancer of load, obtain effective working terminal, inquiry is submitted to working terminal and obtained corresponding results.
In an example of the present invention, in order to guarantee the real-time requirement, query portion is not suitable for adopting Map Reduce framework, but combine MPI etc. more the high-efficiency information transfer mode be responsible for communication and the coordination that website is asked.
Embodiments of the invention have following advantage:
1, highly-parallelization, inquiry is efficient, and real-time is better, and degree of accuracy is higher.
What adopt is distributed search index framework, and has adopted high dimensional indexing structure Hybrid SpillTree comparatively efficiently, combines efficient communication mechanism such as MPI simultaneously, all guarantees the quick response of inquiry in all its bearings as far as possible.In addition; Hybrid Spill Tree under the stand-alone environment; The distributed index structure of the embodiment of the invention can effectively improve degree of accuracy, and main cause is the approximate neighbour that at every turn carries out predetermined quantity when inquiring about, and has adopted redundancy strategy; And finally the result has been carried out two stage integration and filtration, thereby can guarantee that the result who finally returns can cover the accurate neighbour of the predetermined quantity of the overwhelming majority.Wherein to filter be that result that working terminal returns local subprocess is when integrating in the integration of phase one; Filtering a part of redundant results; Subordinate phase then is when sending result that the working terminal of global query request returns other working terminals and integrating, further the filtering redundant results.Need to prove that in addition Hybrid Spill Tree itself possesses the adjustable characteristics of parameter, can in practical application, select suitable parameters, so that between efficient and degree of accuracy, obtain good compromise.
2, system reliability and stability are higher.
Can find out from the module analysis of framework; The present invention has introduced the load condition that monitor module Monitor is used to monitor and manage index block; And working terminal managed accordingly; Carry out intelligent coordinatedly according to the internal memory situation of website,, thereby guarantee the reliability of inquiry with lowest redundancy and the integrality that guarantees global index as much as possible.In addition, considered the multiple abnormal conditions that possibly occur, comprised that the machine of delaying, subprocess withdraw from unusually, host process withdraws from or the like unusually; And to above problem; Introduced the comparatively complicated monitoring mechanism and the coordination system, thereby guaranteed in multiple unusual generation, even under the parallel situation about taking place; System is comparatively robust ground running still, and guarantees the reception and the feedback of external inquiry.What need mention a bit is that on the surface, monitor module Monitor and the home site in traditional client/server scheme that the present invention introduces are similar, possibly become the bottleneck of whole framework.But in fact, can know through analyzing, the monitor module Monitor among the present invention, its load is obviously lower than the home site in the host-guest architecture.And, through adopting technology such as traditional Hot Spare or cluster, can guarantee to a greater extent that the same time all has Monitor website and Balancer website to run well, thereby powerful guarantee is provided for the stability of system.In addition; The architectural schemes that is based on peering structure that the embodiment of the invention adopts, littler for the dependence of bottleneck point, even if under the situation of Monitor complete failure; As long as a Balancer website and several working terminals ability operate as normal are arranged, system still can respond inquiry.
3, extensibility, scalability are better.
The high dimensional indexing parallel query framework that the embodiment of the invention proposed is not intended for certain certain applications; But directly processing feature is vectorial; This makes this framework have stronger extensibility; Both can be applied to CBIR, can also be applied in the fields such as video frequency searching, audio retrieval.In addition, in the present invention, for working terminal and the soft balanced website of load; After they start service, all can initiatively register to Monitor, nullify application and when withdrawing from, also all can submit to Monitor; And carry out follow-up notice or synchronously by Monitor; This mechanism make dynamically increase or the deletion website convenient, need not to intervene other websites or restart system, only need configure and get final product by the website of additions and deletions.These reflect that all framework proposed by the invention is possessing stronger adaptive faculty aspect the dynamic retractility of system.In addition, in the high dimensional indexing parallel query framework based on peering structure that the embodiment of the invention proposes, arbitrary working terminal can also can be used as the assistance website of global query simultaneously as the responsible website of global query, operates more flexible.
4, self-organization, intelligent management ability are stronger
Distributed high dimensional indexing parallel query framework based on peering structure proposed by the invention is in the management of website and coordinate to have done two-way guarantee.On the one hand, the startup of working terminal or the soft balanced website of load or withdraw from all and can register or nullify to the Monitor website, and notify and follow-up work such as synchronously by the Monitor website; On the other hand; The Monitor website itself is being safeguarded all soft balanced site lists of load; Safeguarding the whole working terminal information in the reciprocity website cluster (index cluster) of all reciprocity website cluster informations and current use simultaneously, and these websites are coordinated and managed (comprising going heavily, filling a vacancy or the like of index block).In addition; The inquiry framework that the embodiment of the invention proposes is supported the combination of isomery website; And can be according to the free memory situation of each website, carry out distribution and other scheduling, the co-ordination of index block intelligently, make the resource of system can obtain abundant, appropriate utilization.And the introducing of the soft balanced website of load is also adjusted the load of each working terminal on the aspect of soft equilibrium.Above-mentionedly reflected that all framework proposed by the invention possesses stronger self-organization and intelligent management ability.
5, support the dynamically mass data of increase and decrease
The present invention combines comparatively ripe Map Reduce framework and corresponding distributed memory system, and the establishment of high dimensional indexing has been connected with inquiry.After introducing Map Reduce framework carries out parallelization to index creation, solved the big limitation of high dimensional indexing structure additions and deletions cost to a certain extent, but carried out the renewal of high dimensional indexing through the mode of Fast Reconstruction index.In addition, introduce distributed memory system, can be with the storage system transparence of bottom, the subprocess that is more convenient in the working terminal carries out reading of index block, and need not to carry out complicated configuration.More than the combination of two aspects, create for the high dimensional indexing structure that mass data is provided and to lay a good foundation, and provide safeguard for feasibility.The present invention is connected with inquiry the establishment of high dimensional indexing through distributed memory system; High dimensional indexing is created module and only is responsible for regularly rebuilding index; With the change conditions of reflection data, the renewal that the Monitor website then regularly carries out index file detects, and after detecting stable the renewal; Carry out the loading of index and the switching of index cluster, the system that makes can in time respond the dynamic change situation of mass data.
[embodiment]
In practical application, its validity and feasibility have been verified in the actual test of 120 dimensional feature vectors through 1,000,000 images.As shown in the table, be distributed high dimensional indexing parallel query framework experimental result based on peering structure.Listed the response speed situation of the present invention's several times inquiry in the reality test.It should be noted that; In this experimentation, the time threshold of all inquiries is set to 1 second, and all images is 5 parts and the index block file of having created equal number by cutting; Equity website cluster comprises the ordinary PC of 5 isomeries, and (wherein four PC's is configured to the 2G internal memory; Pentium E5300 double-core CPU, other one is configured to the 4G internal memory, Pentium E5200 double-core CPU).In addition, in experimentation, the parameter of Hybrid Spill Tree is not transferred excellently, and adopt empirical value to test, to reflect the responding ability of system truly more comprehensively.
Inquiry times ?1 2 3 4 5 6 7 8 9 10 Average
Inquire about (millisecond) consuming time ?139 121 77 64 88 57 90 66 65 85 85.2
Table 1
Can find out that from table 1 there is certain wave phenomenon in query time, but the global response time is all still very short, average response time is 85.2 milliseconds, can satisfy the Search Requirement in the practical application basically.And what can look forward to is, transfers excellently Hybrid Spill Tree being carried out parameter, introduces more powerful server, sets under the littler conditions such as inquiry time limit, and query time will have reduction further, thereby make system's highly effective more.
Based on the more concrete implementation of the distributed high dimensional indexing parallel query framework of peering structure, as follows:
The realization of the realization of index creation module (high dimensional indexing is created module Builder):
It is regularly the candidate target of magnanimity to be carried out the reconstruction of index that high dimensional indexing is created the mainly responsible work of module, to guarantee in time to reflect the change conditions of candidate target.This module is because less demanding to real-time; And it is comparatively strict to stability and reliability requirement; In this embodiment; Realize by means of mature and stable relatively Map Reduce framework, and combine distributed memory system that the process of index creation carry out highly-parallelization, to guarantee to rebuild the efficient and reliable of index.
Therefore; The realization of Builder module concentrates on comparatively ripe building of the framework of increasing income and building of distributed memory system on the one hand; Then be to concentrate on timed task is set on the other hand; Magnanimity candidate target to appointment regularly carries out appropriate deblocking, and carries out the parallelization establishment of index through Map Reduce task, and each index block file storage is in transparent distributed memory system the most at last.
In addition, the Builder module also need be responsible for the maintenance of index block file, particularly; Exactly after new index is all created success; Remove old index block file, and new index block file is all moved under the path of appointment in the distributed memory system, so that arrived by the Monitor module monitors.What need explanation a bit is; Here the mistiming that covers the index block file can not cause occurring monitoring the situation of makeing mistakes; Because the monitoring function in the Monitor module is not loading and the switching that begins new index after finding for the first time to change immediately; But can continue new index file is carried out the tracking of several times, and only judging the just switching of execution index under the basicly stable situation of index file.And even if the situation of unusual too early loading occurred, along with the change of follow-up index file, system will reload, and finally guarantees the accuracy that index switches.
The realization of monitor module (Monitor):
The work that the Monitor module is responsible for mainly comprises three parts: the renewal monitoring of index file, the monitoring of working terminal (Peer website), coordination and management, and the soft balanced website of synchronized loading (Balancer website).And communicating by letter between Monitor module and other modules, the website; And the communication between other modules, between the website; In this embodiment; All mainly depend on the efficient communication interface RMI (Remote Method Invocation, the Java of MPI realizes, is widely used in the exploitation of parallel environment) of Java language.According to the implementation feature of RMI, the Monitor module is the side of being called of RMI service in the process of carrying out above-mentioned three part work, is again simultaneously the called side (Peer website, subprocess etc. are also similar) of RMI service.
As the side of being called of RMI service, the service interface that Monitor need provide comprise following these: registerClusterPeer, registerBalancer, unregisterClusterPeer, unregisterBalancer.Wherein registerClusterPeer and registerBalancer service interface are mainly used in the registration of Peer website and Balancer website; Reciprocity website cluster under wherein the registration of Peer website need indicate, unregisterClusterPeer and unregisterBalancer service interface then are mainly used in the de-registration request of Peer website and Balancer website.Through above four service interfaces, Monitor website self maintained the Peer site information and all Balancer site information of all reciprocity website clusters.
As the called side of RMI service, Monitor mainly depends on the service interface that calls other modules or website when carrying out its work of being responsible for, thereby realizes function corresponding.
The renewal monitoring aspect of indexed file, Monitor website need start the timing thread and be used at set intervals index file being detected, and confirm the change conditions of index file through comparison.And after finding change first, the Monitor website will further be followed the tracks of, and after final affirmation index file change is stable, just start the index task switching.The switching of index is actually the idle reciprocity website cluster of Monitor website notice and carries out the loading of new index block, and judges whether to load successfully according to predetermined strategy.Loading under the case of successful, directly current index cluster is switched on the new cluster, and synchronous Balancer website.
Detection, coordination and management aspect at working terminal; The Monitor website need be safeguarded regularly thread equally; Be used for monitoring free memory, index block situation of current all Peer websites of index cluster etc.; And will repeatedly not have the website of response to regard as the inefficacy website according to the strategy of appointment, report to managerial personnel.In addition, when four service interfaces that provide when the Monitor website were called, Monitor website itself also need be safeguarded corresponding Peer site list and Balancer site list in internal memory, and notice is given other websites.Particularly, when certain Peer site registration is perhaps nullified, after the Monitor website response request, can notify other websites in the same reciprocity website cluster to upgrade its reciprocity site list information.In addition; Free memory information and index block information that Monitor can feed back according to each Peer website; Therefrom find the index block of repetition or the index block of disappearance; And, select suitable website to carry out the unloading or the loading of index block intelligently, with lowest redundancy and the integrality that guarantees whole index according to the free memory information of Peer website.
Synchronized loading soft balanced aspect; Be actually the Monitor website and start regularly thread; And, give all Balancer websites with the available site list information synchronization of its current index cluster of safeguarding through directly calling the service that the Balancer website provides, make it in time to reflect the situation of index change; And be consistent, thereby can provide suitable Peer website as the inquiry home site for the query interface module with the Monitor website.
What need explanation a bit is that the realization of Monitor module needs corresponding exterior arrangement file, so that the Monitor module is carried out initialization.Wherein of paramount importance deploy content comprises: the index monitoring time at interval; Peer website monitoring time at interval; Index file is deposited the path in distributed memory system; Available reciprocity website cluster and inner Peer site information (comprising IP address, RMI serve port etc.) thereof, whole Balancer site information, and with some relevant configuration items of RMI or coordination strategy etc.
The realization of host process in the equity website:
Equity website host process is the main provider of inquiry service; This process mainly is responsible for the function of three aspects: the monitoring request of response Monitor website is also safeguarded reciprocity site list information and local index block message; Respond the global search request of other websites and the Query Result of integron process, distribute global search requests and integrate the Query Result that each reciprocity website returns to other reciprocity websites.
In the monitoring request of response Monitor website with safeguard aspect the local index piece; The Peer website need be registered to Monitor when starting; To inform Monitor; And then inform other the reciprocity websites under the same reciprocity website cluster, this is the Peer website as one of place of RMI service call side.After the startup, the Peer website need provide following service: Ping, addFederal, removeFederal, loadIndex, unloadIndex at least to Monitor.Wherein the Ping service interface is used to return the free memory information of Peer website itself and the index block information of managing; AddFederal and removeFederal service interface are used for adding and delete certain the reciprocity site information in the reciprocity site list that Peer website internal memory safeguards, to guarantee the accurate distribution of global search request; LoadIndex and unloadIndex service interface are used to load and unload the index block of appointment; In actual the realization; LoadIndex starts a subprocess of being in charge of this index block by host process; UnloadIndex directly closes corresponding subprocess by force, with the corresponding index of unloading from internal memory.Meanwhile, host process can remain the monitoring to subprocess, when finding that certain subprocess meets the expiration policy of appointment (such as continuous several times response timeout etc.), assert that directly it is the inefficacy subprocess, and feeds back to the Monitor website.Accordingly, host process need start PingForChildProcess service for subprocess, with the monitoring request of response subprocess, thereby keeps the running status of subprocess.
Aspect the Query Result of global search request that responds other websites and integron process, each Peer website need be opened search service for other reciprocity websites, to receive the global search request of other reciprocity websites.After the global search request that receives other websites, the Peer website will call the service interface of subprocess, and further submit to all subprocesss of oneself safeguarding to retrieval request.After getting access to the Query Result that each subprocess returns, the Peer website will be responsible for the result is integrated according to degree of accuracy from high to low, and remove redundancy section, return to the reciprocity website of distribution global query request then.
To other reciprocity websites distribution global search requests and integrate aspect the Query Result that each reciprocity website returns, the Peer website need provide the globalSearch service to supply the query interface module invokes.When receiving the query requests of query interface module; The Peer website will be through a plurality of threads; Further distribute retrieval request and give other Peer websites in the same reciprocity website cluster, oneself also start local retrieval simultaneously, submit to retrieval request to give all subprocesss of local maintenance.The Peer website both need have been integrated the result for retrieval of local subprocess after sending the global search request, need integrate the result for retrieval that other reciprocity websites return again.The process need of integrating is according to degree of accuracy order from high to low, and the needs removal is redundant, and the result returns to the query interface module the most at last.
The Peer website needs the exterior arrangement file to carry out initialized setting equally; Wherein main configuration item comprises: the RMI serve port scope that subprocess can be used; The query time restriction; Local RMI service relevant configuration item (IP address and service port number etc.); The parameter that the promoter process is relevant (the monitoring time interval of operational factor, subprocess etc.), the Peer site information of same index cluster (IP address, RMI service port number etc.), and Monitor site information (IP address, RMI service port number etc.) etc.
The realization of equity website subprocess (Child Process):
Equity website subprocess is related is actually the subprocess that is started by reciprocity website host process, the management and the inquiry of an index block of each this type of subprocess individual responsibility.Therefore, single Peer website has a plurality of subprocesss, simultaneously a plurality of index blocks is managed and is inquired about.
Equity website subprocess is created by host process and when starting, and specifies port, the path of index block file in distributed memory system of RMI service by host process, and the service port number of host process etc.After subprocess starts, will obtain corresponding index block file through the interface of distributed memory system, and it will be launched in internal memory, shone upon the relation between query object and the proper vector, for follow-up inquiry is prepared.
As the called side of RMI service, subprocess mainly is through calling the RMI service of host process, host process is carried out heartbeat detection, can close voluntarily when host process withdraws from unusually with assurance, avoiding taking too much resource.
And as the side of being called of RMI service; Subprocess mainly provides the search service interface; In fact be exactly to accept query vector and the association requests that host process is given; Inquire about in the Hybrid Spill Tree index structure in internal memory, and final result is returned to host process, carry out the result by host process and integrate and send.
The configuration section of subprocess is actually by host process and passes into through the process parameter, need not the exterior arrangement file.
The realization of the soft balance module of load (Balancer):
The work that the soft office weighing apparatus module of load is born in system mainly comprises two parts, is respectively to accept the synchronous requirement of Monitor website and for the query interface module provides suitable available Peer website, as the home site of inquiring about.
As the called side of RMI service, the Balancer website mainly is after starting service, registers through the service of calling the Monitor website, and is synchronous to inform that the Monitor website begins to accept the Peer site list.
And as the side of being called of RMI service, the service interface that the Balancer website provides mainly comprises: SynchronizeCluster and getAProperPeer.Wherein the SynchronizeCluster service interface is mainly called by the Monitor website, thereby realizes the synchronizing function of effective Peer site list in the current index cluster; The getAProperPeer service interface is then called by query interface, returns Peer website available in the current index cluster.
The configuration information of Balancer website mainly comprises: Monitor site information (comprising IP address and RMI serve port etc.), and Balancer website itself starts some relevant parameters of RMI service etc.
The realization of query interface module (Searcher):
The work that query interface is responsible for is to accept the external inquiry request, and query object is quantified as proper vector, and the Peer website that can use from the Balancer station for acquiring as the inlet of search, and is converted into the required form of external interface with final search result.The query interface module is the coupling part between the scheduling of upper layer application and bottom, through the communication and the call relation of packaging bottom layer, for the exploitation of upper layer application provides interface easily.Therefore, in fact this part of module can be used as chained library or third party's bag, offers the upper strata developer and further develops.
Particularly, the query interface module need realize tetrameric function: accept and the quantification query object, call the Peer website that the Balancer station for acquiring can be used, call the Peer website and carry out search, and the processing of Search Results.
The configuration of query interface relates generally to the information (comprising IP address and serve port of Balancer website etc.) of Balancer website.
The connection of system and overall operation:
Distributed high dimensional indexing parallel query framework based on peering structure proposed by the invention has stronger self-organization and intelligent management ability, for the not strict sequence requirement of the boot sequence of each module of system.Suggestion is running index creation module earlier, starts Monitor module, Balancer module and reciprocity website cluster then, starts the query interface module at last and begins to accept the external inquiry request but generally speaking.In addition, aspect system's connection, the planning of cluster is carried out in suggestion earlier; And each modules configured file is carried out proper configuration; Can after startup, settle out with the assurance system, and accomplish the work such as loading of index, with the response external query requests with fast speeds.
The whole modules of system need comparatively comprehensively be tested the overall operation situation of system after starting, particularly to the test of the various abnormal conditions that possibly exist, to weigh the reliability and stability of system.Through actual test analysis, framework provided by the invention is done fairly perfectly aspect reliability and stability, and it is unusual to resist preferably that the multiple machine of delaying, host process withdraw from unusually, subprocess withdraws from unusually etc., and overall operation is all right.
Describe and to be understood that in the process flow diagram or in this any process otherwise described or method; Expression comprises module, fragment or the part of code of the executable instruction of the step that one or more is used to realize specific logical function or process; And the scope of preferred implementation of the present invention comprises other realization, and this should be understood by the embodiments of the invention person of ordinary skill in the field.
Should be appreciated that each several part of the present invention can use hardware, software, firmware or their combination to realize.In the above-described embodiment, a plurality of steps or method can realize with being stored in the storer and by software or firmware that suitable instruction execution system is carried out.
Those skilled in the art are appreciated that and realize that all or part of step that the foregoing description method is carried is to instruct relevant hardware to accomplish through program; Described program can be stored in a kind of computer-readable recording medium; This program comprises one of step or its combination of method embodiment when carrying out.
In addition, each functional unit in each embodiment of the present invention can be integrated in the processing module, also can be that the independent physics in each unit exists, and also can be integrated in the module two or more unit.Above-mentioned integrated module both can adopt the form of hardware to realize, also can adopt the form of software function module to realize.If said integrated module realizes with the form of software function module and during as independently production marketing or use, also can be stored in the computer read/write memory medium.
The above-mentioned storage medium of mentioning can be a ROM (read-only memory), disk or CD etc.
In the description of this instructions, the description of reference term " embodiment ", " some embodiment ", " example ", " concrete example " or " some examples " etc. means the concrete characteristic, structure, material or the characteristics that combine this embodiment or example to describe and is contained at least one embodiment of the present invention or the example.In this manual, the schematic statement to above-mentioned term not necessarily refers to identical embodiment or example.And concrete characteristic, structure, material or the characteristics of description can combine with suitable manner in any one or more embodiment or example.
Although illustrated and described embodiments of the invention; For those of ordinary skill in the art; Be appreciated that under the situation that does not break away from principle of the present invention and spirit and can carry out multiple variation, modification, replacement and modification that scope of the present invention is accompanying claims and be equal to and limit to these embodiment.

Claims (10)

1. the distributed high dimensional indexing parallel query framework based on peering structure is characterized in that, comprising: index creation module, monitor module, reciprocity website cluster and the soft balance module of load, wherein,
Said index creation module is used for candidate target is cut apart and created index obtaining a plurality of index block files for each partitioning portion, and said index block file is stored, and wherein, said index block file comprises index block information,
Said monitor module is used for detecting the corresponding index block information of free memory information and each working terminal of working terminal of said reciprocity website cluster according to testing result the index block that each working terminal was loaded coordinated and to be sent reciprocity site list update instruction to each working terminal
Working terminal in the said reciprocity website cluster carries out load or unload according to the index block information of self to corresponding index block file; And the query requests according to the user sends is inquired about and Query Result is integrated and exported in corresponding index block file
The work at present site list that the soft balance module of said load is used for obtaining said monitor module is carrying out load balancing control according to said working terminal tabulation to the work at present website, and the soft balance module of said load will regularly be undertaken synchronously by said monitor module so that the soft balance module of said load is adjusted current available working terminal tabulation and upgraded.
2. the distributed high dimensional indexing parallel query framework based on peering structure according to claim 1 is characterized in that said index creation module further comprises:
Index creation submodule, said index creation submodule adopt Map Reduce framework that a plurality of data partitioning portions are walked abreast and create index to obtain said a plurality of index block file;
Distributed memory system, said distributed memory system are used to preserve said a plurality of index block file.
3. the distributed high dimensional indexing parallel query framework based on peering structure according to claim 1 is characterized in that,
Said monitor module also be used for detect the index block file upgrade the back to inoperative set of sites pocket transmission index load instructions so that said inoperative website cluster gets into duty; Said monitor module will carry out the cluster switching after said index block file load finishes; And use the working terminal cluster that has loaded up-to-date index, original working terminal cluster is placed off working state.
4. the distributed high dimensional indexing parallel query framework based on peering structure according to claim 1; It is characterized in that; Said monitor module judges whether there is the inefficacy website in the current working terminal according to said testing result, and sends reciprocity site list update instruction to current available working terminal detecting when having the inefficacy website.
5. the distributed high dimensional indexing parallel query framework based on peering structure according to claim 1; It is characterized in that;, will register the inoperative website when adding said working terminal cluster to said monitor module; So that said monitor module upgrades said working terminal tabulation, and the tabulation of the working terminal after the said renewal is sent to the soft balance module of said load, the reciprocity site list after will upgrading simultaneously sends to all said working terminals.
6. the distributed high dimensional indexing parallel query framework based on peering structure according to claim 1; It is characterized in that; When each working terminal in the said reciprocity website cluster receives query requests; With creating host process and,, and Query Result is sent to host process and integrates transmission through host process so that each subprocess inquires about from the index block of loading of correspondence according to a plurality of corresponding subprocesss of the index block information creating of self.
7. the distributed high dimensional indexing parallel query framework based on peering structure according to claim 6 is characterized in that, said host process also is used to respond the query requests of other working terminal distribution.
8. the distributed high dimensional indexing parallel query framework based on peering structure according to claim 6 is characterized in that said subprocess is monitored host process, so that when said host process accident withdraws from, close automatically.
9. the distributed high dimensional indexing parallel query framework based on peering structure according to claim 1; It is characterized in that; The soft balance module of said load comprises the soft balanced website of a plurality of loads, and the soft balance module of said load is safeguarded said current available working terminal tabulation.
10. according to each described distributed high dimensional indexing parallel query framework of claim 1-9, it is characterized in that said index block file adopts Hybird Spill Tree data structure based on peering structure.
CN 201210038115 2012-02-17 2012-02-17 Peer-to-peer structure based distributed high-dimensional indexing parallel query framework Active CN102622414B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201210038115 CN102622414B (en) 2012-02-17 2012-02-17 Peer-to-peer structure based distributed high-dimensional indexing parallel query framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201210038115 CN102622414B (en) 2012-02-17 2012-02-17 Peer-to-peer structure based distributed high-dimensional indexing parallel query framework

Publications (2)

Publication Number Publication Date
CN102622414A true CN102622414A (en) 2012-08-01
CN102622414B CN102622414B (en) 2013-11-06

Family

ID=46562333

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201210038115 Active CN102622414B (en) 2012-02-17 2012-02-17 Peer-to-peer structure based distributed high-dimensional indexing parallel query framework

Country Status (1)

Country Link
CN (1) CN102622414B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103455531A (en) * 2013-02-01 2013-12-18 深圳信息职业技术学院 Parallel indexing method supporting real-time biased query of high dimensional data
CN104572785A (en) * 2013-10-29 2015-04-29 阿里巴巴集团控股有限公司 Method and device for establishing index in distributed form
CN106681794A (en) * 2016-12-07 2017-05-17 同济大学 Interest behavior based distributed virtual environment cache management method
CN106686117A (en) * 2017-01-20 2017-05-17 郑州云海信息技术有限公司 Distributed calculation cluster data storage processing system and method
CN106844752A (en) * 2017-02-16 2017-06-13 中电海康集团有限公司 A kind of entity relationship searching method and device based on data correlation network model
CN108279943A (en) * 2017-01-05 2018-07-13 腾讯科技(深圳)有限公司 Index loading method and device
CN108681592A (en) * 2018-05-15 2018-10-19 北京三快在线科技有限公司 Index switching method, device, system and index switching control device
CN108920552A (en) * 2018-06-19 2018-11-30 浙江工业大学 A kind of distributed index method towards multi-source high amount of traffic
CN109154937A (en) * 2016-04-29 2019-01-04 思科技术公司 The dynamic of inquiry response is transmitted as a stream
CN109710642A (en) * 2018-12-18 2019-05-03 中科曙光国际信息产业有限公司 The parallel processing system (PPS) of index polymerization based on big data framework
CN109788068A (en) * 2019-02-14 2019-05-21 腾讯科技(深圳)有限公司 Heartbeat state information report method, device and equipment and computer storage medium
CN110221910A (en) * 2019-06-19 2019-09-10 北京百度网讯科技有限公司 Method and apparatus for executing MPI operation
CN111338560A (en) * 2018-12-19 2020-06-26 北京奇虎科技有限公司 Cache reconstruction method and device

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106970837B (en) * 2017-03-29 2020-05-26 联想(北京)有限公司 Information processing method and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853283A (en) * 2010-05-21 2010-10-06 南京邮电大学 Construction method for multidimensional data-oriented semantic indexing peer-to-peer network
US20100281166A1 (en) * 2007-11-09 2010-11-04 Manjrasoft Pty Ltd Software Platform and System for Grid Computing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100281166A1 (en) * 2007-11-09 2010-11-04 Manjrasoft Pty Ltd Software Platform and System for Grid Computing
CN101853283A (en) * 2010-05-21 2010-10-06 南京邮电大学 Construction method for multidimensional data-oriented semantic indexing peer-to-peer network

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
《IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING》 20040730 Heng Tao Shen et al. "Efficient Semantic-Based Content Search in P2P Network" 813-826 第16卷, 第7期 *
《Proceedings of the First Symposium》 20040331 Chunqiang Tang et al. "Hybrid Global-Local Indexing for Efficient Peer-to-Peer Information Retrieval" 1-28 , *
《计算机集成制造系统》 20110831 陈f凤娟等 "面向云环境的图像高维特征索引框架" 1827-1833 1-10 第17卷, 第8期 *
CHUNQIANG TANG ET AL.: ""Hybrid Global-Local Indexing for Efficient Peer-to-Peer Information Retrieval"", 《PROCEEDINGS OF THE FIRST SYMPOSIUM》 *
HENG TAO SHEN ET AL.: ""Efficient Semantic-Based Content Search in P2P Network"", 《IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING》 *
陈F凤娟等: ""面向云环境的图像高维特征索引框架"", 《计算机集成制造系统》 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103455531B (en) * 2013-02-01 2016-12-28 深圳信息职业技术学院 A kind of parallel index method supporting high dimensional data to have inquiry partially in real time
CN103455531A (en) * 2013-02-01 2013-12-18 深圳信息职业技术学院 Parallel indexing method supporting real-time biased query of high dimensional data
CN104572785B (en) * 2013-10-29 2018-07-03 阿里巴巴集团控股有限公司 A kind of distributed method and apparatus for creating index
CN104572785A (en) * 2013-10-29 2015-04-29 阿里巴巴集团控股有限公司 Method and device for establishing index in distributed form
CN109154937A (en) * 2016-04-29 2019-01-04 思科技术公司 The dynamic of inquiry response is transmitted as a stream
CN106681794B (en) * 2016-12-07 2020-04-10 长春市三昧动漫设计有限公司 Interest behavior based distributed virtual environment cache management method
CN106681794A (en) * 2016-12-07 2017-05-17 同济大学 Interest behavior based distributed virtual environment cache management method
CN108279943A (en) * 2017-01-05 2018-07-13 腾讯科技(深圳)有限公司 Index loading method and device
CN108279943B (en) * 2017-01-05 2020-09-11 腾讯科技(深圳)有限公司 Index loading method and device
CN106686117B (en) * 2017-01-20 2020-04-03 郑州云海信息技术有限公司 Data storage processing system and method of distributed computing cluster
CN106686117A (en) * 2017-01-20 2017-05-17 郑州云海信息技术有限公司 Distributed calculation cluster data storage processing system and method
CN106844752A (en) * 2017-02-16 2017-06-13 中电海康集团有限公司 A kind of entity relationship searching method and device based on data correlation network model
CN108681592A (en) * 2018-05-15 2018-10-19 北京三快在线科技有限公司 Index switching method, device, system and index switching control device
CN108681592B (en) * 2018-05-15 2021-05-25 北京三快在线科技有限公司 Index switching method, device and system and index switching central control device
CN108920552A (en) * 2018-06-19 2018-11-30 浙江工业大学 A kind of distributed index method towards multi-source high amount of traffic
CN108920552B (en) * 2018-06-19 2022-04-29 浙江工业大学 Distributed index method for multi-source large data stream
CN109710642A (en) * 2018-12-18 2019-05-03 中科曙光国际信息产业有限公司 The parallel processing system (PPS) of index polymerization based on big data framework
CN111338560A (en) * 2018-12-19 2020-06-26 北京奇虎科技有限公司 Cache reconstruction method and device
CN109788068A (en) * 2019-02-14 2019-05-21 腾讯科技(深圳)有限公司 Heartbeat state information report method, device and equipment and computer storage medium
CN109788068B (en) * 2019-02-14 2020-11-03 腾讯科技(深圳)有限公司 Heartbeat state information reporting method, device and equipment and computer storage medium
CN110221910A (en) * 2019-06-19 2019-09-10 北京百度网讯科技有限公司 Method and apparatus for executing MPI operation
CN110221910B (en) * 2019-06-19 2022-08-02 北京百度网讯科技有限公司 Method and apparatus for performing MPI jobs

Also Published As

Publication number Publication date
CN102622414B (en) 2013-11-06

Similar Documents

Publication Publication Date Title
CN102622414B (en) Peer-to-peer structure based distributed high-dimensional indexing parallel query framework
CN112000448B (en) Application management method based on micro-service architecture
CN100430914C (en) Storing system having vitual source
CN102855284B (en) The data managing method of a kind of cluster storage system and system
US8156208B2 (en) Hierarchical, multi-tiered mapping and monitoring architecture for service-to-device re-mapping for smart items
CN101930472A (en) Parallel query method for distributed database
US8484510B2 (en) Enhanced cluster failover management
US20170289059A1 (en) Container-based mobile code offloading support system in cloud environment and offloading method thereof
CN114443435B (en) Performance monitoring alarm method and alarm system for container microservice
US20070118549A1 (en) Hierarchical, multi-tiered mapping and monitoring architecture for smart items
US20070118496A1 (en) Service-to-device mapping for smart items
US20070118560A1 (en) Service-to-device re-mapping for smart items
CN110399535A (en) A kind of data query method, device and equipment
CN101662495A (en) Backup method, master server, backup servers and backup system
CN105701099A (en) Method and device used for executing task in distributed environment, and distributed task execution system
CN116777182B (en) Task dispatch method for semiconductor wafer manufacturing
CN111736809B (en) Distributed robot cluster network management framework and implementation method thereof
CN104750757A (en) Data storage method and equipment based on HBase
US10481800B1 (en) Network data management protocol redirector
US20120272253A1 (en) Distributed multi-system management
CN112468310B (en) Streaming media cluster node management method and device and storage medium
CN111984393A (en) Distributed large-scale real-time data scheduling engine system and data scheduling method thereof
US20180184368A1 (en) Coordinator and control method thereof, wireless sensing network communication system and method
US20230376391A1 (en) Data ingestion replication and disaster recovery
CN101207518B (en) Asynchronization maintenance system facing to distributed resource node

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant