CN108460072A - With electricity consumption data retrieval method and system - Google Patents

With electricity consumption data retrieval method and system Download PDF

Info

Publication number
CN108460072A
CN108460072A CN201711434002.1A CN201711434002A CN108460072A CN 108460072 A CN108460072 A CN 108460072A CN 201711434002 A CN201711434002 A CN 201711434002A CN 108460072 A CN108460072 A CN 108460072A
Authority
CN
China
Prior art keywords
index
node
data
electricity consumption
consumption data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711434002.1A
Other languages
Chinese (zh)
Inventor
吴新玲
谢伟
张书翰
田传波
乔克
闫爱梅
佘家驹
郭乃网
苏运
黄芙蓉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Information and Telecommunication Co Ltd
State Grid Shanghai Electric Power Co Ltd
Beijing Guodiantong Network Technology Co Ltd
Original Assignee
State Grid Information and Telecommunication Co Ltd
State Grid Shanghai Electric Power Co Ltd
Beijing Guodiantong Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Information and Telecommunication Co Ltd, State Grid Shanghai Electric Power Co Ltd, Beijing Guodiantong Network Technology Co Ltd filed Critical State Grid Information and Telecommunication Co Ltd
Priority to CN201711434002.1A priority Critical patent/CN108460072A/en
Publication of CN108460072A publication Critical patent/CN108460072A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/134Distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Library & Information Science (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses one kind matching electricity consumption data retrieval method and system, and method therein includes:Multiple data memory nodes are divided in distributed data base or file system, will be stored in corresponding data memory node with electricity consumption data;Index group system is established by distributed index establishment model and is managed with data memory node and with the corresponding index slicing files of electricity consumption data;Inquiry group system to being retrieved with electricity consumption data of being stored in data memory node and provides retrieval result by distributed search pattern and based on index slicing files;The present invention's matches electricity consumption data retrieval method and system, it can solve the problems, such as the index of mass data, have the characteristics that high-throughput, high scalability, high concurrent, high fault tolerance, it is suitble to the concurrently access to super large data set, reduce the load of retrieval host node, and the availability of inquiry service can be improved, increase the response speed of service, improves the inquiry experience of user.

Description

With electricity consumption data retrieval method and system
Technical field
The present invention relates to information retrievals and search technique field, more particularly to one kind with electricity consumption data retrieval method and to be System.
Background technology
With deepening continuously for intelligent adapted power grid construction, the sharp increase of acquisition terminal quantity acquires frequency substantially Enhancing, is faced with the effective integration of multi-source heterogeneous mass data, the challenge of efficient storage and enhanced scalability.Adapted electric industry business by It walks and develops to intelligent, lean direction, need further to promote trans-sectoral business, cross-platform data analysis and process ability, from And the accuracy and real-time and human-computer interaction and effect of visualization to the high efficiency of data storage and processing, value excavation carry Requirements at the higher level are gone out.It is included with electricity consumption data packet:Structural data and unstructured data.Structural data refers to same class data Storage format and length having the same, can mark off fixed underlying dimension, the number that can be expressed in table form According to, such as character, number;Unstructured data refers to no apparent structure, can not mark off fixed basic component Data, such as sound, figure, image, animation, video, photo.Structural data usually describes the basic letter of static state of spatial data Breath, and unstructured information describes the space time information of spatial data, although there is very big difference in data structure in both data It is different, but they are to describe the same space information from different perspectives, are an entirety in inquiry and calling.
Index file indicates the one-to-one relationship between logic record and physical record, can make in the services such as inquiry .With being continuously increased for data volume, not only the size of index file has soon exceeded the existing space of stand-alone server, And establish the time spent by single huge index file and executed on the index file inquiry time it is also continuous Increase, new matches electricity consumption data retrieval method therefore, it is necessary to a kind of.
Invention content
In view of this, the invention solves a technical problem be to provide and a kind of with electricity consumption data retrieval method and be System.
According to an aspect of the present invention, one kind is provided and matches electricity consumption data retrieval method, including:Match electricity consumption based on preset The storage architecture of data divides multiple data memory nodes in distributed data base or file system, described will match electricity consumption data It is stored in corresponding data memory node;Wherein, described to be included with electricity consumption data packet:Structuring is with electricity consumption data, non-knot Structure adapted data;Index group system, the index group system are established in the distributed data base or file system Established by distributed index establishment model and manage with the data memory node and it is described with electricity consumption data it is corresponding Index slicing files;Inquiry group system, the inquiry group system are established in the distributed data base or file system By distributed search pattern and based on the index slicing files to matching electricity consumption described in being stored in the data memory node Data are retrieved and provide retrieval result.
Optionally, an index host node and multiple index nodes are established, corresponding to index node setting and its Index slicing files, wherein the index slicing files be that tables of data corresponding index fragment of the storage with electricity consumption data is literary Part;The index host node is received for the index task with electricity consumption data, and the index task is passed through message mechanism It is sent to the index node;The index node based on the index task to corresponding index slicing files at Reason;Wherein, the index task includes:It newly indexes, updates index, deletes index task.
Optionally, it if the index task, which is increment type, newly indexes task, is deposited when in the data memory node When matching electricity consumption data described in storage, increment type is sent to the index host node by system message and newly indexes task;Wherein, institute State increment type newly index task carrying data include:Data table name with electricity consumption data and content;The index host node This is determined with the index slicing files corresponding to electricity consumption data according to the data table name with electricity consumption data, and is matched based on described The data table name and content and index slicing files information of electricity consumption data generate index and establish message, and are put into distributed index In message queue;The index node obtains the index from the index messages queue and establishes message, is based on the adapted The data table name of electric data determines that this index establishes message whether thus index node is handled, if it is, based on described It is that described established with electricity consumption data is indexed and updated with described with the corresponding rope of the data table name of electricity consumption data that index, which establishes message, Draw slicing files, if it is not, then determining that the processing index establishes the rope of message based on the data table name with electricity consumption data Draw node, and the index established into message is sent to this index node and handle;If it is determined that the processing index is established The index node of message fails, then the index host node thus establish message and redistribute an index node by described index, And the index is established message and is sent to this index node and is handled.
Optionally, if the index task, which is batch type, increases index task newly, the index host node calls MapReduce modes increase the batch type received newly index task and are divided into multiple data aggregates, by the multiple data set Distribution is closed to multiple index nodes;Wherein, every data in multiple data acquisition systems includes:Data table name with electricity consumption data and Content;The multiple index node is based respectively on the data acquisition system received and establishes index, and the index of foundation is closed And handle, and update with described with the corresponding index slicing files of the data table name of electricity consumption data.
Optionally, query master node, query node and inquiring client terminal are established;The query master node receives described The index slicing files that host node is sent are indexed, and are divided to the query node according to the loading condition of each query node With the index slicing files;When receiving user's inquiry request that the inquiring client terminal is sent, the query master node is true It is fixed be queried described with the corresponding index slicing files of electricity consumption data, slicing files are indexed based on this and determine at least one institute State query node;The query node row that inquiry service is provided are generated according to the load condition of at least one query node Table is simultaneously sent to the inquiring client terminal;The inquiring client terminal askes the query node in node listing to the Check and sends inquiry Request, the Check ask the query node in node listing based on assigned and executed in the index slicing files being locally stored Inquiry request;The Check is ask the query result that the query node in node listing returns and merges place by the inquiring client terminal Reason, and the query result after merging treatment is provided.
Optionally, it is sent by the distributed index message queue between the index host node and the index node Type of message include:Update index deletes index and pattern switching message;The index group system is to the index Slicing files are handled;Wherein, described handle includes:Newly-increased, update is deleted and is merged;The index group system is logical The handling result of the index slicing files will be notified to query master node by crossing the first notification message queue;The main section of inquiry Point generates query node message based on the handling result, is sent the query node message by second notification message queue To query node corresponding with this index slicing files, so that this index slicing files that the update of this query node is locally stored. Wherein, the distributed data base includes:HBase databases;The distributed file system includes:HDFS file system.
According to another aspect of the present invention, one kind is provided and matches electricity consumption data retrieval system, including:Data memory node is used Match electricity consumption data in storage;Wherein, based on the preset storage architecture with electricity consumption data in distributed data base or file system It is middle to divide multiple data memory nodes, it is stored in described in corresponding data memory node with electricity consumption data;Wherein, institute It states and is included with electricity consumption data packet:Structuring is with electricity consumption data, unstructured adapted data;Group system is indexed, for passing through distribution Formula index establishment model is established and manages and the data memory node and described match the corresponding index fragment of electricity consumption data File;Group system is inquired, for inquiring group system by distributed search pattern and being based on the index slicing files pair What is stored in the data memory node described retrieved with electricity consumption data and provides retrieval result.
Optionally, the index group system includes:One index host node and multiple index nodes;The main section of index Point, for index node setting and the index slicing files corresponding to it, wherein the index slicing files are storage The corresponding index slicing files of tables of data with electricity consumption data;It receives for the index task with electricity consumption data, it will be described Index task is sent to the index node by message mechanism;The index node, for based on the index task pair with Its corresponding index slicing files is handled;Wherein, the index task includes:It newly indexes, updates index, deletes index Task.
Optionally, the data memory node, if for the index task be increment type newly index task, when When matching electricity consumption data described in being stored in the data memory node, increment type is sent to the index host node by system message Newly index task;Wherein, the increment type newly index task carrying data include:Data table name with electricity consumption data and Content;The index host node, for determining this with corresponding to electricity consumption data according to the data table name with electricity consumption data Slicing files are indexed, and index is generated based on the data table name and content with electricity consumption data and index slicing files information Message is established, and is put into distributed index message queue;The index node, for being obtained from the index messages queue The index establishes message, determines that whether thus this index establishes message index section based on the data table name with electricity consumption data Point is handled, if it is, based on the index establish message be it is described establish with electricity consumption data index and update with it is described The corresponding index slicing files of data table name with electricity consumption data, if it is not, then based on the data table name with electricity consumption data It determines that processing is described and indexes the index node for establishing message, and the index is established into message and is sent at this index node Reason;The index host node, for if it is determined that described index of processing establishes the index node failure of message, then the rope thus Draw and establish message and redistribute an index node, and the index established into message is sent to this index node and handle.
Optionally, the index host node calls if being that batch type increases index task newly for the index task MapReduce modes increase the batch type received newly index task and are divided into multiple data aggregates, by the multiple data set Distribution is closed to multiple index nodes;Wherein, every data in multiple data acquisition systems includes:Data table name with electricity consumption data and Content;The multiple index node is based respectively on the data acquisition system received and establishes index, and the index of foundation is closed And handle, and update with described with the corresponding index slicing files of the data table name of electricity consumption data.
Optionally, the inquiry group system includes:Establish query master node, query node and inquiring client terminal;Institute Query master node is stated, the index slicing files sent for receiving the index host node, and according to each query node Loading condition distribute the index slicing files to the query node;It is looked into when receiving the user that the inquiring client terminal is sent When asking request, determines and be queried described with the corresponding index slicing files of electricity consumption data, it is true that slicing files are indexed based on this Fixed at least one query node;It is generated according to the load condition of at least one query node and inquiry service is provided Query node list and be sent to the inquiring client terminal;The inquiring client terminal, for being ask in node listing to the Check Query node send inquiry request, wherein the Check askes the query node in node listing based on assigned and in local The index slicing files of storage execute inquiry request;By the Check ask node listing in query node return query result into Row merging treatment, and the query result after merging treatment is provided.
Optionally, it is sent by the distributed index message queue between the index host node and the index node Type of message include:Update index deletes index and pattern switching message;The index group system is to the index Slicing files are handled;Wherein, described handle includes:Newly-increased, update is deleted and is merged;The index group system is logical The handling result of the index slicing files will be notified to query master node by crossing the first notification message queue;The main section of inquiry Point generates query node message based on the handling result, is sent the query node message by second notification message queue To query node corresponding with this index slicing files, so that this index slicing files that the update of this query node is locally stored; Wherein, the distributed data base includes:HBase databases;The distributed file system includes:HDFS file system.
The present invention's matches electricity consumption data retrieval method and system, and multiple numbers are divided in distributed data base or file system According to memory node, will be stored in corresponding data memory node with electricity consumption data;Index group system passes through distribution Index establishment model is established and manages and data memory node and match the corresponding index slicing files of electricity consumption data;Query set Group's system by distributed search pattern and based on index slicing files to store in data memory node with electricity consumption data into Row is retrieved and provides retrieval result;The new search method for structural data and unstructured data is provided, can be solved The index problem of mass data, has the characteristics that high-throughput, high scalability, high concurrent, high fault tolerance, is suitble to super large number According to the concurrently access of collection, the load of retrieval host node is reduced, and the availability of inquiry service can be improved, increases the response of service Speed improves the inquiry experience of user.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention without having to pay creative labor, may be used also for those of ordinary skill in the art With obtain other attached drawings according to these attached drawings.
Fig. 1 is the flow diagram according to one embodiment with electricity consumption data retrieval method of the present invention;
Fig. 2 is to be illustrated according to the flow of the foundation index of one embodiment with electricity consumption data retrieval method of the present invention Figure;
Fig. 3 is to be illustrated according to the flow of the data retrieval of one embodiment with electricity consumption data retrieval method of the present invention Figure;
Fig. 4 is the module diagram according to one embodiment with electricity consumption data retrieval system of the present invention.
Specific implementation mode
Carry out the various exemplary embodiments of detailed description of the present invention now with reference to attached drawing.It should be noted that:Unless in addition having Body illustrates that the unlimited system of component and the positioned opposite of step, numerical expression and the numerical value otherwise illustrated in these embodiments is originally The range of invention.
Simultaneously, it should be appreciated that for ease of description, the size of attached various pieces shown in the drawings is not according to reality Proportionate relationship draw.
It is illustrative to the description only actually of at least one exemplary embodiment below, is never used as to the present invention And its application or any restrictions that use.
Technology, method and apparatus known to person of ordinary skill in the relevant may be not discussed in detail, but suitable In the case of, technology, method and apparatus should be considered as part of specification.
It should be noted that:Similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined, then it need not be further discussed in subsequent attached drawing in a attached drawing.
The embodiment of the present invention can be applied to computer system/server, can be with numerous other general or specialized calculating System environments or configuration operate together.Suitable for be used together with computer system/server well-known computing system, ring The example of border and/or configuration includes but not limited to:Smart mobile phone, personal computer system, server computer system, Thin clients Machine, thick client computer, hand-held or laptop devices, microprocessor-based system, set-top box, programmable consumer electronics, network PC, little types Ji calculate machine Xi Tong ﹑ large computer systems and the distributed cloud computing technology ring including any of the above described system Border, etc..
Computer system/server can be in computer system executable instruction (such as journey executed by computer system Sequence module) general context under describe.In general, program module may include routine, program, target program, component, logic, number According to structure etc., they execute specific task or realize specific abstract data type.Computer system/server can be with Implement in distributed cloud computing environment, in distributed cloud computing environment, task is long-range by what is be linked through a communication network Manage what equipment executed.In distributed cloud computing environment, program module can be positioned at the Local or Remote meter for including storage device It calculates in system storage medium.
Fig. 1 is according to the flow diagram of one embodiment with electricity consumption data retrieval method of the present invention, such as Fig. 1 institutes Show:
Step 101, it is divided in distributed data base or file system based on the preset storage architecture with electricity consumption data Multiple data memory nodes will be stored in electricity consumption data in corresponding data memory node.
Preset storage architecture can be a variety of distributed storage frameworks etc., and distributed data base includes HBase data Library etc., distributed file system include HDFS file system etc..Structuring is included with electricity consumption data, unstructured with electricity consumption data packet Adapted data etc..HDFS file system or HBase databases are divided into multiple data memory nodes, data memory node can Think the set etc. of a server or multiple servers.
For example, HDFS is distributed index system, HDFS is sent out based on GFS, is designed to be deployed in cheap hardware On, have the characteristics that high-throughput, high scalability, high concurrent, high fault tolerance, be suitble to those need it is to super large data set and Hair accesses, the application program of the operations such as storage and processing.HDFS provides simple consistency model simultaneously, has high fault tolerance, It can ensure the safety of the guarantee data in the case of node failure or other systems failure.
Step 102, index group system is established in distributed data base or file system, index group system is by dividing Cloth index establishment model is established and manages and data memory node and match the corresponding index slicing files of electricity consumption data.
For example, index slicing files are the index file of the tables of data with electricity consumption data, adapted is stored in electricity consumption data In the tables of data of electric data, multiple tables of data of the storage with electricity consumption data can be respectively stored on multiple data memory nodes, Index in index slicing files is by the index with electricity consumption data that is stored in the corresponding tables of data with electricity consumption data.
Step 103, inquiry group system is established in distributed data base or file system, inquiry group system is by dividing Cloth search modes are simultaneously retrieved and are provided with electricity consumption data to what is stored in data memory node based on index slicing files Retrieval result.
In one embodiment, single for for unstructured data, inhomogeneous object formation method repetitive rate is very high Effective encapsulation of a object causes redundant storage;Object-oriented data model does not support relationship type, unstructured right There is complicated relationship as between.Original object oriented data model can be expanded, by hypertext data model with towards right The data model of elephant combines together, constructs a kind of object-oriented data model of expansion.The mould of node link may be used Type (N-L), core are node (Node) and chain (Link).
Node is used for indicating the relationship between information for storing unstructured and structuring information, chain.Each is non- Structuring and structural data are all a node in model, the information for including in the element, each element as corresponding to figure layer Deng being all considered as node, and the correlativity between these nodes constitutes chain.Definition divides each data storage in the database Position, these memory spaces are exactly each node, and a database server or multiple server clusters can be a section Point, chain are exactly the abstract of the correspondence between tables of data and table.It will be entirely with electricity consumption data acquisition system according to certain strategy It is divided, each data subset (data memory node) is indexed, and each index slicing files distribution is stored It is last that inquiry service is externally provided on different hosts.
Fig. 2 is to be illustrated according to the flow of the foundation index of one embodiment with electricity consumption data retrieval method of the present invention Figure, as shown in Figure 2:
Step 201, an index host node and multiple index nodes are established.
Step 202, to index node setting and the index slicing files corresponding to it, index slicing files are storage adapted The corresponding index slicing files of tables of data of electric data, the correspondence of pre-set index slicing files and index node.
Step 203, index host node is received for the index task with electricity consumption data, and index task is passed through message mechanism It is sent to index node.
Step 204, index node is handled corresponding index slicing files based on index task;Wherein, rope Drawing task includes:It newly indexes, updates index, deletes index task.
If index task, which is increment type, newly indexes task, when storage is with electricity consumption data in data memory node, Increment type being sent to index host node by system message and newly indexing task, increment type newly indexes the data packet of task carrying It includes:Data table name with electricity consumption data and content.Tables of data with electricity consumption data is entitled to be used to deposit on data memory node Tables of data of the storage with electricity consumption data.
Index host node newly indexes the data table name with electricity consumption data in task according to increment type and determines that this matches electricity consumption Index slicing files corresponding to data, and based on electricity consumption data data table name and content and index slicing files information It generates index and establishes message, and be put into distributed index message queue.
Index node obtains index from index messages queue and establishes message, is determined based on the data table name with electricity consumption data This index establishes message, and thus whether index node is handled, if it is, it is with electricity consumption data to establish message based on index It establishes and indexes and update index slicing files corresponding with the data table name of electricity consumption data is matched, if it is not, then based on electricity consumption number is matched According to data table name determine that processing index establishes the index node of message, and by index establish message be sent to this index node into Row processing.
If it is determined that processing index establishes the index node failure of message, then indexes host node and index thus and establish message weight Newly one index node of distribution, and index established into message be sent to this index node and handle.
Index task is increased newly if the task of index is batch type, and index host node calls MapReduce modes, will receive Batch type increase newly index task be divided into multiple data aggregates, multiple data acquisition systems are distributed to multiple index nodes, it is multiple Every data in data acquisition system includes:Data table name with electricity consumption data and content.Multiple index nodes are based respectively on reception The data acquisition system arrived establishes index, the index of foundation is merged processing, and update and the data table name pair with electricity consumption data The index slicing files answered.
In one embodiment, index group system is responsible for distributed foundation index.Indexing the structure of group system is Master-Slave structures ensure that each index node establishes index parallel by index Task-decomposing to each index node, Improve the ability to mass data processing.Index host node is the top layer of tree structure.Top layer is found by index Node, then carry out lower layer index, improves data-handling capacity at the drawbacks of can having evaded conventional lookup one by one.Index cluster System can improve the handling capacity of whole system, reduce the load of index host node, and index host node is avoided to become the bottle of system Neck.
Indexing cluster supports batch type and increment type to index mission mode, and wherein incremental mode is system default mode.System The system message that an increment type indexes task can be sent to index host node while system storage is per data.Index host node According to the attribute and content of data in the message, call index stripping strategy determine the attribution data in index slicing files, And the message is stored in distributed index message queue.
Each index node mutually exclusive obtains message from message queue.If the message belongs to the index node, stand The message is handled, otherwise forward messages to and corresponding node and handles the message.If corresponding processing index node Failure, index host node receive the message that can not handle message, then redistribute an index node for the message.
In one embodiment, index slicing files are stored in distributed file system, when in distributed file system There are new index slicing files or some index slicing files to be updated, query master node receives corresponding notice.Inquiry master Node is allocated new index slicing files according to the loading condition of each query node.
HDFS can be used to share index file as between index cluster and inquiry cluster, can ensure index file Data consistency and safety ensure rope to provide the index file storage service of a high robust for whole system Draw cluster when establishing index, inquiry cluster still can externally provide stable inquiry service.
Fig. 3 is to be illustrated according to the flow of the data retrieval of one embodiment with electricity consumption data retrieval method of the present invention Figure, as shown in the figure:
Step 301, query master node, query node and inquiring client terminal are established.
Step 302, query master node receives the index slicing files that index host node is sent, and is saved according to each inquiry The loading condition of point is to query node distribution index slicing files.
Step 303, when receiving user's inquiry request that inquiring client terminal is sent, what query master node was determined and was queried With the corresponding index slicing files of electricity consumption data, at least one query node is determined based on this index slicing files.
Step 304, the query node list that inquiry service is provided is generated according to the load condition of at least one query node And it is sent to inquiring client terminal.
Step 305, inquiring client terminal askes the query node in node listing to Check and sends inquiry request, and Check askes node listing In query node based on assigned and execute inquiry request in the index slicing files that are locally stored.
Step 306, Check is ask the query result that the query node in node listing returns and merges place by inquiring client terminal Reason, and the query result after merging treatment is provided.
In one embodiment, inquiry group system includes query master node, query node and inquiring client terminal three Part forms.It inquires group system and also uses Master-Slave structures, its purpose is that ensureing that index slicing files are quick It is deployed to each query node, to improve the availability of inquiry service, increases the response speed of inquiry service, improves looking into for user Ask experience.
Query master node can grasp the load information of each child node in entire cluster.When one user's inquiry of response is asked When asking, query master node therefrom selects a query node list and returns to client according to the load of each node, Inquiring client terminal is ask node listing according to the Check and is inquired.Query node is mainly that inquiring client terminal provides inquiry service.With Family can be by inquiring client terminal releasing inquiry, and obtains the query result that each query node returns, and finally inquires these As a result it merges.Multiple queries node carries out distributed query, and the query result of each lower layer's query node is merged, Find out user requested data.
Indexing the type of message sent by distributed index message queue between host node and index node includes:Update Index, deletion index and pattern switching message etc..Index group system handles index slicing files, and processing includes: It is newly-increased, update, delete and merge etc..Indexing group system will be to the place of index slicing files by the first notification message queue Result notice is managed to query master node.Query master node is based on handling result and generates query node message, is disappeared by the second notice Query node message is sent to query node corresponding with this index slicing files by breath queue, so that the update of this query node is originally This index slicing files of ground storage.
In one embodiment, the communication of index cluster system message is logical between host node and index node for indexing Letter.Type of message includes:Newly index, update index, delete index and pattern switching message etc..When in increment type rope Drawing pattern, index host node judges the corresponding index fragment of data according to index stripping strategy and index fragment distributed intelligence, The information of the index fragment is recorded in message, and the message is finally stored in message queue.When update or deletion index, process is such as The message that newly-increased index generates is identical.When index cluster switchs to batch state by increment type, index host node can be in message team Pattern switching message is added in row.When index node obtains this message, notice interdependent node suspends current index task simultaneously Into batch indexing model.
When index group system increases the index file in distributed file system newly, update is deleted and is merged When operation, notice is inquired to the index file of each relevant inquiring node updates local in cluster.Index node is inquiring main section It is inserted into a piece of news in the message queue of point, query master node handles the message, and the type of message includes:Check index, deployment Index increases index newly, reinitializes index, is loaded into index again, is loaded into index fragment again, deletes index, deletes index Fragment etc..
The message communicating of inquiry group system is mainly by indexing and inquiring the initiation of the message communicating between group system. When query master node obtains a new message, query master node will parse the message, and generate multiple queries section Point message, and by the message transmission to each query node.Query node receives the message, carries out corresponding appoint Business.Type of message between query node and host node includes:Deployment index fragment, is loaded into index fragment, updates index fragment, Delete the information such as index fragment.
In one embodiment, as shown in figure 4, the present invention, which provides one kind, matching electricity consumption data retrieval system, including data are deposited Store up node 41, index group system 42 and inquiry group system 43.Electricity consumption data is matched in the storage of data memory node 41.Based on default The storage architecture with electricity consumption data multiple data memory nodes 41 are divided in distributed data base or file system, by adapted Electric data are stored in corresponding data memory node 41.
Index group system 42 is established by distributed index establishment model and is managed and data memory node and adapted The electric corresponding index slicing files of data.Inquiry group system 43 is inquired group system and by distributed search pattern and is based on Index slicing files with electricity consumption data to being retrieved of being stored in data memory node and provides retrieval result.
In one embodiment, index group system 42 includes an index host node 421 and multiple index nodes 422. Host node 421 is indexed to index node setting and the index slicing files corresponding to it, wherein index slicing files are that storage is matched The corresponding index slicing files of tables of data of electricity consumption data.Host node 421 is indexed to receive for the index task with electricity consumption data, Index task is sent to index node by message mechanism.Index node 422 is based on index task to corresponding index Slicing files are handled, and index task includes:It newly indexes, updates index, deletes index task dispatching.
If index task, which is increment type, newly indexes task, when storage is with electricity consumption data in data memory node, Data memory node 41 sends increment type to index host node 421 by system message and newly indexes task, and increment type creates rope Drawing the data that task carries includes:Data table name with electricity consumption data and content.Host node 421 is indexed according to electricity consumption data Data table name determines this with the index slicing files corresponding to electricity consumption data, and based on data table name and content with electricity consumption data And index slicing files information generates index and establishes message, and be put into distributed index message queue.
Index node 422 obtains index from index messages queue and establishes message, based on the data table name with electricity consumption data It determines that this index establishes message whether thus index node is handled, establishes and disappear if it is, index node 422 is based on index Breath indexes and updates index slicing files corresponding with the data table name of electricity consumption data is matched to be established with electricity consumption data, if not, Then index node 422 determines that processing index establishes the index node of message based on the data table name with electricity consumption data, and will index It establishes message and is sent to this index node and handled.Index host node 421 is if it is determined that processing index establishes the index of message Node failure, then index establishes message and redistributes an index node thus, and index is established message and is sent to this index Node is handled.
Index task is increased newly if the task of index is batch type, and index host node 421 calls MapReduce modes that will receive To batch type increase newly index task be divided into multiple data aggregates, multiple data acquisition systems are distributed to multiple index nodes 422, Every data in multiple data acquisition systems includes:Data table name with electricity consumption data and content.Multiple index nodes 422 distinguish base Index is established in the data acquisition system received, the index of foundation is merged into processing, and update and the data with electricity consumption data The corresponding index slicing files of table name.
In one embodiment, inquiry group system 43 includes:Query master node 431, query node 432 and inquiry visitor Family end 433.Query master node 431 receives the index slicing files that index host node is sent, and according to the negative of each query node Situation is carried to query node distribution index slicing files.When receiving user's inquiry request that inquiring client terminal is sent, inquiry master Node 431 is determined and is queried with the corresponding index slicing files of electricity consumption data, and slicing files determination is indexed at least based on this One query node.Query master node 431 generates according to the load condition of at least one query node and provides looking into for inquiry service It askes node listing and is sent to inquiring client terminal.Inquiring client terminal 433 askes the query node in node listing to Check and sends inquiry Request, Check ask node listing in query node 432 based on it is assigned and be locally stored index slicing files execution look into Ask request.Check is ask the query result that the query node 432 in node listing returns and merges processing by inquiring client terminal 433, And provide the query result after merging treatment.
Match electricity consumption data retrieval method and system in above-described embodiment, is divided in distributed data base or file system Multiple data memory nodes will be stored in electricity consumption data in corresponding data memory node;Index group system passes through Distributed index establishment model is established and is managed with data memory node and with the corresponding index slicing files of electricity consumption data; Inquiry group system matches electricity consumption by distributed search pattern and based on index slicing files to what is stored in data memory node Data are retrieved and provide retrieval result;The new search method for structural data and unstructured data, energy are provided The index for enough solving the problems, such as mass data has the characteristics that high-throughput, high scalability, high concurrent, high fault tolerance, suitable pair The concurrently access of super large data set, reduces the load of retrieval host node, and can improve the availability of inquiry service, increases service Response speed, improve user inquiry experience.
The method and system of the present invention may be achieved in many ways.For example, can by software, hardware, firmware or Software, hardware, firmware any combinations come realize the present invention method and system.The said sequence of the step of for method is only In order to illustrate, the step of method of the invention, is not limited to sequence described in detail above, especially says unless otherwise It is bright.In addition, in some embodiments, also the present invention can be embodied as to record program in the recording medium, these programs include For realizing machine readable instructions according to the method for the present invention.Thus, the present invention also covers storage for executing according to this hair The recording medium of the program of bright method.
Description of the invention provides for the sake of example and description, and is not exhaustively or will be of the invention It is limited to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.It selects and retouches It states embodiment and is to more preferably illustrate the principle of the present invention and practical application, and those skilled in the art is enable to manage Various embodiments with various modifications of the solution present invention to design suitable for special-purpose.

Claims (12)

1. one kind matching electricity consumption data retrieval method, which is characterized in that including:
Multiple data storages are divided in distributed data base or file system based on the preset storage architecture with electricity consumption data Node is stored in described with electricity consumption data in corresponding data memory node;Wherein, described to be included with electricity consumption data packet: Structuring is with electricity consumption data, unstructured adapted data;
Index group system is established in the distributed data base or file system, the index group system passes through distribution Index establishment model is established and is managed and the data memory node and described literary with the corresponding index fragment of electricity consumption data Part;
Inquiry group system is established in the distributed data base or file system, the inquiry group system passes through distribution Search modes and based on the index slicing files to being examined with electricity consumption data described in being stored in the data memory node Rope simultaneously provides retrieval result.
2. the method as described in claim 1, which is characterized in that further include:
An index host node and multiple index nodes are established, the index fragment text to index node setting and corresponding to it Part, wherein the index slicing files are tables of data corresponding index slicing files of the storage with electricity consumption data;
The index host node is received for the index task with electricity consumption data, and the index task is passed through message mechanism It is sent to the index node;
The index node is handled corresponding index slicing files based on the index task;Wherein, the rope Drawing task includes:It newly indexes, updates index, deletes index task.
3. method as claimed in claim 2, which is characterized in that further include:
It is described with electricity consumption when being stored in the data memory node if the index task, which is increment type, newly indexes task When data, increment type is sent to the index host node by system message and newly indexes task;Wherein, the increment type is newly-built Index task carry data include:Data table name with electricity consumption data and content;
The index host node determines this with the index point corresponding to electricity consumption data according to the data table name with electricity consumption data Piece file, and generate index based on the data table name and content with electricity consumption data and index slicing files information and establish and disappear Breath, and be put into distributed index message queue;
The index node obtains the index from the index messages queue and establishes message, based on described with electricity consumption data Data table name determines that this index establishes message whether thus index node is handled, if it is, based on index foundation Message be it is described established with electricity consumption data index and update with it is described literary with the corresponding index fragment of the data table name of electricity consumption data Part, if it is not, then determine that the processing index establishes the index node of message based on the data table name with electricity consumption data, and The index is established message and is sent to this index node and is handled;
If it is determined that processing is described to index the index node failure for establishing message, then the index host node thus build by the index Vertical message redistributes an index node, and the index established message is sent to this index node and handle.
4. method as claimed in claim 3, which is characterized in that further include:
If the index task, which is batch type, increases index task newly, the index host node calls MapReduce modes, will connect The batch type received increases index task newly and is divided into multiple data aggregates, and the multiple data acquisition system is distributed to multiple indexes and is saved Point;Wherein, every data in multiple data acquisition systems includes:Data table name with electricity consumption data and content;
The multiple index node is based respectively on the data acquisition system received and establishes index, and the index of foundation is merged Processing, and update with described with the corresponding index slicing files of the data table name of electricity consumption data.
5. method as claimed in claim 4, which is characterized in that further include:
Establish query master node, query node and inquiring client terminal;
The query master node receives the index slicing files that the index host node is sent, and is saved according to each inquiry The loading condition of point distributes the index slicing files to the query node;
When receiving user's inquiry request that the inquiring client terminal is sent, the query master node determine be queried it is described With the corresponding index slicing files of electricity consumption data, at least one query node is determined based on this index slicing files;
It is concurrent that the query node list for providing and inquiring and servicing is generated according to the load condition of at least one query node Give the inquiring client terminal;
The inquiring client terminal askes the query node in node listing to the Check and sends inquiry request, and the Check askes node listing In query node based on assigned and execute inquiry request in the index slicing files that are locally stored;
The Check is ask the query result that the query node in node listing returns and merges processing by the inquiring client terminal, and Query result after merging treatment is provided.
6. method as claimed in claim 5, which is characterized in that including:
The type of message sent by the distributed index message queue between the index host node and the index node Including:Update index deletes index and pattern switching message;
The index group system handles the index slicing files;Wherein, described handle includes:Newly-increased, update is deleted It removes and merges;The index group system will be to the handling result of the index slicing files by the first notification message queue It notifies to query master node;
The query master node is based on the handling result and generates query node message, will be described by second notification message queue Query node message is sent to query node corresponding with this index slicing files, so that the update of this query node was locally stored This index slicing files.
Wherein, the distributed data base includes:HBase databases;The distributed file system includes:HDFS files system System.
7. one kind matching electricity consumption data retrieval system, which is characterized in that including:
Data memory node matches electricity consumption data for storing;
Wherein, multiple data are divided in distributed data base or file system based on the preset storage architecture with electricity consumption data Memory node is stored in described with electricity consumption data in corresponding data memory node;Wherein, described to match electricity consumption data packet It includes:Structuring is with electricity consumption data, unstructured adapted data;
Group system is indexed, establishes and manages and the data memory node and institute for passing through distributed index establishment model It states and matches the corresponding index slicing files of electricity consumption data;
Group system is inquired, for inquiring group system by distributed search pattern and based on the index slicing files to institute Store in data memory node described is stated to be retrieved with electricity consumption data and retrieval result is provided.
8. matching electricity consumption data retrieval system as claimed in claim 7, which is characterized in that further include:
The index group system includes:One index host node and multiple index nodes;
The index host node, for index node setting and the index slicing files corresponding to it, wherein the rope It is tables of data corresponding index slicing files of the storage with electricity consumption data to draw slicing files;It receives for described with electricity consumption data The index task is sent to the index node by index task by message mechanism;
The index node, for being handled corresponding index slicing files based on the index task;Wherein, institute Stating index task includes:It newly indexes, updates index, deletes index task.
9. matching electricity consumption data retrieval system as claimed in claim 8, which is characterized in that
The data memory node is deposited if being that increment type newly indexes task for the index task when in the data When storing up described in being stored in node with electricity consumption data, sends increment type to the index host node by system message and newly index and appoint Business;Wherein, the increment type newly index task carrying data include:Data table name with electricity consumption data and content;
The index host node, for determining this with the rope corresponding to electricity consumption data according to the data table name with electricity consumption data Draw slicing files, and index is generated based on the data table name and content with electricity consumption data and index slicing files information and is built Vertical message, and be put into distributed index message queue;
The index node establishes message for obtaining the index from the index messages queue, matches electricity consumption based on described The data table name of data determines that this index establishes message whether thus index node is handled, if it is, being based on the rope Draw that establish message be that described established with electricity consumption data is indexed and updated with described with the corresponding index of the data table name of electricity consumption data Slicing files, if it is not, then determining that the processing index establishes the index of message based on the data table name with electricity consumption data Node, and the index is established message and is sent to this index node and is handled;
The index host node, it is for if it is determined that described index of processing establishes the index node failure of message, then described thus Index establishes message and redistributes an index node, and the index is established message and is sent at this index node Reason.
10. matching electricity consumption data retrieval system as claimed in claim 9, which is characterized in that
The index host node calls MapReduce modes if being that batch type increases index task newly for the index task It increases the batch type received newly index task and is divided into multiple data aggregates, the multiple data acquisition system is distributed to multiple ropes Draw node;Wherein, every data in multiple data acquisition systems includes:Data table name with electricity consumption data and content;
The multiple index node is based respectively on the data acquisition system received and establishes index, and the index of foundation is merged Processing, and update with described with the corresponding index slicing files of the data table name of electricity consumption data.
11. matching electricity consumption data retrieval system as claimed in claim 10, which is characterized in that
The inquiry group system includes:Establish query master node, query node and inquiring client terminal;
The query master node, the index slicing files sent for receiving the index host node, and looked into according to each The loading condition for asking node distributes the index slicing files to the query node;It is sent when the reception inquiring client terminal When user's inquiry request, determines and be queried described with the corresponding index slicing files of electricity consumption data, fragment is indexed based on this File determines at least one query node;It is generated to provide according to the load condition of at least one query node and be looked into It askes the query node list of service and is sent to the inquiring client terminal;
The inquiring client terminal, the query node for being ask to the Check in node listing send inquiry request, wherein the Check Query node in node listing is ask based on assigned and execute inquiry request in the index slicing files that are locally stored;It will The Check askes the query result that the query node in node listing returns and merges processing, and provides after merging treatment Query result.
12. matching electricity consumption data retrieval system as claimed in claim 11, which is characterized in that
The type of message sent by the distributed index message queue between the index host node and the index node Including:Update index deletes index and pattern switching message;
The index group system handles the index slicing files;Wherein, described handle includes:Newly-increased, update is deleted It removes and merges;The index group system will be to the handling result of the index slicing files by the first notification message queue It notifies to query master node;
The query master node is based on the handling result and generates query node message, will be described by second notification message queue Query node message is sent to query node corresponding with this index slicing files, so that the update of this query node was locally stored This index slicing files;
Wherein, the distributed data base includes:HBase databases;The distributed file system includes:HDFS files system System.
CN201711434002.1A 2017-12-26 2017-12-26 With electricity consumption data retrieval method and system Pending CN108460072A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711434002.1A CN108460072A (en) 2017-12-26 2017-12-26 With electricity consumption data retrieval method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711434002.1A CN108460072A (en) 2017-12-26 2017-12-26 With electricity consumption data retrieval method and system

Publications (1)

Publication Number Publication Date
CN108460072A true CN108460072A (en) 2018-08-28

Family

ID=63220682

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711434002.1A Pending CN108460072A (en) 2017-12-26 2017-12-26 With electricity consumption data retrieval method and system

Country Status (1)

Country Link
CN (1) CN108460072A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110716933A (en) * 2019-09-29 2020-01-21 浙江大学 Novel urban rail train big data-oriented high-flexibility distributed index method
CN110968762A (en) * 2019-12-05 2020-04-07 北京天融信网络安全技术有限公司 Adjusting method and device for retrieval
CN111737052A (en) * 2020-06-19 2020-10-02 中国工商银行股份有限公司 Distributed object storage system and method
CN111949833A (en) * 2020-08-17 2020-11-17 北京字节跳动网络技术有限公司 Index construction method, data processing method, device, electronic equipment and medium
CN112231501A (en) * 2020-10-20 2021-01-15 浙江大华技术股份有限公司 Portrait library data storage and retrieval method and device and storage medium
CN113612705A (en) * 2021-08-02 2021-11-05 广西电网有限责任公司 Power grid monitoring system data transmission method based on Hash algorithm fragmentation and recombination

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779185A (en) * 2012-06-29 2012-11-14 浙江大学 High-availability distribution type full-text index method
CN103390038A (en) * 2013-07-16 2013-11-13 西安交通大学 HBase-based incremental index creation and retrieval method
CN106599153A (en) * 2016-12-07 2017-04-26 河北中废通网络技术有限公司 Multi-data-source-based waste industry search system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779185A (en) * 2012-06-29 2012-11-14 浙江大学 High-availability distribution type full-text index method
CN103390038A (en) * 2013-07-16 2013-11-13 西安交通大学 HBase-based incremental index creation and retrieval method
CN106599153A (en) * 2016-12-07 2017-04-26 河北中废通网络技术有限公司 Multi-data-source-based waste industry search system and method

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110716933A (en) * 2019-09-29 2020-01-21 浙江大学 Novel urban rail train big data-oriented high-flexibility distributed index method
CN110716933B (en) * 2019-09-29 2022-03-15 浙江大学 Novel urban rail train big data-oriented high-flexibility distributed index method
CN110968762A (en) * 2019-12-05 2020-04-07 北京天融信网络安全技术有限公司 Adjusting method and device for retrieval
CN110968762B (en) * 2019-12-05 2023-07-18 北京天融信网络安全技术有限公司 Adjustment method and device for retrieval
CN111737052A (en) * 2020-06-19 2020-10-02 中国工商银行股份有限公司 Distributed object storage system and method
CN111737052B (en) * 2020-06-19 2023-07-07 中国工商银行股份有限公司 Distributed object storage system and method
CN111949833A (en) * 2020-08-17 2020-11-17 北京字节跳动网络技术有限公司 Index construction method, data processing method, device, electronic equipment and medium
CN112231501A (en) * 2020-10-20 2021-01-15 浙江大华技术股份有限公司 Portrait library data storage and retrieval method and device and storage medium
CN113612705A (en) * 2021-08-02 2021-11-05 广西电网有限责任公司 Power grid monitoring system data transmission method based on Hash algorithm fragmentation and recombination
CN113612705B (en) * 2021-08-02 2023-08-22 广西电网有限责任公司 Hash algorithm slicing and recombination-based power grid monitoring system data transmission method

Similar Documents

Publication Publication Date Title
CN108460072A (en) With electricity consumption data retrieval method and system
CN109492040B (en) System suitable for processing mass short message data in data center
CN102779185B (en) High-availability distribution type full-text index method
Ding et al. Enabling smart transportation systems: A parallel spatio-temporal database approach
CN106708917B (en) A kind of data processing method, device and OLAP system
CN104506632B (en) One kind is based on distributed polycentric resource sharing system and method
CN102567495B (en) Mass information storage system and implementation method
US8635250B2 (en) Methods and systems for deleting large amounts of data from a multitenant database
CN109600447B (en) Method, device and system for processing data
US20080086464A1 (en) Efficient method of location-based content management and delivery
CN105005611B (en) A kind of file management system and file management method
JP5719323B2 (en) Distributed processing system, dispatcher and distributed processing management device
US10158709B1 (en) Identifying data store requests for asynchronous processing
CN106484713A (en) A kind of based on service-oriented Distributed Request Processing system
CN107343021A (en) A kind of Log Administration System based on big data applied in state's net cloud
CN111209364A (en) Mass data access processing method and system based on crowdsourcing map updating
US20220318074A1 (en) System and method for structuring and accessing tenant data in a hierarchical multi-tenant environment
CN109635189A (en) A kind of information search method, device, terminal device and storage medium
CN113127526A (en) Distributed data storage and retrieval system based on Kubernetes
CN103412883B (en) Semantic intelligent information distribution subscription method based on P2P technology
CN110019085A (en) A kind of distributed time series database based on HBase
Dehne et al. VOLAP: A scalable distributed system for real-time OLAP with high velocity data
CN110929126A (en) Distributed crawler scheduling method based on remote procedure call
Ye Research on the key technology of big data service in university library
Cortés et al. A scalable architecture for spatio-temporal range queries over big location data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180828

RJ01 Rejection of invention patent application after publication