CN108460072A - With electricity consumption data retrieval method and system - Google Patents
With electricity consumption data retrieval method and system Download PDFInfo
- Publication number
- CN108460072A CN108460072A CN201711434002.1A CN201711434002A CN108460072A CN 108460072 A CN108460072 A CN 108460072A CN 201711434002 A CN201711434002 A CN 201711434002A CN 108460072 A CN108460072 A CN 108460072A
- Authority
- CN
- China
- Prior art keywords
- index
- node
- data
- electricity consumption
- consumption data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
- G06F16/134—Distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/14—Details of searching files based on file metadata
- G06F16/148—File search processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
- G06F16/24532—Query optimisation of parallel queries
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Library & Information Science (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses one kind matching electricity consumption data retrieval method and system, and method therein includes:Multiple data memory nodes are divided in distributed data base or file system, will be stored in corresponding data memory node with electricity consumption data;Index group system is established by distributed index establishment model and is managed with data memory node and with the corresponding index slicing files of electricity consumption data;Inquiry group system to being retrieved with electricity consumption data of being stored in data memory node and provides retrieval result by distributed search pattern and based on index slicing files;The present invention's matches electricity consumption data retrieval method and system, it can solve the problems, such as the index of mass data, have the characteristics that high-throughput, high scalability, high concurrent, high fault tolerance, it is suitble to the concurrently access to super large data set, reduce the load of retrieval host node, and the availability of inquiry service can be improved, increase the response speed of service, improves the inquiry experience of user.
Description
Technical field
The present invention relates to information retrievals and search technique field, more particularly to one kind with electricity consumption data retrieval method and to be
System.
Background technology
With deepening continuously for intelligent adapted power grid construction, the sharp increase of acquisition terminal quantity acquires frequency substantially
Enhancing, is faced with the effective integration of multi-source heterogeneous mass data, the challenge of efficient storage and enhanced scalability.Adapted electric industry business by
It walks and develops to intelligent, lean direction, need further to promote trans-sectoral business, cross-platform data analysis and process ability, from
And the accuracy and real-time and human-computer interaction and effect of visualization to the high efficiency of data storage and processing, value excavation carry
Requirements at the higher level are gone out.It is included with electricity consumption data packet:Structural data and unstructured data.Structural data refers to same class data
Storage format and length having the same, can mark off fixed underlying dimension, the number that can be expressed in table form
According to, such as character, number;Unstructured data refers to no apparent structure, can not mark off fixed basic component
Data, such as sound, figure, image, animation, video, photo.Structural data usually describes the basic letter of static state of spatial data
Breath, and unstructured information describes the space time information of spatial data, although there is very big difference in data structure in both data
It is different, but they are to describe the same space information from different perspectives, are an entirety in inquiry and calling.
Index file indicates the one-to-one relationship between logic record and physical record, can make in the services such as inquiry
.With being continuously increased for data volume, not only the size of index file has soon exceeded the existing space of stand-alone server,
And establish the time spent by single huge index file and executed on the index file inquiry time it is also continuous
Increase, new matches electricity consumption data retrieval method therefore, it is necessary to a kind of.
Invention content
In view of this, the invention solves a technical problem be to provide and a kind of with electricity consumption data retrieval method and be
System.
According to an aspect of the present invention, one kind is provided and matches electricity consumption data retrieval method, including:Match electricity consumption based on preset
The storage architecture of data divides multiple data memory nodes in distributed data base or file system, described will match electricity consumption data
It is stored in corresponding data memory node;Wherein, described to be included with electricity consumption data packet:Structuring is with electricity consumption data, non-knot
Structure adapted data;Index group system, the index group system are established in the distributed data base or file system
Established by distributed index establishment model and manage with the data memory node and it is described with electricity consumption data it is corresponding
Index slicing files;Inquiry group system, the inquiry group system are established in the distributed data base or file system
By distributed search pattern and based on the index slicing files to matching electricity consumption described in being stored in the data memory node
Data are retrieved and provide retrieval result.
Optionally, an index host node and multiple index nodes are established, corresponding to index node setting and its
Index slicing files, wherein the index slicing files be that tables of data corresponding index fragment of the storage with electricity consumption data is literary
Part;The index host node is received for the index task with electricity consumption data, and the index task is passed through message mechanism
It is sent to the index node;The index node based on the index task to corresponding index slicing files at
Reason;Wherein, the index task includes:It newly indexes, updates index, deletes index task.
Optionally, it if the index task, which is increment type, newly indexes task, is deposited when in the data memory node
When matching electricity consumption data described in storage, increment type is sent to the index host node by system message and newly indexes task;Wherein, institute
State increment type newly index task carrying data include:Data table name with electricity consumption data and content;The index host node
This is determined with the index slicing files corresponding to electricity consumption data according to the data table name with electricity consumption data, and is matched based on described
The data table name and content and index slicing files information of electricity consumption data generate index and establish message, and are put into distributed index
In message queue;The index node obtains the index from the index messages queue and establishes message, is based on the adapted
The data table name of electric data determines that this index establishes message whether thus index node is handled, if it is, based on described
It is that described established with electricity consumption data is indexed and updated with described with the corresponding rope of the data table name of electricity consumption data that index, which establishes message,
Draw slicing files, if it is not, then determining that the processing index establishes the rope of message based on the data table name with electricity consumption data
Draw node, and the index established into message is sent to this index node and handle;If it is determined that the processing index is established
The index node of message fails, then the index host node thus establish message and redistribute an index node by described index,
And the index is established message and is sent to this index node and is handled.
Optionally, if the index task, which is batch type, increases index task newly, the index host node calls
MapReduce modes increase the batch type received newly index task and are divided into multiple data aggregates, by the multiple data set
Distribution is closed to multiple index nodes;Wherein, every data in multiple data acquisition systems includes:Data table name with electricity consumption data and
Content;The multiple index node is based respectively on the data acquisition system received and establishes index, and the index of foundation is closed
And handle, and update with described with the corresponding index slicing files of the data table name of electricity consumption data.
Optionally, query master node, query node and inquiring client terminal are established;The query master node receives described
The index slicing files that host node is sent are indexed, and are divided to the query node according to the loading condition of each query node
With the index slicing files;When receiving user's inquiry request that the inquiring client terminal is sent, the query master node is true
It is fixed be queried described with the corresponding index slicing files of electricity consumption data, slicing files are indexed based on this and determine at least one institute
State query node;The query node row that inquiry service is provided are generated according to the load condition of at least one query node
Table is simultaneously sent to the inquiring client terminal;The inquiring client terminal askes the query node in node listing to the Check and sends inquiry
Request, the Check ask the query node in node listing based on assigned and executed in the index slicing files being locally stored
Inquiry request;The Check is ask the query result that the query node in node listing returns and merges place by the inquiring client terminal
Reason, and the query result after merging treatment is provided.
Optionally, it is sent by the distributed index message queue between the index host node and the index node
Type of message include:Update index deletes index and pattern switching message;The index group system is to the index
Slicing files are handled;Wherein, described handle includes:Newly-increased, update is deleted and is merged;The index group system is logical
The handling result of the index slicing files will be notified to query master node by crossing the first notification message queue;The main section of inquiry
Point generates query node message based on the handling result, is sent the query node message by second notification message queue
To query node corresponding with this index slicing files, so that this index slicing files that the update of this query node is locally stored.
Wherein, the distributed data base includes:HBase databases;The distributed file system includes:HDFS file system.
According to another aspect of the present invention, one kind is provided and matches electricity consumption data retrieval system, including:Data memory node is used
Match electricity consumption data in storage;Wherein, based on the preset storage architecture with electricity consumption data in distributed data base or file system
It is middle to divide multiple data memory nodes, it is stored in described in corresponding data memory node with electricity consumption data;Wherein, institute
It states and is included with electricity consumption data packet:Structuring is with electricity consumption data, unstructured adapted data;Group system is indexed, for passing through distribution
Formula index establishment model is established and manages and the data memory node and described match the corresponding index fragment of electricity consumption data
File;Group system is inquired, for inquiring group system by distributed search pattern and being based on the index slicing files pair
What is stored in the data memory node described retrieved with electricity consumption data and provides retrieval result.
Optionally, the index group system includes:One index host node and multiple index nodes;The main section of index
Point, for index node setting and the index slicing files corresponding to it, wherein the index slicing files are storage
The corresponding index slicing files of tables of data with electricity consumption data;It receives for the index task with electricity consumption data, it will be described
Index task is sent to the index node by message mechanism;The index node, for based on the index task pair with
Its corresponding index slicing files is handled;Wherein, the index task includes:It newly indexes, updates index, deletes index
Task.
Optionally, the data memory node, if for the index task be increment type newly index task, when
When matching electricity consumption data described in being stored in the data memory node, increment type is sent to the index host node by system message
Newly index task;Wherein, the increment type newly index task carrying data include:Data table name with electricity consumption data and
Content;The index host node, for determining this with corresponding to electricity consumption data according to the data table name with electricity consumption data
Slicing files are indexed, and index is generated based on the data table name and content with electricity consumption data and index slicing files information
Message is established, and is put into distributed index message queue;The index node, for being obtained from the index messages queue
The index establishes message, determines that whether thus this index establishes message index section based on the data table name with electricity consumption data
Point is handled, if it is, based on the index establish message be it is described establish with electricity consumption data index and update with it is described
The corresponding index slicing files of data table name with electricity consumption data, if it is not, then based on the data table name with electricity consumption data
It determines that processing is described and indexes the index node for establishing message, and the index is established into message and is sent at this index node
Reason;The index host node, for if it is determined that described index of processing establishes the index node failure of message, then the rope thus
Draw and establish message and redistribute an index node, and the index established into message is sent to this index node and handle.
Optionally, the index host node calls if being that batch type increases index task newly for the index task
MapReduce modes increase the batch type received newly index task and are divided into multiple data aggregates, by the multiple data set
Distribution is closed to multiple index nodes;Wherein, every data in multiple data acquisition systems includes:Data table name with electricity consumption data and
Content;The multiple index node is based respectively on the data acquisition system received and establishes index, and the index of foundation is closed
And handle, and update with described with the corresponding index slicing files of the data table name of electricity consumption data.
Optionally, the inquiry group system includes:Establish query master node, query node and inquiring client terminal;Institute
Query master node is stated, the index slicing files sent for receiving the index host node, and according to each query node
Loading condition distribute the index slicing files to the query node;It is looked into when receiving the user that the inquiring client terminal is sent
When asking request, determines and be queried described with the corresponding index slicing files of electricity consumption data, it is true that slicing files are indexed based on this
Fixed at least one query node;It is generated according to the load condition of at least one query node and inquiry service is provided
Query node list and be sent to the inquiring client terminal;The inquiring client terminal, for being ask in node listing to the Check
Query node send inquiry request, wherein the Check askes the query node in node listing based on assigned and in local
The index slicing files of storage execute inquiry request;By the Check ask node listing in query node return query result into
Row merging treatment, and the query result after merging treatment is provided.
Optionally, it is sent by the distributed index message queue between the index host node and the index node
Type of message include:Update index deletes index and pattern switching message;The index group system is to the index
Slicing files are handled;Wherein, described handle includes:Newly-increased, update is deleted and is merged;The index group system is logical
The handling result of the index slicing files will be notified to query master node by crossing the first notification message queue;The main section of inquiry
Point generates query node message based on the handling result, is sent the query node message by second notification message queue
To query node corresponding with this index slicing files, so that this index slicing files that the update of this query node is locally stored;
Wherein, the distributed data base includes:HBase databases;The distributed file system includes:HDFS file system.
The present invention's matches electricity consumption data retrieval method and system, and multiple numbers are divided in distributed data base or file system
According to memory node, will be stored in corresponding data memory node with electricity consumption data;Index group system passes through distribution
Index establishment model is established and manages and data memory node and match the corresponding index slicing files of electricity consumption data;Query set
Group's system by distributed search pattern and based on index slicing files to store in data memory node with electricity consumption data into
Row is retrieved and provides retrieval result;The new search method for structural data and unstructured data is provided, can be solved
The index problem of mass data, has the characteristics that high-throughput, high scalability, high concurrent, high fault tolerance, is suitble to super large number
According to the concurrently access of collection, the load of retrieval host node is reduced, and the availability of inquiry service can be improved, increases the response of service
Speed improves the inquiry experience of user.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention without having to pay creative labor, may be used also for those of ordinary skill in the art
With obtain other attached drawings according to these attached drawings.
Fig. 1 is the flow diagram according to one embodiment with electricity consumption data retrieval method of the present invention;
Fig. 2 is to be illustrated according to the flow of the foundation index of one embodiment with electricity consumption data retrieval method of the present invention
Figure;
Fig. 3 is to be illustrated according to the flow of the data retrieval of one embodiment with electricity consumption data retrieval method of the present invention
Figure;
Fig. 4 is the module diagram according to one embodiment with electricity consumption data retrieval system of the present invention.
Specific implementation mode
Carry out the various exemplary embodiments of detailed description of the present invention now with reference to attached drawing.It should be noted that:Unless in addition having
Body illustrates that the unlimited system of component and the positioned opposite of step, numerical expression and the numerical value otherwise illustrated in these embodiments is originally
The range of invention.
Simultaneously, it should be appreciated that for ease of description, the size of attached various pieces shown in the drawings is not according to reality
Proportionate relationship draw.
It is illustrative to the description only actually of at least one exemplary embodiment below, is never used as to the present invention
And its application or any restrictions that use.
Technology, method and apparatus known to person of ordinary skill in the relevant may be not discussed in detail, but suitable
In the case of, technology, method and apparatus should be considered as part of specification.
It should be noted that:Similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi
It is defined, then it need not be further discussed in subsequent attached drawing in a attached drawing.
The embodiment of the present invention can be applied to computer system/server, can be with numerous other general or specialized calculating
System environments or configuration operate together.Suitable for be used together with computer system/server well-known computing system, ring
The example of border and/or configuration includes but not limited to:Smart mobile phone, personal computer system, server computer system, Thin clients
Machine, thick client computer, hand-held or laptop devices, microprocessor-based system, set-top box, programmable consumer electronics, network
PC, little types Ji calculate machine Xi Tong ﹑ large computer systems and the distributed cloud computing technology ring including any of the above described system
Border, etc..
Computer system/server can be in computer system executable instruction (such as journey executed by computer system
Sequence module) general context under describe.In general, program module may include routine, program, target program, component, logic, number
According to structure etc., they execute specific task or realize specific abstract data type.Computer system/server can be with
Implement in distributed cloud computing environment, in distributed cloud computing environment, task is long-range by what is be linked through a communication network
Manage what equipment executed.In distributed cloud computing environment, program module can be positioned at the Local or Remote meter for including storage device
It calculates in system storage medium.
Fig. 1 is according to the flow diagram of one embodiment with electricity consumption data retrieval method of the present invention, such as Fig. 1 institutes
Show:
Step 101, it is divided in distributed data base or file system based on the preset storage architecture with electricity consumption data
Multiple data memory nodes will be stored in electricity consumption data in corresponding data memory node.
Preset storage architecture can be a variety of distributed storage frameworks etc., and distributed data base includes HBase data
Library etc., distributed file system include HDFS file system etc..Structuring is included with electricity consumption data, unstructured with electricity consumption data packet
Adapted data etc..HDFS file system or HBase databases are divided into multiple data memory nodes, data memory node can
Think the set etc. of a server or multiple servers.
For example, HDFS is distributed index system, HDFS is sent out based on GFS, is designed to be deployed in cheap hardware
On, have the characteristics that high-throughput, high scalability, high concurrent, high fault tolerance, be suitble to those need it is to super large data set and
Hair accesses, the application program of the operations such as storage and processing.HDFS provides simple consistency model simultaneously, has high fault tolerance,
It can ensure the safety of the guarantee data in the case of node failure or other systems failure.
Step 102, index group system is established in distributed data base or file system, index group system is by dividing
Cloth index establishment model is established and manages and data memory node and match the corresponding index slicing files of electricity consumption data.
For example, index slicing files are the index file of the tables of data with electricity consumption data, adapted is stored in electricity consumption data
In the tables of data of electric data, multiple tables of data of the storage with electricity consumption data can be respectively stored on multiple data memory nodes,
Index in index slicing files is by the index with electricity consumption data that is stored in the corresponding tables of data with electricity consumption data.
Step 103, inquiry group system is established in distributed data base or file system, inquiry group system is by dividing
Cloth search modes are simultaneously retrieved and are provided with electricity consumption data to what is stored in data memory node based on index slicing files
Retrieval result.
In one embodiment, single for for unstructured data, inhomogeneous object formation method repetitive rate is very high
Effective encapsulation of a object causes redundant storage;Object-oriented data model does not support relationship type, unstructured right
There is complicated relationship as between.Original object oriented data model can be expanded, by hypertext data model with towards right
The data model of elephant combines together, constructs a kind of object-oriented data model of expansion.The mould of node link may be used
Type (N-L), core are node (Node) and chain (Link).
Node is used for indicating the relationship between information for storing unstructured and structuring information, chain.Each is non-
Structuring and structural data are all a node in model, the information for including in the element, each element as corresponding to figure layer
Deng being all considered as node, and the correlativity between these nodes constitutes chain.Definition divides each data storage in the database
Position, these memory spaces are exactly each node, and a database server or multiple server clusters can be a section
Point, chain are exactly the abstract of the correspondence between tables of data and table.It will be entirely with electricity consumption data acquisition system according to certain strategy
It is divided, each data subset (data memory node) is indexed, and each index slicing files distribution is stored
It is last that inquiry service is externally provided on different hosts.
Fig. 2 is to be illustrated according to the flow of the foundation index of one embodiment with electricity consumption data retrieval method of the present invention
Figure, as shown in Figure 2:
Step 201, an index host node and multiple index nodes are established.
Step 202, to index node setting and the index slicing files corresponding to it, index slicing files are storage adapted
The corresponding index slicing files of tables of data of electric data, the correspondence of pre-set index slicing files and index node.
Step 203, index host node is received for the index task with electricity consumption data, and index task is passed through message mechanism
It is sent to index node.
Step 204, index node is handled corresponding index slicing files based on index task;Wherein, rope
Drawing task includes:It newly indexes, updates index, deletes index task.
If index task, which is increment type, newly indexes task, when storage is with electricity consumption data in data memory node,
Increment type being sent to index host node by system message and newly indexing task, increment type newly indexes the data packet of task carrying
It includes:Data table name with electricity consumption data and content.Tables of data with electricity consumption data is entitled to be used to deposit on data memory node
Tables of data of the storage with electricity consumption data.
Index host node newly indexes the data table name with electricity consumption data in task according to increment type and determines that this matches electricity consumption
Index slicing files corresponding to data, and based on electricity consumption data data table name and content and index slicing files information
It generates index and establishes message, and be put into distributed index message queue.
Index node obtains index from index messages queue and establishes message, is determined based on the data table name with electricity consumption data
This index establishes message, and thus whether index node is handled, if it is, it is with electricity consumption data to establish message based on index
It establishes and indexes and update index slicing files corresponding with the data table name of electricity consumption data is matched, if it is not, then based on electricity consumption number is matched
According to data table name determine that processing index establishes the index node of message, and by index establish message be sent to this index node into
Row processing.
If it is determined that processing index establishes the index node failure of message, then indexes host node and index thus and establish message weight
Newly one index node of distribution, and index established into message be sent to this index node and handle.
Index task is increased newly if the task of index is batch type, and index host node calls MapReduce modes, will receive
Batch type increase newly index task be divided into multiple data aggregates, multiple data acquisition systems are distributed to multiple index nodes, it is multiple
Every data in data acquisition system includes:Data table name with electricity consumption data and content.Multiple index nodes are based respectively on reception
The data acquisition system arrived establishes index, the index of foundation is merged processing, and update and the data table name pair with electricity consumption data
The index slicing files answered.
In one embodiment, index group system is responsible for distributed foundation index.Indexing the structure of group system is
Master-Slave structures ensure that each index node establishes index parallel by index Task-decomposing to each index node,
Improve the ability to mass data processing.Index host node is the top layer of tree structure.Top layer is found by index
Node, then carry out lower layer index, improves data-handling capacity at the drawbacks of can having evaded conventional lookup one by one.Index cluster
System can improve the handling capacity of whole system, reduce the load of index host node, and index host node is avoided to become the bottle of system
Neck.
Indexing cluster supports batch type and increment type to index mission mode, and wherein incremental mode is system default mode.System
The system message that an increment type indexes task can be sent to index host node while system storage is per data.Index host node
According to the attribute and content of data in the message, call index stripping strategy determine the attribution data in index slicing files,
And the message is stored in distributed index message queue.
Each index node mutually exclusive obtains message from message queue.If the message belongs to the index node, stand
The message is handled, otherwise forward messages to and corresponding node and handles the message.If corresponding processing index node
Failure, index host node receive the message that can not handle message, then redistribute an index node for the message.
In one embodiment, index slicing files are stored in distributed file system, when in distributed file system
There are new index slicing files or some index slicing files to be updated, query master node receives corresponding notice.Inquiry master
Node is allocated new index slicing files according to the loading condition of each query node.
HDFS can be used to share index file as between index cluster and inquiry cluster, can ensure index file
Data consistency and safety ensure rope to provide the index file storage service of a high robust for whole system
Draw cluster when establishing index, inquiry cluster still can externally provide stable inquiry service.
Fig. 3 is to be illustrated according to the flow of the data retrieval of one embodiment with electricity consumption data retrieval method of the present invention
Figure, as shown in the figure:
Step 301, query master node, query node and inquiring client terminal are established.
Step 302, query master node receives the index slicing files that index host node is sent, and is saved according to each inquiry
The loading condition of point is to query node distribution index slicing files.
Step 303, when receiving user's inquiry request that inquiring client terminal is sent, what query master node was determined and was queried
With the corresponding index slicing files of electricity consumption data, at least one query node is determined based on this index slicing files.
Step 304, the query node list that inquiry service is provided is generated according to the load condition of at least one query node
And it is sent to inquiring client terminal.
Step 305, inquiring client terminal askes the query node in node listing to Check and sends inquiry request, and Check askes node listing
In query node based on assigned and execute inquiry request in the index slicing files that are locally stored.
Step 306, Check is ask the query result that the query node in node listing returns and merges place by inquiring client terminal
Reason, and the query result after merging treatment is provided.
In one embodiment, inquiry group system includes query master node, query node and inquiring client terminal three
Part forms.It inquires group system and also uses Master-Slave structures, its purpose is that ensureing that index slicing files are quick
It is deployed to each query node, to improve the availability of inquiry service, increases the response speed of inquiry service, improves looking into for user
Ask experience.
Query master node can grasp the load information of each child node in entire cluster.When one user's inquiry of response is asked
When asking, query master node therefrom selects a query node list and returns to client according to the load of each node,
Inquiring client terminal is ask node listing according to the Check and is inquired.Query node is mainly that inquiring client terminal provides inquiry service.With
Family can be by inquiring client terminal releasing inquiry, and obtains the query result that each query node returns, and finally inquires these
As a result it merges.Multiple queries node carries out distributed query, and the query result of each lower layer's query node is merged,
Find out user requested data.
Indexing the type of message sent by distributed index message queue between host node and index node includes:Update
Index, deletion index and pattern switching message etc..Index group system handles index slicing files, and processing includes:
It is newly-increased, update, delete and merge etc..Indexing group system will be to the place of index slicing files by the first notification message queue
Result notice is managed to query master node.Query master node is based on handling result and generates query node message, is disappeared by the second notice
Query node message is sent to query node corresponding with this index slicing files by breath queue, so that the update of this query node is originally
This index slicing files of ground storage.
In one embodiment, the communication of index cluster system message is logical between host node and index node for indexing
Letter.Type of message includes:Newly index, update index, delete index and pattern switching message etc..When in increment type rope
Drawing pattern, index host node judges the corresponding index fragment of data according to index stripping strategy and index fragment distributed intelligence,
The information of the index fragment is recorded in message, and the message is finally stored in message queue.When update or deletion index, process is such as
The message that newly-increased index generates is identical.When index cluster switchs to batch state by increment type, index host node can be in message team
Pattern switching message is added in row.When index node obtains this message, notice interdependent node suspends current index task simultaneously
Into batch indexing model.
When index group system increases the index file in distributed file system newly, update is deleted and is merged
When operation, notice is inquired to the index file of each relevant inquiring node updates local in cluster.Index node is inquiring main section
It is inserted into a piece of news in the message queue of point, query master node handles the message, and the type of message includes:Check index, deployment
Index increases index newly, reinitializes index, is loaded into index again, is loaded into index fragment again, deletes index, deletes index
Fragment etc..
The message communicating of inquiry group system is mainly by indexing and inquiring the initiation of the message communicating between group system.
When query master node obtains a new message, query master node will parse the message, and generate multiple queries section
Point message, and by the message transmission to each query node.Query node receives the message, carries out corresponding appoint
Business.Type of message between query node and host node includes:Deployment index fragment, is loaded into index fragment, updates index fragment,
Delete the information such as index fragment.
In one embodiment, as shown in figure 4, the present invention, which provides one kind, matching electricity consumption data retrieval system, including data are deposited
Store up node 41, index group system 42 and inquiry group system 43.Electricity consumption data is matched in the storage of data memory node 41.Based on default
The storage architecture with electricity consumption data multiple data memory nodes 41 are divided in distributed data base or file system, by adapted
Electric data are stored in corresponding data memory node 41.
Index group system 42 is established by distributed index establishment model and is managed and data memory node and adapted
The electric corresponding index slicing files of data.Inquiry group system 43 is inquired group system and by distributed search pattern and is based on
Index slicing files with electricity consumption data to being retrieved of being stored in data memory node and provides retrieval result.
In one embodiment, index group system 42 includes an index host node 421 and multiple index nodes 422.
Host node 421 is indexed to index node setting and the index slicing files corresponding to it, wherein index slicing files are that storage is matched
The corresponding index slicing files of tables of data of electricity consumption data.Host node 421 is indexed to receive for the index task with electricity consumption data,
Index task is sent to index node by message mechanism.Index node 422 is based on index task to corresponding index
Slicing files are handled, and index task includes:It newly indexes, updates index, deletes index task dispatching.
If index task, which is increment type, newly indexes task, when storage is with electricity consumption data in data memory node,
Data memory node 41 sends increment type to index host node 421 by system message and newly indexes task, and increment type creates rope
Drawing the data that task carries includes:Data table name with electricity consumption data and content.Host node 421 is indexed according to electricity consumption data
Data table name determines this with the index slicing files corresponding to electricity consumption data, and based on data table name and content with electricity consumption data
And index slicing files information generates index and establishes message, and be put into distributed index message queue.
Index node 422 obtains index from index messages queue and establishes message, based on the data table name with electricity consumption data
It determines that this index establishes message whether thus index node is handled, establishes and disappear if it is, index node 422 is based on index
Breath indexes and updates index slicing files corresponding with the data table name of electricity consumption data is matched to be established with electricity consumption data, if not,
Then index node 422 determines that processing index establishes the index node of message based on the data table name with electricity consumption data, and will index
It establishes message and is sent to this index node and handled.Index host node 421 is if it is determined that processing index establishes the index of message
Node failure, then index establishes message and redistributes an index node thus, and index is established message and is sent to this index
Node is handled.
Index task is increased newly if the task of index is batch type, and index host node 421 calls MapReduce modes that will receive
To batch type increase newly index task be divided into multiple data aggregates, multiple data acquisition systems are distributed to multiple index nodes 422,
Every data in multiple data acquisition systems includes:Data table name with electricity consumption data and content.Multiple index nodes 422 distinguish base
Index is established in the data acquisition system received, the index of foundation is merged into processing, and update and the data with electricity consumption data
The corresponding index slicing files of table name.
In one embodiment, inquiry group system 43 includes:Query master node 431, query node 432 and inquiry visitor
Family end 433.Query master node 431 receives the index slicing files that index host node is sent, and according to the negative of each query node
Situation is carried to query node distribution index slicing files.When receiving user's inquiry request that inquiring client terminal is sent, inquiry master
Node 431 is determined and is queried with the corresponding index slicing files of electricity consumption data, and slicing files determination is indexed at least based on this
One query node.Query master node 431 generates according to the load condition of at least one query node and provides looking into for inquiry service
It askes node listing and is sent to inquiring client terminal.Inquiring client terminal 433 askes the query node in node listing to Check and sends inquiry
Request, Check ask node listing in query node 432 based on it is assigned and be locally stored index slicing files execution look into
Ask request.Check is ask the query result that the query node 432 in node listing returns and merges processing by inquiring client terminal 433,
And provide the query result after merging treatment.
Match electricity consumption data retrieval method and system in above-described embodiment, is divided in distributed data base or file system
Multiple data memory nodes will be stored in electricity consumption data in corresponding data memory node;Index group system passes through
Distributed index establishment model is established and is managed with data memory node and with the corresponding index slicing files of electricity consumption data;
Inquiry group system matches electricity consumption by distributed search pattern and based on index slicing files to what is stored in data memory node
Data are retrieved and provide retrieval result;The new search method for structural data and unstructured data, energy are provided
The index for enough solving the problems, such as mass data has the characteristics that high-throughput, high scalability, high concurrent, high fault tolerance, suitable pair
The concurrently access of super large data set, reduces the load of retrieval host node, and can improve the availability of inquiry service, increases service
Response speed, improve user inquiry experience.
The method and system of the present invention may be achieved in many ways.For example, can by software, hardware, firmware or
Software, hardware, firmware any combinations come realize the present invention method and system.The said sequence of the step of for method is only
In order to illustrate, the step of method of the invention, is not limited to sequence described in detail above, especially says unless otherwise
It is bright.In addition, in some embodiments, also the present invention can be embodied as to record program in the recording medium, these programs include
For realizing machine readable instructions according to the method for the present invention.Thus, the present invention also covers storage for executing according to this hair
The recording medium of the program of bright method.
Description of the invention provides for the sake of example and description, and is not exhaustively or will be of the invention
It is limited to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.It selects and retouches
It states embodiment and is to more preferably illustrate the principle of the present invention and practical application, and those skilled in the art is enable to manage
Various embodiments with various modifications of the solution present invention to design suitable for special-purpose.
Claims (12)
1. one kind matching electricity consumption data retrieval method, which is characterized in that including:
Multiple data storages are divided in distributed data base or file system based on the preset storage architecture with electricity consumption data
Node is stored in described with electricity consumption data in corresponding data memory node;Wherein, described to be included with electricity consumption data packet:
Structuring is with electricity consumption data, unstructured adapted data;
Index group system is established in the distributed data base or file system, the index group system passes through distribution
Index establishment model is established and is managed and the data memory node and described literary with the corresponding index fragment of electricity consumption data
Part;
Inquiry group system is established in the distributed data base or file system, the inquiry group system passes through distribution
Search modes and based on the index slicing files to being examined with electricity consumption data described in being stored in the data memory node
Rope simultaneously provides retrieval result.
2. the method as described in claim 1, which is characterized in that further include:
An index host node and multiple index nodes are established, the index fragment text to index node setting and corresponding to it
Part, wherein the index slicing files are tables of data corresponding index slicing files of the storage with electricity consumption data;
The index host node is received for the index task with electricity consumption data, and the index task is passed through message mechanism
It is sent to the index node;
The index node is handled corresponding index slicing files based on the index task;Wherein, the rope
Drawing task includes:It newly indexes, updates index, deletes index task.
3. method as claimed in claim 2, which is characterized in that further include:
It is described with electricity consumption when being stored in the data memory node if the index task, which is increment type, newly indexes task
When data, increment type is sent to the index host node by system message and newly indexes task;Wherein, the increment type is newly-built
Index task carry data include:Data table name with electricity consumption data and content;
The index host node determines this with the index point corresponding to electricity consumption data according to the data table name with electricity consumption data
Piece file, and generate index based on the data table name and content with electricity consumption data and index slicing files information and establish and disappear
Breath, and be put into distributed index message queue;
The index node obtains the index from the index messages queue and establishes message, based on described with electricity consumption data
Data table name determines that this index establishes message whether thus index node is handled, if it is, based on index foundation
Message be it is described established with electricity consumption data index and update with it is described literary with the corresponding index fragment of the data table name of electricity consumption data
Part, if it is not, then determine that the processing index establishes the index node of message based on the data table name with electricity consumption data, and
The index is established message and is sent to this index node and is handled;
If it is determined that processing is described to index the index node failure for establishing message, then the index host node thus build by the index
Vertical message redistributes an index node, and the index established message is sent to this index node and handle.
4. method as claimed in claim 3, which is characterized in that further include:
If the index task, which is batch type, increases index task newly, the index host node calls MapReduce modes, will connect
The batch type received increases index task newly and is divided into multiple data aggregates, and the multiple data acquisition system is distributed to multiple indexes and is saved
Point;Wherein, every data in multiple data acquisition systems includes:Data table name with electricity consumption data and content;
The multiple index node is based respectively on the data acquisition system received and establishes index, and the index of foundation is merged
Processing, and update with described with the corresponding index slicing files of the data table name of electricity consumption data.
5. method as claimed in claim 4, which is characterized in that further include:
Establish query master node, query node and inquiring client terminal;
The query master node receives the index slicing files that the index host node is sent, and is saved according to each inquiry
The loading condition of point distributes the index slicing files to the query node;
When receiving user's inquiry request that the inquiring client terminal is sent, the query master node determine be queried it is described
With the corresponding index slicing files of electricity consumption data, at least one query node is determined based on this index slicing files;
It is concurrent that the query node list for providing and inquiring and servicing is generated according to the load condition of at least one query node
Give the inquiring client terminal;
The inquiring client terminal askes the query node in node listing to the Check and sends inquiry request, and the Check askes node listing
In query node based on assigned and execute inquiry request in the index slicing files that are locally stored;
The Check is ask the query result that the query node in node listing returns and merges processing by the inquiring client terminal, and
Query result after merging treatment is provided.
6. method as claimed in claim 5, which is characterized in that including:
The type of message sent by the distributed index message queue between the index host node and the index node
Including:Update index deletes index and pattern switching message;
The index group system handles the index slicing files;Wherein, described handle includes:Newly-increased, update is deleted
It removes and merges;The index group system will be to the handling result of the index slicing files by the first notification message queue
It notifies to query master node;
The query master node is based on the handling result and generates query node message, will be described by second notification message queue
Query node message is sent to query node corresponding with this index slicing files, so that the update of this query node was locally stored
This index slicing files.
Wherein, the distributed data base includes:HBase databases;The distributed file system includes:HDFS files system
System.
7. one kind matching electricity consumption data retrieval system, which is characterized in that including:
Data memory node matches electricity consumption data for storing;
Wherein, multiple data are divided in distributed data base or file system based on the preset storage architecture with electricity consumption data
Memory node is stored in described with electricity consumption data in corresponding data memory node;Wherein, described to match electricity consumption data packet
It includes:Structuring is with electricity consumption data, unstructured adapted data;
Group system is indexed, establishes and manages and the data memory node and institute for passing through distributed index establishment model
It states and matches the corresponding index slicing files of electricity consumption data;
Group system is inquired, for inquiring group system by distributed search pattern and based on the index slicing files to institute
Store in data memory node described is stated to be retrieved with electricity consumption data and retrieval result is provided.
8. matching electricity consumption data retrieval system as claimed in claim 7, which is characterized in that further include:
The index group system includes:One index host node and multiple index nodes;
The index host node, for index node setting and the index slicing files corresponding to it, wherein the rope
It is tables of data corresponding index slicing files of the storage with electricity consumption data to draw slicing files;It receives for described with electricity consumption data
The index task is sent to the index node by index task by message mechanism;
The index node, for being handled corresponding index slicing files based on the index task;Wherein, institute
Stating index task includes:It newly indexes, updates index, deletes index task.
9. matching electricity consumption data retrieval system as claimed in claim 8, which is characterized in that
The data memory node is deposited if being that increment type newly indexes task for the index task when in the data
When storing up described in being stored in node with electricity consumption data, sends increment type to the index host node by system message and newly index and appoint
Business;Wherein, the increment type newly index task carrying data include:Data table name with electricity consumption data and content;
The index host node, for determining this with the rope corresponding to electricity consumption data according to the data table name with electricity consumption data
Draw slicing files, and index is generated based on the data table name and content with electricity consumption data and index slicing files information and is built
Vertical message, and be put into distributed index message queue;
The index node establishes message for obtaining the index from the index messages queue, matches electricity consumption based on described
The data table name of data determines that this index establishes message whether thus index node is handled, if it is, being based on the rope
Draw that establish message be that described established with electricity consumption data is indexed and updated with described with the corresponding index of the data table name of electricity consumption data
Slicing files, if it is not, then determining that the processing index establishes the index of message based on the data table name with electricity consumption data
Node, and the index is established message and is sent to this index node and is handled;
The index host node, it is for if it is determined that described index of processing establishes the index node failure of message, then described thus
Index establishes message and redistributes an index node, and the index is established message and is sent at this index node
Reason.
10. matching electricity consumption data retrieval system as claimed in claim 9, which is characterized in that
The index host node calls MapReduce modes if being that batch type increases index task newly for the index task
It increases the batch type received newly index task and is divided into multiple data aggregates, the multiple data acquisition system is distributed to multiple ropes
Draw node;Wherein, every data in multiple data acquisition systems includes:Data table name with electricity consumption data and content;
The multiple index node is based respectively on the data acquisition system received and establishes index, and the index of foundation is merged
Processing, and update with described with the corresponding index slicing files of the data table name of electricity consumption data.
11. matching electricity consumption data retrieval system as claimed in claim 10, which is characterized in that
The inquiry group system includes:Establish query master node, query node and inquiring client terminal;
The query master node, the index slicing files sent for receiving the index host node, and looked into according to each
The loading condition for asking node distributes the index slicing files to the query node;It is sent when the reception inquiring client terminal
When user's inquiry request, determines and be queried described with the corresponding index slicing files of electricity consumption data, fragment is indexed based on this
File determines at least one query node;It is generated to provide according to the load condition of at least one query node and be looked into
It askes the query node list of service and is sent to the inquiring client terminal;
The inquiring client terminal, the query node for being ask to the Check in node listing send inquiry request, wherein the Check
Query node in node listing is ask based on assigned and execute inquiry request in the index slicing files that are locally stored;It will
The Check askes the query result that the query node in node listing returns and merges processing, and provides after merging treatment
Query result.
12. matching electricity consumption data retrieval system as claimed in claim 11, which is characterized in that
The type of message sent by the distributed index message queue between the index host node and the index node
Including:Update index deletes index and pattern switching message;
The index group system handles the index slicing files;Wherein, described handle includes:Newly-increased, update is deleted
It removes and merges;The index group system will be to the handling result of the index slicing files by the first notification message queue
It notifies to query master node;
The query master node is based on the handling result and generates query node message, will be described by second notification message queue
Query node message is sent to query node corresponding with this index slicing files, so that the update of this query node was locally stored
This index slicing files;
Wherein, the distributed data base includes:HBase databases;The distributed file system includes:HDFS files system
System.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711434002.1A CN108460072A (en) | 2017-12-26 | 2017-12-26 | With electricity consumption data retrieval method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711434002.1A CN108460072A (en) | 2017-12-26 | 2017-12-26 | With electricity consumption data retrieval method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108460072A true CN108460072A (en) | 2018-08-28 |
Family
ID=63220682
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711434002.1A Pending CN108460072A (en) | 2017-12-26 | 2017-12-26 | With electricity consumption data retrieval method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108460072A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110716933A (en) * | 2019-09-29 | 2020-01-21 | 浙江大学 | Novel urban rail train big data-oriented high-flexibility distributed index method |
CN110968762A (en) * | 2019-12-05 | 2020-04-07 | 北京天融信网络安全技术有限公司 | Adjusting method and device for retrieval |
CN111737052A (en) * | 2020-06-19 | 2020-10-02 | 中国工商银行股份有限公司 | Distributed object storage system and method |
CN111949833A (en) * | 2020-08-17 | 2020-11-17 | 北京字节跳动网络技术有限公司 | Index construction method, data processing method, device, electronic equipment and medium |
CN112231501A (en) * | 2020-10-20 | 2021-01-15 | 浙江大华技术股份有限公司 | Portrait library data storage and retrieval method and device and storage medium |
CN113612705A (en) * | 2021-08-02 | 2021-11-05 | 广西电网有限责任公司 | Power grid monitoring system data transmission method based on Hash algorithm fragmentation and recombination |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102779185A (en) * | 2012-06-29 | 2012-11-14 | 浙江大学 | High-availability distribution type full-text index method |
CN103390038A (en) * | 2013-07-16 | 2013-11-13 | 西安交通大学 | HBase-based incremental index creation and retrieval method |
CN106599153A (en) * | 2016-12-07 | 2017-04-26 | 河北中废通网络技术有限公司 | Multi-data-source-based waste industry search system and method |
-
2017
- 2017-12-26 CN CN201711434002.1A patent/CN108460072A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102779185A (en) * | 2012-06-29 | 2012-11-14 | 浙江大学 | High-availability distribution type full-text index method |
CN103390038A (en) * | 2013-07-16 | 2013-11-13 | 西安交通大学 | HBase-based incremental index creation and retrieval method |
CN106599153A (en) * | 2016-12-07 | 2017-04-26 | 河北中废通网络技术有限公司 | Multi-data-source-based waste industry search system and method |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110716933A (en) * | 2019-09-29 | 2020-01-21 | 浙江大学 | Novel urban rail train big data-oriented high-flexibility distributed index method |
CN110716933B (en) * | 2019-09-29 | 2022-03-15 | 浙江大学 | Novel urban rail train big data-oriented high-flexibility distributed index method |
CN110968762A (en) * | 2019-12-05 | 2020-04-07 | 北京天融信网络安全技术有限公司 | Adjusting method and device for retrieval |
CN110968762B (en) * | 2019-12-05 | 2023-07-18 | 北京天融信网络安全技术有限公司 | Adjustment method and device for retrieval |
CN111737052A (en) * | 2020-06-19 | 2020-10-02 | 中国工商银行股份有限公司 | Distributed object storage system and method |
CN111737052B (en) * | 2020-06-19 | 2023-07-07 | 中国工商银行股份有限公司 | Distributed object storage system and method |
CN111949833A (en) * | 2020-08-17 | 2020-11-17 | 北京字节跳动网络技术有限公司 | Index construction method, data processing method, device, electronic equipment and medium |
CN112231501A (en) * | 2020-10-20 | 2021-01-15 | 浙江大华技术股份有限公司 | Portrait library data storage and retrieval method and device and storage medium |
CN113612705A (en) * | 2021-08-02 | 2021-11-05 | 广西电网有限责任公司 | Power grid monitoring system data transmission method based on Hash algorithm fragmentation and recombination |
CN113612705B (en) * | 2021-08-02 | 2023-08-22 | 广西电网有限责任公司 | Hash algorithm slicing and recombination-based power grid monitoring system data transmission method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108460072A (en) | With electricity consumption data retrieval method and system | |
CN109492040B (en) | System suitable for processing mass short message data in data center | |
CN102779185B (en) | High-availability distribution type full-text index method | |
Ding et al. | Enabling smart transportation systems: A parallel spatio-temporal database approach | |
CN106708917B (en) | A kind of data processing method, device and OLAP system | |
CN104506632B (en) | One kind is based on distributed polycentric resource sharing system and method | |
CN102567495B (en) | Mass information storage system and implementation method | |
US8635250B2 (en) | Methods and systems for deleting large amounts of data from a multitenant database | |
CN109600447B (en) | Method, device and system for processing data | |
US20080086464A1 (en) | Efficient method of location-based content management and delivery | |
CN105005611B (en) | A kind of file management system and file management method | |
JP5719323B2 (en) | Distributed processing system, dispatcher and distributed processing management device | |
US10158709B1 (en) | Identifying data store requests for asynchronous processing | |
CN106484713A (en) | A kind of based on service-oriented Distributed Request Processing system | |
CN107343021A (en) | A kind of Log Administration System based on big data applied in state's net cloud | |
CN111209364A (en) | Mass data access processing method and system based on crowdsourcing map updating | |
US20220318074A1 (en) | System and method for structuring and accessing tenant data in a hierarchical multi-tenant environment | |
CN109635189A (en) | A kind of information search method, device, terminal device and storage medium | |
CN113127526A (en) | Distributed data storage and retrieval system based on Kubernetes | |
CN103412883B (en) | Semantic intelligent information distribution subscription method based on P2P technology | |
CN110019085A (en) | A kind of distributed time series database based on HBase | |
Dehne et al. | VOLAP: A scalable distributed system for real-time OLAP with high velocity data | |
CN110929126A (en) | Distributed crawler scheduling method based on remote procedure call | |
Ye | Research on the key technology of big data service in university library | |
Cortés et al. | A scalable architecture for spatio-temporal range queries over big location data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180828 |
|
RJ01 | Rejection of invention patent application after publication |