CN104750727A - Column type memory storage and query device and column type memory storage and query method - Google Patents

Column type memory storage and query device and column type memory storage and query method Download PDF

Info

Publication number
CN104750727A
CN104750727A CN201310744231.9A CN201310744231A CN104750727A CN 104750727 A CN104750727 A CN 104750727A CN 201310744231 A CN201310744231 A CN 201310744231A CN 104750727 A CN104750727 A CN 104750727A
Authority
CN
China
Prior art keywords
data
cascade
querying condition
column
column data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310744231.9A
Other languages
Chinese (zh)
Other versions
CN104750727B (en
Inventor
杜海亮
陈晓峰
贲福才
户顺义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bright Oceans Inter Telecom Co Ltd
Original Assignee
Bright Oceans Inter Telecom Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bright Oceans Inter Telecom Co Ltd filed Critical Bright Oceans Inter Telecom Co Ltd
Priority to CN201310744231.9A priority Critical patent/CN104750727B/en
Publication of CN104750727A publication Critical patent/CN104750727A/en
Application granted granted Critical
Publication of CN104750727B publication Critical patent/CN104750727B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a column type memory storage and query device which is characterized by comprising a memory storage unit, a data loading unit, a query engine and a row reading unit. The memory storage unit is used for creating column arrays in memories for storing various column data in line type storage structures; the data loading unit is used for loading the target column data in the line type storage structures into the memory storage unit; the query engine is used for querying the column data loaded in the memory storage unit by the data loading unit according to hierarchical query conditions inputted by data consumers, acquiring column data position identification which meet the hierarchical query conditions and transmitting the column data position identification to the row reading unit; the row reading unit acquires row numbers and all row data information consistent to the column data position identification from the line type storage structures according to the column data position identification acquired by the query engine and transmits the row numbers and the row data information to the data consumers. The column type memory storage and query device has the advantage that the data can be easily stored in the memories and can be efficiently queried. The invention further provides a column type memory storage and query method.

Description

A kind of column memory inquiry unit and column memory querying method
Technical field
Field of data storage of the present invention, relates to a kind of column memory inquiry unit and method particularly.
Background technology
Along with the develop rapidly of internet, miscellaneous service data volume also sharply increases, storage mode for data directly affects the efficiency of data query, and the efficiency of data query directly can affect the treatment effeciency of miscellaneous service, science carry out data storage, ensureing the efficient read-write of data, is the basis of improving business processing efficiency.
Which type of data structure is adopted to store in internal memory, EMS memory occupation and operation efficiency are had great importance, due to memory resource limitation, to store in the mode taking resource less, how little just little what first should answer is, an int takies 4 bytes in internal memory, N number of int just should be minimum close to 4*N, such memory consumption can be thought little, if also less than this, can only be adopt compression and specific coding mode, and compression and the process will inevitably carrying out when computing decompressing and deciphering of encoding, this will inevitably affect efficiency, so memory table is except special situation, original text should be adopted to store, avoid unnecessary computing consumption.
Current various storage meanss, the most frequently used is store by row, the storage unit of often going has identical data structure, a line to be supported the many data type data storing multiple row, most be applied in data inserting and fetch data time be all with behavior unit, so the data of the overwhelming majority store all by row, but inquiry occurs often on row, need during inquiry to travel through the row pointer to object that every a line gets every a line, the value of respective column is got again by row pointer to object, if adopt universal architecture may carry out index value according to the index of row or title, after going, each row object unit lattice are compared and searched, determine whether the row selected, if line pointer is put into result queue.This execution value task, to need in every row searching value and carries out type conversion or value comparison to every row, can take a large amount of memory source, reduce the efficiency of inquiring about value.
It is mostly adopt object line to store and complicated data structure storage that prior art internal storage data stores, and operation efficiency is not high, takies a large amount of memory source; A kind of thinking is also had to be adopt memory database to store, but a large amount of internal memory of same consumption, performance depends on index, and the interface of use is comparatively complicated, adopts SQL expression and resolver resolves inquiry.The feature of these technology be doomed its search efficiency and loading efficiency not high, condition dirigibility is strong, has comparatively complicated API, is not suitable for being used in the application scenarios for inquiry efficient in internal memory.
Therefore a kind of storing queries mode supporting high-speed and high-efficiency to inquire about in internal memory is needed.
Summary of the invention
The invention provides a kind of column memory inquiry unit, it is characterized in that, described device comprises:
Memory storage unit, for creating each column data in columns group storage line storage organization in internal memory;
Data loading unit, for entering described memory storage unit by the target column data loading in described line storage organization;
Query engine, for the cascade querying condition inputted according to data consumer, the column data that described data loading unit load enters described memory storage unit is inquired about, obtains the column data station location marker meeting described cascade querying condition, send to row reading unit;
Row reading unit, obtains the line number All Datarows information consistent with described column data station location marker according to the column data station location marker that described query engine obtains, sends to data consumer in described line storage organization.
Preferably, described device also comprises:
Buffer unit, for storing the data query of described query engine, when the described cascade querying condition of data consumer input is consistent with the cascade querying condition of storage, and columns group corresponding to this cascade querying condition be not when changing, the column data station location marker meeting described cascade querying condition of described storage is sent to row reading unit;
Periodic Control unit, for being the described memory storage unit set memory memory cycle, delete the column data exceeding the described memory cycle in array, also for being described buffer unit setting caching period, deleting in buffer unit and exceeding the data cached of described caching period;
Locking control module, for setting read lock to described memory storage unit and writing lock, controls read-write the locking and unlocking of described memory storage unit.
Detailed, described memory storage unit comprises further:
Column data memory module, for creating the column data of row storage of array different lines, name this columns group with the field name of described columns group respective column in line storage organization, described array index is column data station location marker;
Tables of data memory module, for creating tables of data, the columns group belonging to identical line storage organization described column data memory module stored is stored in identical tables of data;
Described memory storage unit can comprise multiple tables of data memory module, each tables of data memory module can comprise multiple column data memory module, each described column data memory module preserves the data of row in line storage organization, and described column data station location marker is identical with the line number in its corresponding line storage organization;
Preferential use simple types by described column data stored in column data memory module.
Preferably, for the tables of data of preserving in each described tables of data memory module arranges version number, when the described line storage organization data that described tables of data is corresponding change, upgrade described version number.
Concrete, described query engine comprises further:
Rule definition memory module, for defining and storing the resolution rules of described cascade querying condition;
Cascade querying condition parsing module, for receiving the cascade querying condition of data consumer input, according to the resolution rules of the described cascade querying condition of described rule definition memory module definition, resolves described cascade querying condition;
Enquiry module, resolves the cascade querying condition obtained according to described cascade querying condition parsing module, inquire about step by step to the columns group that described column data memory module stores, and obtains the column data station location marker meeting described cascade querying condition.
Concrete, described buffer unit comprises further:
Data query memory module, for storage level connection querying condition, this cascade querying condition institute for tables of data version information and inquiry acquisition column data station location marker;
Consistency check module, whether the cascade querying condition for check data consumer input is consistent with the cascade querying condition of described storage, and check data consumer requires that whether the tables of data version information inquired about is consistent with the tables of data version information of storage;
Query Result sending module, for when the judged result of consistance judgment models is consistent, sends to described row reading unit by the column data station location marker corresponding with this cascade querying condition that described data query memory module stores.
More specifically, described consistency check module comprises further:
Cascade querying condition inspection module, if for identical according to cascade querying condition character string and do not comprise non-idempotence function in cascade querying condition, consistent rule is ask in decision level joint investigation, and whether the cascade querying condition of check data consumer input is consistent with the cascade querying condition of described storage;
For check data consumer, target data inspection module, requires that whether the tables of data version information inquired about is consistent with the tables of data version information of described storage.
The present invention also discloses a kind of column memory querying method, and described method comprises:
The each column data in columns group storage line storage organization is created in internal memory;
Column data in described line storage organization is loaded into the columns group of described establishment;
According to the cascade querying condition of data consumer input, the described column data be loaded in columns group is inquired about, obtain the column data station location marker meeting described cascade querying condition;
Column data station location marker according to described acquisition obtains the line number All Datarows information consistent with described column data station location marker in described line storage organization, sends to data consumer.
Preferably, described method also comprises:
The cascade querying condition that one query is used, for columns group, obtain column data station location marker stored in buffer memory, when the described cascade querying condition of data consumer input is identical with the cascade querying condition of storage, and columns group corresponding to this cascade querying condition be not when changing, according to the column data station location marker meeting described cascade querying condition of described storage, obtain the All Datarows information that line number is consistent with described column data station location marker;
The set memory memory cycle, in delete columns array, exceed the column data in described memory cycle;
Setting caching period, delete the data exceeding described caching period in buffer memory;
Set read lock and write lock, controlling the read-write the locking and unlocking to described columns group.
Detailed, the described method creating each column data that columns group stores in line storage organization in internal memory is specially:
Create the column data of row storage of array different lines, name this columns group with the field name of described column data respective column in line storage organization, described row array index is column data station location marker;
Create tables of data, will the described columns group of identical line storage organization be belonged to stored in identical tables of data;
Can create multiple described tables of data, each tables of data comprises multiple described columns group, and each described columns group preserves the data of row in line storage organization, and described column data station location marker is identical with the line number in its corresponding line storage organization;
Preferential use simple types by column data stored in column data memory module.
Preferably, store the version number information of each described tables of data, when the column data in described tables of data changes, upgrade described version number information.
Detailed, the described cascade querying condition according to data consumer input, inquires about the described column data be loaded in columns group, obtains the method meeting the column data station location marker of described cascade querying condition and is specially:
Define and store the resolution rules of described cascade querying condition;
Receive the cascade querying condition of data consumer input, according to the cascade querying condition resolution rules of described storage, resolve described cascade querying condition;
According to the cascade querying condition after described parsing, step by step the data stored in described columns group are inquired about, obtain the column data station location marker meeting described cascade querying condition.
In more detail, described judgement when the described cascade querying condition of data consumer input consistent with the cascade querying condition of storage, and the method that columns group corresponding to this cascade querying condition does not change is specially:
If identical and do not comprise non-idempotence function in cascade querying condition according to cascade querying condition character string, consistent rule is ask in decision level joint investigation, and whether the cascade querying condition of check data consumer input is consistent with the cascade querying condition of described storage;
If data consumer requires that the tables of data version information of inquiry is consistent with the tables of data version information of described storage, then judge that the columns group that this cascade querying condition is corresponding does not change.
The present invention is by entering internal memory by the data loading in line storage organization, store in the mode of columns group, inquire about qualified data, and obtain the column data station location marker (i.e. the subscript of data place array position) of these data, the line number meeting querying condition is obtained by this mark, and then obtain full line data, the benefit done like this is, do not need to travel through after every a line gets the information of every a line when inquiring about, calculating is compared to target column, but directly for all data of these row in row storage organization, put it in array and carry out unifying comparison, obtain the line number satisfied condition, obtain the row data satisfied condition again, greatly save traversal and the time that value compares calculating has all been carried out to every row, improve the efficiency of inquiry, the invention provides cascade querying condition freely to edit for data consumer simultaneously, progressively reducing query context by inquiring about step by step, improving search efficiency further, preferably, the present invention also provides buffer memory, for preserving the status information and object information of inquiring about in the past, when cascade querying condition arrives next time, if querying condition, target column array all do not change, then the direct inquiry columns group station location marker read line information according to storing sends to data consumer, further increases search efficiency, in sum, the invention provides a kind of column memory inquiry mode supporting high-speed and high-efficiency to inquire about.
Accompanying drawing explanation
Fig. 1 is the structural representation of the embodiment of the present invention one one kinds of column memory inquiry units;
Fig. 2 is the structural representation of embodiment of the present invention two devices;
Fig. 3 is a kind of column memory inquiry unit workflow schematic diagram of the embodiment of the present invention three in conjunction with example;
Fig. 4 is the method flow diagram of the embodiment of the present invention 41 kinds of column memory querying methods;
Fig. 5 is the method flow diagram of the embodiment of the present invention five method;
Fig. 6 is that how resolution stage joins querying condition and carries out the method flow diagram of consistance judgement the embodiment of the present invention six;
Fig. 7 carries out cascade query hierarchy schematic diagram in the embodiment of the present invention six.
Embodiment
Graphic and embodiment below will be coordinated to describe embodiments of the present invention in detail, by this to the present invention how application technology means solve technical matters and the implementation procedure reaching technology effect can fully understand and implement according to this.
As shown in Figure 1, the embodiment of the present invention one discloses a kind of column memory inquiry unit, comprises following structure:
Memory storage unit 1, for creating each column data in columns group storage line storage organization in internal memory.
Due to memory source limited space, adopt each column data of simple storage of array.Structure of arrays only has subscript and value, because its structure is simple, so take less memory source.
Data loading unit 2, for entering described memory storage unit 1 by the target column data loading in described line storage organization.
Data loading unit, obtains from data source there and loads the target column data needing to carry out inquiring about.The size that the cycle of loading loads has data source to determine, data source can be issued it to data consumer and be mounted with which data, and data consumer can be inquired about the data of loading according to the information issued.Data loading unit carries out data loading according to the load command of data source.
General line storage organization is as shown in table 1 below, and one has the student of 10,000 row to show.
Table 1 line storage organization example
Line number Name (name) Age (age) Class (class)
0 Zhang 17 1
1 Lee 22 2
2 Wang 21 3
3 Zhao 19 2
9999 Money 21 1
To upper table 1, according to age=29 find out all row, the way stored by row is:
(1) the row pointer to object that every a line gets every a line is traveled through
(2) got the value of age row by row pointer to object, if adopt universal architecture, index value may be carried out according to the index of row age or title.
(3) generally store according to universal architecture owing to arranging, need to carry out (int) type conversion or compare searching to each age cell, int after conversion and 29 carries out comparison of equalization, determines whether selected row, in this way line pointer is put in result queue.
(4) searched for, obtained whole result line pointers.
Adopt each column data in this line storage organization of row storage of array, by the name (name) in upper table 1 as a columns group, the age (age), class (class) was as a columns group as a columns group.
Name, age, class tri-arranges and is loaded in memory storage unit as the assembling of name columns group, age columns group and class columns by data loading unit.
When data loading unit is loaded with the data in multiple line storage organization in memory storage unit, tables of data DataTable concept can be introduced, the establishment columns group set of memory storage unit can be described as tables of data DataTable, a DataTable comprises multiple columns group, preserve the column data of row in line storage organization in a columns group, data loading unit is by each columns group respectively equipped in the DataTable of loading one in a line storage organization.
Can the name nominating tables of data of line storage organization during loading, with the name nominating columns group arranged.
Query engine 3, for the cascade querying condition inputted according to data consumer, the column data described data loading unit 2 being loaded into described memory storage unit 1 is inquired about, and obtains the column data station location marker meeting described cascade querying condition, sends to row reading unit.
Cascade querying condition is that data consumer is write according to the data query requirements of self, be made up of querying condition one by one, query engine directly can calculate the column data in columns group according to these querying conditions, avoid in row storage organization and first travel through row, get the target column in row again, calculate again, after meeting design conditions, record the troublesome calculation process of line number.
The column data station location marker satisfied condition is recorded, is sent to row reading unit after completing the calculating to column datas all in columns group according to querying condition by query engine.
Row reading unit 4, obtains the line number All Datarows information consistent with described column data station location marker according to the column data station location marker that described query engine 3 obtains, sends to data consumer in described line storage organization.
After obtaining column data station location marker, need to know line number in the row storage organization corresponding with it according to column data station location marker, being then expert at by line number obtains all field informations of this row of row in storage organization, sends to data consumer.
Preferably, in order to improve for same stages connection querying condition and for the search efficiency of same number of columns certificate, described device also comprises:
Buffer unit 5, for storing the data query of described query engine 3, when the described cascade querying condition of data consumer input is consistent with the cascade querying condition of storage, and columns group corresponding to this cascade querying condition be not when changing, the column data station location marker meeting described cascade querying condition of described storage is sent to row reading unit.
Due in practical operation, there will be data consumer carries out the same terms inquiry for same columns group, in order to improve search efficiency, avoid repeated work, buffer unit 5 of the present invention is responsible for preserving each data query, described data query mainly comprises the content of three aspects, and one is the querying condition that this is inquired about; Two for this inquiry for columns group attribute information, three column data station location markers for using this querying condition to obtain to columns group.When the first two information be change time, then directly obtaining column data station location marker sends to row reading unit to go to obtain corresponding line data, send to data consumer, will exceed the consistency checking efficiency of two kinds of information and again calculate data each in columns group, therefore buffer unit is conducive to the work efficiency improving apparatus of the present invention.
Preferably, in order to effectively use and control memory source, guarantee system is normally run, and described device also comprises:
Periodic Control unit 6, for being the described memory storage unit set memory memory cycle, delete the column data exceeding the described memory cycle in array, also for being described buffer unit setting caching period, deleting in buffer unit and exceeding the data cached of described caching period.
Due to memory resource limitation, when huge data volume, constantly Data import is entered memory storage unit, finally ample resources can be consumed, cause efficiency to decline, and some data after date when inquiry one section is just of little use, and can clear up in internal memory and buffer memory.Therefore set memory memory cycle and caching period, delete the data exceeding the cycle, ensure the normal operation of apparatus of the present invention.
Preferably, in order to ensure the correctness when carrying out read-write operation to memory storage unit 1, described device also comprises:
Locking control module 7, for setting read lock to described memory storage unit 1 and writing lock, controls read-write the locking and unlocking of described memory storage unit 1.
Read lock and to write lock be correctness in order to ensure read-write operation, the mode locked can set according to actual conditions, as carried out chains to the tables of data in memory storage unit, also can lock to the columns group in memory storage unit.The locking principle of locking control module ensures that the same moment can only have one to load thread and be in active state, load thread not at the end of other loading thread and read thread and be all in waiting status; Support any number of query manipulation concurrence performance, but at the end of inquiry is not, loads thread and be in waiting status.
Based on embodiments of the invention one, in order to the principle of work of further description apparatus of the present invention, spy provides embodiments of the invention two, as shown in Figure 2.
Memory storage unit 1 comprises further:
Column data memory module 11, for creating the column data of row storage of array different lines, name this columns group with the field name of described column data respective column in line storage organization, described array index is column data station location marker.
Described memory storage unit can comprise multiple column data memory module, and each described column data memory module preserves the data of row in line storage organization, and described column data station location marker is identical with the line number in its corresponding line storage organization.
Column data station location marker is the array index of this column data position, array index starts with 0, the line number in its corresponding line storage organization is got according to array index, if line number is also start with 0, then get identical line number, if line number starts with 1, then get the line number of " subscript+1 ".
Preferential use simple types by column data stored in column data memory module.
Also there is a lot of selection, such as int type data even if store by row, the storage organization that int [] or Integer [], List<Integer> etc. commonly use can be used.
The int [] that initialization is one 1,000 ten thousand only needs 15ms, and adopt Integer [] to need 400ms, differ more than 20 times, the more important thing is committed memory group member, int [] only needs 49MB, and Integer [] needs 228MB, as adopt ArrayList<Integer> then inner structure be equal to Object [], its system and storage only can be greater than Integer [], conclusion is self-evident, therefore adopting simple types to store array is more effective storage organization, common simple types is as int, bool, double.
Tables of data memory module 12, for creating tables of data, the columns group belonging to identical line storage organization described column data memory module 11 stored is stored in identical tables of data.
Described memory storage unit can comprise multiple tables of data memory module, and each tables of data memory module can comprise multiple column data memory module.
For the tables of data of preserving in each described tables of data memory module 12 arranges version number, when the line storage organization data that described tables of data is corresponding change, upgrade described version number.
When needing in internal memory to import multiple line storage list, in order to represent the row in each line storage list clearly, preferred employing tables of data, columns group and tables of data are subordinate relation, a tables of data can comprise multiple columns group, a corresponding tables of data of line storage list, comprise the data of all row in the storage of this line in tables of data, the column data of each row is stored as a columns group.
Data loading unit 2, for entering the columns group in described memory storage unit 1 by the target column data loading in described line storage organization.
Query engine 3 comprises further:
Rule definition memory module 31, for defining and storing the resolution rules of described cascade querying condition.
The cascade querying condition resolution rules of definition, namely defines the corresponding relation that in cascade querying condition, operational symbol and general-purpose operation accord with.
Definition cascaded operational symbol corresponding relation can provide convenient for during data consumer input stage connection querying condition, and described corresponding relation is self-defined, can change the title of cascaded operational symbol according to actual conditions.
Corresponding relation example as shown in table 2 is provided in the present invention.
Table 2 cascaded operational symbol accords with corresponding relation with general-purpose operation
Adopt the corresponding relation recorded in table 2, write cascade querying condition, show as follows:
Eq (' class', 2) .bw (' age', 18,20) .eq (' sex', 0) .like (' address ', ' % Haidian % ')
Cascade querying condition parsing module 32, for receiving the cascade querying condition of data consumer input, according to the resolution rules of the described cascade querying condition that described rule definition memory module 31 defines, resolves described cascade querying condition.
According to the corresponding relation in table 2, above-mentioned cascade querying condition is resolved.
The implication of above-mentioned cascade querying condition statement is: class is 2, the age between 18 (>=) to 20 (<20), sex be the record comprising Haidian in man's (0 is male), address.
Enquiry module 33, resolves the cascade querying condition obtained according to described cascade querying condition parsing module 32, inquire about step by step to the columns group that described column data memory module 11 stores, and obtains the column data station location marker meeting described cascade querying condition.
From left to right perform this cascade querying condition, often perform once result and just diminish once, when having arrived like operation the most consuming time, record number has greatly reduced, and the number of times performing like also just greatly reduces, and so overall efficiency just has and promotes a lot.
Row reading unit 4, obtains the line number All Datarows information consistent with described column data station location marker according to the column data station location marker that described query engine 3 obtains, sends to data consumer in described line storage organization.
Buffer unit 5 comprises further:
Data query memory module 51, for storage level connection querying condition, this cascade querying condition institute for tables of data version information and inquiry acquisition column data station location marker.
In order to avoid repeating inquiry to the same terms same queries target data, by data query memory module, the column data subscript that the querying condition at every turn inquired about, query aim data, inquiry obtain is stored, so that when data consumer is inquired about for same queries condition and target data, the direct column data subscript sending acquisition and satisfy condition, raises the efficiency.
Consistency check module 52, whether the cascade querying condition for check data consumer input is consistent with the cascade querying condition of described storage, and check data consumer requires that whether the tables of data version information inquired about is consistent with the tables of data version information of storage.
Consistency check module is just used to check the querying condition that whether there is data consumer input in buffer memory, and whether query aim data change, by checking the rear column data subscript meeting querying condition that just directly can send storage.Therefore consistency check module 52 should carry out the inspection of two aspects, comprises further:
Cascade querying condition inspection module 521, if for identical according to cascade querying condition character string and do not comprise non-idempotence function in cascade querying condition, consistent rule is ask in decision level joint investigation, and whether the cascade querying condition of check data consumer input is consistent with the cascade querying condition of described storage.
Judge that whether the cascade querying condition that data consumer inputs is identical with the cascade querying condition of described storage, by two condition criterions, one is whether character string is identical, and whether two for comprising non-idempotence function.The identical i.e. comparison character string successively of character string.Non-idempotence function is repeat the different function of result, although its character string is identical, due to the characteristic of non-idempotence function, the result causing it to repeat occurs different, therefore the querying condition comprising non-idempotence function there occurs change in fact, should survey as not identical.Such as now () function is obtain the current time, it is inconsistent in the result of multiple exercise, mean that such Query Result can not be reused if comprise now () function in query expression, need to carry out query execution at every turn.
When treating non-idempotence function, also has a kind of processing mode, namely data query memory module 51 judges whether comprise non-idempotence function in querying condition when storing querying condition, if comprised, then this querying condition and the relevant information corresponding with this querying condition are not stored.
For check data consumer, target data inspection module 522, requires that whether the tables of data version information inquired about is consistent with the tables of data version information of described storage.
Querying condition is consistent, but target data changes, also Query Result can be caused to change, therefore need to carry out consistency check to target data, because tables of data stores version information, can version information be upgraded when data change in tables of data, therefore only need simple comparison data table version information to complete consistency check.
Query Result sending module 53, for when the judged result of consistance judge module 52 is consistent, the column data station location marker corresponding with this cascade querying condition described data query memory module 51 stored sends to described row reading unit.
Periodic Control unit 6, for being the described memory storage unit set memory memory cycle, delete the column data exceeding the described memory cycle in array, also for being described buffer unit setting caching period, deleting in buffer unit and exceeding the data cached of described caching period.
Locking control module 7, for setting read lock to described memory storage unit 1 and writing lock, controls read-write the locking and unlocking of described memory storage unit 1.
For apparatus of the present invention and principle of work are better described, for the table 1 in the present invention, provide embodiments of the invention three based on embodiment one, embodiment two, be described in detail, as shown in Figure 3.
Step S101: column data memory module 11 creates columns group according to line storage organization.
Comprise name columns group, age columns group, class columns group.The data in each columns group are stored respectively with simple data type.As shown in table 3.
Table 3 row storage of array schematic diagram
Step S102: tables of data memory module 12 creates tables of data, preserve each columns group created in step S101, version number is studentV1.0.
Step S103: the rule that data consumer defines according to rule definition memory module 31, writes cascade querying condition.
The cascade querying condition write is " eq (' class', 2) .bw (' age', 18,20) "
Step S104: the rule that cascade querying condition parsing module 32 defines according to rule definition memory module 31, resolves querying condition in succession.
The result of resolving is 2 for inquiring about class, the data of age between 18 to 20.
Step S105: the querying condition that enquiry module 33 is resolved according to querying condition parsing module 32, inquires about step by step to the class columns group stored in tables of data memory module 12, age columns group.
First inquire about class columns group, obtaining the data subscript of=2, is 1,3,
Inquire about age columns group on this basis, obtaining the array index between 18-20, is 3 ...
Step S106: the All Datarows information that row reading unit 4 reads this row according to the column data subscript obtained in line storage organization sends to data consumer.
Row data-reading unit 4 read line number be 3 ... row data message, send to data consumer.
Step S107: data query memory module 51 stores querying condition and the target data version information corresponding to querying condition of this inquiry, and column data array index.
The querying condition stored is: " eq (' class', 2) .bw (' age', 18,20) ";
The target data version information stored is: student1.0;
Be designated as under the column data stored: 3 ...
Step S108: consistency check module 52 receives new cascade querying condition, carries out consistency check to new cascade querying condition, enters step S109, do not enter step S104 by inspection by inspection.
Step S109: obtain the column data subscript corresponding to this concatenation condition in data query memory module 51, enter step S106.
The embodiment of the present invention four discloses a kind of column memory querying method, and as shown in Figure 4, described method comprises:
Step S201: create each column data in columns group storage line storage organization in internal memory.
It should be noted that, what create at this is empty columns group, its objective is store line storage organization lattice column data.
Step S202: the columns group column data in described line storage organization being loaded into described establishment.
Step S203: according to the cascade querying condition of data consumer input, inquire about the described column data be loaded in columns group, obtains the column data station location marker meeting described cascade querying condition.
Define and store the resolution rules of described cascade querying condition; Receive the cascade querying condition of data consumer input, according to the cascade querying condition resolution rules of described storage, resolve described cascade querying condition; According to the cascade querying condition after described parsing, step by step the data stored in described columns group are inquired about, obtain the column data station location marker meeting described cascade querying condition.
Step S204: the column data station location marker according to described acquisition obtains the line number All Datarows information consistent with described column data station location marker in described line storage organization, sends to data consumer.
Due to when carrying out row storage of array, according to the row sequential storage of each column data in line storage organization, therefore column data station location marker (i.e. the array index at this column data place) with its in line storage organization to be positioned at capable line number corresponding, it should be noted that, array index is from 0, if the line number of line storage organization is also from 0, then the array index at column data place is as its line number, if the line number of line storage organization is from 1, then using after the array index at column data place adds 1 as its line number.
Obtain the line number at column data place, the information of the full line just can got in line storage organization according to line number sends to data consumer, improves the search efficiency of data consumer.
In order to improve same queries condition further, for the search efficiency of same target columns group, ensure Appropriate application and control memory source, ensure the read-write correctness of columns group, preferably, the cascade querying condition that one query can be used, for columns group, obtain column data station location marker stored in buffer memory, when the described cascade querying condition of data consumer input is identical with the cascade querying condition of storage, and columns group corresponding to this cascade querying condition be not when changing, the line number All Datarows information consistent with described column data station location marker is obtained according to the column data station location marker meeting described cascade querying condition of described storage.
The set memory memory cycle, in delete columns array, exceed the column data in described memory cycle.
Setting caching period, delete the data exceeding described caching period in buffer memory.
Set read lock and write lock, controlling the read-write the locking and unlocking to described columns group.
, provide embodiments of the invention five below, as shown in Figure 5, described method comprises the steps:
Step S301: create each column data in columns group storage line storage organization in internal memory.
Create the column data of row storage of array different lines, name this columns group with the field name of described column data respective column in line storage organization, described row array index is column data station location marker.
Preferential use simple types by column data stored in column data memory module.
Use simple types memory row data, can reduce the memory source that data storage takies, arithmetic speed also has and significantly promotes.
Step S302: create tables of data, will the described columns group of identical line storage organization be belonged to stored in identical tables of data.
Can create multiple described tables of data, each tables of data comprises multiple described columns group, and each described columns group preserves the data of row in line storage organization, and described column data station location marker is identical with the line number in its corresponding line storage organization.
Store the version number information of each described tables of data, when the column data in described tables of data changes, upgrade described version number information.
Step S303: the column data in described line storage organization is loaded into the columns group in the tables of data of described establishment.
Column data in identical line storage organization, is loaded into a tables of data, can comprise multiple columns group in this tables of data, for loading each column data in this line storage organization.The corresponding columns group of every column data.
Step S304: according to the cascade querying condition of data consumer input, judge whether there is the cascade querying condition identical with it in buffer memory, if do not exist, enter step S305, if exist, enters step S306.
To store within caching period the relevant information that data consumer is in the past inquired about according to querying condition target column data in buffer memory, if find, the querying condition that this is inquired about once was used, then judge further for target column data whether identical; If the querying condition of this inquiry never occurs, then inquire about according to new querying condition.
Step S305: the cascade querying condition inputted according to described consumer, the described data be loaded in columns group are inquired about, obtain the column data station location marker meeting described cascade querying condition, by this data query inquired about stored in buffer memory, enter step S308.
This inquiry is new inquiry, then, when completing this inquiry, by data query stored in buffer memory, when occurring inquiring about identical inquiry with this, directly can take out the column data mark that inquiry obtains, entering the reading of line data, can raise the efficiency.
This inquiry data query comprise described use cascade querying condition, this inquiry for tables of data version information and this inquiry obtain described column data station location marker.
Step S306: judge whether tables of data version information corresponding with this cascade querying condition in buffer memory changes, if change, enters step S305, if do not change, then enters step S307.
If tables of data version information there occurs change, then illustrate that target column data may have occurred renewal, be again loaded or there occurs other changes, therefore inquiry there occurs change, the Query Result before can not using, and therefore needs to return step S305.
Step S307: obtain the column data station location marker corresponding with described cascade querying condition in the buffer.
Step S308: the column data station location marker according to described acquisition obtains the line number All Datarows information consistent with described column data station location marker in described line storage organization, sends to data consumer.
If data consumer requires that the tables of data version information of inquiry is consistent with the tables of data version information of described storage, then judge that the columns group that this cascade querying condition is corresponding does not change.
How rule definition and parsing are completed to cascade querying condition to illustrate more clearly, and how to judge that whether querying condition is consistent, provide embodiments of the invention six, as shown in Figure 6.
Many condition often in one query, as searched, " class is 2 classes, age is between 18 ~ 20, sex is the student of the male sex ", query parse needs condition one by one to perform, when any field is without index justice, such process is very easy concerning program, it just one by one condition perform, but obviously these three conditions are relations of and, 1st condition is complete, should not be on all row and should do in the result of the 1st in execution the 2nd condition, cascaded expression is adopted to carry out descriptive level connection querying condition, the inquiry step by step to each condition can be realized, further raising search efficiency.What therefore set cascaded expression here writes mode.
Step S401: define and store the resolution rules of described cascade querying condition.
The cascade querying condition resolution rules of definition, namely defines the corresponding relation that in cascade querying condition, operational symbol and general-purpose operation accord with.As shown in above table 2.
Definition cascaded operational symbol corresponding relation can provide convenient for during data consumer input stage connection querying condition, and described corresponding relation is self-defined, can change the title of cascaded operational symbol according to actual conditions.
Step S402: the cascade querying condition receiving data consumer input, according to the cascade querying condition resolution rules of described storage, resolves described cascade querying condition.
Before being placed on according to simple querying condition, the principle that complicated querying condition is put behind writes cascade querying condition.
Cascade querying condition is from left to right performed, often performs once result and just diminish once, when having arrived operation the most consuming time, record number greatly reduces, the number of times performing like also just greatly reduces, and so overall efficiency just has and promotes a lot, as shown in Fig. 7 below.
According to above-mentioned thinking, when user's input stage connection querying condition, can according to by rearmounted for preposition for operation (as equal) not consuming time, consuming time operation (like), whole efficiency just has lifting by a larger margin.
Step S403: according to the cascade querying condition after described parsing, judges that whether described cascade querying condition character string is identical with the character string stored in buffer memory, if identical, enters step S404, if not identical, then enter step S405.
Judge that whether character string is identical, only need to check one by one and can obtain result, repeat no more.
Step S404: judge whether comprise non-idempotence function in described cascade querying condition, if comprised, enter step S405, if do not comprised, enter step S406.
In the identical situation of character string, if comprise non-idempotence function in character string, represent that the cascade querying condition that this character string describes probably there occurs change.
As non-idempotence function now(), represent and get current state, and the Query Result data of preserving in buffer memory, the result data inquired about before being not is current result data, therefore can not directly obtain.
Step S405: determine that the cascade querying condition in the cascade querying condition that data consumer inputs and buffer memory is inconsistent.
Step S406: determine that the cascade querying condition that data consumer inputs is consistent with the cascade querying condition in buffer memory.
The detailed implementation of each step in this method, can describe see device section above, therefore repeat no more.
Although the embodiment disclosed by the present invention is as above, but described content be not used to directly limit protection scope of the present invention.Any the technical staff in the technical field of the invention, under the prerequisite not departing from the spirit and scope disclosed by the present invention, can do a little change what implement in form and in details.Protection scope of the present invention, the scope that still must define with appending claims is as the criterion.

Claims (13)

1. a column memory inquiry unit, is characterized in that, described device comprises:
Memory storage unit, for creating each column data in columns group storage line storage organization in internal memory;
Data loading unit, for entering described memory storage unit by the target column data loading in described line storage organization;
Query engine, for the cascade querying condition inputted according to data consumer, the column data that described data loading unit load enters described memory storage unit is inquired about, obtains the column data station location marker meeting described cascade querying condition, send to row reading unit;
Row reading unit, obtains the line number All Datarows information consistent with described column data station location marker according to the column data station location marker that described query engine obtains, sends to data consumer in described line storage organization.
2. device according to claim 1, is characterized in that, described device also comprises:
Buffer unit, for storing the data query of described query engine, when the described cascade querying condition of data consumer input is consistent with the cascade querying condition of storage, and columns group corresponding to this cascade querying condition be not when changing, the column data station location marker meeting described cascade querying condition of described storage is sent to row reading unit;
Periodic Control unit, for being the described memory storage unit set memory memory cycle, delete the column data exceeding the described memory cycle in array, also for being described buffer unit setting caching period, deleting in buffer unit and exceeding the data cached of described caching period;
Locking control module, for setting read lock to described memory storage unit and writing lock, controls read-write the locking and unlocking of described memory storage unit.
3. device according to claim 2, is characterized in that, described memory storage unit comprises further:
Column data memory module, for creating the column data of row storage of array different lines, name this columns group with the field name of described columns group respective column in line storage organization, described array index is column data station location marker;
Tables of data memory module, for creating tables of data, the columns group belonging to identical line storage organization described column data memory module stored is stored in identical tables of data;
Described memory storage unit can comprise multiple tables of data memory module, each tables of data memory module can comprise multiple column data memory module, each described column data memory module preserves the data of row in line storage organization, and described column data station location marker is identical with the line number in its corresponding line storage organization;
Preferential use simple types by described column data stored in column data memory module.
4. device according to claim 3, is characterized in that:
For the tables of data of preserving in each described tables of data memory module arranges version number, when the described line storage organization data that described tables of data is corresponding change, upgrade described version number.
5. device according to claim 4, is characterized in that, described query engine comprises further:
Rule definition memory module, for defining and storing the resolution rules of described cascade querying condition;
Cascade querying condition parsing module, for receiving the cascade querying condition of data consumer input, according to the resolution rules of the described cascade querying condition of described rule definition memory module definition, resolves described cascade querying condition;
Enquiry module, resolves the cascade querying condition obtained according to described cascade querying condition parsing module, inquire about step by step to the columns group that described column data memory module stores, and obtains the column data station location marker meeting described cascade querying condition.
6. device according to claim 5, is characterized in that, described buffer unit comprises further:
Data query memory module, for storage level connection querying condition, this cascade querying condition institute for tables of data version information and inquiry acquisition column data station location marker;
Consistency check module, whether the cascade querying condition for check data consumer input is consistent with the cascade querying condition of described storage, and check data consumer requires that whether the tables of data version information inquired about is consistent with the tables of data version information of storage;
Query Result sending module, for when the assay of consistance judgment models is consistent, sends to described row reading unit by the column data station location marker corresponding with this cascade querying condition that described data query memory module stores.
7. device according to claim 6, is characterized in that, described consistency check module comprises further:
Cascade querying condition inspection module, if for identical according to cascade querying condition character string and do not comprise non-idempotence function in cascade querying condition, consistent rule is ask in decision level joint investigation, and whether the cascade querying condition of check data consumer input is consistent with the cascade querying condition of described storage;
For check data consumer, target data inspection module, requires that whether the tables of data version information inquired about is consistent with the tables of data version information of described storage.
8. a column memory querying method, is characterized in that, described method comprises:
The each column data in columns group storage line storage organization is created in internal memory;
Column data in described line storage organization is loaded into the columns group of described establishment;
According to the cascade querying condition of data consumer input, the described column data be loaded in columns group is inquired about, obtain the column data station location marker meeting described cascade querying condition;
Column data station location marker according to described acquisition obtains the line number All Datarows information consistent with described column data station location marker in described line storage organization, sends to data consumer.
9. method according to claim 8, is characterized in that, described method also comprises:
The cascade querying condition that one query is used, for columns group, obtain column data station location marker stored in buffer memory, when the described cascade querying condition of data consumer input is identical with the cascade querying condition of storage, and columns group corresponding to this cascade querying condition be not when changing, according to the column data station location marker meeting described cascade querying condition of described storage, obtain the All Datarows information that line number is consistent with described column data station location marker;
The set memory memory cycle, in delete columns array, exceed the column data in described memory cycle;
Setting caching period, delete the data exceeding described caching period in buffer memory;
Set read lock and write lock, controlling the read-write the locking and unlocking to described columns group.
10. method according to claim 9, is characterized in that, the described method creating each column data that columns group stores in line storage organization in internal memory is specially:
Create the column data of row storage of array different lines, name this columns group with the field name of described column data respective column in line storage organization, described row array index is column data station location marker;
Create tables of data, will the described columns group of identical line storage organization be belonged to stored in identical tables of data;
Can create multiple described tables of data, each tables of data comprises multiple described columns group, and each described columns group preserves the column data of row in line storage organization, and described column data station location marker is identical with the line number in its corresponding line storage organization;
Preferential use simple types by column data stored in column data memory module.
11. methods according to claim 10, is characterized in that:
Store the version number information of each described tables of data, when the column data in described tables of data changes, upgrade described version number information.
12. methods according to claim 11, it is characterized in that, the described cascade querying condition according to data consumer input, inquires about the described column data be loaded in columns group, obtains the method meeting the column data station location marker of described cascade querying condition and is specially:
Define and store the resolution rules of described cascade querying condition;
Receive the cascade querying condition of data consumer input, according to the cascade querying condition resolution rules of described storage, resolve described cascade querying condition;
According to the cascade querying condition after described parsing, step by step the data stored in described columns group are inquired about, obtain the column data station location marker meeting described cascade querying condition.
13. methods according to claim 12, is characterized in that, described judgement when the described cascade querying condition of data consumer input consistent with the cascade querying condition of storage, and the method that columns group corresponding to this cascade querying condition does not change is specially:
If identical and do not comprise non-idempotence function in cascade querying condition according to cascade querying condition character string, consistent rule is ask in decision level joint investigation, and whether the cascade querying condition of check data consumer input is consistent with the cascade querying condition of described storage;
If data consumer requires that the tables of data version information of inquiry is consistent with the tables of data version information of described storage, then judge that the columns group that this cascade querying condition is corresponding does not change.
CN201310744231.9A 2013-12-30 2013-12-30 A kind of column memory storage inquiry unit and column memory storage querying method Active CN104750727B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310744231.9A CN104750727B (en) 2013-12-30 2013-12-30 A kind of column memory storage inquiry unit and column memory storage querying method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310744231.9A CN104750727B (en) 2013-12-30 2013-12-30 A kind of column memory storage inquiry unit and column memory storage querying method

Publications (2)

Publication Number Publication Date
CN104750727A true CN104750727A (en) 2015-07-01
CN104750727B CN104750727B (en) 2019-03-26

Family

ID=53590426

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310744231.9A Active CN104750727B (en) 2013-12-30 2013-12-30 A kind of column memory storage inquiry unit and column memory storage querying method

Country Status (1)

Country Link
CN (1) CN104750727B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105589969A (en) * 2015-12-23 2016-05-18 浙江大华技术股份有限公司 Data processing method and device
CN107092624A (en) * 2016-12-28 2017-08-25 北京小度信息科技有限公司 Date storage method, apparatus and system
CN107436767A (en) * 2017-07-31 2017-12-05 杭州安恒信息技术有限公司 The optimization method that idempotent operates in a kind of asynchronous framework
CN109101516A (en) * 2017-11-30 2018-12-28 新华三大数据技术有限公司 A kind of data query method and server
CN109165378A (en) * 2018-08-15 2019-01-08 北京天安智慧信息技术有限公司 Sophisticated functions Report Customization method and system
CN109445945A (en) * 2018-10-29 2019-03-08 努比亚技术有限公司 Memory allocation method, mobile terminal, server and the storage medium of application program
CN110069487A (en) * 2017-09-28 2019-07-30 北京国双科技有限公司 A kind of data processing method, apparatus and system
CN113168413A (en) * 2018-10-09 2021-07-23 塔谱软件公司 Correlated incremental loading of multiple data sets for interactive data preparation applications
US11386089B2 (en) 2020-01-13 2022-07-12 The Toronto-Dominion Bank Scan optimization of column oriented storage

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102136005B (en) * 2011-03-29 2013-07-17 北京航空航天大学 Data searching method and device
CN103246498A (en) * 2013-05-13 2013-08-14 浪潮集团山东通用软件有限公司 Memory storage structures supporting relational data parallel processing and achieving method thereof

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105589969A (en) * 2015-12-23 2016-05-18 浙江大华技术股份有限公司 Data processing method and device
CN107092624A (en) * 2016-12-28 2017-08-25 北京小度信息科技有限公司 Date storage method, apparatus and system
CN107436767A (en) * 2017-07-31 2017-12-05 杭州安恒信息技术有限公司 The optimization method that idempotent operates in a kind of asynchronous framework
CN110069487A (en) * 2017-09-28 2019-07-30 北京国双科技有限公司 A kind of data processing method, apparatus and system
CN109101516A (en) * 2017-11-30 2018-12-28 新华三大数据技术有限公司 A kind of data query method and server
US11269881B2 (en) 2017-11-30 2022-03-08 New H3C Big Data Technologies Co., Ltd. Data query
CN109165378A (en) * 2018-08-15 2019-01-08 北京天安智慧信息技术有限公司 Sophisticated functions Report Customization method and system
CN113168413A (en) * 2018-10-09 2021-07-23 塔谱软件公司 Correlated incremental loading of multiple data sets for interactive data preparation applications
CN113168413B (en) * 2018-10-09 2022-07-01 塔谱软件公司 Correlated incremental loading of multiple data sets for interactive data preparation applications
CN109445945A (en) * 2018-10-29 2019-03-08 努比亚技术有限公司 Memory allocation method, mobile terminal, server and the storage medium of application program
CN109445945B (en) * 2018-10-29 2023-09-19 努比亚技术有限公司 Memory allocation method of application program, mobile terminal, server and storage medium
US11386089B2 (en) 2020-01-13 2022-07-12 The Toronto-Dominion Bank Scan optimization of column oriented storage

Also Published As

Publication number Publication date
CN104750727B (en) 2019-03-26

Similar Documents

Publication Publication Date Title
CN104750727A (en) Column type memory storage and query device and column type memory storage and query method
CN107391653B (en) Distributed NewSQL database system and picture data storage method
CN102799634B (en) Data storage method and device
CN109299100B (en) Managing internal memory data and the method and system for safeguarding data in memory
CN103309958B (en) The star-like Connection inquiring optimization method of OLAP under GPU and CPU mixed architecture
CN104765731B (en) Database inquiry optimization method and apparatus
CN104361113B (en) A kind of OLAP query optimization method under internal memory flash memory mixing memory module
US20150261793A1 (en) Method for implementing database
CN107577436A (en) A kind of date storage method and device
CN110196847A (en) Data processing method and device, storage medium and electronic device
CN106326475A (en) High-efficiency static hash table implement method and system
CN103678519A (en) Mixed storage system and mixed storage method for supporting Hive DML (data manipulation language) enhancement
CN105608214B (en) The method that fast search is carried out to the number-plate number of deploying to ensure effective monitoring and control of illegal activities
CN103186622A (en) Updating method of index information in full text retrieval system and device thereof
US8756208B2 (en) Encoded data processing
CN103019691A (en) Transformation method for extract, transform and load (ETL) operation relation graph and implementation system thereof
CN109815240A (en) For managing method, apparatus, equipment and the storage medium of index
CN103336828B (en) Real-time data base is read and wiring method
CN101441654B (en) Database retrieving method and system
CN109697068A (en) One kind dividing logic SQL statement interpretation method and device under the table mode of library
CN111126461A (en) Intelligent auditing method based on machine learning model explanation
CN104462080A (en) Index structure creating method and system with group statistics for search results
CN101963993A (en) Method for fast searching database sheet table record
CN110119410A (en) Processing method and processing device, computer equipment and the storage medium of reference book data
CN111581212A (en) Data storage method, system, server and storage medium of relational database

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20180705

Address after: 110179 room 519, 2-1 Gao Ge Road, Hunnan New District, Shenyang, Liaoning.

Applicant after: Yiyang Computer Technology Co.,Ltd. Shenyang

Address before: No. 1 building, hi tech Development Zone, Songshan Road, Nangang District, Harbin, Heilongjiang

Applicant before: BOCO INTER-TELECOM Co.,Ltd.

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230704

Address after: 150001 Building 1, High tech Development Zone, Songshan Road, Nangang District, Harbin City, Heilongjiang Province

Patentee after: BOCO INTER-TELECOM Co.,Ltd.

Address before: 110179 room 519, 2-1 Gao Ge Road, Hunnan New District, Shenyang, Liaoning.

Patentee before: Yiyang Computer Technology Co.,Ltd. Shenyang