CN104750727B - A kind of column memory storage inquiry unit and column memory storage querying method - Google Patents

A kind of column memory storage inquiry unit and column memory storage querying method Download PDF

Info

Publication number
CN104750727B
CN104750727B CN201310744231.9A CN201310744231A CN104750727B CN 104750727 B CN104750727 B CN 104750727B CN 201310744231 A CN201310744231 A CN 201310744231A CN 104750727 B CN104750727 B CN 104750727B
Authority
CN
China
Prior art keywords
data
cascade
querying condition
column
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310744231.9A
Other languages
Chinese (zh)
Other versions
CN104750727A (en
Inventor
杜海亮
陈晓峰
贲福才
户顺义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bright Oceans Inter Telecom Co Ltd
Original Assignee
YIYANG COMPUTER TECHNOLOGY Co Ltd SHENYANG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by YIYANG COMPUTER TECHNOLOGY Co Ltd SHENYANG filed Critical YIYANG COMPUTER TECHNOLOGY Co Ltd SHENYANG
Priority to CN201310744231.9A priority Critical patent/CN104750727B/en
Publication of CN104750727A publication Critical patent/CN104750727A/en
Application granted granted Critical
Publication of CN104750727B publication Critical patent/CN104750727B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention discloses a kind of column memory storage inquiry unit, which is characterized in that described device includes: memory storage unit, for creating each column data in columns group storage line storage organization in memory;Data load units, for the target column data in the line storage organization to be loaded into the memory storage unit;Query engine, cascade querying condition for being inputted according to data consumer, the column data for being loaded into the memory storage unit to the data load units is inquired, and is obtained the column data station location marker for meeting the cascade querying condition, is sent to row reading unit;Row reading unit obtains line number and the consistent All Datarows information of the column data station location marker according to the column data station location marker that the query engine obtains in the line storage organization, is sent to data consumer.The simple storage in memory can be realized through the invention, and efficiently inquiry.The present invention also provides a kind of column memory storage querying methods.

Description

A kind of column memory storage inquiry unit and column memory storage querying method
Technical field
Field of data storage of the present invention, more particularly to a kind of column memory storage inquiry unit and method.
Background technique
With the rapid development of Internet, various businesses data volume also sharply increases, it is direct for the storage mode of data The efficiency of data query is influenced, and the efficiency of data query will have a direct impact on the treatment effeciency of various businesses, scientific progress Data storage, guarantees the efficient read-write of data, is the basis for improving business processing efficiency.
It is stored in memory using which type of data structure, there is important meaning to EMS memory occupation and operation efficiency Justice, due to memory resource limitation, by occupy resource it is lesser in a manner of stored, should answer first be it is how small just small, As soon as int occupies 4 bytes in memory, N number of int should be minimum close to 4*N, and such memory consumption may be considered small , if also smaller than this, can only be using compression and specific coding mode, and compress and coding will necessarily in operation into The process that row is decompressed and decrypted by, this necessarily will affect efficiency, so memory table is in addition to special situation, it should use original text Storage avoids unnecessary operation from consuming.
Current various storage methods, the most commonly used is by row storage, the storage unit of every row data knot having the same Structure, will support more data type datas of storage multiple row in a line, it is most apply be inserted into data and access according to when be With behavior unit, so the data of the overwhelming majority are to store by row, however inquire and often occur on column, when inquiry, is needed The row pointer to object that every a line gets every a line is traversed, then gets the value of respective column by row pointer to object, if using general Structure may be indexed value according to the index or title of column, and each column object unit lattice are compared and are looked into after going It looks for, judges whether it is selected row, be put into result queue if it is line pointer.Value task of this execution, needs Searching value and type conversion is carried out to every row or value compares in every row, can occupy a large amount of memory sources, reduce inquiry value Efficiency.
The storage of prior art internal storage data is mostly using the storage of object line and complicated data structure storage, operation effect Rate is not high, occupies a large amount of memory sources;It is to be stored using memory database, but equally consume in a large amount of there are also a kind of thinking It deposits, for performance dependent on index, the interface used is complex, using SQL expression and resolver parsing inquiry.These technologies Feature has been doomed its search efficiency and loading efficiency is not high, and condition flexibility is not strong, has more complicated API, is not suitable for making Used in for the application scenarios efficiently inquired in memory.
Therefore a kind of storage inquiry mode for supporting high-speed and high-efficiency inquiry in memory is needed.
Summary of the invention
The present invention provides a kind of column memory storage inquiry units, which is characterized in that described device includes:
Memory storage unit, for creating each column data in columns group storage line storage organization in memory;
Data load units, for the target column data in the line storage organization to be loaded into the memory storage list Member;
Query engine, the cascade querying condition for being inputted according to data consumer load the data load units The column data for entering the memory storage unit is inquired, and the column data station location marker for meeting the cascade querying condition is obtained, It is sent to row reading unit;
Row reading unit obtains in the line storage organization according to the column data station location marker that the query engine obtains Line number and the consistent All Datarows information of the column data station location marker are taken, data consumer is sent to.
Preferably, described device further include:
Cache unit, for storing the inquiry data of the query engine, when the grade joint investigation of data consumer input Inquiry condition is consistent with the cascade querying condition of storage, and when the corresponding columns group of the cascade querying condition does not change, by institute The column data station location marker for meeting the cascade querying condition for stating storage is sent to row reading unit;
Period control unit deletes for storing the period for the memory storage unit set memory and exceeds institute in array The column data for stating the memory storage period is also used to set caching period for the cache unit, deletes and exceed institute in cache unit State the data cached of caching period;
Control unit is locked, for setting read lock to the memory storage unit and writing lock, controls the memory storage list The read-write locking and unlocking of member.
Detailed, the memory storage unit further comprises:
Column data memory module is deposited with the columns group in line for creating the column data of column storage of array different lines The field name of respective column names the columns group in storage structure, and the array index is column data station location marker;
Column data memory module storage is belonged to identical line for creating tables of data by tables of data memory module The columns group of storage organization is stored in identical tables of data;
The memory storage unit may include multiple tables of data memory modules, and each tables of data memory module may include multiple Column data memory module, each column data memory module save the data of a column in line storage organization, the column data Station location marker is identical with the line number in its corresponding line storage organization;
The column data is preferentially stored in column data memory module using simple types.
Preferably, version number is arranged in the tables of data to save in each tables of data memory module, when the tables of data When the corresponding line storage organization data change, the version number is updated.
Specifically, the query engine further comprises:
Rule defines memory module, for defining and storing the resolution rules of the cascade querying condition;
Querying condition parsing module is cascaded, for receiving data the cascade querying condition of consumer entering, according to the rule The resolution rules for then defining the cascade querying condition that memory module defines, parse the cascade querying condition;
Enquiry module, according to the cascade querying condition that the cascade querying condition parsing module parsing obtains, step by step to institute The columns group for stating the storage of column data memory module is inquired, and the column data position mark for meeting the cascade querying condition is obtained Know.
Specifically, the cache unit further comprises:
Data memory module is inquired, joins querying condition, the targeted tables of data version of the cascade querying condition for storage level The column data station location marker that this information and inquiry obtain;
Consistency check module, the grade joint investigation of cascade querying condition and the storage for inspection data consumer entering Whether inquiry condition consistent and the tables of data version information of inspection data customer demand inquiry and the tables of data version of storage are believed It whether consistent ceases;
Query result sending module, for when the judging result of consistency judgment models be it is consistent when, by the inquiry number The row reading unit is sent to according to the column data station location marker corresponding with the cascade querying condition that memory module stores.
More specifically, the consistency check module further comprises:
Querying condition inspection module is cascaded, if in and cascade querying condition identical according to cascade querying condition character string Not comprising non-idempotence function, then consistent rule is ask in decision level joint investigation, and item is inquired in the cascade of inspection data consumer entering Whether part is consistent with the cascade querying condition of the storage;
Target data inspection module, tables of data version information and the storage for the inquiry of inspection data customer demand Tables of data version information it is whether consistent.
Invention additionally discloses a kind of column memory storage querying methods, which comprises
Each column data in creation columns group storage line storage organization in memory;
Column data in the line storage organization is loaded into the columns group of the creation;
According to the cascade querying condition that data consumer inputs, the column data being loaded into columns group is looked into It askes, obtains the column data station location marker for meeting the cascade querying condition;
Line number and the column data are obtained in the line storage organization according to the column data station location marker of the acquisition The consistent All Datarows information of station location marker, is sent to data consumer.
Preferably, the method also includes:
By cascade querying condition that one query uses, the columns group being directed to, column data station location marker deposit caching is obtained, When the cascade querying condition of data consumer input is identical as the cascade querying condition of storage, and the cascade querying condition pair When the columns group answered does not change, according to the column data station location marker for meeting the cascade querying condition of the storage, obtain Take line number and the consistent All Datarows information of the column data station location marker;
Set memory stores the period, deletes the column data for exceeding the memory storage period in columns group;
Caching period is set, the data for exceeding the caching period in caching are deleted;
It sets read lock and writes lock, control the read-write locking and unlocking to the columns group.
Detailed, the method for each column data in the columns group of the creation in memory storage line storage organization is specific Are as follows:
The column data of column storage of array different lines is created, with the field of column data respective column in line storage organization The name nominating columns group, the column array index are column data station location marker;
Tables of data is created, the columns group for belonging to identical line storage organization is stored in identical tables of data;
Multiple tables of data can be created, each tables of data includes multiple columns groups, and each columns group saves The data of a column, the line number phase in the column data station location marker and its corresponding line storage organization in line storage organization Together;
Column data is preferentially stored in column data memory module using simple types.
Preferably, the version number information for storing each tables of data, when the column data in the tables of data changes When, update the version number information.
It is detailed, the cascade querying condition inputted according to data consumer, to the column being loaded into columns group Data are inquired, and the method for meeting the column data station location marker of the cascade querying condition is obtained specifically:
Define and store the resolution rules of the cascade querying condition;
The cascade querying condition for receiving data consumer input, according to the cascade querying condition resolution rules of the storage, Parse the cascade querying condition;
According to the cascade querying condition after the parsing, the data stored in the columns group are inquired step by step, are obtained Take the column data station location marker for meeting the cascade querying condition.
In more detail, the judgement is when the cascade querying condition of data consumer input and the cascade inquiry item of storage Part is consistent, and the not changed method of the corresponding columns group of the cascade querying condition specifically:
Determine if according to non-idempotence function is not included in and cascade querying condition identical if cascade querying condition character string The consistent rule of querying condition is cascaded, item is inquired in the cascade querying condition of inspection data consumer entering and the cascade of the storage Whether part is consistent;
If data consumer requires the tables of data version information of inquiry consistent with the tables of data version information of the storage, Determine that the corresponding columns group of the cascade querying condition does not change.
The present invention is stored in a manner of columns group, is looked by the way that the data in line storage organization are loaded into memory Qualified data are ask, and obtain the column data station location marker (i.e. the subscript of array position where data) of the data, are passed through The mark, which obtains, meets the line number of querying condition, and then obtains full line data, the advantage of doing so is that, it is not needed in inquiry time After going through the information that every a line takes every a line, calculating is compared to target column, but directly against the column in row storage organization All data put it into array and carry out unified comparison, obtain the line number for the condition that meets, then obtain the line number of the condition of satisfaction According to, be greatly saved traversal and every row is carried out the time that value compares calculating, to improve the efficiency of inquiry;The present invention simultaneously It provides cascade querying condition freely to edit for data consumer, gradually reduces query context by inquiring step by step, further increase Search efficiency;Preferably, the present invention also provides cachings, for saving the status information inquired and result information in the past, when next When secondary connection querying condition arrives, if querying condition, target column array do not change, directly according to the inquiry of storage Columns group station location marker reads row information and is sent to data consumer, search efficiency is further improved, in conclusion of the invention Provide a kind of column memory storage inquiry mode for supporting high-speed and high-efficiency to inquire.
Detailed description of the invention
Fig. 1 is a kind of structural schematic diagram of the column memory storage inquiry unit of the embodiment of the present invention one;
Fig. 2 is the structural schematic diagram of two devices of the embodiment of the present invention;
Fig. 3 is a kind of column memory storage inquiry unit workflow schematic diagram that the embodiment of the present invention three combines example;
Fig. 4 is a kind of method flow diagram of the column memory storage querying method of the embodiment of the present invention four;
Fig. 5 is the method flow diagram of five method of the embodiment of the present invention;
Fig. 6 is that how resolution stage joins querying condition and carries out the method flow diagram of consistency judgement the embodiment of the present invention six;
Fig. 7 is that cascade query hierarchy schematic diagram is carried out in the embodiment of the present invention six.
Specific embodiment
Carry out the embodiment that the present invention will be described in detail below in conjunction with schema and embodiment, how the present invention is applied whereby Technological means solves technical problem and reaches the realization process of technical effect to fully understand and implement.
As shown in Figure 1, the embodiment of the present invention one discloses a kind of column memory storage inquiry unit, including with flowering structure:
Memory storage unit 1, for creating each column data in columns group storage line storage organization in memory.
Since memory source space is limited, using each column data of simple storage of array.Structure of arrays only has subscript and value, Since its structure is simple, so occupying less memory source.
Data load units 2, for the target column data in the line storage organization to be loaded into the memory storage Unit 1.
Data load units obtain from data source there and load the target column data inquired.The week of loading The size that phase loads has data source to determine, data source can issue it to data consumer and be mounted with which data, data consumption Person can inquire the data of loading according to the information of publication.Data load units are carried out according to the load command of data source Data load.
General line storage organization is as shown in table 1 below, and one has student's table of 10,000 rows.
1 line storage organization example of table
Line number Name (name) Age (age) Class (class)
0 Zhang 17 1
1 Lee 22 2
2 Mr. Wang 21 3
3 Zhao 19 2
9999 Qian 21 1
To upper table 1, all rows are found out according to age=29, by the way of row storage are as follows:
(1) the row pointer to object that every a line gets every a line is traversed
(2) value that age column are got by row pointer to object may be according to the index of column age if it is universal architecture is used Or title is indexed value.
(3) due to column generally according to universal architecture store, need to each age cell carry out (int) type conversion or Compare lookup, the int and 29 after conversion carries out comparison of equalization, judges whether it is selected row, line pointer is put into knot in this way In fruit queue.
(4) search is completed, and obtains whole result line pointers.
Using each column data in the column storage of array line storage organization, i.e., by name (name) conduct in upper table 1 One columns group, age (age) are used as a columns group, and class (class) is used as a columns group.
Name, age, class tri- is arranged as name column array, age columns group and class and is arranged by data load units Array is loaded into memory storage unit.
When data load units are loaded with the data in multiple line storage organizations into memory storage unit, can introduce Tables of data DataTable concept, the creation columns group set of memory storage unit can be described as tables of data DataTable, and one DataTable includes multiple columns groups, and the column data of a column in line storage organization is saved in a columns group, and data load single Respectively equipping in one line storage organization is loaded into each columns group in one DataTable by member.
It can be with the name nominating tables of data of line storage organization, with the name nominating columns group of column when loading.
Query engine 3, the cascade querying condition for being inputted according to data consumer fill the data load units 2 The column data for being loaded into the memory storage unit 1 is inquired, and the column data position mark for meeting the cascade querying condition is obtained Know, is sent to row reading unit.
Cascade querying condition is that data consumer is write according to the data query requirements of itself, there is querying condition group one by one At query engine can directly calculate the column data progress in columns group according to these querying conditions, avoid capable storage First traversal is gone in structure, then takes the target column in row, then calculate, and meets the troublesome calculation process of record line number after design conditions.
After query engine is completed according to querying condition to the calculating of all column datas in columns group, by the columns for the condition that meets It is recorded according to station location marker, is sent to row reading unit.
Row reading unit 4, the column data station location marker obtained according to the query engine 3 is in the line storage organization Line number and the consistent All Datarows information of the column data station location marker are obtained, data consumer is sent to.
After obtaining column data station location marker, need to know according to column data station location marker corresponding row storage knot Then line number in structure is expert at by line number and obtains all field informations of the row row in storage organization, is sent to data consumer.
Preferably, described in order to improve the search efficiency for being directed to same stages connection querying condition and being directed to same number of columns evidence Device further include:
Cache unit 5, for storing the inquiry data of the query engine 3, when the cascade of data consumer input Querying condition is consistent with the cascade querying condition of storage, and when the corresponding columns group of the cascade querying condition does not change, will The column data station location marker for meeting the cascade querying condition of the storage is sent to row reading unit.
Due in actual operation, it may appear that data consumer carries out the inquiry of the same terms for same columns group, In order to improve search efficiency, repeated work is avoided, cache unit 5 of the invention is responsible for saving each inquiry data, described Inquiry data mainly include the content of three aspects, a querying condition inquired for this;Two inquire the columns group being directed to for this Attribute information, three be to use this querying condition column data station location marker obtained to columns group.When the first two information is When changing, then directly acquire column data station location marker be sent to row reading unit go obtain corresponding line data, be sent to number According to consumer, the consistency checking efficiency of two kinds of information to be exceeded, each data in columns group are calculated again, therefore is slow Memory cell is conducive to improve the working efficiency of apparatus of the present invention.
Preferably, in order to effectively use and control memory source, guarantee that system operates normally, described device further include:
Period control unit 6 deletes for storing the period for the memory storage unit set memory and exceeds institute in array The column data for stating the memory storage period is also used to set caching period for the cache unit, deletes and exceed institute in cache unit State the data cached of caching period.
Since data are constantly loaded into memory storage unit, finally when facing huge data volume by memory resource limitation Vast resources can be consumed, efficiency is caused to decline, and some data are just of little use after inquiring one period, it can be with It is cleared up in memory eases up and deposits.Therefore set memory storage period and caching period, the data for exceeding the period is deleted, guarantee this hair The normal operation of bright device.
Preferably, in order to guarantee that the correctness when being written and read to memory storage unit 1, described device are also wrapped It includes:
Control unit 7 is locked, for setting read lock to the memory storage unit 1 and writing lock, controls the memory storage The read-write locking and unlocking of unit 1.
Read lock is the correctness in order to ensure read-write operation with lock is write, and the mode of locking may be set according to actual conditions, Can such as chains be carried out to the tables of data in memory storage unit, the columns group in memory storage unit can also be locked.Locking The locking principle of control unit is to guarantee that the same moment can only have a load thread to be in active state, and load thread is not tied Others load thread and reading thread are all in wait state when beam;Any number of inquiry operations are supported concurrently to execute, but When inquiry is not finished, load thread is waited for.
Based on the embodiment of the present invention one, for the working principle of further description apparatus of the present invention, spy provides this The embodiment two of invention, as shown in Figure 2.
Memory storage unit 1 further comprises:
Column data memory module 11, for creating the column data of column storage of array different lines, with the column data in line The field name of respective column names the columns group in storage organization, and the array index is column data station location marker.
The memory storage unit may include multiple column data memory modules, and each column data memory module saves row The data of a column, the line number phase in the column data station location marker and its corresponding line storage organization in formula storage organization Together.
Column data station location marker is the array index of the column data position, and array index starts with 0, according to array Subscript takes the line number in its corresponding line storage organization, if line number is also to start with 0, takes identical line number, if Line number is started with 1, then takes the line number of " subscript+1 ".
Column data is preferentially stored in column data memory module using simple types.
There are many selections, such as int type data storing by column, int [] or Integer [] can be used, The common storage organizations such as List<Integer>.
The int [] that initialization is one 10,000,000 only needs 15ms, and needs 400ms, difference more than 20 using Integer [] Times, it is often more important that committed memory group member, int [] only needs 49MB, and Integer [] needs 228MB, such as uses Then internal structure is equal to Object [] ArrayList<Integer>, and system and storage can only be greater than Integer [], knot It is more effective storage organization by being self-evident, therefore using simple types storage array, common simple types is such as Int, bool, double.
Tables of data memory module 12 is identical belonging to of storing by the column data memory module 11 for creating tables of data The columns group of line storage organization is stored in identical tables of data.
The memory storage unit may include multiple tables of data memory modules, and each tables of data memory module may include multiple Column data memory module.
Version number is arranged in tables of data to save in each tables of data memory module 12, when the tables of data is corresponding When line storage organization data change, the version number is updated.
When needing to import multiple line storage tables in memory, column in each line storage table of expression for clarity, Tables of data is preferably used, columns group and tables of data are subordinate relation, and a tables of data may include multiple columns groups, a line Storage table corresponds to a tables of data, the data comprising all column in line storage in tables of data, the column data storage of each column For a columns group.
Data load units 2, for the target column data in the line storage organization to be loaded into the memory storage Columns group in unit 1.
Query engine 3 further comprises:
Rule defines memory module 31, for defining and storing the resolution rules of the cascade querying condition.
The cascade querying condition resolution rules of definition, i.e., pair of operator and general-purpose operation symbol in definition cascade querying condition It should be related to.
It defines cascaded operational symbol corresponding relationship and provides convenience when can join querying condition for data consumer input stage, it is described Corresponding relationship be it is customized, can change according to the actual situation cascaded operational symbol title.
Corresponding relationship example as shown in Table 2 is provided in the present invention.
2 cascaded operational of table symbol accords with corresponding relationship with general-purpose operation
Using the corresponding relationship recorded in table 2, cascade querying condition is write, such as shown below:
Eq (' class', 2) .bw (' age', 18,20) .eq (' sex', 0) .like (' address ', ' Haidian % % ')
Querying condition parsing module 32 is cascaded, for receiving data the cascade querying condition of consumer entering, according to described Rule defines the resolution rules for the cascade querying condition that memory module 31 defines, and parses the cascade querying condition.
According to the corresponding relationship in table 2, above-mentioned cascade querying condition is parsed.
The meaning of above-mentioned cascade querying condition statement are as follows: class 2, age 18 (>=) between 20 (<20), gender be It include the record in Haidian in male's (0 is male), address.
Enquiry module 33, according to the cascade querying condition that cascade querying condition parsing module 32 parsing obtains, step by step The columns group of the column data memory module 11 storage is inquired, the column data position for meeting the cascade querying condition is obtained Set mark.
This cascade querying condition is from left to right executed, every primary result of execution just becomes smaller once, has arrived most time-consuming When like is operated, record strip number has greatly reduced, and the number for executing like is also also considerably reduced, and such overall efficiency is just It is many to have promotion.
Row reading unit 4, the column data station location marker obtained according to the query engine 3 is in the line storage organization Line number and the consistent All Datarows information of the column data station location marker are obtained, data consumer is sent to.
Cache unit 5 further comprises:
Data memory module 51 is inquired, joins querying condition, the targeted tables of data of the cascade querying condition for storage level The column data station location marker that version information and inquiry obtain.
In order to avoid carrying out repeating inquiry to the same terms same queries target data, by inquiring data memory module, The column data subscript that the querying condition inquired every time, inquiry target data, inquiry obtain is stored, so as to data consumer When being inquired for same queries condition and target data, the column data subscript for obtaining the condition that meets is directly transmitted, improves effect Rate.
Consistency check module 52, the cascade of cascade querying condition and the storage for inspection data consumer entering Whether querying condition consistent and the tables of data version of the tables of data version information of inspection data customer demand inquiry and storage Whether information is consistent.
Consistency check module is exactly the querying condition that whether there is data consumer input in caching for checking, and Whether inquiry target data changes, under the column data for meeting querying condition by that can directly transmit storage after examining Mark.Therefore consistency check module 52 should carry out the inspection of two aspects, further comprise:
Querying condition inspection module 521 is cascaded, if for and cascade inquiry item identical according to cascade querying condition character string Not including non-idempotence function in part, then consistent rule, the grade joint investigation of inspection data consumer entering are ask in decision level joint investigation Whether inquiry condition is consistent with the cascade querying condition of the storage.
Judge whether the cascade querying condition of data consumer input is identical as the cascade querying condition of the storage, passes through Two condition criterions, whether one is identical for character string, and whether two be comprising non-idempotence function.Character string is identical successively to be compared Character string.Non- idempotence function is to repeat the different function of implementing result, although its character string is identical, due to non-idempotence letter Several characteristics, the result that causes it to repeat occur it is different, therefore include the querying condition of non-idempotence function in fact Changed, it is not identical for should surveying.Such as now () function is to obtain current time, it being performed a plurality of times the result is that Inconsistent, it is meant that if including now () function in query expression, such query result cannot be reused, be needed Query execution is carried out every time.
When treating non-idempotence function, there are also a kind of processing modes, i.e., inquiry data memory module 51 is to inquiry item Whether judge in querying condition when part is stored comprising non-idempotence function, if comprising, not to the querying condition and Relevant information corresponding with the querying condition is stored.
Target data inspection module 522, for inspection data customer demand inquiry tables of data version information with it is described Whether the tables of data version information of storage is consistent.
Querying condition is consistent, but target data changes, and also results in query result and changes, it is therefore desirable to mesh Mark data, which carry out consistency check, will be updated since tables of data stores version information when data change in tables of data Version information, therefore only need simple comparison data table version information that consistency check can be completed.
Query result sending module 53, for being looked into described when the judging result of consistency judgment module 52 is consistent It askes the column data station location marker corresponding with the cascade querying condition that data memory module 51 stores and is sent to the row reading list Member.
Period control unit 6 deletes for storing the period for the memory storage unit set memory and exceeds institute in array The column data for stating the memory storage period is also used to set caching period for the cache unit, deletes and exceed institute in cache unit State the data cached of caching period.
Control unit 7 is locked, for setting read lock to the memory storage unit 1 and writing lock, controls the memory storage The read-write locking and unlocking of unit 1.
To better illustrate apparatus of the present invention and working principle, for the table 1 in the present invention, based on embodiment one, implement Example two provides the embodiment of the present invention three, is described in detail, as shown in Figure 3.
Step S101: column data memory module 11 creates columns group according to line storage organization.
Including name column array, age columns group, class columns group.Each columns group is stored with simple data type respectively In data.As shown in table 3.
3 column storage of array schematic diagram of table
Step S102: tables of data memory module 12 creates tables of data, saves each columns group created in step S101, version Number be studentV1.0.
Step S103: data consumer defines the rule that memory module 31 defines according to rule, writes cascade querying condition.
The cascade querying condition write is " eq (' class', 2) .bw (' age', 18,20) "
Step S104: cascade querying condition parsing module 32 defines the rule that memory module 31 defines, docking according to rule Even querying condition is parsed.
The result of parsing is 2 to inquire class, data of the age between 18 to 20.
Step S105: the querying condition that enquiry module 33 is parsed according to querying condition parsing module 32 stores tables of data Class columns group, the age columns group stored in module 12 is inquired step by step.
Class columns group is first inquired ,=2 data subscript is obtained, is 1,3 ... ...;
Age columns group is inquired on this basis, obtains the array index between 18-20, is 3 ... ....
Step S106: row reading unit 4 reads the institute of the row according to the column data subscript of acquisition in line storage organization There is row data information to be sent to data consumer.
Row data-reading unit 4 read line number be 3 ... row data information, be sent to data consumer.
Step S107: inquiry data memory module 51 stores the querying condition and the corresponding mesh of querying condition of this inquiry Mark data version information and column data array index.
The querying condition of storage are as follows: " eq (' class', 2) .bw (' age', 18,20) ";
The target data version information of storage are as follows: student1.0;
The column data subscript of storage are as follows: 3 ....
Step S108: consistency check module 52 receives new cascade querying condition, carries out to new cascade querying condition Consistency check enters step S109 by examining, not verified to enter step S104.
Step S109: column data subscript corresponding with the concatenation condition is obtained in inquiry data memory module 51, is entered Step S106.
The embodiment of the present invention four discloses a kind of column memory storage querying method, as shown in Figure 4, which comprises
Step S201: each column data in columns group storage line storage organization is created in memory.
It should be noted that is created herein is empty columns group, the purpose is to store the lattice of line storage organization column Data.
Step S202: the column data in the line storage organization is loaded into the columns group of the creation.
Step S203: the cascade querying condition inputted according to data consumer, to the columns being loaded into columns group According to being inquired, the column data station location marker for meeting the cascade querying condition is obtained.
Define and store the resolution rules of the cascade querying condition;Item is inquired in the cascade for receiving data consumer input Part parses the cascade querying condition according to the cascade querying condition resolution rules of the storage;According to the grade after the parsing Join querying condition, the data stored in the columns group are inquired step by step, obtains the column for meeting the cascade querying condition Data Position mark.
Step S204: line number and institute are obtained in the line storage organization according to the column data station location marker of the acquisition The consistent All Datarows information of column data station location marker is stated, data consumer is sent to.
Due to when carrying out column storage of array, being the row sequential storage according to each column data in line storage organization, because This column data station location marker (array index i.e. where the column data) and its line number for being located at row in line storage organization It is corresponding, it should be noted that array index since 0, if the line number of line storage organization also since 0, column data The array index at place as its line number, if the line number of line storage organization since 1, by the number where column data Group subscript is used as its line number after adding 1.
Obtain the line number where column data, so that it may which the information of the full line taken in line storage organization according to line number is sent out Data consumer is given, the search efficiency of data consumer is improved.
In order to further increase same queries condition, for the search efficiency of same target columns group, guarantee rationally utilization With control memory source, guarantee the read-write correctness of columns group, it is preferred that can by cascade querying condition that one query uses, For columns group, obtain column data station location marker deposit caching, when data consumer input the cascade querying condition with The cascade querying condition of storage is identical, and when the corresponding columns group of the cascade querying condition does not change, according to the storage The column data station location marker for meeting the cascade querying condition obtain line number and the column data station location marker is consistent all Row data information.
Set memory stores the period, deletes the column data for exceeding the memory storage period in columns group.
Caching period is set, the data for exceeding the caching period in caching are deleted.
It sets read lock and writes lock, control the read-write locking and unlocking to the columns group.
In the following, the embodiment of the present invention five is provided, as shown in figure 5, described method includes following steps:
Step S301: each column data in columns group storage line storage organization is created in memory.
The column data of column storage of array different lines is created, with the field of column data respective column in line storage organization The name nominating columns group, the column array index are column data station location marker.
Column data is preferentially stored in column data memory module using simple types.
Column data is stored using simple types, it is possible to reduce the memory source that data storage occupies, arithmetic speed also have Significantly promoted.
Step S302: the columns group for belonging to identical line storage organization is stored in identical data by creation tables of data Table.
Multiple tables of data can be created, each tables of data includes multiple columns groups, and each columns group saves The data of a column, the line number phase in the column data station location marker and its corresponding line storage organization in line storage organization Together.
The version number information for storing each tables of data is updated when the column data in the tables of data changes The version number information.
Step S303: the columns column data in the line storage organization being loaded into the tables of data of the creation Group.
Column data in identical line storage organization is loaded into a tables of data, may include multiple columns in the tables of data Group, for loading each column data in the line storage organization.The corresponding columns group of every column data.
Step S304: the cascade querying condition inputted according to data consumer judges in caching with the presence or absence of same Cascade querying condition, then enter step S305 if it does not exist, then enter step S306 if it exists.
The previous data consumer within caching period is stored in caching to look into target column data according to querying condition The relevant information of inquiry, if finding, the querying condition of this inquiry was once used, further judged that the target column data being directed to is It is no identical;If the querying condition of this inquiry never occurs, inquired according to new querying condition.
Step S305: according to the cascade querying condition of the consumer entering, to the data being loaded into columns group It is inquired, obtains the column data station location marker for meeting the cascade querying condition, the inquiry data of this inquiry are stored in slow It deposits, enters step S308.
This inquiry is new inquiry, then in this inquiry of completion, data deposit caching will be inquired, when appearance and this When inquiring identical inquiry, the column data mark that inquiry obtains can be directly taken out, into the reading of line data, can be improved effect Rate.
Inquiry data of this inquiry include the cascade querying condition used, this tables of data version for being directed to of inquiry The column data station location marker that information and this inquiry obtain.
Step S306: judging whether tables of data version information corresponding with the cascade querying condition changes in caching, S305 is entered step if changing, if entering step S307 there is no variation.
If tables of data version information is changed, illustrates that target column data may have occurred update, be loaded again Or other variations have occurred, therefore inquiry is changed, it cannot query result before use, it is therefore desirable to return step S305。
Step S307: column data station location marker corresponding with the cascade querying condition is obtained in the buffer.
Step S308: line number and institute are obtained in the line storage organization according to the column data station location marker of the acquisition The consistent All Datarows information of column data station location marker is stated, data consumer is sent to.
If data consumer requires the tables of data version information of inquiry consistent with the tables of data version information of the storage, Determine that the corresponding columns group of the cascade querying condition does not change.
It illustrates how to complete rule definition and parsing to cascade querying condition in order to clearer, and how to determine to inquire Whether condition is consistent, provides the embodiment of the present invention six, as shown in Figure 6.
It is often many condition in one query, such as " class is 2 classes, and the age, gender was male between 18~20 for lookup Property student ", inquiry parsing need one by one condition execute, in the case where any field is fair without index, such processing For program it is that very easily, just condition executes one by one for it, it is apparent that these three conditions are the relationships of and, the 1st Condition executes completion, should not should do in the 1st result on all rows in the 2nd condition of execution, using grade Connection expression formula carrys out descriptive level connection querying condition, it can be achieved that inquiry step by step to each condition, further increases search efficiency.Therefore this In set cascaded expression writing mode.
Step S401: defining and stores the resolution rules of the cascade querying condition.
The cascade querying condition resolution rules of definition, i.e., pair of operator and general-purpose operation symbol in definition cascade querying condition It should be related to.As shown in Table 2 above.
It defines cascaded operational symbol corresponding relationship and provides convenience when can join querying condition for data consumer input stage, it is described Corresponding relationship be it is customized, can change according to the actual situation cascaded operational symbol title.
Step S402: the cascade querying condition of data consumer input is received, according to the cascade querying condition of the storage Resolution rules parse the cascade querying condition.
It is placed on front according to simple querying condition, the principle that complicated querying condition is put behind writes cascade inquiry item Part.
Cascade querying condition is from left to right executed, every primary result of execution just becomes smaller once, has arrived most time-consuming operation When, record strip number has greatly reduced, and the number for executing like is also also considerably reduced, and such overall efficiency just has promotion Very much, as shown in following Fig. 7.
It, can be according to by not time-consuming operation (such as equal) when user's input stage joins querying condition according to above-mentioned thinking Preposition, time-consuming operation (like) postposition, whole efficiency just have promotion by a larger margin.
Step S403: according to the cascade querying condition after the parsing, judge the cascade querying condition character string and delay Whether the character string for depositing middle storage is identical, then enters step S404 if they are the same, if not identical, enters step S405.
Judge whether character string is identical, it is only necessary to which verification can be obtained as a result, repeating no more one by one.
Step S404: judge to walk whether comprising non-idempotence function if entered comprising if in the cascade querying condition Rapid S405 enters step S406 if not including.
In the case where character string is identical, if including non-idempotence function in character string, indicate that the character string is retouched The cascade querying condition stated has been likely occurred variation.
Such as non-idempotence function now(), expression takes current state, and the query result data saved in caching, before being The result data inquired not is current result data, therefore cannot directly acquire.
Step S405: determine that the cascade querying condition of data consumer input and the cascade querying condition in caching are different It causes.
Step S406: determine that the cascade querying condition of data consumer input is consistent with the cascade querying condition in caching.
The detailed implementation of each step in this method, reference can be made to device part description above, therefore repeat no more.
Although disclosed herein embodiment it is as above, the content is not of the invention directly to limit Protection scope.Any the technical staff in the technical field of the invention, do not depart from disclosed herein spirit and scope Under the premise of, a little change can be made in the formal and details of implementation.Protection scope of the present invention, still must be with appended power Subject to the range that sharp claim is defined.

Claims (13)

1. a kind of column memory storage inquiry unit, which is characterized in that described device includes:
Memory storage unit, for creating each column data in columns group storage line storage organization in memory;
Data load units, for the target column data in the line storage organization to be loaded into the memory storage unit;
Query engine, the cascade querying condition for being inputted according to data consumer are loaded into institute to the data load units The column data for stating memory storage unit is inquired, and is obtained the column data station location marker for meeting the cascade querying condition, is sent Give row reading unit;
Row reading unit obtains row according to the column data station location marker that the query engine obtains in the line storage organization Number with the consistent All Datarows information of the column data station location marker, be sent to data consumer;
Described device further include:
Cache unit, for storing the inquiry data of the query engine, when item is inquired in the cascade of data consumer input Part is consistent with the cascade querying condition of storage, and when the corresponding columns group of the cascade querying condition does not change, deposits described The column data station location marker for meeting the cascade querying condition of storage is sent to row reading unit.
2. the apparatus according to claim 1, which is characterized in that described device further include:
Period control unit is deleted in array and is exceeded in described for storing the period for the memory storage unit set memory The column data for storing the period is also used to set caching period for the cache unit, deletes in cache unit beyond described slow Cycle of deposit it is data cached;
Control unit is locked, for setting read lock to the memory storage unit and writing lock, controls the memory storage unit Read and write locking and unlocking.
3. the apparatus of claim 2, which is characterized in that the memory storage unit further comprises:
Column data memory module is stored in line with the columns group and is tied for creating the column data of column storage of array different lines The field name of respective column names the columns group in structure, and the array index is column data station location marker;
Tables of data memory module stores the identical line that belongs to of column data memory module storage for creating tables of data The columns group of structure is stored in identical tables of data;
The memory storage unit may include multiple tables of data memory modules, and each tables of data memory module may include multiple columns According to memory module, each column data memory module saves the data of a column in line storage organization, the column data position It identifies identical with the line number in its corresponding line storage organization;
The column data is preferentially stored in column data memory module using simple types.
4. device according to claim 3, it is characterised in that:
Version number is arranged in tables of data to save in each tables of data memory module, when the corresponding row of the tables of data When formula storage organization data change, the version number is updated.
5. device according to claim 4, which is characterized in that the query engine further comprises:
Rule defines memory module, for defining and storing the resolution rules of the cascade querying condition;
Querying condition parsing module is cascaded, for receiving data the cascade querying condition of consumer entering, it is fixed according to the rule The resolution rules for the cascade querying condition that adopted memory module defines, parse the cascade querying condition;
Enquiry module, according to the cascade querying condition that the cascade querying condition parsing module parsing obtains, step by step to the column The columns group of data memory module storage is inquired, and the column data station location marker for meeting the cascade querying condition is obtained.
6. device according to claim 5, which is characterized in that the cache unit further comprises:
Data memory module is inquired, joins querying condition, the targeted tables of data version letter of the cascade querying condition for storage level The column data station location marker that breath and inquiry obtain;
Consistency check module inquires item for the cascade querying condition of inspection data consumer entering and the cascade of the storage Whether part consistent and the tables of data version information of inspection data customer demand inquiry and the tables of data version information of storage are It is no consistent;
Query result sending module, for when the inspection result of consistency judgment models is consistent, the inquiry data to be deposited The column data station location marker corresponding with the cascade querying condition of storage module storage is sent to the row reading unit.
7. device according to claim 6, which is characterized in that the consistency check module further comprises:
Querying condition inspection module is cascaded, if for not wrapping in and cascade querying condition identical according to cascade querying condition character string Containing non-idempotence function, then consistent rule is ask in decision level joint investigation, the cascade querying condition of inspection data consumer entering with Whether the cascade querying condition of the storage is consistent;
Target data inspection module, for the tables of data version information of inspection data customer demand inquiry and the number of the storage It is whether consistent according to table version information.
8. a kind of column memory storage querying method, which is characterized in that the described method includes:
Each column data in creation columns group storage line storage organization in memory;
Column data in the line storage organization is loaded into the columns group of the creation;
According to the cascade querying condition that data consumer inputs, the column data being loaded into columns group is inquired, is obtained Take the column data station location marker for meeting the cascade querying condition;
Line number and the column data position are obtained in the line storage organization according to the column data station location marker of the acquisition Consistent All Datarows information is identified, data consumer is sent to;
The method also includes:
By cascade querying condition that one query uses, the columns group being directed to, column data station location marker deposit caching is obtained, works as number It is identical as the cascade querying condition of storage according to the cascade querying condition of consumer entering, and the cascade querying condition is corresponding When columns group does not change, according to the column data station location marker for meeting the cascade querying condition of the storage, row is obtained Number with the consistent All Datarows information of the column data station location marker.
9. according to the method described in claim 8, it is characterized in that, the method also includes:
Set memory stores the period, deletes the column data for exceeding the memory storage period in columns group;
Caching period is set, the data for exceeding the caching period in caching are deleted;
It sets read lock and writes lock, control the read-write locking and unlocking to the columns group.
10. according to the method described in claim 9, it is characterized in that, the columns group of the creation in memory storage line storage The method of each column data in structure specifically:
The column data of column storage of array different lines is created, with the field name of column data respective column in line storage organization The columns group is named, the column array index is column data station location marker;
Tables of data is created, the columns group for belonging to identical line storage organization is stored in identical tables of data;
Multiple tables of data can be created, each tables of data includes multiple columns groups, and each columns group saves line The column data of a column, the line number phase in the column data station location marker and its corresponding line storage organization in storage organization Together;
Column data is preferentially stored in column data memory module using simple types.
11. according to the method described in claim 10, it is characterized by:
The version number information for storing each tables of data, when the column data in the tables of data changes, described in update Version number information.
12. according to the method for claim 11, which is characterized in that item is inquired in the cascade according to data consumer input Part inquires the column data being loaded into columns group, obtains the column data position for meeting the cascade querying condition The method of mark specifically:
Define and store the resolution rules of the cascade querying condition;
The cascade querying condition for receiving data consumer input, according to the cascade querying condition resolution rules of the storage, parsing The cascade querying condition;
According to the cascade querying condition after the parsing, the data stored in the columns group are inquired step by step, are obtained full The column data station location marker of the foot cascade querying condition.
13. according to the method for claim 12, which is characterized in that the cascade of data consumer input is worked as in the judgement Querying condition is consistent with the cascade querying condition of storage, and the not changed method of the corresponding columns group of the cascade querying condition Specifically:
Determine to cascade if according to non-idempotence function is not included in and cascade querying condition identical if cascade querying condition character string The consistent rule of querying condition, the cascade querying condition of inspection data consumer entering and the cascade querying condition of the storage are It is no consistent;
If data consumer requires the tables of data version information of inquiry consistent with the tables of data version information of the storage, determine The corresponding columns group of the cascade querying condition does not change.
CN201310744231.9A 2013-12-30 2013-12-30 A kind of column memory storage inquiry unit and column memory storage querying method Active CN104750727B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310744231.9A CN104750727B (en) 2013-12-30 2013-12-30 A kind of column memory storage inquiry unit and column memory storage querying method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310744231.9A CN104750727B (en) 2013-12-30 2013-12-30 A kind of column memory storage inquiry unit and column memory storage querying method

Publications (2)

Publication Number Publication Date
CN104750727A CN104750727A (en) 2015-07-01
CN104750727B true CN104750727B (en) 2019-03-26

Family

ID=53590426

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310744231.9A Active CN104750727B (en) 2013-12-30 2013-12-30 A kind of column memory storage inquiry unit and column memory storage querying method

Country Status (1)

Country Link
CN (1) CN104750727B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105589969A (en) * 2015-12-23 2016-05-18 浙江大华技术股份有限公司 Data processing method and device
CN107092624B (en) * 2016-12-28 2022-08-30 北京星选科技有限公司 Data storage method, device and system
CN107436767A (en) * 2017-07-31 2017-12-05 杭州安恒信息技术有限公司 The optimization method that idempotent operates in a kind of asynchronous framework
CN110069487A (en) * 2017-09-28 2019-07-30 北京国双科技有限公司 A kind of data processing method, apparatus and system
CN109101516B (en) 2017-11-30 2019-09-17 新华三大数据技术有限公司 A kind of data query method and server
CN109165378A (en) * 2018-08-15 2019-01-08 北京天安智慧信息技术有限公司 Sophisticated functions Report Customization method and system
JP7199522B2 (en) * 2018-10-09 2023-01-05 タブロー ソフトウェア,インコーポレイテッド Correlated incremental loading of multiple datasets for interactive data prep applications
CN109445945B (en) * 2018-10-29 2023-09-19 努比亚技术有限公司 Memory allocation method of application program, mobile terminal, server and storage medium
US11386089B2 (en) 2020-01-13 2022-07-12 The Toronto-Dominion Bank Scan optimization of column oriented storage
CN117931802B (en) * 2024-01-25 2024-09-13 南京雀翼信息科技有限公司 High-speed writing and reading system and method for multiple data sources

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102136005A (en) * 2011-03-29 2011-07-27 北京航空航天大学 Data searching method and device
CN103246498A (en) * 2013-05-13 2013-08-14 浪潮集团山东通用软件有限公司 Memory storage structures supporting relational data parallel processing and achieving method thereof

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102136005A (en) * 2011-03-29 2011-07-27 北京航空航天大学 Data searching method and device
CN103246498A (en) * 2013-05-13 2013-08-14 浪潮集团山东通用软件有限公司 Memory storage structures supporting relational data parallel processing and achieving method thereof

Also Published As

Publication number Publication date
CN104750727A (en) 2015-07-01

Similar Documents

Publication Publication Date Title
CN104750727B (en) A kind of column memory storage inquiry unit and column memory storage querying method
US9875280B2 (en) Efficient partitioned joins in a database with column-major layout
US8935575B2 (en) Test data generation
US9129002B2 (en) Dividing device, dividing method, and recording medium
CN104361113B (en) A kind of OLAP query optimization method under internal memory flash memory mixing memory module
CN107943952B (en) Method for realizing full-text retrieval based on Spark framework
EP2354921A1 (en) Hybrid evaluation of expressions in DBMS
US20170255673A1 (en) Batch Data Query Method and Apparatus
CN108664516A (en) Enquiring and optimizing method and relevant apparatus
CN101901265B (en) Objectification management system of virtual test data
CN102236672A (en) Method and device for importing data
CN107038222A (en) Database caches implementation method and its system
CN101251825A (en) Device and method for generating test use case
CN103019691A (en) Transformation method for extract, transform and load (ETL) operation relation graph and implementation system thereof
JP2007334627A (en) Service base software design support method, and device therefor
EP3293644B1 (en) Loading data for iterative evaluation through simd registers
CN106844320A (en) A kind of financial statement integration method and equipment
JP2014502756A (en) Apparatus and method for mass data storage based on tree structure
JP5936135B2 (en) Information processing apparatus, information processing method, and program
CN100395752C (en) Report data collection system and method
CN101719162A (en) Multi-version open geographic information service access method and system based on fragment pattern matching
CN115062028B (en) Method for multi-table join query in OLTP field
CN111126619A (en) Machine learning method and device
CN107888686B (en) User data validity verification method located at HBase client
CN105630997A (en) Data parallel processing method, device and equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20180705

Address after: 110179 room 519, 2-1 Gao Ge Road, Hunnan New District, Shenyang, Liaoning.

Applicant after: Yiyang Computer Technology Co.,Ltd. Shenyang

Address before: No. 1 building, hi tech Development Zone, Songshan Road, Nangang District, Harbin, Heilongjiang

Applicant before: BOCO INTER-TELECOM Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20230704

Address after: 150001 Building 1, High tech Development Zone, Songshan Road, Nangang District, Harbin City, Heilongjiang Province

Patentee after: BOCO INTER-TELECOM Co.,Ltd.

Address before: 110179 room 519, 2-1 Gao Ge Road, Hunnan New District, Shenyang, Liaoning.

Patentee before: Yiyang Computer Technology Co.,Ltd. Shenyang

TR01 Transfer of patent right