A kind of conversion method and device of data memory format
Technical field
The present invention relates to database system technology field more particularly to the conversion methods and dress of a kind of data memory format
It sets.
Background technique
With the continuous intensification of social informatization degree, Database Systems use sea that is more and more extensive, constantly accumulating
Amount data and ever-increasing data expansion put forward new requirements Database Systems.
The data stored in the database have certain storage format, and different storage formats can influence data base set
The performance of system.Data load can be rapidly completed in row storage organization, higher to the adaptation of dynamic load, but row storage organization cannot
Support quick search processing, while space utilization rate is also not easy to greatly improve.Although by entropy coding and utilizing column correlated performance
A preferable compression ratio is enough obtained, but complex data storage realizes that will lead to decompression expense increases.Column storage organization then will
The not same area dispersion of the same record stores and reconstruct of these records will be brought compared with large overhead, but arranges storage and can be avoided reading
Unnecessary column, and the set of metadata of similar data compressed in a column can reach higher compression ratio.
Currently, the advantage and disadvantage of comprehensive row storage, column storage, produce various ranks combination storage modes, such as PAX or row
Column mixing storage (RCFile, Record Columnar File) these storage modes by the optimization to bottom storage format,
Optimize system performance more.
However, row, column or ranks the mixing storage format of existing database are all that the database that places one's entire reliance upon is initial
Initial setting when change, i.e. user the database bottom storage format specified when creating database.When user needs to change
When database purchase format, database administrator (DBA, Database Administrator) manual off-line is needed to modify.
Summary of the invention
The embodiment of the present invention provides the conversion method and device of a kind of data memory format, enables to Database Systems
According to loading condition, it is dynamically determined database bottom storage format, system automatic adjustment optimization function is realized, reduces inquiry language
The handling capacity of sentence, while the memory space and utilization rate of lifting system.
In order to achieve the above objectives, the embodiment of the present invention adopts the following technical scheme that
In a first aspect, the embodiment of the present invention provides a kind of controller, comprising:
Decision package, if being set for meeting user with the system performance index of the database of the first storage format storing data
Fixed switch condition, it is determined that the second storage format needed for storing data in the database;
Storage format converting unit, for by the storage format of the data in the database from first storage format
Be converted to second storage format;
Feedback unit, for judging whether the compression ratio of the database after storage format is converted meets the first preset condition,
And the data in the database after storage format conversion are ranked up, whether the test sequencing time meets the second default item
Part;If the compression ratio meets the first preset condition, and the sorting time meets the second preset condition, then is externally taken
Business, alternatively, if the compression ratio is unsatisfactory for the first preset condition and/or the sorting time is unsatisfactory for the second preset condition,
Feedback information is sent to the decision package, so that the decision package is to be set according to the user in the feedback information
Core index threshold value redefines the second storage format needed for storing data in the database.
In the first possible implementation of first aspect, the method also includes:
The decision package is also used in determining the database before the second storage format needed for storing data,
Judge whether the system performance index meets core index threshold value set by user;If described in the system performance index meets
Core index threshold value, then judge whether the system performance index meets the switch condition.
The possible implementation of with reference to first aspect the first, in the second possible implementation of the first aspect,
The decision package keeps depositing in the database if being also used to the system performance index is unsatisfactory for the switch condition
The format for storing up data is first storage format.
In conjunction with second of the possible implementation of the first of first aspect above-mentioned or first aspect to first aspect
Any possible implementation in possible implementation, it is in a third possible implementation of the first aspect, described
Controller further includes data acquisition unit;
The data acquisition unit, for acquiring the system performance index.
In conjunction with the third of the possible implementation of the first of first aspect above-mentioned or first aspect to first aspect
Any possible implementation in possible implementation, it is in a fourth possible implementation of the first aspect, described
Decision package is also used in determining the database after the second storage format needed for storing data, according to user configuration
Information determines the storage format of the data in the database being converted to the second storage lattice from first storage format
The switch instant of formula.
The 4th kind of possible implementation with reference to first aspect, in the fifth possible implementation of the first aspect,
The storage format converting unit is specifically used for according to second storage format and the switch instant, in the buffer will
Data recombination in the database will be in the buffer area if the data volume in the buffer area reaches disk write threshold value
Data be written disk.
The 5th kind of possible implementation with reference to first aspect, in the sixth possible implementation of the first aspect,
The storage format converting unit, if second storage format specifically for the data in the buffer area is single-row deposits
Storage, then be converted to the single-row storage from first storage format for the storage format of the data in the buffer area, and
It compresses and stores the data in the buffer area;Alternatively, if second storage format of the data in the buffer area is row
Column mixing storage or row storage, then be converted to institute from first storage format for the storage format of the data in the buffer area
Ranks mixing storage or row storage are stated, and stores the data in the buffer area.
The 5th kind of possible implementation with reference to first aspect, in a seventh possible implementation of the first aspect,
The controller further includes reading unit;
The reading unit is used in the storage format converting unit according to second storage format and the conversion
Moment, if first storage format is row storage, presses row in the buffer by before the data recombination in the database
Read the data in the database.
The 7th kind in conjunction with the possible implementation of the first of first aspect above-mentioned or first aspect to first aspect can
Any implementation being able to achieve in mode, in the 8th kind of possible implementation of first aspect,
The data acquisition unit, if being also used to meet the first preset condition, and the sorting time in the compression ratio
Meet the second preset condition, then after externally being serviced, the data throughput of the database after acquiring the storage format conversion
Amount and query statement response time.
The 8th kind in conjunction with the possible implementation of the first of first aspect above-mentioned or first aspect to first aspect can
Any implementation being able to achieve in mode, in the 9th kind of possible implementation of first aspect, the system performance refers to
Mark includes at least data volume, inquiry average access data volume, processing line number and accounts for the column for reading column number proportion, inquiring average access
Than and query statement proportion.
Second aspect, the embodiment of the present invention provide a kind of controller, comprising:
Processor, if for meeting user's setting with the system performance index of the database of the first storage format storing data
Switch condition, it is determined that the second storage format needed for storing data in the database;In format converter by the number
After being converted to second storage format from first storage format according to the storage format of the data in library, storage format is judged
Whether the compression ratio of the database after conversion meets the first preset condition, and in the database after storage format conversion
Data are ranked up, and whether the test sequencing time meets the second preset condition;If the compression ratio meets the first preset condition, and
The sorting time meets the second preset condition, then is externally serviced, alternatively, if the compression ratio is unsatisfactory for the first default item
Part and/or the sorting time are unsatisfactory for the second preset condition, then are redefined according to user's core index threshold value to be set
Second storage format needed for storing data in the database;
Format converter, for being converted to the storage format of the data in the database from first storage format
Second storage format that the processor determines.
In the first possible implementation of the second aspect, the processor is also used to determining the database
Before second storage format needed for middle storing data, judge whether the system performance index meets core set by user and refer to
Mark threshold value;If the system performance index meets the core index threshold value, judge whether the system performance index meets
The switch condition.
In conjunction with the first possible implementation of second aspect, in second of possible implementation of second aspect
In, the processor keeps depositing in the database if being also used to the system performance index is unsatisfactory for the switch condition
The format for storing up data is first storage format.
In conjunction with second of the possible implementation of the first of second aspect above-mentioned or second aspect to second aspect
Any implementation in possible implementation, in the third possible implementation of the second aspect, the control
Device further includes data collector;
The data collector, for acquiring the system performance index.
In conjunction with the third of the possible implementation of the first of second aspect above-mentioned or second aspect to second aspect
In any implementation in possible implementation, in the fourth possible implementation of the second aspect, the place
Device is managed, is also used in determining the database after the second storage format needed for storing data, according to user configuration information,
It determines and the storage format of the data in the database is converted into second storage format from first storage format
Switch instant.
In conjunction with the 4th kind of possible implementation of second aspect, in the 5th kind of possible implementation of second aspect
In, the format converter is specifically used for according to second storage format and the switch instant, in the buffer will be described
Data recombination in database, if the data volume in the buffer area reaches disk write threshold value, by the number in the buffer area
According to write-in disk.
In conjunction with the 5th kind of possible implementation of second aspect, in the 6th kind of possible implementation of second aspect
In, the format converter, if second storage format specifically for the data in the buffer area is single-row storage,
The storage format of data in the buffer area is converted into the single-row storage from first storage format, and compression is simultaneously
Store the data in the buffer area;Alternatively, if second storage format of the data in the buffer area is ranks mixing
Storage or row storage, then be converted to the ranks from first storage format for the storage format of the data in the buffer area
Mixing storage or row storage, and store the data in the buffer area.
In conjunction with the 5th kind of possible implementation of second aspect, in the 7th kind of possible implementation of second aspect
In, the processor is also used in the storage format converter according to second storage format and the switch instant,
By before the data recombination in the database in buffer area, if first storage format is row storage, institute is read by row
State the data in database.
In conjunction with the 7th kind of the possible implementation of the first of second aspect above-mentioned or second aspect to second aspect
Any implementation in possible implementation, in the 8th kind of possible implementation of second aspect,
The data collector, if being also used to meet the first preset condition in the compression ratio, and the sorting time is full
The second preset condition of foot, then after externally being serviced, the data throughout of the database after acquiring the storage format conversion
And the query statement response time.
In conjunction with the 8th kind of the possible implementation of the first of second aspect above-mentioned or second aspect to second aspect
Any implementation in possible implementation, in the 9th kind of possible implementation of second aspect, the system
Performance indicator includes at least data volume, inquiry average access data volume, processing line number and accounts for reading column number proportion, inquiry average access
Column ratio and query statement proportion.
The third aspect, the embodiment of the present invention provide a kind of conversion method of data memory format, comprising:
Step a: if meeting set by user turn with the system performance index of the database of the first storage format storing data
Change condition, it is determined that the second storage format needed for storing data in the database;
Step b: the storage format of the data in the database is converted to described second from first storage format
Storage format;
Step c: whether the compression ratio of the database after judging storage format conversion meets the first preset condition, and to described
The data in database after storage format conversion are ranked up, and whether the test sequencing time meets the second preset condition;
Step d: if the compression ratio meets the first preset condition, and the sorting time meets the second preset condition, then
Externally serviced;Alternatively, if the compression ratio is unsatisfactory for the first preset condition and/or the sorting time is unsatisfactory for second
Preset condition then re-executes above-mentioned steps according to user's core index threshold value to be set.
In the first possible implementation of the third aspect, in the determination database needed for storing data
Before second storage format, the method also includes:
Judge whether the system performance index meets core index threshold value set by user;
If the system performance index meets the core index threshold value, judge whether the system performance index meets
The switch condition.
In conjunction with the first possible implementation of the third aspect, in second of possible implementation of the third aspect
In, if the system performance index is unsatisfactory for the switch condition, keep the format of storing data in the database for institute
State the first storage format.
In conjunction with the possible implementation of the first of the third aspect above-mentioned or the third aspect to second of possible realization
Any implementation of mode, in the third possible implementation of the third aspect, the method also includes:
Acquire the system performance index.
In conjunction with the possible implementation of the first of the third aspect above-mentioned or the third aspect to the third possible realization
Any implementation of mode is deposited in determining the database in the fourth possible implementation of the third aspect
After second storage format needed for storing up data, the method also includes:
According to user configuration information, determine the storage format of the data in the database from first storage format
Be converted to the switch instant of second storage format.
In conjunction with the 4th kind of possible implementation of the third aspect, in the 5th kind of possible implementation of the third aspect
In, the storage format by the data in the database is converted to the second storage lattice from first storage format
Formula specifically includes:
According to second storage format and the switch instant, in the buffer by the data weight in the database
Group;
If the data volume in the buffer area reaches disk write threshold value, disk is written into the data in the buffer area.
In conjunction with the 5th kind of possible implementation of the third aspect, in the 6th kind of possible implementation of the third aspect
In, disk is written in the data by the buffer area, it specifically includes:
If second storage format of the data in the buffer area is single-row storage, by the number in the buffer area
According to storage format be converted to the single-row storage from first storage format, and compress and store in the buffer area
Data;Alternatively,
If second storage format of the data in the buffer area is that ranks mixing storage or row store, will be described
The storage format of data in buffer area is converted to the ranks mixing storage or row storage from first storage format,
And store the data in the buffer area.
In conjunction with the 5th kind of possible implementation of the third aspect, in the 7th kind of possible implementation of the third aspect
In, it is described according to second storage format and the switch instant, in the buffer by the data recombination in the database
Before, the method also includes:
If first storage format is row storage, the data in the database are read by row.
In conjunction with the possible implementation of the first of the third aspect above-mentioned or the third aspect to the 7th kind of possible realization
Any implementation in mode, in the 8th kind of possible implementation of the third aspect, if the compression ratio is full
The first preset condition of foot, and the sorting time meets the second preset condition, then after externally being serviced, the method is also wrapped
It includes:
The data throughout of database after acquiring the storage format conversion and query statement response time.
In conjunction with the possible implementation of the first of the third aspect above-mentioned or the third aspect to the 8th kind of possible realization side
Any implementation in formula, in the 9th kind of possible implementation of the third aspect, the system performance index is at least
Including data volume, inquiry average access data volume, processing line number account for read column number proportion, inquire average access column ratio and
Query statement proportion.
The embodiment of the present invention provides the conversion method and device of a kind of data memory format, if with the storage of the first storage format
The system performance index of the database of data meets switch condition set by user, and controller then determines storing data in database
The second required storage format, and the storage format of the data in database is converted into the second storage lattice from the first storage format
Formula, then, controller judge whether the compression ratio of the database after storage format conversion meets the first preset condition, and to storage
The data in database after format conversion are ranked up, and whether the test sequencing time meets the second preset condition, if compression ratio
Meet the first preset condition, and sorting time meets the second preset condition, then is externally serviced;Alternatively, if compression ratio is discontented
The first preset condition of foot and/or sorting time are unsatisfactory for the second preset condition, then according to user's core index threshold value to be set
Re-execute the second storage format needed for storing data in determining database.With this solution, controller passes through to system reality
The monitoring of border operation data constantly determines that the optimal storage format of data in Database Systems, i.e. controller can make database
System is dynamically determined the storage format of data in Database Systems according to loading condition, solves and is changing database purchase at present
When format, database administrator manual off-line is needed to modify, system memory space and the low problem of utilization rate pass through self-decision number
According to storage format, the handling capacity of query statement, while the memory space and utilization rate of lifting system are reduced.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with
It obtains other drawings based on these drawings.
Fig. 1 is the structural schematic diagram one of the controller of the embodiment of the present invention;
Fig. 2 is the structural schematic diagram two of the controller of the embodiment of the present invention;
Fig. 3 is the structural schematic diagram three of the controller of the embodiment of the present invention;
Fig. 4 is the structural schematic diagram four of the controller of the embodiment of the present invention;
Fig. 5 is the conversion method flow diagram one of the data memory format of the embodiment of the present invention;
Fig. 6 is the conversion method flow diagram two of the data memory format of the embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Various techniques described herein is suitable for database field, such as: the dynamic of database bottom data storage format
Data distribution, database materialization strategy, database index strategy etc. in optimization, data-base cluster.
Currently, the application based on database be broadly divided into OLTP (On-Line Transaction Processing, it is online
Issued transaction) and OLAP (On-Line Analytical Processing, on-line analytical processing) two classes, the former needs to handle
It is related to the things inquiry of frequent " writing " operation, the latter lays particular emphasis on processing and is related to the analytic type inquiry of a large amount of " readings " operation.Column are deposited
Storage has biggish advantage in read operation, is highly suitable for OLAP query, but unsatisfactory to the support of write operation, therefore simultaneously
Be not suitable for OLTP inquiry.It is very good that OLTP inquiry is supported in row storage.
The advantages of row storage organization, is the high adaptability of rapid data load and dynamic load, this is because row storage
It ensure that all domains of identical recordings all in the same node.But row storage the shortcomings that be also it will be apparent that for example, it
Quick search cannot be supported to handle, because it is unnecessary that it cannot be skipped when inquiry is only for several column in multiple row table
Column read;Further, since mixing the column of different data value, row storage is not easy to obtain a high compression ratio, i.e. space
Utilization rate is not easy to greatly improve.Although by entropy coding and a preferable compression ratio can be obtained using column correlation,
Complex data storage realizes that will lead to decompression expense increases.
The not same area of the same record is dispersed storage by column storage, and the reconstruct of these records will be brought compared with large overhead.But
It is that column storage can be avoided the unnecessary column of reading, and the set of metadata of similar data compressed in a column can reach higher compression ratio.
Currently, the advantage and disadvantage of comprehensive row storage, column storage, produce various ranks combination storage modes, such as PAX or
These storage modes of RCFile optimize system performance more by the optimization to bottom storage format.But existing data
Row, column or ranks the mixing storage format in library are all initial settings when placing one's entire reliance upon data library initialization, i.e. user exists
Create database bottom storage format specified when database.When user needs to change database purchase format, need to count
It is modified according to library manager's (DBA, Database Administrator) manual off-line, has lacked system automatic adjustment optimization function
Energy.
The present invention provides the conversion method and device of a kind of data memory format, enables to Database Systems according to load
Situation is dynamically determined database bottom storage format, realizes system automatic adjustment optimization function, reduces handling up for query statement
Amount, while the memory space and utilization rate of lifting system.
Embodiment one
The embodiment of the present invention provides a kind of controller, as shown in Figure 1, comprising:
Decision package 10, if for meeting user with the system performance index of the database of the first storage format storing data
The switch condition of setting, it is determined that the second storage format needed for storing data in the database;
Storage format converting unit 11, for the storage format of the data in the database to be stored lattice from described first
Formula is converted to second storage format that the decision package 10 determines;
Feedback unit 12, for judging whether the compression ratio of the database after storage format is converted meets the first default item
Part, and the data in the database after storage format conversion are ranked up, it is pre- whether the test sequencing time meets second
If condition;If the compression ratio meets the first preset condition, and the sorting time meets the second preset condition, then carries out external
Service, alternatively, if the compression ratio is unsatisfactory for the first preset condition and/or the sorting time is unsatisfactory for the second preset condition,
Feedback information is then sent to the decision package 10, so that the decision package waits setting according to the user in the feedback information
Fixed core index threshold value redefines the second storage format needed for storing data in the database.
Further, the decision package 10 is also used in determining the database needed for storing data second and deposits
Before storing up format, judge whether the system performance index meets core index threshold value set by user;If the system performance
Index meets the core index threshold value, then judges whether the system performance index meets the switch condition.
Further, the decision package 10, if being also used to the system performance index is unsatisfactory for the switch condition,
The format for keeping storing data in the database is first storage format.
Further, as shown in Fig. 2, the controller further includes data acquisition unit 13;
The data acquisition unit 13, is also used to acquire the system performance index.
Further, the decision package 10 is also used in determining the database needed for storing data second and deposits
After storing up format, according to user configuration information, determines and store the storage format of the data in the database from described first
Format is converted to the switch instant of second storage format.
Further, the storage format converting unit 11, specifically for according to the decision package 10 determination
Second storage format and the switch instant, in the buffer by the data recombination in the database, if in the buffer area
Data volume reach disk write threshold value, then by the buffer area data be written disk.
Further, the storage format converting unit 11, if described specifically for the data in the buffer area
Two storage formats are single-row storage, then are converted to the storage format of the data in the buffer area from first storage format
The single-row storage, and compress and store the data in the buffer area;Alternatively, if data in the buffer area it is described
Second storage format is ranks mixing storage or row storage, then by the storage format of the data in the buffer area from described first
Storage format is converted to the ranks mixing storage or row storage, and stores the data in the buffer area.
Further, as shown in Fig. 2, the controller further includes reading unit 14,
The reading unit 14, in the storage format converting unit 11 according to second storage format and described
Switch instant, in the buffer by before the data recombination in the database, if first storage format is row storage,
The data in the database are read by row.
Further, the data acquisition unit 13, if being also used to meet the first preset condition, and institute in the compression ratio
It states sorting time and meets the second preset condition, then the database after externally being serviced, after acquiring the storage format conversion
Data throughout and the query statement response time.
Further, the system performance index includes at least data volume, inquiry average access data volume, processing line number and accounts for
The column ratio and query statement proportion for reading column number proportion, inquiring average access.
The present invention provides a kind of controllers, mainly include decision package, storage format converting unit and feedback unit.If
Meet switch condition set by user with the system performance index of the database of the first storage format storing data, decision package is then
Determine the second storage format needed for storing data in database, then, storage format converting unit is by the data in database
Storage format be converted to the second storage format from the first storage format, finally, feedback unit judge storage format conversion after
Whether the compression ratio of database meets the first preset condition, and arranges the data in the database after storage format conversion
Whether sequence, test sequencing time meet the second preset condition, if compression ratio meets the first preset condition, and sorting time meets the
Two preset conditions, then externally serviced;Alternatively, if compression ratio is unsatisfactory for the first preset condition and/or sorting time is unsatisfactory for
Second preset condition then sends feedback information to decision package, so that decision package waits setting according to the user in feedback information
Fixed core index threshold value redefines the second storage format needed for storing data in database.With this solution, controller
By the monitoring to running data, the optimal storage format of data in Database Systems, i.e. controller are constantly determined
Database Systems can be made to be dynamically determined the storage format of data in Database Systems according to loading condition, solve and changing at present
When variable database storage format, database administrator manual off-line is needed to modify, system memory space and the low problem of utilization rate,
By self-decision data memory format, the handling capacity of query statement, while the memory space and utilization rate of lifting system are reduced.
Embodiment two
Embodiment of the embodiment of the present invention provides a kind of controller, as shown in Figure 3, comprising:
Processor 20, if being set for meeting user with the system performance index of the database of the first storage format storing data
Fixed switch condition, it is determined that the second storage format needed for storing data in the database;In format converter 21 by institute
It states after the storage formats of the data in database is converted to second storage format from first storage format, judges to store
Whether the compression ratio of the database after format conversion meets the first preset condition, and to the database after storage format conversion
In data be ranked up, whether the test sequencing time meets the second preset condition;If the compression ratio meets the first default item
Part, and the sorting time meets the second preset condition, then is externally serviced, alternatively, if the compression ratio is unsatisfactory for first
Preset condition and/or the sorting time are unsatisfactory for the second preset condition, then according to user's core index threshold value weight to be set
Newly determine the second storage format needed for storing data in the database;
Format converter 21, for converting the storage format of the data in the database from first storage format
Second storage format determined for the processor 20.
Further, the processor 20 is also used in determining the database the second storage needed for storing data
Before format, judge whether the system performance index meets core index threshold value set by user;If the system performance refers to
Mark meets the core index threshold value, then judges whether the system performance index meets the switch condition.
Further, the processor 20 is protected if being also used to the system performance index is unsatisfactory for the switch condition
The format for holding storing data in the database is first storage format.
Further, as shown in figure 4, the controller further includes data collector 22;
The data collector 22, for acquiring the system performance index.
Further, the processor 20 is also used in determining the database the second storage needed for storing data
After format, according to user configuration information, determines and the storage format of the data in the database is stored into lattice from described first
Formula is converted to the switch instant of second storage format.
Further, the format converter 21 is specifically used for according to second storage format and the switch instant,
It, will if the data volume in the buffer area reaches disk write threshold value in the buffer by the data recombination in the database
Disk is written in data in the buffer area.
Further, the format converter 21, if specifically for second storage of the data in the buffer area
Format is single-row storage, then the storage format of the data in the buffer area is converted to the list from first storage format
Column storage, and compress and store the data in the buffer area;Alternatively, if described second of the data in the buffer area deposits
Format is stored up as ranks mixing storage or row storage, then the storage format of the data in the buffer area is stored into lattice from described first
Formula is converted to the ranks mixing storage or row storage, and stores the data in the buffer area.
Further, the processor 20 is also used in the storage format converter according to second storage format
With the switch instant, in the buffer by before the data recombination in the database, if first storage format is row
The data in the database are then read in storage by row.
Further, the data collector 22, if for meeting the first preset condition, and the row in the compression ratio
The sequence time meets the second preset condition, then after externally being serviced, the number of the database after acquiring the storage format conversion
According to handling capacity and query statement response time.
Further, the system performance index includes at least data volume, inquiry average access data volume, processing line number and accounts for
The column ratio and query statement proportion for reading column number proportion, inquiring average access.
The embodiment of the present invention provides a kind of controller, mainly includes processor and format converter.If with the first storage lattice
The system performance index of the database of formula storing data meets switch condition set by user, and controller is then determined and deposited in database
Second storage format needed for storing up data, and the storage format of the data in database is converted to second from the first storage format
Storage format, then, controller judge whether the compression ratio of the database after storage format conversion meets the first preset condition, and
Data in database after storage format conversion are ranked up, whether the test sequencing time meets the second preset condition, if
Compression ratio meets the first preset condition, and sorting time meets the second preset condition, then is externally serviced;Alternatively, if compression
It is unsatisfactory for the second preset condition than being unsatisfactory for the first preset condition and/or sorting time, then is referred to according to user's core to be set
Mark threshold value re-executes the second storage format needed for storing data in determining database.With this solution, controller by pair
The monitoring of running data constantly determines that the optimal storage format of data in Database Systems, i.e. controller can make
Database Systems are dynamically determined the storage format of data in Database Systems according to loading condition, solve and are changing data at present
When the storage format of library, database administrator manual off-line is needed to modify, system memory space and the low problem of utilization rate, by from
Decision data storage format reduces the handling capacity of query statement, while the memory space and utilization rate of lifting system.
Embodiment three
The embodiment of the present invention provides a kind of conversion method of data memory format, as shown in figure 5, this method comprises:
If S101, meeting conversion set by user with the system performance index of the database of the first storage format storing data
Condition, controller then determine the second storage format needed for storing data in database.
Existing database row, column or ranks mixing storage format all place one's entire reliance upon data library initialization when
Initial setting, i.e. user's database bottom storage format specified when creating database.When user needs to change database
When storage format, DBA manual off-line is needed to modify, has lacked system automatic adjustment optimization function.
In order to solve the problems, such as that system cannot automatically adjust the storage format of data in optimization database, the present invention provides one
The conversion method of kind data memory format, enables to Database Systems according to loading condition, is dynamically determined database bottom and deposits
Format is stored up, system automatic adjustment optimization function is realized.
In practical applications, the OLTP of database is applied and OLAP is excellent using embodying on write operation and read operation respectively
Gesture.In order to integrate the advantage and disadvantage of row storage, column storage, various ranks combination storage modes, i.e. OLTP and OLAP fusion are produced.
In the application environment towards OLTP and OLAP fusion, data library initialization exists with line mode.
The conversion of data memory format in Database Systems is completed, controller needs to determine storing data in database first
Required storage format determines that the data needs stored in database are stored with any format.
Specifically, if meeting set by user turn with the system performance index of the database of the first storage format storing data
Condition is changed, controller then determines the second storage format needed for storing data in database.
Wherein, system performance index is that controller is collected within period regular hour, including at least data volume, is looked into
Ask average access data volume, processing line number accounts for the column ratio and query statement institute accounting for reading column number proportion, inquiring average access
Example.
Further, controller before the second storage format needed for storing data, needs to judge in determining database
Whether the system performance index of database meets switch condition set by user, if the system performance index is unsatisfactory for user's setting
Switch condition, then controller is without any processing, the data in database still with the first storage format storage.
Illustratively, if switch condition set by user is that any one column visiting frequency (accesses this column number/access this table
Number) reach 80% the column can be converted to column storage, when the visiting frequency that controller collects in database the n-th column reaches
80%, controller then determines that n-th column data is stored using column storage format.
Correspondingly, controller needs while controller determines the second storage format needed for storing data in database
The table for needing procession to convert in the database is calculated, determining needs in the column and database that need to polymerize storage in database
The column individually to store.For example, individually the highest column of Access Column frequency are individually stored by column mode.
Further, determine that controller also needs in database after the second storage format needed for storing data in controller
It to determine and the storage format of data in database is converted into the second storage lattice from the first storage format according to user configuration information
The switch instant of formula.
Specifically, controller can carry out the conversion of storage format in load idle moment according to system performance index,
After can be for the second storage format needed for the storing data in determining database, prompt user that can convert and deposit second
Storage format is shown to user, and controller carries out the conversion of storage format after user, which inputs, to be ordered.
The storage format of data in database is converted to the second storage lattice from the first storage format by S102, controller
Formula.
After second storage format needed for determining data database storing in controller, controller is by the data in database
Storage format be converted to the second storage format from the first storage format.
Specifically, controller is in the buffer after controller determines the switch instant of the second storage format and storage format
Data in database are recombinated according to the second storage format, when the data volume in buffer area reaches disk write threshold value, control
Then disk is written in data in the buffer area by device.
Further, controller carries out the data in buffer area according to the second storage format of the data in buffer area
Different disposal.If second storage format of the data in buffer area is single-row storage, by depositing for the data in buffer area
Storage format is converted to the data in single-row storage, and compression and memory buffer from the first storage format;Alternatively, if buffer area
In the second storage format of data be ranks mixing storage or row storage, then by the storage format of the data in buffer area from the
One storage format is converted to ranks mixing storage or row storage, and the data in memory buffer.
S103, controller judge whether the compression ratio of the database after storage format conversion meets the first preset condition, and
Data in database after storage format conversion are ranked up, whether the test sequencing time meets the second preset condition.
After the storage format of the data in database is converted to the second storage format from the first storage format by controller,
Whether the conversion that controller needs to detect storage format is reasonable, if can optimize database.
Specifically, controller and is passed through by carrying out the judgement of size variation to the table for having been converted into column in database
The simple sequence test sequencing time.Since the size of table directly represents the compression ratio of database, and compression ratio gets over high spatial benefit
It is higher with rate;The length of sorting time embody consumption CPU memory source number, column storage can be improved sequence efficiency.Cause
This, whether the conversion that controller detects storage format is reasonable, and the compression ratio of the database after needing to judge storage format conversion is
The first preset condition of no satisfaction, and the data in the database after storage format conversion are ranked up, the test sequencing time is
The second preset condition of no satisfaction.
Wherein, the first preset condition is that the compression ratio of the database after storage format conversion is less than or equal to the first default threshold
Value, the second preset condition are that sorting time is less than or equal to the second preset threshold.
If S104, compression ratio meet the first preset condition, and sorting time meets the second preset condition, and controller then carries out
Externally service.
Specifically, if compression ratio meets the first preset condition, and sorting time meets the second preset condition, then explanation is current
The conversion of storage format enables to the performance of Database Systems to get a promotion, and controller is then externally serviced.
Further, it carries out after externally servicing, controller needs to further look at acquisition corresponding index, judges data base set
Whether whether system performance optimizes, i.e., need further to decouple in column in Database Systems, or the column without segmentation is avoided to be divided
It cuts.
Wherein, the index of controller acquisition includes data throughout and query statement response time.Data throughout embodies
Whether redundant data reading is reduced, and explanation, which is likely to require, if the variation of data throughout is unobvious is further split into
Column;The query statement response time be judge column store whether effective direct indicator.
If S105, compression ratio are unsatisfactory for the first preset condition and/or sorting time is unsatisfactory for the second preset condition, control
Device then re-executes the conversion of data memory format next time according to user's core index threshold value to be set.
If compression ratio is unsatisfactory for the first preset condition and/or sorting time is unsatisfactory for the second preset condition, illustrate to control
The second storage format that device determines cannot make the performance of Database Systems effectively be promoted, and controller needs are waited for according to user
The conversion that the core index threshold value of setting re-executes data memory format next time redefines the second storage format of data.
The embodiment of the present invention provides a kind of conversion method of data memory format, if with the first storage format storing data
The system performance index of database meets switch condition set by user, and controller then determines in database needed for storing data
Second storage format, and the storage format of the data in database is converted into the second storage format from the first storage format, so
Afterwards, controller judges whether the compression ratio of the database after storage format conversion meets the first preset condition, and to storage format
The data in database after conversion are ranked up, and whether the test sequencing time meets the second preset condition, if compression ratio meets
First preset condition, and sorting time meets the second preset condition, then is externally serviced;Alternatively, if compression ratio is unsatisfactory for
One preset condition and/or sorting time are unsatisfactory for the second preset condition, then again according to user's core index threshold value to be set
It executes and determines the second storage format needed for storing data in database.With this solution, controller passes through to the practical fortune of system
The monitoring of row data constantly determines that the optimal storage format of data in Database Systems, i.e. controller can make Database Systems
It is dynamically determined the storage format of data in Database Systems according to loading condition, solves and is changing database purchase format at present
When, need database administrator manual off-line to modify, system memory space and the low problem of utilization rate are deposited by self-decision data
Format is stored up, the handling capacity of query statement, while the memory space and utilization rate of lifting system are reduced.
Example IV
S201, controller acquisition are with the system performance index of the database of the first storage format storing data.
Existing database row, column or ranks mixing storage format all place one's entire reliance upon data library initialization when
Initial setting, i.e. user's database bottom storage format specified when creating database.When user needs to change database
When storage format, DBA manual off-line is needed to modify, has lacked system automatic adjustment optimization function.
In order to solve the problems, such as that system cannot automatically adjust the storage format of data in optimization database, the present invention provides one
The conversion method of kind data memory format, enables to Database Systems according to loading condition, is dynamically determined database bottom and deposits
Format is stored up, system automatic adjustment optimization function is realized.
In practical applications, the OLTP of database is applied and OLAP is excellent using embodying on write operation and read operation respectively
Gesture.In order to integrate the advantage and disadvantage of row storage, column storage, various ranks combination storage modes, i.e. OLTP and OLAP fusion are produced.
In the application environment towards OLTP and OLAP fusion, data library initialization exists with line mode.
Specifically, can be dynamically determined database bottom storage format to realize, controller acquires deposit with first first
The system performance index for storing up the database of format memory data, so that controller determines number according to collected performance indicator
According to storage format needed for the storing data of library.
Wherein, system performance index includes at least data volume, inquiry average access data volume, processing line number and accounts for reading line number
Ratio, the column ratio and query statement proportion for inquiring average access.Specifically,
Data volume is the important indicator whether deposited using column, and the much the bigger inquiry of data volume the more suitable to be deposited using column, data
Amount size is entire database data amount size;
Inquiry average access data volume is that database inquires the number of data lines averagely used every time, and average access data volume is very
Big scene is suitble to column to deposit;
Processing line number, which accounts for read column number proportion and refer to that the number of data lines of averagely each operation actual use accounts for, all reads line numbers
Ratio, when reading the data in analytical database, although some data are read in from disk, actually system is not
Carry out correlation analysis operation, it is desirable to be read in all line numbers can be handled by system, so ratio gets over Gao Yueshi
Column are closed to deposit;
The ratio of the column columns more total than the column Zhan for referring to the access of average lookup sentence of average access is inquired, the ratio is smaller more suitable
Column are closed to deposit;
Query statement proportion refers to inquiry operation proportion in all database manipulations, and ratio is closer to 100%
More suitable column are deposited.
S202, controller judge whether system performance index meets core index threshold value set by user.
After controller is collected with the system performance index of the database of the first storage format storing data, controller pair
The system performance index is analyzed.
Specifically, controller judges whether system performance index meets core index threshold value set by user, i.e. controller
Analysis of Policy Making is carried out to system performance index according to core index threshold value set by user and decision making algorithm.
Optionally, if core index threshold value set by user includes: to inquire the column of average access than threshold value Ta, query statement
Proportion threshold value Tq and processing line number account for and read column number proportion threshold value Tp, then decision making algorithm set by user can be (inquiry
The column of average access than<Ta) and/or (query statement proportion>Tq) and/or (processing line number account for read column number proportion>Tp).
If S203, system performance index meet core index threshold value, controller judges whether system performance index meets
Switch condition set by user.
If controller determines that system performance index meets core threshold value, i.e. system performance index meets decision set by user
Algorithm, controller then judge whether the performance indicator meets switch condition set by user.Only performance indicator meets user and sets
Fixed switch condition, controller just can determine that storage format needed for storing data in database.
Illustratively, if switch condition set by user is that any one column visiting frequency (accesses this column number/access this table
Number) reach 80% the column can be converted to column storage, when the visiting frequency that controller collects in database the n-th column reaches
80%, controller just can determine that n-th column data is stored using column storage format.
If S204, system performance index meet switch condition, controller determines in database needed for storing data the
Two storage formats.
Specifically, if meeting set by user turn with the system performance index of the database of the first storage format storing data
Condition is changed, then the storage format of the data in database of descriptions can be converted, and controller can determine storing data institute in database
The second storage format needed.
Correspondingly, controller needs while controller determines the second storage format needed for storing data in database
The table for needing procession to convert in the database is calculated, determining needs in the column and database that need to polymerize storage in database
The column individually to store.For example, individually the highest column of Access Column frequency are individually stored by column mode.
S205, controller determine according to user configuration information and store the storage format of the data in database from first
Format is converted to the switch instant of the second storage format.
Determine that in database after the second storage format needed for storing data, controller is also needed according to user in controller
Configuration information, when determining that the storage format by data in database is converted to the conversion of the second storage format from the first storage format
It carves.
Specifically, controller can carry out the conversion of storage format in load idle moment according to system performance index,
After can be for the second storage format needed for the storing data in determining database, prompt user that can convert and deposit second
Storage format is shown to user, and controller carries out the conversion of storage format after user, which inputs, to be ordered.
The storage format of data in database is converted to the second storage lattice from the first storage format by S206, controller
Formula.
After second storage format needed for determining data database storing in controller, controller is by the data in database
Storage format be converted to the second storage format from the first storage format.
Specifically, controller is in the buffer after controller determines the switch instant of the second storage format and storage format
Data in database are recombinated according to the second storage format, when the data volume in buffer area reaches disk write threshold value, control
Then disk is written in data in the buffer area by device, wherein if the first storage format is row storage, controller presses capable reading first
The data in database are taken, then recombinate the data in database according to the second storage format in the area Cai Hongchong.
Further, controller carries out the data in buffer area according to the second storage format of the data in buffer area
Different disposal.If second storage format of the data in buffer area is single-row storage, by depositing for the data in buffer area
Storage format is converted to the data in single-row storage, and compression and memory buffer from the first storage format;Alternatively, if buffer area
In the second storage format of data be ranks mixing storage or row storage, then by the storage format of the data in buffer area from the
One storage format is converted to ranks mixing storage or row storage, and the data in memory buffer.
S207, controller judge whether the compression ratio of the database after storage format conversion meets the first preset condition, and
Data in database after storage format conversion are ranked up, whether the test sequencing time meets the second preset condition.
After the storage format of the data in database is converted to the second storage format from the first storage format by controller,
Whether the conversion that controller needs to detect storage format is reasonable, if can optimize database.
Specifically, controller and is passed through by carrying out the judgement of size variation to the table for having been converted into column in database
The simple sequence test sequencing time.Since the size of table directly represents the compression ratio of database, and compression ratio gets over high spatial benefit
It is higher with rate;The length of sorting time embody consumption CPU memory source number, column storage can be improved sequence efficiency.Cause
This, whether the conversion that controller detects storage format is reasonable, and the compression ratio of the database after needing to judge storage format conversion is
The first preset condition of no satisfaction, and the data in the database after storage format conversion are ranked up, the test sequencing time is
The second preset condition of no satisfaction.
Wherein, the first preset condition is that the compression ratio of the database after storage format conversion is less than or equal to the first default threshold
Value, the second preset condition are that sorting time is less than or equal to the second preset threshold.
If S208, compression ratio meet the first preset condition, and sorting time meets the second preset condition, and controller then carries out
Externally service.
Specifically, if compression ratio meets the first preset condition, and sorting time meets the second preset condition, then explanation is current
The conversion of storage format enables to the performance of Database Systems to get a promotion, and controller is then externally serviced.
Further, it carries out after externally servicing, controller needs to further look at acquisition corresponding index, judges data base set
Whether whether system performance optimizes, i.e., need further to decouple in column in Database Systems, or the column without segmentation is avoided to be divided
It cuts.
Wherein, the index of controller acquisition includes data throughout and query statement response time.Data throughout embodies
Whether redundant data reading is reduced, and explanation, which is likely to require, if the variation of data throughout is unobvious is further split into
Column;The query statement response time be judge column store whether effective direct indicator.
If S209, compression ratio are unsatisfactory for the first preset condition and/or sorting time is unsatisfactory for the second preset condition, control
Device then re-executes the conversion of data memory format next time according to user's core index threshold value to be set.
If compression ratio is unsatisfactory for the first preset condition and/or sorting time is unsatisfactory for the second preset condition, illustrate to control
The second storage format that device determines cannot make the performance of Database Systems effectively be promoted, and controller needs are waited for according to user
The conversion that the core index threshold value of setting re-executes data memory format next time redefines the second storage format of data.
Illustratively, if when system initialization, when column access is than being 90%, which is converted to individually column and deposited by controller
Storage still finds system performance without promotion after conversion, and controller is then fed back column access than being improved by 90% to 91%, is used
Family resets the value of column access ratio according to the feedback information, and controller accesses ratio according to the column of user's new settings and redefines
Second storage format.
If S210, system performance index are unsatisfactory for switch condition, controller keeps the format of storing data in database
For the first storage format.
The embodiment of the present invention provides a kind of conversion method of data memory format, if with the first storage format storing data
The system performance index of database meets switch condition set by user, and controller then determines in database needed for storing data
Second storage format, and the storage format of the data in database is converted into the second storage format from the first storage format, so
Afterwards, controller judges whether the compression ratio of the database after storage format conversion meets the first preset condition, and to storage format
The data in database after conversion are ranked up, and whether the test sequencing time meets the second preset condition, if compression ratio meets
First preset condition, and sorting time meets the second preset condition, then is externally serviced;Alternatively, if compression ratio is unsatisfactory for
One preset condition and/or sorting time are unsatisfactory for the second preset condition, then wait setting according to the user in the feedback information again
Fixed core index threshold value, which executes, determines the second storage format needed for storing data in database.With this solution, controller
By the monitoring to running data, the optimal storage format of data in Database Systems, i.e. controller are constantly determined
Database Systems can be made to be dynamically determined the storage format of data in Database Systems according to loading condition, solve and changing at present
When variable database storage format, database administrator manual off-line is needed to modify, system memory space and the low problem of utilization rate,
By self-decision data memory format, the handling capacity of query statement, while the memory space and utilization rate of lifting system are reduced.
It is apparent to those skilled in the art that for convenience and simplicity of description, only with above-mentioned each function
The division progress of module can according to need and for example, in practical application by above-mentioned function distribution by different function moulds
Block is completed, i.e., the internal structure of device is divided into different functional modules, to complete all or part of function described above
Energy.The specific work process of the system, apparatus, and unit of foregoing description, can be with reference to corresponding in preceding method embodiment
Journey, details are not described herein.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it
Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the module or unit
It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components
It can be combined or can be integrated into another system, or some features can be ignored or not executed.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any
Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain
Lid is within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.