CN105378716B - A kind of conversion method and device of data memory format - Google Patents

A kind of conversion method and device of data memory format Download PDF

Info

Publication number
CN105378716B
CN105378716B CN201480000190.5A CN201480000190A CN105378716B CN 105378716 B CN105378716 B CN 105378716B CN 201480000190 A CN201480000190 A CN 201480000190A CN 105378716 B CN105378716 B CN 105378716B
Authority
CN
China
Prior art keywords
data
storage format
database
storage
format
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201480000190.5A
Other languages
Chinese (zh)
Other versions
CN105378716A (en
Inventor
李怀洲
姜旭栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Luo Sanjie
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN105378716A publication Critical patent/CN105378716A/en
Application granted granted Critical
Publication of CN105378716B publication Critical patent/CN105378716B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/217Database tuning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Abstract

The embodiment of the present invention provides the conversion method and device of a kind of data memory format, is related to database system technology field, and Database Systems is enabled to be dynamically determined bottom storage format, realizes system automatic adjustment optimization function.This method comprises: if meeting switch condition set by user with the system performance index of the database of the first storage format storing data, it is determined that the second storage format needed for storing data in database;The storage format of data is converted into the second storage format from the first storage format;Whether the compression ratio of the database after judging storage format conversion meets the first preset condition, and is ranked up to the data in database, and whether the test sequencing time meets the second preset condition;If compression ratio meets the first preset condition, and sorting time meets the second preset condition, then is externally serviced;Alternatively, redefining the second storage format if compression ratio is unsatisfactory for the first preset condition and/or sorting time is unsatisfactory for the second preset condition.

Description

A kind of conversion method and device of data memory format
Technical field
The present invention relates to database system technology field more particularly to the conversion methods and dress of a kind of data memory format It sets.
Background technique
With the continuous intensification of social informatization degree, Database Systems use sea that is more and more extensive, constantly accumulating Amount data and ever-increasing data expansion put forward new requirements Database Systems.
The data stored in the database have certain storage format, and different storage formats can influence data base set The performance of system.Data load can be rapidly completed in row storage organization, higher to the adaptation of dynamic load, but row storage organization cannot Support quick search processing, while space utilization rate is also not easy to greatly improve.Although by entropy coding and utilizing column correlated performance A preferable compression ratio is enough obtained, but complex data storage realizes that will lead to decompression expense increases.Column storage organization then will The not same area dispersion of the same record stores and reconstruct of these records will be brought compared with large overhead, but arranges storage and can be avoided reading Unnecessary column, and the set of metadata of similar data compressed in a column can reach higher compression ratio.
Currently, the advantage and disadvantage of comprehensive row storage, column storage, produce various ranks combination storage modes, such as PAX or row Column mixing storage (RCFile, Record Columnar File) these storage modes by the optimization to bottom storage format, Optimize system performance more.
However, row, column or ranks the mixing storage format of existing database are all that the database that places one's entire reliance upon is initial Initial setting when change, i.e. user the database bottom storage format specified when creating database.When user needs to change When database purchase format, database administrator (DBA, Database Administrator) manual off-line is needed to modify.
Summary of the invention
The embodiment of the present invention provides the conversion method and device of a kind of data memory format, enables to Database Systems According to loading condition, it is dynamically determined database bottom storage format, system automatic adjustment optimization function is realized, reduces inquiry language The handling capacity of sentence, while the memory space and utilization rate of lifting system.
In order to achieve the above objectives, the embodiment of the present invention adopts the following technical scheme that
In a first aspect, the embodiment of the present invention provides a kind of controller, comprising:
Decision package, if being set for meeting user with the system performance index of the database of the first storage format storing data Fixed switch condition, it is determined that the second storage format needed for storing data in the database;
Storage format converting unit, for by the storage format of the data in the database from first storage format Be converted to second storage format;
Feedback unit, for judging whether the compression ratio of the database after storage format is converted meets the first preset condition, And the data in the database after storage format conversion are ranked up, whether the test sequencing time meets the second default item Part;If the compression ratio meets the first preset condition, and the sorting time meets the second preset condition, then is externally taken Business, alternatively, if the compression ratio is unsatisfactory for the first preset condition and/or the sorting time is unsatisfactory for the second preset condition, Feedback information is sent to the decision package, so that the decision package is to be set according to the user in the feedback information Core index threshold value redefines the second storage format needed for storing data in the database.
In the first possible implementation of first aspect, the method also includes:
The decision package is also used in determining the database before the second storage format needed for storing data, Judge whether the system performance index meets core index threshold value set by user;If described in the system performance index meets Core index threshold value, then judge whether the system performance index meets the switch condition.
The possible implementation of with reference to first aspect the first, in the second possible implementation of the first aspect, The decision package keeps depositing in the database if being also used to the system performance index is unsatisfactory for the switch condition The format for storing up data is first storage format.
In conjunction with second of the possible implementation of the first of first aspect above-mentioned or first aspect to first aspect Any possible implementation in possible implementation, it is in a third possible implementation of the first aspect, described Controller further includes data acquisition unit;
The data acquisition unit, for acquiring the system performance index.
In conjunction with the third of the possible implementation of the first of first aspect above-mentioned or first aspect to first aspect Any possible implementation in possible implementation, it is in a fourth possible implementation of the first aspect, described Decision package is also used in determining the database after the second storage format needed for storing data, according to user configuration Information determines the storage format of the data in the database being converted to the second storage lattice from first storage format The switch instant of formula.
The 4th kind of possible implementation with reference to first aspect, in the fifth possible implementation of the first aspect, The storage format converting unit is specifically used for according to second storage format and the switch instant, in the buffer will Data recombination in the database will be in the buffer area if the data volume in the buffer area reaches disk write threshold value Data be written disk.
The 5th kind of possible implementation with reference to first aspect, in the sixth possible implementation of the first aspect, The storage format converting unit, if second storage format specifically for the data in the buffer area is single-row deposits Storage, then be converted to the single-row storage from first storage format for the storage format of the data in the buffer area, and It compresses and stores the data in the buffer area;Alternatively, if second storage format of the data in the buffer area is row Column mixing storage or row storage, then be converted to institute from first storage format for the storage format of the data in the buffer area Ranks mixing storage or row storage are stated, and stores the data in the buffer area.
The 5th kind of possible implementation with reference to first aspect, in a seventh possible implementation of the first aspect, The controller further includes reading unit;
The reading unit is used in the storage format converting unit according to second storage format and the conversion Moment, if first storage format is row storage, presses row in the buffer by before the data recombination in the database Read the data in the database.
The 7th kind in conjunction with the possible implementation of the first of first aspect above-mentioned or first aspect to first aspect can Any implementation being able to achieve in mode, in the 8th kind of possible implementation of first aspect,
The data acquisition unit, if being also used to meet the first preset condition, and the sorting time in the compression ratio Meet the second preset condition, then after externally being serviced, the data throughput of the database after acquiring the storage format conversion Amount and query statement response time.
The 8th kind in conjunction with the possible implementation of the first of first aspect above-mentioned or first aspect to first aspect can Any implementation being able to achieve in mode, in the 9th kind of possible implementation of first aspect, the system performance refers to Mark includes at least data volume, inquiry average access data volume, processing line number and accounts for the column for reading column number proportion, inquiring average access Than and query statement proportion.
Second aspect, the embodiment of the present invention provide a kind of controller, comprising:
Processor, if for meeting user's setting with the system performance index of the database of the first storage format storing data Switch condition, it is determined that the second storage format needed for storing data in the database;In format converter by the number After being converted to second storage format from first storage format according to the storage format of the data in library, storage format is judged Whether the compression ratio of the database after conversion meets the first preset condition, and in the database after storage format conversion Data are ranked up, and whether the test sequencing time meets the second preset condition;If the compression ratio meets the first preset condition, and The sorting time meets the second preset condition, then is externally serviced, alternatively, if the compression ratio is unsatisfactory for the first default item Part and/or the sorting time are unsatisfactory for the second preset condition, then are redefined according to user's core index threshold value to be set Second storage format needed for storing data in the database;
Format converter, for being converted to the storage format of the data in the database from first storage format Second storage format that the processor determines.
In the first possible implementation of the second aspect, the processor is also used to determining the database Before second storage format needed for middle storing data, judge whether the system performance index meets core set by user and refer to Mark threshold value;If the system performance index meets the core index threshold value, judge whether the system performance index meets The switch condition.
In conjunction with the first possible implementation of second aspect, in second of possible implementation of second aspect In, the processor keeps depositing in the database if being also used to the system performance index is unsatisfactory for the switch condition The format for storing up data is first storage format.
In conjunction with second of the possible implementation of the first of second aspect above-mentioned or second aspect to second aspect Any implementation in possible implementation, in the third possible implementation of the second aspect, the control Device further includes data collector;
The data collector, for acquiring the system performance index.
In conjunction with the third of the possible implementation of the first of second aspect above-mentioned or second aspect to second aspect In any implementation in possible implementation, in the fourth possible implementation of the second aspect, the place Device is managed, is also used in determining the database after the second storage format needed for storing data, according to user configuration information, It determines and the storage format of the data in the database is converted into second storage format from first storage format Switch instant.
In conjunction with the 4th kind of possible implementation of second aspect, in the 5th kind of possible implementation of second aspect In, the format converter is specifically used for according to second storage format and the switch instant, in the buffer will be described Data recombination in database, if the data volume in the buffer area reaches disk write threshold value, by the number in the buffer area According to write-in disk.
In conjunction with the 5th kind of possible implementation of second aspect, in the 6th kind of possible implementation of second aspect In, the format converter, if second storage format specifically for the data in the buffer area is single-row storage, The storage format of data in the buffer area is converted into the single-row storage from first storage format, and compression is simultaneously Store the data in the buffer area;Alternatively, if second storage format of the data in the buffer area is ranks mixing Storage or row storage, then be converted to the ranks from first storage format for the storage format of the data in the buffer area Mixing storage or row storage, and store the data in the buffer area.
In conjunction with the 5th kind of possible implementation of second aspect, in the 7th kind of possible implementation of second aspect In, the processor is also used in the storage format converter according to second storage format and the switch instant, By before the data recombination in the database in buffer area, if first storage format is row storage, institute is read by row State the data in database.
In conjunction with the 7th kind of the possible implementation of the first of second aspect above-mentioned or second aspect to second aspect Any implementation in possible implementation, in the 8th kind of possible implementation of second aspect,
The data collector, if being also used to meet the first preset condition in the compression ratio, and the sorting time is full The second preset condition of foot, then after externally being serviced, the data throughout of the database after acquiring the storage format conversion And the query statement response time.
In conjunction with the 8th kind of the possible implementation of the first of second aspect above-mentioned or second aspect to second aspect Any implementation in possible implementation, in the 9th kind of possible implementation of second aspect, the system Performance indicator includes at least data volume, inquiry average access data volume, processing line number and accounts for reading column number proportion, inquiry average access Column ratio and query statement proportion.
The third aspect, the embodiment of the present invention provide a kind of conversion method of data memory format, comprising:
Step a: if meeting set by user turn with the system performance index of the database of the first storage format storing data Change condition, it is determined that the second storage format needed for storing data in the database;
Step b: the storage format of the data in the database is converted to described second from first storage format Storage format;
Step c: whether the compression ratio of the database after judging storage format conversion meets the first preset condition, and to described The data in database after storage format conversion are ranked up, and whether the test sequencing time meets the second preset condition;
Step d: if the compression ratio meets the first preset condition, and the sorting time meets the second preset condition, then Externally serviced;Alternatively, if the compression ratio is unsatisfactory for the first preset condition and/or the sorting time is unsatisfactory for second Preset condition then re-executes above-mentioned steps according to user's core index threshold value to be set.
In the first possible implementation of the third aspect, in the determination database needed for storing data Before second storage format, the method also includes:
Judge whether the system performance index meets core index threshold value set by user;
If the system performance index meets the core index threshold value, judge whether the system performance index meets The switch condition.
In conjunction with the first possible implementation of the third aspect, in second of possible implementation of the third aspect In, if the system performance index is unsatisfactory for the switch condition, keep the format of storing data in the database for institute State the first storage format.
In conjunction with the possible implementation of the first of the third aspect above-mentioned or the third aspect to second of possible realization Any implementation of mode, in the third possible implementation of the third aspect, the method also includes:
Acquire the system performance index.
In conjunction with the possible implementation of the first of the third aspect above-mentioned or the third aspect to the third possible realization Any implementation of mode is deposited in determining the database in the fourth possible implementation of the third aspect After second storage format needed for storing up data, the method also includes:
According to user configuration information, determine the storage format of the data in the database from first storage format Be converted to the switch instant of second storage format.
In conjunction with the 4th kind of possible implementation of the third aspect, in the 5th kind of possible implementation of the third aspect In, the storage format by the data in the database is converted to the second storage lattice from first storage format Formula specifically includes:
According to second storage format and the switch instant, in the buffer by the data weight in the database Group;
If the data volume in the buffer area reaches disk write threshold value, disk is written into the data in the buffer area.
In conjunction with the 5th kind of possible implementation of the third aspect, in the 6th kind of possible implementation of the third aspect In, disk is written in the data by the buffer area, it specifically includes:
If second storage format of the data in the buffer area is single-row storage, by the number in the buffer area According to storage format be converted to the single-row storage from first storage format, and compress and store in the buffer area Data;Alternatively,
If second storage format of the data in the buffer area is that ranks mixing storage or row store, will be described The storage format of data in buffer area is converted to the ranks mixing storage or row storage from first storage format, And store the data in the buffer area.
In conjunction with the 5th kind of possible implementation of the third aspect, in the 7th kind of possible implementation of the third aspect In, it is described according to second storage format and the switch instant, in the buffer by the data recombination in the database Before, the method also includes:
If first storage format is row storage, the data in the database are read by row.
In conjunction with the possible implementation of the first of the third aspect above-mentioned or the third aspect to the 7th kind of possible realization Any implementation in mode, in the 8th kind of possible implementation of the third aspect, if the compression ratio is full The first preset condition of foot, and the sorting time meets the second preset condition, then after externally being serviced, the method is also wrapped It includes:
The data throughout of database after acquiring the storage format conversion and query statement response time.
In conjunction with the possible implementation of the first of the third aspect above-mentioned or the third aspect to the 8th kind of possible realization side Any implementation in formula, in the 9th kind of possible implementation of the third aspect, the system performance index is at least Including data volume, inquiry average access data volume, processing line number account for read column number proportion, inquire average access column ratio and Query statement proportion.
The embodiment of the present invention provides the conversion method and device of a kind of data memory format, if with the storage of the first storage format The system performance index of the database of data meets switch condition set by user, and controller then determines storing data in database The second required storage format, and the storage format of the data in database is converted into the second storage lattice from the first storage format Formula, then, controller judge whether the compression ratio of the database after storage format conversion meets the first preset condition, and to storage The data in database after format conversion are ranked up, and whether the test sequencing time meets the second preset condition, if compression ratio Meet the first preset condition, and sorting time meets the second preset condition, then is externally serviced;Alternatively, if compression ratio is discontented The first preset condition of foot and/or sorting time are unsatisfactory for the second preset condition, then according to user's core index threshold value to be set Re-execute the second storage format needed for storing data in determining database.With this solution, controller passes through to system reality The monitoring of border operation data constantly determines that the optimal storage format of data in Database Systems, i.e. controller can make database System is dynamically determined the storage format of data in Database Systems according to loading condition, solves and is changing database purchase at present When format, database administrator manual off-line is needed to modify, system memory space and the low problem of utilization rate pass through self-decision number According to storage format, the handling capacity of query statement, while the memory space and utilization rate of lifting system are reduced.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.
Fig. 1 is the structural schematic diagram one of the controller of the embodiment of the present invention;
Fig. 2 is the structural schematic diagram two of the controller of the embodiment of the present invention;
Fig. 3 is the structural schematic diagram three of the controller of the embodiment of the present invention;
Fig. 4 is the structural schematic diagram four of the controller of the embodiment of the present invention;
Fig. 5 is the conversion method flow diagram one of the data memory format of the embodiment of the present invention;
Fig. 6 is the conversion method flow diagram two of the data memory format of the embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Various techniques described herein is suitable for database field, such as: the dynamic of database bottom data storage format Data distribution, database materialization strategy, database index strategy etc. in optimization, data-base cluster.
Currently, the application based on database be broadly divided into OLTP (On-Line Transaction Processing, it is online Issued transaction) and OLAP (On-Line Analytical Processing, on-line analytical processing) two classes, the former needs to handle It is related to the things inquiry of frequent " writing " operation, the latter lays particular emphasis on processing and is related to the analytic type inquiry of a large amount of " readings " operation.Column are deposited Storage has biggish advantage in read operation, is highly suitable for OLAP query, but unsatisfactory to the support of write operation, therefore simultaneously Be not suitable for OLTP inquiry.It is very good that OLTP inquiry is supported in row storage.
The advantages of row storage organization, is the high adaptability of rapid data load and dynamic load, this is because row storage It ensure that all domains of identical recordings all in the same node.But row storage the shortcomings that be also it will be apparent that for example, it Quick search cannot be supported to handle, because it is unnecessary that it cannot be skipped when inquiry is only for several column in multiple row table Column read;Further, since mixing the column of different data value, row storage is not easy to obtain a high compression ratio, i.e. space Utilization rate is not easy to greatly improve.Although by entropy coding and a preferable compression ratio can be obtained using column correlation, Complex data storage realizes that will lead to decompression expense increases.
The not same area of the same record is dispersed storage by column storage, and the reconstruct of these records will be brought compared with large overhead.But It is that column storage can be avoided the unnecessary column of reading, and the set of metadata of similar data compressed in a column can reach higher compression ratio.
Currently, the advantage and disadvantage of comprehensive row storage, column storage, produce various ranks combination storage modes, such as PAX or These storage modes of RCFile optimize system performance more by the optimization to bottom storage format.But existing data Row, column or ranks the mixing storage format in library are all initial settings when placing one's entire reliance upon data library initialization, i.e. user exists Create database bottom storage format specified when database.When user needs to change database purchase format, need to count It is modified according to library manager's (DBA, Database Administrator) manual off-line, has lacked system automatic adjustment optimization function Energy.
The present invention provides the conversion method and device of a kind of data memory format, enables to Database Systems according to load Situation is dynamically determined database bottom storage format, realizes system automatic adjustment optimization function, reduces handling up for query statement Amount, while the memory space and utilization rate of lifting system.
Embodiment one
The embodiment of the present invention provides a kind of controller, as shown in Figure 1, comprising:
Decision package 10, if for meeting user with the system performance index of the database of the first storage format storing data The switch condition of setting, it is determined that the second storage format needed for storing data in the database;
Storage format converting unit 11, for the storage format of the data in the database to be stored lattice from described first Formula is converted to second storage format that the decision package 10 determines;
Feedback unit 12, for judging whether the compression ratio of the database after storage format is converted meets the first default item Part, and the data in the database after storage format conversion are ranked up, it is pre- whether the test sequencing time meets second If condition;If the compression ratio meets the first preset condition, and the sorting time meets the second preset condition, then carries out external Service, alternatively, if the compression ratio is unsatisfactory for the first preset condition and/or the sorting time is unsatisfactory for the second preset condition, Feedback information is then sent to the decision package 10, so that the decision package waits setting according to the user in the feedback information Fixed core index threshold value redefines the second storage format needed for storing data in the database.
Further, the decision package 10 is also used in determining the database needed for storing data second and deposits Before storing up format, judge whether the system performance index meets core index threshold value set by user;If the system performance Index meets the core index threshold value, then judges whether the system performance index meets the switch condition.
Further, the decision package 10, if being also used to the system performance index is unsatisfactory for the switch condition, The format for keeping storing data in the database is first storage format.
Further, as shown in Fig. 2, the controller further includes data acquisition unit 13;
The data acquisition unit 13, is also used to acquire the system performance index.
Further, the decision package 10 is also used in determining the database needed for storing data second and deposits After storing up format, according to user configuration information, determines and store the storage format of the data in the database from described first Format is converted to the switch instant of second storage format.
Further, the storage format converting unit 11, specifically for according to the decision package 10 determination Second storage format and the switch instant, in the buffer by the data recombination in the database, if in the buffer area Data volume reach disk write threshold value, then by the buffer area data be written disk.
Further, the storage format converting unit 11, if described specifically for the data in the buffer area Two storage formats are single-row storage, then are converted to the storage format of the data in the buffer area from first storage format The single-row storage, and compress and store the data in the buffer area;Alternatively, if data in the buffer area it is described Second storage format is ranks mixing storage or row storage, then by the storage format of the data in the buffer area from described first Storage format is converted to the ranks mixing storage or row storage, and stores the data in the buffer area.
Further, as shown in Fig. 2, the controller further includes reading unit 14,
The reading unit 14, in the storage format converting unit 11 according to second storage format and described Switch instant, in the buffer by before the data recombination in the database, if first storage format is row storage, The data in the database are read by row.
Further, the data acquisition unit 13, if being also used to meet the first preset condition, and institute in the compression ratio It states sorting time and meets the second preset condition, then the database after externally being serviced, after acquiring the storage format conversion Data throughout and the query statement response time.
Further, the system performance index includes at least data volume, inquiry average access data volume, processing line number and accounts for The column ratio and query statement proportion for reading column number proportion, inquiring average access.
The present invention provides a kind of controllers, mainly include decision package, storage format converting unit and feedback unit.If Meet switch condition set by user with the system performance index of the database of the first storage format storing data, decision package is then Determine the second storage format needed for storing data in database, then, storage format converting unit is by the data in database Storage format be converted to the second storage format from the first storage format, finally, feedback unit judge storage format conversion after Whether the compression ratio of database meets the first preset condition, and arranges the data in the database after storage format conversion Whether sequence, test sequencing time meet the second preset condition, if compression ratio meets the first preset condition, and sorting time meets the Two preset conditions, then externally serviced;Alternatively, if compression ratio is unsatisfactory for the first preset condition and/or sorting time is unsatisfactory for Second preset condition then sends feedback information to decision package, so that decision package waits setting according to the user in feedback information Fixed core index threshold value redefines the second storage format needed for storing data in database.With this solution, controller By the monitoring to running data, the optimal storage format of data in Database Systems, i.e. controller are constantly determined Database Systems can be made to be dynamically determined the storage format of data in Database Systems according to loading condition, solve and changing at present When variable database storage format, database administrator manual off-line is needed to modify, system memory space and the low problem of utilization rate, By self-decision data memory format, the handling capacity of query statement, while the memory space and utilization rate of lifting system are reduced.
Embodiment two
Embodiment of the embodiment of the present invention provides a kind of controller, as shown in Figure 3, comprising:
Processor 20, if being set for meeting user with the system performance index of the database of the first storage format storing data Fixed switch condition, it is determined that the second storage format needed for storing data in the database;In format converter 21 by institute It states after the storage formats of the data in database is converted to second storage format from first storage format, judges to store Whether the compression ratio of the database after format conversion meets the first preset condition, and to the database after storage format conversion In data be ranked up, whether the test sequencing time meets the second preset condition;If the compression ratio meets the first default item Part, and the sorting time meets the second preset condition, then is externally serviced, alternatively, if the compression ratio is unsatisfactory for first Preset condition and/or the sorting time are unsatisfactory for the second preset condition, then according to user's core index threshold value weight to be set Newly determine the second storage format needed for storing data in the database;
Format converter 21, for converting the storage format of the data in the database from first storage format Second storage format determined for the processor 20.
Further, the processor 20 is also used in determining the database the second storage needed for storing data Before format, judge whether the system performance index meets core index threshold value set by user;If the system performance refers to Mark meets the core index threshold value, then judges whether the system performance index meets the switch condition.
Further, the processor 20 is protected if being also used to the system performance index is unsatisfactory for the switch condition The format for holding storing data in the database is first storage format.
Further, as shown in figure 4, the controller further includes data collector 22;
The data collector 22, for acquiring the system performance index.
Further, the processor 20 is also used in determining the database the second storage needed for storing data After format, according to user configuration information, determines and the storage format of the data in the database is stored into lattice from described first Formula is converted to the switch instant of second storage format.
Further, the format converter 21 is specifically used for according to second storage format and the switch instant, It, will if the data volume in the buffer area reaches disk write threshold value in the buffer by the data recombination in the database Disk is written in data in the buffer area.
Further, the format converter 21, if specifically for second storage of the data in the buffer area Format is single-row storage, then the storage format of the data in the buffer area is converted to the list from first storage format Column storage, and compress and store the data in the buffer area;Alternatively, if described second of the data in the buffer area deposits Format is stored up as ranks mixing storage or row storage, then the storage format of the data in the buffer area is stored into lattice from described first Formula is converted to the ranks mixing storage or row storage, and stores the data in the buffer area.
Further, the processor 20 is also used in the storage format converter according to second storage format With the switch instant, in the buffer by before the data recombination in the database, if first storage format is row The data in the database are then read in storage by row.
Further, the data collector 22, if for meeting the first preset condition, and the row in the compression ratio The sequence time meets the second preset condition, then after externally being serviced, the number of the database after acquiring the storage format conversion According to handling capacity and query statement response time.
Further, the system performance index includes at least data volume, inquiry average access data volume, processing line number and accounts for The column ratio and query statement proportion for reading column number proportion, inquiring average access.
The embodiment of the present invention provides a kind of controller, mainly includes processor and format converter.If with the first storage lattice The system performance index of the database of formula storing data meets switch condition set by user, and controller is then determined and deposited in database Second storage format needed for storing up data, and the storage format of the data in database is converted to second from the first storage format Storage format, then, controller judge whether the compression ratio of the database after storage format conversion meets the first preset condition, and Data in database after storage format conversion are ranked up, whether the test sequencing time meets the second preset condition, if Compression ratio meets the first preset condition, and sorting time meets the second preset condition, then is externally serviced;Alternatively, if compression It is unsatisfactory for the second preset condition than being unsatisfactory for the first preset condition and/or sorting time, then is referred to according to user's core to be set Mark threshold value re-executes the second storage format needed for storing data in determining database.With this solution, controller by pair The monitoring of running data constantly determines that the optimal storage format of data in Database Systems, i.e. controller can make Database Systems are dynamically determined the storage format of data in Database Systems according to loading condition, solve and are changing data at present When the storage format of library, database administrator manual off-line is needed to modify, system memory space and the low problem of utilization rate, by from Decision data storage format reduces the handling capacity of query statement, while the memory space and utilization rate of lifting system.
Embodiment three
The embodiment of the present invention provides a kind of conversion method of data memory format, as shown in figure 5, this method comprises:
If S101, meeting conversion set by user with the system performance index of the database of the first storage format storing data Condition, controller then determine the second storage format needed for storing data in database.
Existing database row, column or ranks mixing storage format all place one's entire reliance upon data library initialization when Initial setting, i.e. user's database bottom storage format specified when creating database.When user needs to change database When storage format, DBA manual off-line is needed to modify, has lacked system automatic adjustment optimization function.
In order to solve the problems, such as that system cannot automatically adjust the storage format of data in optimization database, the present invention provides one The conversion method of kind data memory format, enables to Database Systems according to loading condition, is dynamically determined database bottom and deposits Format is stored up, system automatic adjustment optimization function is realized.
In practical applications, the OLTP of database is applied and OLAP is excellent using embodying on write operation and read operation respectively Gesture.In order to integrate the advantage and disadvantage of row storage, column storage, various ranks combination storage modes, i.e. OLTP and OLAP fusion are produced. In the application environment towards OLTP and OLAP fusion, data library initialization exists with line mode.
The conversion of data memory format in Database Systems is completed, controller needs to determine storing data in database first Required storage format determines that the data needs stored in database are stored with any format.
Specifically, if meeting set by user turn with the system performance index of the database of the first storage format storing data Condition is changed, controller then determines the second storage format needed for storing data in database.
Wherein, system performance index is that controller is collected within period regular hour, including at least data volume, is looked into Ask average access data volume, processing line number accounts for the column ratio and query statement institute accounting for reading column number proportion, inquiring average access Example.
Further, controller before the second storage format needed for storing data, needs to judge in determining database Whether the system performance index of database meets switch condition set by user, if the system performance index is unsatisfactory for user's setting Switch condition, then controller is without any processing, the data in database still with the first storage format storage.
Illustratively, if switch condition set by user is that any one column visiting frequency (accesses this column number/access this table Number) reach 80% the column can be converted to column storage, when the visiting frequency that controller collects in database the n-th column reaches 80%, controller then determines that n-th column data is stored using column storage format.
Correspondingly, controller needs while controller determines the second storage format needed for storing data in database The table for needing procession to convert in the database is calculated, determining needs in the column and database that need to polymerize storage in database The column individually to store.For example, individually the highest column of Access Column frequency are individually stored by column mode.
Further, determine that controller also needs in database after the second storage format needed for storing data in controller It to determine and the storage format of data in database is converted into the second storage lattice from the first storage format according to user configuration information The switch instant of formula.
Specifically, controller can carry out the conversion of storage format in load idle moment according to system performance index, After can be for the second storage format needed for the storing data in determining database, prompt user that can convert and deposit second Storage format is shown to user, and controller carries out the conversion of storage format after user, which inputs, to be ordered.
The storage format of data in database is converted to the second storage lattice from the first storage format by S102, controller Formula.
After second storage format needed for determining data database storing in controller, controller is by the data in database Storage format be converted to the second storage format from the first storage format.
Specifically, controller is in the buffer after controller determines the switch instant of the second storage format and storage format Data in database are recombinated according to the second storage format, when the data volume in buffer area reaches disk write threshold value, control Then disk is written in data in the buffer area by device.
Further, controller carries out the data in buffer area according to the second storage format of the data in buffer area Different disposal.If second storage format of the data in buffer area is single-row storage, by depositing for the data in buffer area Storage format is converted to the data in single-row storage, and compression and memory buffer from the first storage format;Alternatively, if buffer area In the second storage format of data be ranks mixing storage or row storage, then by the storage format of the data in buffer area from the One storage format is converted to ranks mixing storage or row storage, and the data in memory buffer.
S103, controller judge whether the compression ratio of the database after storage format conversion meets the first preset condition, and Data in database after storage format conversion are ranked up, whether the test sequencing time meets the second preset condition.
After the storage format of the data in database is converted to the second storage format from the first storage format by controller, Whether the conversion that controller needs to detect storage format is reasonable, if can optimize database.
Specifically, controller and is passed through by carrying out the judgement of size variation to the table for having been converted into column in database The simple sequence test sequencing time.Since the size of table directly represents the compression ratio of database, and compression ratio gets over high spatial benefit It is higher with rate;The length of sorting time embody consumption CPU memory source number, column storage can be improved sequence efficiency.Cause This, whether the conversion that controller detects storage format is reasonable, and the compression ratio of the database after needing to judge storage format conversion is The first preset condition of no satisfaction, and the data in the database after storage format conversion are ranked up, the test sequencing time is The second preset condition of no satisfaction.
Wherein, the first preset condition is that the compression ratio of the database after storage format conversion is less than or equal to the first default threshold Value, the second preset condition are that sorting time is less than or equal to the second preset threshold.
If S104, compression ratio meet the first preset condition, and sorting time meets the second preset condition, and controller then carries out Externally service.
Specifically, if compression ratio meets the first preset condition, and sorting time meets the second preset condition, then explanation is current The conversion of storage format enables to the performance of Database Systems to get a promotion, and controller is then externally serviced.
Further, it carries out after externally servicing, controller needs to further look at acquisition corresponding index, judges data base set Whether whether system performance optimizes, i.e., need further to decouple in column in Database Systems, or the column without segmentation is avoided to be divided It cuts.
Wherein, the index of controller acquisition includes data throughout and query statement response time.Data throughout embodies Whether redundant data reading is reduced, and explanation, which is likely to require, if the variation of data throughout is unobvious is further split into Column;The query statement response time be judge column store whether effective direct indicator.
If S105, compression ratio are unsatisfactory for the first preset condition and/or sorting time is unsatisfactory for the second preset condition, control Device then re-executes the conversion of data memory format next time according to user's core index threshold value to be set.
If compression ratio is unsatisfactory for the first preset condition and/or sorting time is unsatisfactory for the second preset condition, illustrate to control The second storage format that device determines cannot make the performance of Database Systems effectively be promoted, and controller needs are waited for according to user The conversion that the core index threshold value of setting re-executes data memory format next time redefines the second storage format of data.
The embodiment of the present invention provides a kind of conversion method of data memory format, if with the first storage format storing data The system performance index of database meets switch condition set by user, and controller then determines in database needed for storing data Second storage format, and the storage format of the data in database is converted into the second storage format from the first storage format, so Afterwards, controller judges whether the compression ratio of the database after storage format conversion meets the first preset condition, and to storage format The data in database after conversion are ranked up, and whether the test sequencing time meets the second preset condition, if compression ratio meets First preset condition, and sorting time meets the second preset condition, then is externally serviced;Alternatively, if compression ratio is unsatisfactory for One preset condition and/or sorting time are unsatisfactory for the second preset condition, then again according to user's core index threshold value to be set It executes and determines the second storage format needed for storing data in database.With this solution, controller passes through to the practical fortune of system The monitoring of row data constantly determines that the optimal storage format of data in Database Systems, i.e. controller can make Database Systems It is dynamically determined the storage format of data in Database Systems according to loading condition, solves and is changing database purchase format at present When, need database administrator manual off-line to modify, system memory space and the low problem of utilization rate are deposited by self-decision data Format is stored up, the handling capacity of query statement, while the memory space and utilization rate of lifting system are reduced.
Example IV
S201, controller acquisition are with the system performance index of the database of the first storage format storing data.
Existing database row, column or ranks mixing storage format all place one's entire reliance upon data library initialization when Initial setting, i.e. user's database bottom storage format specified when creating database.When user needs to change database When storage format, DBA manual off-line is needed to modify, has lacked system automatic adjustment optimization function.
In order to solve the problems, such as that system cannot automatically adjust the storage format of data in optimization database, the present invention provides one The conversion method of kind data memory format, enables to Database Systems according to loading condition, is dynamically determined database bottom and deposits Format is stored up, system automatic adjustment optimization function is realized.
In practical applications, the OLTP of database is applied and OLAP is excellent using embodying on write operation and read operation respectively Gesture.In order to integrate the advantage and disadvantage of row storage, column storage, various ranks combination storage modes, i.e. OLTP and OLAP fusion are produced. In the application environment towards OLTP and OLAP fusion, data library initialization exists with line mode.
Specifically, can be dynamically determined database bottom storage format to realize, controller acquires deposit with first first The system performance index for storing up the database of format memory data, so that controller determines number according to collected performance indicator According to storage format needed for the storing data of library.
Wherein, system performance index includes at least data volume, inquiry average access data volume, processing line number and accounts for reading line number Ratio, the column ratio and query statement proportion for inquiring average access.Specifically,
Data volume is the important indicator whether deposited using column, and the much the bigger inquiry of data volume the more suitable to be deposited using column, data Amount size is entire database data amount size;
Inquiry average access data volume is that database inquires the number of data lines averagely used every time, and average access data volume is very Big scene is suitble to column to deposit;
Processing line number, which accounts for read column number proportion and refer to that the number of data lines of averagely each operation actual use accounts for, all reads line numbers Ratio, when reading the data in analytical database, although some data are read in from disk, actually system is not Carry out correlation analysis operation, it is desirable to be read in all line numbers can be handled by system, so ratio gets over Gao Yueshi Column are closed to deposit;
The ratio of the column columns more total than the column Zhan for referring to the access of average lookup sentence of average access is inquired, the ratio is smaller more suitable Column are closed to deposit;
Query statement proportion refers to inquiry operation proportion in all database manipulations, and ratio is closer to 100% More suitable column are deposited.
S202, controller judge whether system performance index meets core index threshold value set by user.
After controller is collected with the system performance index of the database of the first storage format storing data, controller pair The system performance index is analyzed.
Specifically, controller judges whether system performance index meets core index threshold value set by user, i.e. controller Analysis of Policy Making is carried out to system performance index according to core index threshold value set by user and decision making algorithm.
Optionally, if core index threshold value set by user includes: to inquire the column of average access than threshold value Ta, query statement Proportion threshold value Tq and processing line number account for and read column number proportion threshold value Tp, then decision making algorithm set by user can be (inquiry The column of average access than<Ta) and/or (query statement proportion>Tq) and/or (processing line number account for read column number proportion>Tp).
If S203, system performance index meet core index threshold value, controller judges whether system performance index meets Switch condition set by user.
If controller determines that system performance index meets core threshold value, i.e. system performance index meets decision set by user Algorithm, controller then judge whether the performance indicator meets switch condition set by user.Only performance indicator meets user and sets Fixed switch condition, controller just can determine that storage format needed for storing data in database.
Illustratively, if switch condition set by user is that any one column visiting frequency (accesses this column number/access this table Number) reach 80% the column can be converted to column storage, when the visiting frequency that controller collects in database the n-th column reaches 80%, controller just can determine that n-th column data is stored using column storage format.
If S204, system performance index meet switch condition, controller determines in database needed for storing data the Two storage formats.
Specifically, if meeting set by user turn with the system performance index of the database of the first storage format storing data Condition is changed, then the storage format of the data in database of descriptions can be converted, and controller can determine storing data institute in database The second storage format needed.
Correspondingly, controller needs while controller determines the second storage format needed for storing data in database The table for needing procession to convert in the database is calculated, determining needs in the column and database that need to polymerize storage in database The column individually to store.For example, individually the highest column of Access Column frequency are individually stored by column mode.
S205, controller determine according to user configuration information and store the storage format of the data in database from first Format is converted to the switch instant of the second storage format.
Determine that in database after the second storage format needed for storing data, controller is also needed according to user in controller Configuration information, when determining that the storage format by data in database is converted to the conversion of the second storage format from the first storage format It carves.
Specifically, controller can carry out the conversion of storage format in load idle moment according to system performance index, After can be for the second storage format needed for the storing data in determining database, prompt user that can convert and deposit second Storage format is shown to user, and controller carries out the conversion of storage format after user, which inputs, to be ordered.
The storage format of data in database is converted to the second storage lattice from the first storage format by S206, controller Formula.
After second storage format needed for determining data database storing in controller, controller is by the data in database Storage format be converted to the second storage format from the first storage format.
Specifically, controller is in the buffer after controller determines the switch instant of the second storage format and storage format Data in database are recombinated according to the second storage format, when the data volume in buffer area reaches disk write threshold value, control Then disk is written in data in the buffer area by device, wherein if the first storage format is row storage, controller presses capable reading first The data in database are taken, then recombinate the data in database according to the second storage format in the area Cai Hongchong.
Further, controller carries out the data in buffer area according to the second storage format of the data in buffer area Different disposal.If second storage format of the data in buffer area is single-row storage, by depositing for the data in buffer area Storage format is converted to the data in single-row storage, and compression and memory buffer from the first storage format;Alternatively, if buffer area In the second storage format of data be ranks mixing storage or row storage, then by the storage format of the data in buffer area from the One storage format is converted to ranks mixing storage or row storage, and the data in memory buffer.
S207, controller judge whether the compression ratio of the database after storage format conversion meets the first preset condition, and Data in database after storage format conversion are ranked up, whether the test sequencing time meets the second preset condition.
After the storage format of the data in database is converted to the second storage format from the first storage format by controller, Whether the conversion that controller needs to detect storage format is reasonable, if can optimize database.
Specifically, controller and is passed through by carrying out the judgement of size variation to the table for having been converted into column in database The simple sequence test sequencing time.Since the size of table directly represents the compression ratio of database, and compression ratio gets over high spatial benefit It is higher with rate;The length of sorting time embody consumption CPU memory source number, column storage can be improved sequence efficiency.Cause This, whether the conversion that controller detects storage format is reasonable, and the compression ratio of the database after needing to judge storage format conversion is The first preset condition of no satisfaction, and the data in the database after storage format conversion are ranked up, the test sequencing time is The second preset condition of no satisfaction.
Wherein, the first preset condition is that the compression ratio of the database after storage format conversion is less than or equal to the first default threshold Value, the second preset condition are that sorting time is less than or equal to the second preset threshold.
If S208, compression ratio meet the first preset condition, and sorting time meets the second preset condition, and controller then carries out Externally service.
Specifically, if compression ratio meets the first preset condition, and sorting time meets the second preset condition, then explanation is current The conversion of storage format enables to the performance of Database Systems to get a promotion, and controller is then externally serviced.
Further, it carries out after externally servicing, controller needs to further look at acquisition corresponding index, judges data base set Whether whether system performance optimizes, i.e., need further to decouple in column in Database Systems, or the column without segmentation is avoided to be divided It cuts.
Wherein, the index of controller acquisition includes data throughout and query statement response time.Data throughout embodies Whether redundant data reading is reduced, and explanation, which is likely to require, if the variation of data throughout is unobvious is further split into Column;The query statement response time be judge column store whether effective direct indicator.
If S209, compression ratio are unsatisfactory for the first preset condition and/or sorting time is unsatisfactory for the second preset condition, control Device then re-executes the conversion of data memory format next time according to user's core index threshold value to be set.
If compression ratio is unsatisfactory for the first preset condition and/or sorting time is unsatisfactory for the second preset condition, illustrate to control The second storage format that device determines cannot make the performance of Database Systems effectively be promoted, and controller needs are waited for according to user The conversion that the core index threshold value of setting re-executes data memory format next time redefines the second storage format of data.
Illustratively, if when system initialization, when column access is than being 90%, which is converted to individually column and deposited by controller Storage still finds system performance without promotion after conversion, and controller is then fed back column access than being improved by 90% to 91%, is used Family resets the value of column access ratio according to the feedback information, and controller accesses ratio according to the column of user's new settings and redefines Second storage format.
If S210, system performance index are unsatisfactory for switch condition, controller keeps the format of storing data in database For the first storage format.
The embodiment of the present invention provides a kind of conversion method of data memory format, if with the first storage format storing data The system performance index of database meets switch condition set by user, and controller then determines in database needed for storing data Second storage format, and the storage format of the data in database is converted into the second storage format from the first storage format, so Afterwards, controller judges whether the compression ratio of the database after storage format conversion meets the first preset condition, and to storage format The data in database after conversion are ranked up, and whether the test sequencing time meets the second preset condition, if compression ratio meets First preset condition, and sorting time meets the second preset condition, then is externally serviced;Alternatively, if compression ratio is unsatisfactory for One preset condition and/or sorting time are unsatisfactory for the second preset condition, then wait setting according to the user in the feedback information again Fixed core index threshold value, which executes, determines the second storage format needed for storing data in database.With this solution, controller By the monitoring to running data, the optimal storage format of data in Database Systems, i.e. controller are constantly determined Database Systems can be made to be dynamically determined the storage format of data in Database Systems according to loading condition, solve and changing at present When variable database storage format, database administrator manual off-line is needed to modify, system memory space and the low problem of utilization rate, By self-decision data memory format, the handling capacity of query statement, while the memory space and utilization rate of lifting system are reduced.
It is apparent to those skilled in the art that for convenience and simplicity of description, only with above-mentioned each function The division progress of module can according to need and for example, in practical application by above-mentioned function distribution by different function moulds Block is completed, i.e., the internal structure of device is divided into different functional modules, to complete all or part of function described above Energy.The specific work process of the system, apparatus, and unit of foregoing description, can be with reference to corresponding in preceding method embodiment Journey, details are not described herein.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the module or unit It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components It can be combined or can be integrated into another system, or some features can be ignored or not executed.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain Lid is within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.

Claims (30)

1. a kind of controller characterized by comprising
Decision package, if set by user for being met with the system performance index of the database of the first storage format storing data Switch condition, it is determined that the second storage format needed for storing data in the database;
Storage format converting unit, for converting the storage format of the data in the database from first storage format Second storage format determined for the decision package;
Feedback unit, for judging whether the compression ratio of the database after storage format is converted meets the first preset condition, and it is right The data in database after the storage format conversion are ranked up, and whether the test sequencing time meets the second preset condition; If the compression ratio meets the first preset condition, and the sorting time meets the second preset condition, then is externally serviced, or Person sends if the compression ratio is unsatisfactory for the first preset condition and/or the sorting time is unsatisfactory for the second preset condition Feedback information is to the decision package, so that the decision package is according to user's core to be set in the feedback information Metrics-thresholds redefine the second storage format needed for storing data in the database.
2. controller according to claim 1, which is characterized in that
The decision package is also used in determining the database before the second storage format needed for storing data, judgement Whether the system performance index meets core index threshold value set by user;If the system performance index meets the core Metrics-thresholds, then judge whether the system performance index meets the switch condition.
3. controller according to claim 2, which is characterized in that
The decision package keeps the database if being also used to the system performance index is unsatisfactory for the switch condition The format of middle storing data is first storage format.
4. controller according to any one of claim 1-3, which is characterized in that the controller further includes data acquisition Unit,
The data acquisition unit is also used to acquire the system performance index.
5. controller according to any one of claim 1-3, which is characterized in that
The decision package is also used in determining the database after the second storage format needed for storing data, according to User configuration information determines the storage formats of the data in the database being converted to described the from first storage format The switch instant of two storage formats.
6. controller according to claim 5, which is characterized in that
The storage format converting unit, specifically for second storage format and described determined according to the decision package Switch instant, in the buffer by the data recombination in the database, if the data volume in the buffer area reaches disk write Then disk is written in data in the buffer area by threshold value.
7. controller according to claim 6, which is characterized in that
The storage format converting unit, if second storage format specifically for the data in the buffer area is single-row Storage, then be converted to the single-row storage from first storage format for the storage format of the data in the buffer area, with And it compresses and stores the data in the buffer area;Alternatively, if second storage format of the data in the buffer area is Ranks mixing storage or row storage, then be converted to the storage format of the data in the buffer area from first storage format The ranks mixing storage or row storage, and store the data in the buffer area.
8. controller according to claim 6, which is characterized in that the controller further includes reading unit;
The reading unit, for when the storage format converting unit is according to second storage format and the conversion It carves, in the buffer by before the data recombination in the database, if first storage format is row storage, is read by row Take the data in the database.
9. according to claim 1-3, controller described in any one of 6-8, which is characterized in that
The data acquisition unit, if being also used to meet the first preset condition in the compression ratio, and the sorting time meets Second preset condition, then after externally being serviced, the data throughout of the database after acquiring storage format conversion and The query statement response time.
10. controller according to claim 9, which is characterized in that the system performance index includes at least data volume, looks into Ask average access data volume, processing line number accounts for the column ratio and query statement institute accounting for reading column number proportion, inquiring average access Example.
11. a kind of controller characterized by comprising
Processor, if for meeting set by user turn with the system performance index of the database of the first storage format storing data Change condition, it is determined that the second storage format needed for storing data in the database;In format converter by the database In data storage format be converted to second storage format from first storage format after, judge that storage format is converted Whether the compression ratio of database afterwards meets the first preset condition, and to the data in the database after storage format conversion It is ranked up, whether the test sequencing time meets the second preset condition;If the compression ratio meets the first preset condition, and described Sorting time meets the second preset condition, then is externally serviced, alternatively, if the compression ratio is unsatisfactory for the first preset condition, And/or the sorting time is unsatisfactory for the second preset condition, then redefines institute according to user's core index threshold value to be set State the second storage format needed for storing data in database;
Format converter, it is described for being converted to the storage format of the data in the database from first storage format Second storage format that processor determines.
12. controller according to claim 11, which is characterized in that
The processor is also used in determining the database before the second storage format needed for storing data, judges institute State whether system performance index meets core index threshold value set by user;Refer to if the system performance index meets the core Threshold value is marked, then judges whether the system performance index meets the switch condition.
13. controller according to claim 12, which is characterized in that
The processor is kept in the database if being also used to the system performance index is unsatisfactory for the switch condition The format of storing data is first storage format.
14. controller described in any one of 1-13 according to claim 1, which is characterized in that the controller further includes data Collector;
The data collector, for acquiring the system performance index.
15. controller described in any one of 1-13 according to claim 1, which is characterized in that
The processor is also used in determining the database after the second storage format needed for storing data, according to Family configuration information determines the storage format of the data in the database being converted to described second from first storage format The switch instant of storage format.
16. controller according to claim 15, which is characterized in that
The format converter is specifically used for according to second storage format and the switch instant, in the buffer by institute The data recombination in database is stated, it, will be in the buffer area if the data volume in the buffer area reaches disk write threshold value Disk is written in data.
17. controller according to claim 16, which is characterized in that
The format converter, if second storage format specifically for the data in the buffer area is single-row storage, The storage format of the data in the buffer area is then converted into the single-row storage, and compression from first storage format And store the data in the buffer area;Alternatively, if second storage format of the data in the buffer area is mixed for ranks Storage or row storage are closed, then the storage format of the data in the buffer area is converted into the row from first storage format Column mixing storage or row storage, and store the data in the buffer area.
18. controller according to claim 16, which is characterized in that
The processor is also used in the storage format converter according to second storage format and the switch instant, In the buffer by before the data recombination in the database, if first storage format is row storage, read by row Data in the database.
19. controller described in any one of 1-13,16-18 according to claim 1, which is characterized in that
The data collector, if being also used to meet the first preset condition in the compression ratio, and the sorting time meets the Two preset conditions after then externally being serviced, acquire the data throughout of the database after the storage format is converted and look into Ask the sentence response time.
20. controller according to claim 19, which is characterized in that the system performance index include at least data volume, Inquiry average access data volume, processing line number, which account for, to be read column number proportion, inquires shared by the column ratio and query statement of average access Ratio.
21. a kind of conversion method of data memory format characterized by comprising
Step a: if meeting conversion stripes set by user with the system performance index of the database of the first storage format storing data Part, it is determined that the second storage format needed for storing data in the database;
Step b: the storage format of the data in the database is converted into second storage from first storage format Format;
Step c: whether the compression ratio of the database after judging storage format conversion meets the first preset condition, and to the storage The data in database after format conversion are ranked up, and whether the test sequencing time meets the second preset condition;
Step d: if the compression ratio meets first preset condition, and the sorting time meets the described second default item Part is then externally serviced;Alternatively, if the compression ratio is unsatisfactory for first preset condition and/or the sorting time not Meet second preset condition, then above-mentioned steps is re-executed according to user's core index threshold value to be set.
22. the conversion method of data memory format according to claim 21, which is characterized in that the determination data In library before the second storage format needed for storing data, the method also includes:
Judge whether the system performance index meets core index threshold value set by user;
If the system performance index meets the core index threshold value, it is described to judge whether the system performance index meets Switch condition.
23. the conversion method of data memory format according to claim 22, which is characterized in that
If the system performance index is unsatisfactory for the switch condition, keep the format of storing data in the database for institute State the first storage format.
24. the conversion method of the data memory format according to any one of claim 21-23, which is characterized in that described Method further include:
Acquire the system performance index.
25. the conversion method of the data memory format according to any one of claim 21-23, which is characterized in that true In the fixed database after the second storage format needed for storing data, the method also includes:
According to user configuration information, determination converts the storage format of the data in the database from first storage format For the switch instant of second storage format.
26. the conversion method of data memory format according to claim 25, which is characterized in that described by the database In the storage formats of data be converted to second storage format from first storage format, specifically include:
According to second storage format and the switch instant, in the buffer by the data recombination in the database;
If the data volume in the buffer area reaches disk write threshold value, disk is written into the data in the buffer area.
27. the conversion method of data memory format according to claim 26, which is characterized in that described by the buffer area In data be written disk, specifically include:
If second storage format of the data in the buffer area is single-row storage, by the data in the buffer area Storage format is converted to the single-row storage from first storage format, and compresses and store the number in the buffer area According to;Alternatively,
If second storage format of the data in the buffer area is ranks mixing storage or row storage, by the buffering The storage format of data in area is converted to the ranks mixing storage or row storage from first storage format, and deposits Store up the data in the buffer area.
28. the conversion method of data memory format according to claim 26, which is characterized in that described according to described second Storage format and the switch instant, in the buffer by before the data recombination in the database, the method also includes:
If first storage format is row storage, the data in the database are read by row.
29. the conversion method of the data memory format according to any one of claim 21-23,26-28, feature exist In if the compression ratio meets the first preset condition, and the sorting time meets the second preset condition, then carries out external After service, the method also includes:
The data throughout of database after acquiring the storage format conversion and query statement response time.
30. the conversion method of data memory format according to claim 29, which is characterized in that the system performance index Including at least data volume, inquiry average access data volume, processing line number account for read column number proportion, inquire average access column ratio, And query statement proportion.
CN201480000190.5A 2014-03-18 2014-03-18 A kind of conversion method and device of data memory format Active CN105378716B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2014/073576 WO2015139193A1 (en) 2014-03-18 2014-03-18 Method and apparatus for conversion of data storage formats

Publications (2)

Publication Number Publication Date
CN105378716A CN105378716A (en) 2016-03-02
CN105378716B true CN105378716B (en) 2019-03-26

Family

ID=54143626

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201480000190.5A Active CN105378716B (en) 2014-03-18 2014-03-18 A kind of conversion method and device of data memory format

Country Status (2)

Country Link
CN (1) CN105378716B (en)
WO (1) WO2015139193A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110275677A (en) * 2019-05-22 2019-09-24 华为技术有限公司 Hard disk form conversion method, device and storage equipment
WO2022257575A1 (en) * 2021-06-11 2022-12-15 华为技术有限公司 Data processing method, apparatus, and device

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107092624B (en) * 2016-12-28 2022-08-30 北京星选科技有限公司 Data storage method, device and system
US10719508B2 (en) * 2018-04-19 2020-07-21 Risk Management Solutions, Inc. Data storage system for providing low latency search query responses
CN110196847A (en) 2018-08-16 2019-09-03 腾讯科技(深圳)有限公司 Data processing method and device, storage medium and electronic device
CN111064976B (en) * 2018-10-17 2022-01-04 武汉斗鱼网络科技有限公司 Method for sending live broadcast information and server
CN111198859B (en) * 2018-11-16 2023-11-03 北京微播视界科技有限公司 Data processing method, device, electronic equipment and computer readable storage medium
CN110162563B (en) * 2019-05-28 2023-11-17 深圳市网心科技有限公司 Data warehousing method and system, electronic equipment and storage medium
CN112579597B (en) * 2020-12-15 2023-03-21 西安邮电大学 Compression-sensitive database file storage method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102495905A (en) * 2011-12-23 2012-06-13 天津神舟通用数据技术有限公司 Packing method based on line storage database engine
CN103345518A (en) * 2013-07-11 2013-10-09 清华大学 Self-adaptive data storage management method and system based on data block

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102737033B (en) * 2011-03-31 2015-02-04 国际商业机器公司 Data processing equipment and data processing method thereof

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102495905A (en) * 2011-12-23 2012-06-13 天津神舟通用数据技术有限公司 Packing method based on line storage database engine
CN103345518A (en) * 2013-07-11 2013-10-09 清华大学 Self-adaptive data storage management method and system based on data block

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110275677A (en) * 2019-05-22 2019-09-24 华为技术有限公司 Hard disk form conversion method, device and storage equipment
CN110275677B (en) * 2019-05-22 2022-04-12 华为技术有限公司 Hard disk format conversion method and device and storage equipment
WO2022257575A1 (en) * 2021-06-11 2022-12-15 华为技术有限公司 Data processing method, apparatus, and device

Also Published As

Publication number Publication date
CN105378716A (en) 2016-03-02
WO2015139193A1 (en) 2015-09-24

Similar Documents

Publication Publication Date Title
CN105378716B (en) A kind of conversion method and device of data memory format
CN102521406B (en) Distributed query method and system for complex task of querying massive structured data
US20130275364A1 (en) Concurrent OLAP-Oriented Database Query Processing Method
CN103095805A (en) Cloud storage system of data intelligent and decentralized management
CN105630810B (en) A method of mass small documents are uploaded in distributed memory system
KR20150089538A (en) Apparatus for in-memory data management and method for in-memory data management
CN102073697A (en) Data processing method and data processing device
CN107402926A (en) A kind of querying method and query facility
CN106598501B (en) For storing the Data Migration device and method of AUTOMATIC ZONING
CN108776690B (en) Method for HDFS distributed and centralized mixed data storage system based on hierarchical governance
CN102480502B (en) I/O load equilibrium method and I/O server
CN112416960A (en) Data processing method, device and equipment under multiple scenes and storage medium
CN106302659A (en) A kind of based on cloud storage system promotes access data quick storage method
CN103945005A (en) Multiple evaluation indexes based dynamic load balancing framework
CN106326012A (en) Web application cluster buffer utilization method and system
CN111475507A (en) Key value data indexing method for workload self-adaptive single-layer L SMT
CN104717251A (en) Scheduling method and system for Cell nodes through OpenStack cloud computing management platform
CN103365923A (en) Method and device for assessing partition schemes of database
CN202093513U (en) Bulk data processing system
KR20150089544A (en) Apparatus of managing data and method of managing data for supporting mixed workload
CN113639305A (en) Heat exchange station load prediction system based on time sequence database platform
CN1904882A (en) Compression method of database near-line data
CN107179883A (en) Spark architecture optimization method of hybrid storage system based on SSD and HDD
CN105912621A (en) Area building energy consumption platform data storing and query method
KR20180077830A (en) Processing method for a relational query in distributed stream processing engine based on shared-nothing architecture, recording medium and device for performing the method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20201228

Address after: 518101 Baoan District Xin'an street, Shenzhen, Guangdong, No. 625, No. 625, Nuo platinum Plaza,

Patentee after: SHENZHEN SHANGGE INTELLECTUAL PROPERTY SERVICE Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd.

Effective date of registration: 20201228

Address after: 313200 No.8 Yu'an South Road, Hongfeng village, Xin'an Town, Deqing County, Huzhou City, Zhejiang Province (Zhejiang Huazhuo Electromechanical Technology Co., Ltd.)

Patentee after: Luo Sanjie

Address before: 518101 Baoan District Xin'an street, Shenzhen, Guangdong, No. 625, No. 625, Nuo platinum Plaza,

Patentee before: SHENZHEN SHANGGE INTELLECTUAL PROPERTY SERVICE Co.,Ltd.

TR01 Transfer of patent right