CN104750682A - Buffering capacity allocation method for massive logs - Google Patents

Buffering capacity allocation method for massive logs Download PDF

Info

Publication number
CN104750682A
CN104750682A CN201310727354.1A CN201310727354A CN104750682A CN 104750682 A CN104750682 A CN 104750682A CN 201310727354 A CN201310727354 A CN 201310727354A CN 104750682 A CN104750682 A CN 104750682A
Authority
CN
China
Prior art keywords
described
section
sublist
amount
quoting
Prior art date
Application number
CN201310727354.1A
Other languages
Chinese (zh)
Other versions
CN104750682B (en
Inventor
吕成云
唐新民
沈智杰
景晓军
Original Assignee
任子行网络技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 任子行网络技术股份有限公司 filed Critical 任子行网络技术股份有限公司
Priority to CN201310727354.1A priority Critical patent/CN104750682B/en
Publication of CN104750682A publication Critical patent/CN104750682A/en
Application granted granted Critical
Publication of CN104750682B publication Critical patent/CN104750682B/en

Links

Abstract

The invention discloses a buffering capacity allocation method for massive logs. The method comprises an eleventh step of reading the log in a sublist in real time; a twelfth step of counting citation times of an offset of a domain appears for the first time; a thirteenth step of establishing citation volume of each segment, and computing total citation volume of the sublist; a fourteenth step of performing linear fitting on each segment of citation volume; and a fifteenth step of allocating a preset total buffering capacity in the sublist for each segment. The method has the beneficial effects of reasonably allocating the buffering capacity, occupying less memory resources, and reducing input/output (I/O) operation due to citation of domain offset.

Description

A kind of buffering capacity distribution method of massive logs

Technical field

The present invention relates to log management field, more particularly, relate to a kind of buffering capacity distribution method of massive logs.

Background technology

IDC (Internet Data Center, Internet data center), DNS (Domain Name Service, domain name system) etc. produce the daily record of magnanimity, demand carries out importing (1 ~ 100,000/second) in real time fast, and nearly real-time search.As realized the targets such as above-mentioned importing or search, generally all partitioning technique can be used.This technology very large table is divided into multiple little table according to certain rule and is stored into different regions respectively, a table so in logic, during physical store, can as multiple tables, be stored in different positions, simplify the management activity of database, but also can application performance be improved.Subregion (during inquiry, general limit is within two hours, and the threshold value so set may correspond to it, allows a search hit as far as possible in a subregion, is also no more than two subregions at the most) is carried out with data volume size.When subregion completes, the value in the territory of index has corresponding scope.For fast filtering.If the time domain in addition in daily record can keep temporally increasing progressively, so can to carrying out special processing.Reach its index take up room little and filter fast.Certain system provides to this territory whether strictly increasing is configurable.So it just can only use specific subregion directly to inquire about when access list.Do not need to relate to whole table in inquiry, naturally just improve query performance.Simultaneously because external interface is still a table, for user, application is transparent, the existence of their imperceptible subregion.Therefore, large table partitioning technique is applied widely in mass data storage.

In existing scheme, Mysql uses the speed only having 1000-2000 bar/second during MYISAM storage engines; MongoDB data volume is reduced to less than 2000/second more than 1,000 ten thousand hourly velocity, and also can decline along with the increase of data volume always; NOSQL, based on key-value couple, can not index multiple domain; The highest ability of lucene boot speed close to 10,000/second, and increases a lot of extra unwanted content relevant to score, position etc. simultaneously.Another stealthy important indicator of obvious mass data is that compression is few.The scheme of last time is all more limited at boot speed.Ratio of compression is little, and internal memory, cpu, I/O take height.In addition in order to save cost, log system needs in existing less idle taking on device.Obvious lightweight is the ultimate aim pursued.Buffering capacity allocation strategy is the important aspect of in system.

Summary of the invention

The technical problem to be solved in the present invention is, for the defect of the buffering capacity unreasonable distribution of prior art, provides a kind of buffering capacity distribution method of massive logs.

The technical solution adopted for the present invention to solve the technical problems is: the buffering capacity distribution method constructing a kind of massive logs, and in the buffering capacity of reading in massive logs time-division gamete table, the method comprises the following steps:

S11, read in daily record to sublist in real time, and described daily record is stored in the section of specifying in sublist;

S12, all sections in described sublist to be divided according to the time of reading in daily record, if the daily record of reading in sublist has identical territory, in all identical territories, then all quote the side-play amount in the territory occurred first in sublist, and the number of times that the side-play amount in the territory occurred first described in statistics is cited;

S13, set up the amount of the quoting S of every section i, described in the amount of quoting S ifor the number of times sum that the side-play amount in the territory occurred first described in all in i-th section is cited, wherein, i is the positive integer in [1, n], and n is the section in described sublist; Calculate the always amount of the quoting S of sublist sum:

S14, the amount of quoting S according to the Time alignment every section and every section that read in daily record i, to the amount of the quoting S of every section and every section irelation carry out linear fit, obtain the straight line y=ax+b that regulation characterizes the corresponding relation of section and the amount of quoting, wherein, x-axis is the xth section in described sublist, y-axis for described in the amount of quoting;

S15, the total buffer amount C that will preset in described sublist according to the described corresponding relation of described straight line y=ax+b regulation sumdistribute to every section, the buffering capacity C of i-th section of distribution gained ifor: C i=C sum× (ai+b)/S sum.

In buffering capacity distribution method of the present invention, in described step S11: the territory of described daily record comprises user ID, access time, access IP, requests for page and request function number.

In buffering capacity distribution method of the present invention, described step S12 comprises following sub-step:

S12A, all sections in described sublist to be divided according to the time of reading in daily record, if the daily record of reading in sublist has identical territory, then in all identical territories, all quote the side-play amount in the territory occurred first in sublist;

The number of times z that S12B, the side-play amount of adding up the territory that i-th section of jth occurs first are cited ijand described number of times is sorted, wherein, i is the positive integer in [1, n], and n is the total hop count in described sublist, and j is the positive integer in [1, m], and m is total number in the territory occurred first in described i-th section.

In buffering capacity distribution method of the present invention, described step S13 comprises following sub-step:

S13A, the foundation amount of quoting S i, described in the amount of quoting S inumber of times sum for the side-play amount in the territory occurred first described in all in i-th section is cited: wherein, i is the positive integer in [1, n], and n is the total hop count in described sublist, and j is the positive integer in [1, m], and m is total number in the territory occurred first in described i-th section;

The always amount of the quoting S of S13B, calculating sublist sum:

In buffering capacity distribution method of the present invention, described step S14 also comprises:

S14A, the amount of quoting S according to the Time alignment every section and every section that read in daily record i, the section of getting the territory place of described sequence in preset range carries out linear fit, and obtain the straight line y=ax+b that regulation characterizes the corresponding relation of section and the amount of quoting, wherein, x-axis is the section in described sublist, y-axis for described in the amount of quoting.

In buffering capacity distribution method of the present invention, the method also comprises:

S15A, before described step S15, judge whether ai+b is greater than 0, if ai+b is greater than 0, then perform step S15; If ai+b is less than or equal to 0, then perform step S15B;

S15B, by fitting a straight line along y-axis upwards translation c unit, until ai+b+c is greater than 0, and described fitting a straight line is modified to y=ax+b+c;

S15C, the total buffer amount C will preset in described sublist sumdistribute to every section, the buffering capacity C of i-th section of distribution gained ifor: C i = C sum × ( ai + b + c ) / ( S sum × ∫ 0 - ( b + c ) / a ( ax + b + c ) dx / ∫ 0 - b / a ( ax + b ) dx ) .

Implement the buffering capacity distribution method of a kind of massive logs of the present invention, have following beneficial effect: according to the amount of quoting reasonable distribution buffering capacity, take less memory source, the side-play amount of referring domain decreases I/O operation.

Accompanying drawing explanation

Below in conjunction with drawings and Examples, the invention will be further described, in accompanying drawing:

Fig. 1 is the process flow diagram of the buffering capacity distribution method of a kind of massive logs that preferred embodiment of the present invention provides;

Fig. 2 is the linear fit coordinate diagram that preferred embodiment of the present invention provides;

Fig. 3 is the structural representation that sublist is read in daily record;

Fig. 4 is the process flow diagram of the buffering capacity distribution method of a kind of massive logs that another preferred embodiment of the present invention provides;

Fig. 5 is the linear fit coordinate diagram that another preferred embodiment of the present invention provides.

Embodiment

In order to there be understanding clearly to technical characteristic of the present invention, object and effect, now contrast accompanying drawing and describe the specific embodiment of the present invention in detail.

As shown in Figure 1, in the process flow diagram of the buffering capacity distribution method of a kind of massive logs provided at preferred embodiment of the present invention, the method is used in the buffering capacity of reading in massive logs time-division gamete table, as shown in Figure 3, comprise a summary table, summary table is made up of multiple sublist the general structure of WEB server, each sublist is made up of multilayer, every layer is made up of multistage, and section is the elementary cell of process, and the method specifically comprises:

S11, read in daily record to sublist in real time, and described daily record is stored in the section in sublist; The territory of described daily record at least comprises user ID, access time, access IP, requests for page and request function number.

S12, divide according to the time of reading in daily record the section in described sublist, if identical territory appears in the daily record of reading in sublist, then the side-play amount in the territory occurred first in sublist is quoted in identical territory, and the number of times that the side-play amount in territory occurred first described in statistics is cited; During the number of times that the side-play amount of adding up the territory occurred first is cited, a bivariate table is used to store and cushion, such as: <name, number>, if when the territory occurred below is identical with this territory, quoted by number, this is because the territory occurred first can by below all sections directly quote, when there is the territory that occurs first and identical territory thereof in same section, the territory occurred first is directly quoted in the resource-area of the identical territory section of being stored in; In statistic processes, all records of statistics can, in advance with binlog form written document, during statistics written document, can use this part binlog to complete final write.

This step specifically comprises following sub-step:

S12A, to the section in described sublist according to read in daily record time divide, if identical territory appears in the daily record of reading in sublist, the side-play amount in the territory occurred first in sublist is quoted in identical territory;

The number of times z that S12B, the side-play amount of adding up the territory that i-th section of jth occurs first are cited ijand described number of times is sorted, wherein, i is the positive integer in [1, n], and n is the total hop count in described sublist, and j is the positive integer in [1, m], and m is total number in the territory occurred first in described i-th section.Because each section comprises multiple territory occurred first, the number of times that each territory is cited affects reading in of daily record and storage speed, therefore the total amount quoting number of times and correspondence in need every section being added up.

S13, set up the amount of the quoting S of every section i, described in the amount of quoting S ifor the number of times sum that the side-play amount in the territory occurred first described in all in i-th section is cited, wherein, i is the positive integer in [1, n], and n is the section in described sublist; Calculate the always amount of the quoting S of sublist sum:

This step specifically comprises following sub-step:

S13A, the foundation amount of quoting S i, described in the amount of quoting S inumber of times sum for the side-play amount in the territory occurred first described in all in i-th section is cited: wherein, i is the positive integer in [1, n], and n is the total hop count in described sublist, and j is the positive integer in [1, m], and m is total number in the territory occurred first in described i-th section;

The always amount of the quoting S of S13B, calculating sublist sum:

S14, the amount of quoting S according to the Time alignment every section and every section that read in daily record i, to the amount of the quoting S of every section and every section irelation carry out linear fit, obtain straight line y=ax+b, wherein, x-axis is the xth section in described sublist, y-axis for described in the amount of quoting; As shown in Figure 2, because the operand amount of quoting of all sections being carried out to linear fit is very large, therefore this step can also according to the amount of the quoting S of the Time alignment every section and every section that read in daily record for the straight line y=ax+b obtained by linear fit i, the section of getting the territory place of described sequence in preset range carries out linear fit, obtains straight line y=ax+b, and wherein, x-axis is the section in described sublist, y-axis for described in the amount of quoting.The section at the territory place of described sequence in preset range can carry out rank again, the section of generally reading at first contains the highest amount of quoting, get the high section of rank and data volume arranges, carry out linear fit again, not only can reduce operand, the accuracy that follow-up buffering capacity is distributed can also be ensured.

Linear fit adopts continuous curve to portray approx or than the funtcional relationship between the coordinate on quasi-plane represented by discrete point group, some discrete function values of such as certain function known, by adjusting some undetermined coefficients in this function, make the difference of this function and known point set (least square meaning) minimum, if unJeiermined function is linear, be just linear fit.In numerical analysis, curve approaches discrete data with analytical expression exactly, i.e. the formulism of discrete data.In practice, discrete point group or data are the repeatedly observed reading of various physical problem amount relevant to statistical problem or experiment value often, and they are scattered, is not only not easy to process, and usually can not definitely and fully embodies its intrinsic rule.This defect just can be made up by suitable analytical expression.

The linear fit of general y=ax+b can be calculated by following formula:

a = l &Sigma; k = 1 l x k y k - &Sigma; k = 1 1 x k &Sigma; k = 1 l y k l &Sigma; k = 1 l x k 2 - &Sigma; k = 1 l x k &Sigma; k = 1 l x k b = &Sigma; k = 1 l y k - a &Sigma; k = 1 l x k l

Wherein, l is the number of (x, y) discrete value, x kfor a kth x value, y kfor a kth y value of correspondence.

Such as: get described sequence and carry out linear fit in the section at the place, territory of TOP V (namely in preset range), obtain the amount of quoting of getting the first five section, wherein the amount of quoting of first paragraph to the 5th section is respectively S 1, S 2, S 3, S 4and S 5, so l=5, x 1=1, y1=S 1, x 2=2, y 2=S 2, x 3=3, y 3=S 3, x 4=4, y 4=S 4, x 5=5, y 5=S 5.The formula then above-mentioned numerical tape being entered linear fit can calculate the value of a and b.

S15, the total buffer amount C will preset in described sublist according to described straight line y=ax+b sumdistribute to every section, the buffering capacity C of i-th section of distribution gained ifor: C i=C sum× (ai+b)/S sum.Buffering capacity is distributed according to the amount of quoting in section, improves memory usage, distributes the buffering capacity C of gained at i-th section ifor: C i=C sum× (ai+b)/S sumin, every section is distributed the buffering capacity of gained is this section of amount of quoting and the ratio of the total amount of quoting.

The beneficial effect of the method has:

1) inverted list of search technique of arranging in pairs or groups realizes, and massive logs ensures its temporally strictly increasing, and meet high speed and import, support real-time search, index is compact simultaneously;

2) owing to comprising index and data total amount during data compression, by directly quoting side-play amount, reduce I/O number of operations, improve reading speed, ratio of compression is lower;

3) reasonable distribution buffering capacity, the size distributing buffer amount measured by reference, makes occupying system resources less;

4) in the system of below internal memory 2G, there is better Performance Ratio.

As shown in Figure 4, in the process flow diagram of the buffering capacity distribution method of a kind of massive logs provided at another preferred embodiment of the present invention, the present embodiment is based on a upper embodiment, be provided in the fitting a straight line y=ax+b+c revised when the amount of quoting that fitting a straight line y=ax+b characterizes is less than or equal to 0, and corresponding correction is also done to buffering capacity distribution, the method is specific as follows:

S21, read in daily record to sublist in real time, and described daily record is stored in the section of specifying in sublist;

S22, all sections in described sublist to be divided according to the time of reading in daily record, if the daily record of reading in sublist has identical territory, in all identical territories, then all quote the side-play amount in the territory occurred first in sublist, and the number of times that the side-play amount in the territory occurred first described in statistics is cited; The step S12 that this step also can adopt an embodiment to provide.

S23, set up the amount of the quoting S of every section i, described in the amount of quoting S ifor the number of times sum that the side-play amount in the territory occurred first described in all in i-th section is cited, wherein, i is the positive integer in [1, n], and n is the section in described sublist; Calculate the always amount of the quoting S of sublist sum: the step S13 that this step also can adopt an embodiment to provide.

S24, the amount of quoting S according to the Time alignment every section and every section that read in daily record i, to the amount of the quoting S of every section and every section irelation carry out linear fit, obtain the straight line y=ax+b that regulation characterizes the corresponding relation of section and the amount of quoting, wherein, x-axis is the xth section in described sublist, y-axis for described in the amount of quoting; The step S14 that this step also can adopt an embodiment to provide.

S25, before described step S15, judge whether ai+b is greater than 0, if ai+b is greater than 0, then perform step S26; If ai+b is less than or equal to 0, then perform step S27-S28;

S26, the total buffer amount C that will preset in described sublist according to the described corresponding relation of described straight line y=ax+b regulation sumdistribute to every section, the buffering capacity C of i-th section of distribution gained ifor: C i=C sum× (ai+b)/S sum.

S27, by fitting a straight line along y-axis upwards translation c unit, until ai+b+c is greater than 0, and described fitting a straight line is modified to y=ax+b+c, as shown in Figure 5; This is that this does not conform with convention, therefore need revise described fitting a straight line because when section that some amounts of quoting are few corresponds to fitting a straight line, the amount of quoting may be negative value.

S28, the total buffer amount C will preset in described sublist sumdistribute to every section, the buffering capacity C of i-th section of distribution gained ifor: C i = C sum &times; ( ai + b + c ) / ( S sum &times; &Integral; 0 - ( b + c ) / a ( ax + b + c ) dx / &Integral; 0 - b / a ( ax + b ) dx ) . Because straight line is along y-axis upwards translation c unit, straight line and y-axis and unit area that x-axis forms can not characterize always to be quoted shared ratio, total buffer amount is according to this proportional distribution buffering capacity extremely every section.

The present embodiment, except having the beneficial effect of a upper embodiment, distributes described fitting a straight line and buffering capacity and revises, ensure that buffering capacity is accurately distributed, the amount of making rational use of resources.

By reference to the accompanying drawings embodiments of the invention are described above; but the present invention is not limited to above-mentioned embodiment; above-mentioned embodiment is only schematic; instead of it is restrictive; those of ordinary skill in the art is under enlightenment of the present invention; do not departing under the ambit that present inventive concept and claim protect, also can make a lot of form, these all belong within protection of the present invention.

Claims (6)

1. a buffering capacity distribution method for massive logs, in the buffering capacity of reading in massive logs time-division gamete table, it is characterized in that, the method comprises the following steps:
S11, read in daily record to sublist in real time, and described daily record is stored in the section of specifying in sublist;
S12, all sections in described sublist to be divided according to the time of reading in daily record, if the daily record of reading in sublist has identical territory, in all identical territories, then all quote the side-play amount in the territory occurred first in sublist, and the number of times that the side-play amount in the territory occurred first described in statistics is cited;
S13, set up the amount of the quoting S of every section i, described in the amount of quoting S ifor the number of times sum that the side-play amount in the territory occurred first described in all in i-th section is cited, wherein, i is the positive integer in [1, n], and n is the section in described sublist; Calculate the always amount of the quoting S of sublist sum:
S14, the amount of quoting S according to the Time alignment every section and every section that read in daily record i, to the amount of the quoting S of every section and every section irelation carry out linear fit, obtain the straight line y=ax+b that regulation characterizes the corresponding relation of section and the amount of quoting, wherein, x-axis is the xth section in described sublist, y-axis for described in the amount of quoting;
S15, the total buffer amount C that will preset in described sublist according to the described corresponding relation of described straight line y=ax+b regulation sumdistribute to every section, the buffering capacity C of i-th section of distribution gained ifor: C i=C sum× (ai+b)/S sum.
2. buffering capacity distribution method as claimed in claim 1, is characterized in that, in described step S11: the territory of described daily record comprises user ID, access time, access IP, requests for page and request function number.
3. buffering capacity distribution method as claimed in claim 2, it is characterized in that, described step S12 comprises following sub-step:
S12A, all sections in described sublist to be divided according to the time of reading in daily record, if the daily record of reading in sublist has identical territory, then in all identical territories, all quote the side-play amount in the territory occurred first in sublist;
The number of times that S12B, the side-play amount of adding up the territory that i-th section of jth occurs first are cited land described number of times is sorted, wherein, i is the positive integer in [1, n], and n is the total hop count in described sublist, and j is the positive integer in [1, m], and m is total number in the territory occurred first in described i-th section.
4. buffering capacity distribution method as claimed in claim 3, it is characterized in that, described step S13 comprises following sub-step:
S13A, the foundation amount of quoting S i, described in the amount of quoting S inumber of times sum for the side-play amount in the territory occurred first described in all in i-th section is cited: wherein, i is the positive integer in [1, n], and n is the total hop count in described sublist, and j is the positive integer in [1, m], and m is total number in the territory occurred first in described i-th section;
The always amount of the quoting S of S13B, calculating sublist sum:
5. buffering capacity distribution method as claimed in claim 4, it is characterized in that, described step S14 also comprises:
S14A, the amount of quoting S according to the Time alignment every section and every section that read in daily record i, the section of getting the territory place of described sequence in preset range carries out linear fit, and obtain the straight line y=ax+b that regulation characterizes the corresponding relation of section and the amount of quoting, wherein, x-axis is the section in described sublist, y-axis for described in the amount of quoting.
6. buffering capacity distribution method as claimed in claim 5, it is characterized in that, the method also comprises:
S15A, before described step S15, judge whether ai+b is greater than 0, if ai+b is greater than 0, then perform step S15; If ai+b is less than or equal to 0, then perform step S15B;
S15B, by fitting a straight line along y-axis upwards translation c unit, until ai+b+c is greater than 0, and described fitting a straight line is modified to y=ax+b+c;
S15C, the total buffer amount C will preset in described sublist sumdistribute to every section, the buffering capacity C of i-th section of distribution gained ifor: C i = C sum &times; ( ai + b + c ) / ( S sum &times; &Integral; 0 - ( b + c ) / a ( ax + b + c ) dx / &Integral; 0 - b / a ( ax + b ) dx ) .
CN201310727354.1A 2013-12-25 2013-12-25 A kind of buffering capacity distribution method of massive logs CN104750682B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310727354.1A CN104750682B (en) 2013-12-25 2013-12-25 A kind of buffering capacity distribution method of massive logs

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310727354.1A CN104750682B (en) 2013-12-25 2013-12-25 A kind of buffering capacity distribution method of massive logs

Publications (2)

Publication Number Publication Date
CN104750682A true CN104750682A (en) 2015-07-01
CN104750682B CN104750682B (en) 2018-04-06

Family

ID=53590393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310727354.1A CN104750682B (en) 2013-12-25 2013-12-25 A kind of buffering capacity distribution method of massive logs

Country Status (1)

Country Link
CN (1) CN104750682B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7430741B2 (en) * 2004-01-20 2008-09-30 International Business Machines Corporation Application-aware system that dynamically partitions and allocates resources on demand
CN101667198A (en) * 2009-09-18 2010-03-10 浙江大学 Cache optimization method of real-time vertical search engine objects
US20120323870A1 (en) * 2009-06-10 2012-12-20 At&T Intellectual Property I, L.P. Incremental Maintenance of Inverted Indexes for Approximate String Matching
CN103336771A (en) * 2013-04-02 2013-10-02 江苏大学 Data similarity detection method based on sliding window

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7430741B2 (en) * 2004-01-20 2008-09-30 International Business Machines Corporation Application-aware system that dynamically partitions and allocates resources on demand
US20120323870A1 (en) * 2009-06-10 2012-12-20 At&T Intellectual Property I, L.P. Incremental Maintenance of Inverted Indexes for Approximate String Matching
CN101667198A (en) * 2009-09-18 2010-03-10 浙江大学 Cache optimization method of real-time vertical search engine objects
CN103336771A (en) * 2013-04-02 2013-10-02 江苏大学 Data similarity detection method based on sliding window

Also Published As

Publication number Publication date
CN104750682B (en) 2018-04-06

Similar Documents

Publication Publication Date Title
Dittrich et al. Efficient big data processing in Hadoop MapReduce
Cameron et al. Econometric models based on count data. Comparisons and applications of some estimators and tests
Plattner et al. In-memory data management: technology and applications
Agarwal et al. BlinkDB: queries with bounded errors and bounded response times on very large data
Grund et al. HYRISE: a main memory hybrid storage engine
Morton et al. Estimating the progress of MapReduce pipelines
Ji et al. Big data processing in cloud computing environments
US20100293135A1 (en) Highconcurrency query operator and method
Ghinita et al. A framework for efficient data anonymization under privacy and accuracy constraints
US20130275364A1 (en) Concurrent OLAP-Oriented Database Query Processing Method
AU2014201593B2 (en) Shared cache used to provide zero copy memory mapped database
CN103955502B (en) A kind of visualization OLAP application realization method and system
US10157204B2 (en) Generating statistical views in a database system
US20130275365A1 (en) Multi-Dimensional OLAP Query Processing Method Oriented to Column Store Data Warehouse
CN102663116A (en) Multi-dimensional OLAP (On Line Analytical Processing) inquiry processing method facing column storage data warehouse
US7917526B2 (en) Group-By result size estimation
TW201214167A (en) Matching text sets
US8712972B2 (en) Query optimization with awareness of limited resource usage
US8326825B2 (en) Automated partitioning in parallel database systems
US9875280B2 (en) Efficient partitioned joins in a database with column-major layout
EP2577507B1 (en) Data mart automation
Zeng et al. G-ola: Generalized on-line aggregation for interactive analysis on big data
WO2011103579A2 (en) Operating on time sequences of data
CN103491187B (en) A kind of big data united analysis processing method based on cloud computing
US9672272B2 (en) Method, apparatus, and computer-readable medium for efficiently performing operations on distinct data values

Legal Events

Date Code Title Description
PB01 Publication
C06 Publication
SE01 Entry into force of request for substantive examination
C10 Entry into substantive examination
GR01 Patent grant
GR01 Patent grant