CN104714976B - Data processing method and equipment - Google Patents
Data processing method and equipment Download PDFInfo
- Publication number
- CN104714976B CN104714976B CN201310692643.2A CN201310692643A CN104714976B CN 104714976 B CN104714976 B CN 104714976B CN 201310692643 A CN201310692643 A CN 201310692643A CN 104714976 B CN104714976 B CN 104714976B
- Authority
- CN
- China
- Prior art keywords
- data
- histogram
- group
- away
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Complex Calculations (AREA)
Abstract
A kind of data processing method of the application offer and equipment.The method includes:In response to the initial query request for a data acquisition system, basic histogram is obtained by reading the data in a data acquisition system;And based on scheduled target interval or target group away from being obtained with target interval or target group away from corresponding goal histogram by basic histogram, and goal histogram is presented.By using this method, the number of reading data can be made to be reduced to once during repeatedly transformation goal histogram, that is, the transformation that basic histogram achieves that histogram is used only.The calculating speed and data-handling capacity of system are substantially increased as a result, moreover, can rapidly show histogram in the case of big data.
Description
Technical field
This application involves technical field of data processing more particularly to a kind of data processing method based on histogram and set
It is standby.
Background technology
Usually, when the quantity of data to be analyzed is only tens, the analysis knot of data is can be obtained by by range estimation
Fruit, but when the quantity of data to be analyzed reaches 1,000,10,000 ... when 100,000,000,1,000,000,000, so that it may analyze number to use histogram
According to.Histogram(Histogram)It is a kind of statistical graph of performance data distribution characteristics, i.e., with one group without interval, wide, height
The case where equal longitudinal line segment or column vertical bar do not indicate data distribution.
For example, Figure 10 A to Figure 10 E are the schematic diagrames using an example of histogram analysis data.Figure 10 A are waited for point for certain
Histogram obtained from data is analysed, by interval width(It is also referred to as below " group away from ")In the case of being set as 80, as seen from the figure, number
According to having focused largely on [480,560) and [0,80) the two sections.It, as shown in Figure 10 B, can when by group away from being changed to 20 from 80
To find out that [500,520) and [0,20) advantage is clearly in two more sections of data.In turn, to be concerned only with data most
Section [500,520) and by group away from being changed to 2 when, as illustrated in figure 10 c, it is known that section [510,512) on concentrated exhausted big portion
The data divided.Same section [500,520), when by group away from being adjusted to 0.1, as shown in Figure 10 D, obtain in this section
Data are both present in integer such conclusion nearby.In contrast, when be concerned only with section [0,20) when, as shown in figure 10e, this
Data distribution in section and section [500,520) it is entirely different, but show the shape in log series model.It can by the example
Know, histogram is conducive to the understanding of the distribution to data to be analyzed, by change group away from(Interval width)It can obtain data point
The more information of cloth can more intuitively find out the number of each region by paying close attention to interested several regions in histogram
According to characteristic distributions.
However, when the data volume of data to be analyzed is smaller, the time for executing calculating for acquisition histogram every time is very short,
User can continuously convert display group away from(I.e. the group of histogram away from), it is switched to each interested section, without apparent
Pause feel.But when the data volume of data to be analyzed is larger, calculating the time will be elongated, causes in impulsive
The slack phenomenon of picture is significantly appeared in the process, and the experience of user is made to decline.In addition, for being stored in distributed system
Mass data(That is big data), user change demand and convert display group away from when, be every time the meter that executes of acquisition histogram
Calculation needs to spend a few minutes, can just show new histogram.
Apply for content
The main purpose of the application is to provide a kind of data processing method and equipment, with solve it is of the existing technology
The calculating process of histogram evaluation time of falling into a trap is long and user experience is led to problems such as to decline, wherein:
According to the one side of the application, a kind of data processing method is provided, which is characterized in that including:In response to being directed to
The initial query of one data acquisition system is asked, and basic histogram is obtained by reading the data in a data acquisition system;And base
In scheduled target interval or target group away from being obtained with target interval or target group away from corresponding target histogram by basic histogram
Figure, and goal histogram is presented.
According to the another aspect of the application, a kind of data processing equipment is provided, which is characterized in that including:Basic histogram
Device is obtained, is configured in response to ask for the initial query of a data acquisition system, by reading in a data acquisition system
Data obtain basic histogram;And goal histogram obtains device, is configured to be based on scheduled target interval or target
Group by basic histogram away from being obtained with target interval or target group away from corresponding goal histogram, and goal histogram is presented.
Compared with prior art, mediant is calculated as by reading a pass evidence according to the technical solution of the application
According to group away from very small histogram(Referred to as " basic histogram "), then according to the demand of user, and use basic histogram
To obtain goal histogram corresponding with user demand.As a result, during repeatedly transformation goal histogram, make reading data
Number be reduced to once, and the transformation of histogram is achieved that using basic histogram every time.It substantially increases as a result, and is
The calculating speed and data-handling capacity of system, moreover, can rapidly be shown to user in the case of big data
Histogram.
Description of the drawings
Attached drawing described herein is used for providing further understanding of the present application, constitutes part of this application, this Shen
Illustrative embodiments and their description please do not constitute the improper restriction to the application for explaining the application.In the accompanying drawings:
Fig. 1 is the general flowchart of the data processing method of the embodiment of the present application;
Fig. 2 is the flow chart of the basic histogram of acquisition of the embodiment of the present application;
Fig. 3 is the schematic diagram of an example that basic histogram is obtained by Distributed Calculation of the embodiment of the present application;
Fig. 4 is the flow chart of the acquisition basic histogram of node of the embodiment of the present application;
Fig. 5 is the flow chart of the basic histogram of merge node of the embodiment of the present application;
Fig. 6 be the invention relates to data processing equipment structural schematic diagram;
Fig. 7 be the invention relates to the basic histogram of node obtain device an example structural schematic diagram;
Fig. 8 be the invention relates to the basic histogram of node obtain another structural schematic diagram of device;
Fig. 9 be the invention relates to histogram obtain device an example structural schematic diagram;
Figure 10 A to Figure 10 E are the schematic diagrames of an example for utilizing histogram analysis data in the prior art.
Specific implementation mode
The main idea of the present application lies in that for big data, a pass is only read according to can be achieved with being supplied in order to reach
Look facility this purpose of the histogram of user's smoothness is calculated as intermediate data by reading pass evidence first
Group is away from very small histogram(Referred to as " basic histogram "), then according to the demand of user, by basic histogram be transformed to
The corresponding goal histogram of user demand, the i.e. target interval based on user demand or target group are away from just using basic histogram
It can obtain goal histogram.To so that the calculating speed and data-handling capacity of system increase substantially, and even if
Also histogram can be rapidly shown in the case of big data to realize the function of quickly checking histogram.
What needs to be explained here is that so-called big data can refer to the number that data magnitude is tens GB or more in the application
According to set, and can be arbitrary data types, such as network log, video, picture, geographical location information etc..It can manage
It solves, the scheme of the application is particularly suitable for the big data scene with huge data volume.At the same time, the scheme of the application
It may be equally applicable for the data processing scene of other arbitrary data magnitudes.To make the purpose, technical scheme and advantage of the application
It is clearer, below in conjunction with drawings and the specific embodiments, the application is described in further detail.
<Data processing method>
According to an embodiment of the present application, a kind of data processing method is provided.By the data processing method, treated
Handling result is shown to user in the form of histogram.
In the prior art, two factors for constituting histogram be group away from and frequency, the general algorithmic method of histogram be:
(1)A pass evidence is first read, the maximum value and minimum value of data are calculated, to obtain the maximum value and minimum of very poor i.e. data
The difference of value.(2)It determines the group number of histogram according to the demand of user, is then removed with this group of number very poor, histogram can be obtained
Every group of width of figure, i.e. group away from.(3)According to group away from determining the boundary value of each group.(4)A pass evidence is read again, and statistics is each
The frequency of group.Whenever user's change demand histogram to be shown is changed as shown in Figure 10 A to Figure 10 E for this method
The section of figure or group away from when, be required for reading twice total data and re-start calculating, can just obtain and user demand pair
The histogram answered.Also, in the calculating process of histogram, due to needing to read twice of data, so the time of data processing
It is elongated.In this way, it is constantly increased in data volume, cause the time of data processing elongated, the usage experience of user will
It is greatly reduced.
In view of the above problems, the data-handling capacity of system in order to greatly increase, the application is from when reducing data processing
Between set about, that is, reduce read data number.Therefore, this application involves data processing method mainly include two parts:One
It is to obtain a group away from very small basic histogram by adaptive computational methods, so that only reading pass evidence can
To obtain basic histogram;Second is that basic histogram is transformed to goal histogram corresponding with user demand, so that being not necessarily to
The reading that data are repeated.
Referring to Fig.1, Fig. 1 is the general flowchart of the data processing method of the embodiment of the present application.In the figure, step S101
Be this application involves the data of processing in a self-adaptive manner the step of, step S102 be this application involves basic histogram transformation
The step of processing.In the following, being described in detail one by one.
(data processing of adaptive mode)
Specifically, in step S101, in response to the initial query request for a data acquisition system, by reading one time
Data in the data acquisition system obtain basic histogram.
When user wants to check certain data distributions, the inquiry request for corresponding data set can be initiated.
In this application, user is known as initial query request for the inquiry for the first time of a certain data acquisition system.Initial query request can
To be data inquiry request that user is initiated by input inquiry keyword.More specifically, initial query request can also be
The histogram display request that user initiates for some data acquisition system.
In the above-mentioned initial query request for receiving user, can be read for example, by data processing equipments such as computers
Data obtain basic histogram after handling data in a self-adaptive manner, and only carry out primary reading number in the process
According to operation.Above-mentioned data refer to pending data.In addition, histogram is group away from very small histogram, its reality substantially
The intermediate data for constituting goal histogram on border, i.e., the handling result obtained after data processing be intermediate data and
It is not intended to the real histogram shown to user, but for ease of description, this intermediate data is referred to as basic histogram
Figure.
For big data, since data volume is huge, so generally use Distributed Calculation handles data.That is, will
Data are assigned to multiple calculate nodes after being divided into more parts, each calculate node only handles a part of data, finally merge each
The handling result of calculate node obtains final calculation result.So-called Distributed Calculation is exactly total mutually by multiple calculate nodes
Information is enjoyed, these calculate nodes can both be run on same computer, can also be in more to get up by network connection
It is run on computer.In the following, illustrating the preparation method of basic histogram for calculating in a distributed manner.
Fig. 2 is the flow chart of the basic histogram of acquisition of the embodiment of the present application.Fig. 3 is that the embodiment of the present application passes through distribution
The schematic diagram of an example of basic histogram is calculated in formula.In conjunction with Fig. 2 and Fig. 3, the preparation method of basic histogram is illustrated.
In step s 201, data are assigned to multiple calculate nodes, so that each calculate node obtains node data.
That is, as shown in figure 3, for example by mass data be divided into node data 1, node data 2 and this three parts of node data 3 with
Afterwards, the data of this three parts are separately sent to calculate node 1, calculate node 2 and calculate node 3.In addition, Fig. 3 is shown
Three calculate nodes and a merge node, but the quantity of calculate node is without being limited thereto, can also be three or more.In addition, closing
And node can also be one of multiple calculate nodes.
In step S202, each calculate node obtains the basic histogram of node by reading a node data.That is,
As shown in figure 3, calculate node 1 reads received node data 1, node is obtained after aftermentioned processing shown in Fig. 4
Basic histogram 1.Fig. 4 is the flow chart of the acquisition basic histogram of node of the embodiment of the present application.
Specifically, in step S401, a part of node data is once read as primary data, determines the initial number
According to data area, and according to the data area of predetermined group of number and the primary data come determine initial group away from(Determine step).
In other words, calculate node 1 first reads the part in node data 1, this part being read is determined by the read operation
Then the data area M of node data obtains group away from very small initial group away from i.e. M/N according to predetermined group of number N, wherein N is
Positive integer.Here, predetermined group number N is the preset suggestion group number for the basic histogram of node, which may not
It is essentially equal with the group number of the basic histogram of node obtained by calculation, and it is desirable to the suggestion group number and obtained node
The group number of basic histogram is in an order of magnitude.In actually calculating, it is proposed that group number N is usually taken to be 10000, then obtained by
The group number of the basic histogram of node will be between 10000 and 100000.On the other hand, by the way that predetermined group of number N to be arranged
It is sufficiently large, sufficiently small group can be obtained away from the advantage of doing so is that the number of data calculating can be reduced.
In step S402, according to initial group away from primary data is divided into multiple initial sections, and calculate it is each just
The frequency in beginning section(That is partiting step).In other words, calculate node 1 according to initial group away from M/N to the part of nodes that reads
Data are grouped, and to which this partial data is divided into multiple initial sections, then calculate the frequency in each initial section.Often
The frequency in a initial section is the number for the data for being included in each section.In addition, since initial group is away from very small, so
By only reading a pass evidence, it will be able to rapidly calculate the frequency in each initial section, such as frequency is made to be less than or equal to 2.
In this way, substantially reducing the reading times of data, the data processing speed of system is improved.
In step S403, remaining node data as new data is read, handles new data in a self-adaptive manner, and really
Determine the group of current data away from and frequency, wherein current data includes primary data and new data(That is self-adaptive processing step).It changes
Sentence talk about, on the basis of primary data increase new data when, be adaptively adjusted data area and group away from.Here, remaining is saved
Point data refers to the part of nodes data in addition to being read in step S401(Hreinafter referred to as " it is read data ")In addition
One or more data.That is, new data packets contain one or more data, the new data is either other than being read data
All data, can also be the partial data other than being read data.
In the following, explaining the flow of self-adaptive processing step in detail.
First, after reading new data, judge whether each data in new data are all located at the number of primary data
Within the scope of.That is, if need to adjust data area, the numerical value of each data depended primarily in new data is
The no minimum value for more than or equal to primary data and the maximum value less than or equal to primary data.Therefore, it is determined according to judging result
It is fixed whether to adjust data area.
If it is judged that being yes, i.e., each data in new data are all located within the data area of primary data, then
It needs to further determine which initial section is each data in new data belong to, and correspondingly increases the frequency in affiliated initial section
Number so that obtain current data group away from and frequency.That is, the group of current data is away from being initial group away from the frequency of current data
It is the result obtained after increasing the frequency in the initial section corresponding to new data.In other words, when each of new data
When data are all located within the data area of primary data, without adjusting data area, only make each data bit in new data
In the section corresponding to it.
, whereas if judging result is no, i.e., some or all of data in new data are located at the number of primary data
Except range, then between multiple initial sections being adjusted to multiple new districts, and every number in the new data is further determined
Between belonging to which new district, and the frequency belonging to correspondingly increasing between new district so that obtain the group of current data away from and frequency.
That is, the group of current data is away from being new group after being adjusted away from the frequency of current data is by increasing corresponding to new data
The result obtained after frequency between new district.In other words, when some or all of data in new data are located at primary data
Data area except when, need to adjust data area and group away from.
Between multiple initial sections are adjusted to multiple new districts, it is broadly divided into following two situations:
The first situation is, only increases the number in section, and group is away from remaining unchanged.That is, such situation is only to adjust data model
The case where enclosing.In this case, for the data except the data area of primary data in new data, according to initial
Number of the group away from increase section, and using each section after increase as between a new district so that new bit is in all new districts
Between within corresponding data area.That is, making the data except the data area of primary data in new data
Within the increased section of institute.
The second situation be increase the number in section, and change group away from.That is, such situation is to adjust data area simultaneously
With group away from the case where.Due to being continuously increased with new data, the number in section can exceed the accessible number of calculate node.This
Sample, it is necessary to the number in section be reduced away from by way of increase group, thus also need to adjust while adjusting data area
Whole group away from.In the latter case, first, for the data except the data area of primary data in new data,
According to initial group away from come when increasing section, when the sum of number and the number in initial section in increased section be more than a predetermined group number
Certain multiple when, by initial group away from prearranged multiple be adjusted to new group away from.Then, a left side for the data area of current data is adjusted
Boundary value so that the left boundary value after adjustment be new group away from multiple, and using the data area after adjustment as new data
Range.Then, according to new group away from new data range to be divided between multiple new districts so that the number between new district is more than or equal to predetermined
Group number and the certain multiple for being less than predetermined group number.For example, if the number of current interval is more than 10N, need to group away from rising
Grade that is, by initial group away from increasing to 10 times, 100 times or 1000 times etc., while will also adjust a left side for the data area of current data
Boundary value, make new group away from multiple.If for example, adjustment before left boundary value be 92, new group away from being 10, then adjust after
Left boundary value is 90.To meet condition between finally obtained new district is that the number between new district is more than or equal to N and is less than 10N.Then,
Determine the frequency between each new district.Due to new group away from be initial group away from prearranged multiple, the left boundary value of new data range is new
Group away from multiple, so being the equal of several adjacent regions in the histogram that will be made of current data between actually each new district
Between merge made of.In this way, the frequency between each new district is exactly the sum of each frequency of these adjacent intervals.
In addition, adaptively handle new data during, group away from selection it is particularly important.New data is read when continuous
When, data area can constantly become larger, and the number in section can be also continuously increased.But data area increase to a certain extent when, area
Between number will exceed the accessible number of calculate node, at this time, it may be necessary to reduce section away from by way of increase group
Number.In order to reach only by merging multiple groups away from forming new group away from without obtaining this purpose by recalculating,
New group after increase away from be set as yes former group away from integral multiple, the data amount check between some new district is exactly several former sections as a result,
The sum of data amount check, i.e. frequency between some new district is the sum of the frequency in several former sections.In addition, adjustment group away from when also want
In view of the boundary value in section, the preferably more neat number of the boundary value, such as:0.1,0.002,1,100 etc..Logarithm type
Data, choose initial group away from being 10k, k is integer, thus new group away from one be set to initial group away from 10mTimes, m is positive integer.
Furthermore it is possible to keep the data volume of the new data read every time more as possible, to reduce the number of section adjustment,
The operational efficiency of raising system.
In step s 404, if there are still the node data not being read, step S403 is returned to, until all sections
Until point data has been read, with obtain the groups of whole node datas away from and frequency(Recycle read step).That is, with
New data is constantly read, constantly adjusts data area and group away from frequency also changes therewith, until whole node datas
Be read, so that it may with obtain the groups of whole node datas away from and frequency.
In step S405, the group based on whole node datas away from and frequency, obtain the basic histogram of node(It is walked
Suddenly).That is, as shown in figure 3, when node data 1 is all read completion by calculate node 1, so that it may to obtain for section
The group of point data 1 away from and frequency, thus just obtained the basic histogram of node 1.Similarly, calculate node 2 uses node data 2
Obtain the basic histogram 2 of node, calculate node 3 obtains the basic histogram of node 3 using node data 3.
In the following, illustrating the process of self-adaptive processing new data.
For example, it is assumed that for a certain partial data obtain one grouping, that is, formed multiple initial sections [10,10.1),
[10.1,10.2), [10.2,10.3) ..., [99.9,100), each corresponding frequency in section be { 2,2,2 ..., 2 }, predetermined group
Number N=10.
Then, read a new data 10.15 because it belong to existing section [10.1,10.2), then have no need to change
Data area and group are away from only increasing corresponding frequency, i.e., corresponding frequency is { 2,3,2 ..., 2 }.
Then, then read a new data 9.85, due to the data between original area except, so according to initial group away from
0.1 and increase [9.8,9.9) and [9.9,10) the two sections.Wherein, [9.9,10) it is since the section of histogram is continuous
And increased empty interval, then between finally obtaining multiple new districts, i.e., [9.8,9.9), [9.9,10), [10,10.1), [10.1,
10.2), [10.2,10.3) ..., [and 99.9,100), corresponding frequency becomes { 1,0,2,3,2 ..., 2 }.
Then, then a new data -1.0 is read, then the number for increasing section away from 0.1 according to initial group, that is, after increasing
Section be [- 0.1,0), [0,0.1) ..., [9.8,9.9), [9.9,10) ..., [99.9,100), the sum in section is super at this time
10N=100 are crossed, therefore, initial group are upgraded to new group away from 10 away from 0.1 by 100.In order to make between the new district after adjusting again
Number needs the left boundary value for adjusting the data area of current data, makes new group away from 10 more than or equal to 10 and less than 100
Multiple, and the data area after adjustment is as new data range.That is, for section [- 0.1,0) for, due to the left side before adjustment
Boundary value is -0.1, new group away from being 10, so the left boundary value after adjustment is -10.Therefore, by being directed to new data range weight
New demarcation interval and obtain between multiple new districts [- 10,0) ..., [0,10), [10,20), [20,30) [90,100).Then, needle
Between each new district, the frequency in the former section for being included in the new district is added, just obtain the frequency between multiple new districts be 1,
1,201,200,…,200}。
As described above, illustrating the process of self-adaptive processing data, i.e., partial data is first read again with the increase of data
Come adjust data area and group away from.But it is also not necessarily limited to this.It is relatively small in the quantity of data, the data processing energy of calculate node
In the case that power is sufficiently large, the basic histogram of node can also be obtained by disposably reading whole node datas.That is,
In this case, by once reading whole node datas, the data area of whole node datas is determined;According to predetermined group
The data area of number and whole node datas come determine the group of whole node datas away from;According to the group of whole node datas
Away from whole node datas is divided into multiple sections, and calculate the frequency in each section;Based on whole node datas
Group obtains the basic histogram of node away from the frequency with each section.Due to the data processing method and Fig. 2 to Fig. 5 of such situation
Described data processing method is roughly the same, only difference is that whether disposably read whole data, so saving herein
The slightly description of its detail.
Then, as shown in Fig. 2, in step S203, the basic histogram of multiple nodes is summarized for a basic histogram.
That is, as shown in figure 3, merge node receives the basic histogram 1 of the result of calculation from calculate node 1 i.e. node, from calculating
The basic histogram 2 of result of calculation, that is, node of node 2 and the basic histogram of result of calculation, that is, node from calculate node 33
After, the basic histogram 1 of node, the basic histogram 2 of node and the basic histogram of node 3 are merged into a basic histogram
Figure.As shown in figure 5, Fig. 5 is the flow chart of the basic histogram of merge node of the embodiment of the present application.
Specifically, in step S501, by comparing the data area of the basic histogram of multiple nodes, overall number is obtained
According to data area.That is, the left and right side dividing value of more each basic histogram of node, obtains the left margin of goal histogram
Minimum value and right margin maximum value, to determine the data area of conceptual data.
In step S502, according to the data area of conceptual data and suggestion group number, determine the group of conceptual data away from.The step
Rapid processing method is identical as above-mentioned step S401.That is, in order to merge the basic histogram of each node, need to totality
Data are grouped, that is, establish one division data unified standard, it is therefore desirable to first determine conceptual data group away from.
In step S503, group according to conceptual data by the data area of conceptual data away from being divided into multiple whole areas
Between.The processing method of the step and above-mentioned step S402 are essentially identical.That is, according to identified group away from being made
For multiple whole sections of the unified standard of division data.
In step S504, the basic histogram of each node is divided into multiple portions area respectively according to multiple whole sections
Between, and determine the frequency of each partial section.That is, according to unified standard respectively to the basic histogram of each node again
It is divided.
It, will be corresponding with whole section in the basic histogram of multiple nodes for each whole section in step S505
The frequency of partial section summarize for the frequency in each whole section.That is, by the basic histogram of each node according to every
Summarized in a entirety section.
In step S506, according to the group of conceptual data away from the frequency with each whole section, obtain being directed to conceptual data
Basic histogram.
In the following, illustrating the method for merging each basic histogram of node.
Assuming that shared calculate node 1 and calculate node 2 the two calculate nodes, the basic histogram of node of calculate node 1
Interval division be:[9.8,9.9), [9.9,10), [10,10.1), [10.1,10.2), [10.2,10.3), section is corresponding
Frequency is { 1,1,1,1,1 }.The interval division of the basic histogram of node of calculate node 2 is:[10,20),[20,30),…,
[90,100), the corresponding frequency in section is:{ 200,200 ..., 200 }, it is proposed that group number is 10.
First, compare the left and right side dividing value of calculate node 1 and calculate node 2, i.e., the left boundary value difference of each calculate node
Right boundary value for 9.8 and 10, each calculate node is respectively 10.3 and 100, show that the minimum value of left margin is 9.8, right margin
Maximum value be 100, therefore, the data area of conceptual data be [9.8,100).
Then, according to the data area of the conceptual data [9.8,100) and suggestion group number 10, determine to be directed to conceptual data
Group away from being 10.
In turn, interval divisions are carried out away from 10 pairs of conceptual datas with group, thus the interval division of conceptual data be [0,10),
[10,20),[20,30),…,[90,100)。
Then, the basic histogram of each node is repartitioned, i.e.,:Due to the basic histogram of the node of calculate node 1
The interval division of figure be [9.8,9.9), [9.9,10), [10,10.1), [10.1,10.2), [10.2,10.3), section is corresponding
Frequency is { 1,1,1,1,1 }, the interval division of conceptual data be [0,10), [10,20), [20,30) ..., [and 90,100), therefore
The node histogram of calculate node 1 be reclassified as [0,10), [10,20), [20,30) ..., [and 90,100), the frequency in section
Number become 2,3,0,0, ..., 0.Similarly, since the interval division of the basic histogram of the node of calculate node 2 is:[10,
20), [20,30) ..., [and 90,100), the corresponding frequency in section is { 200,200 ..., 200 }, and the interval division of conceptual data is
[0,10), [10,20), [20,30) ..., [90,100), thus the node histogram of calculate node 2 be reclassified as [0,
10), [10,20), [20,30) ..., [90,100), the frequency in section becomes { 0,200,200 ..., 200 }, as a result, overall number
According to frequency summarized results be { 2,203,200 ..., 200 }.
Therefore, the interval division and frequency of the basic histogram of conceptual data are obtained, i.e.,:The basic histogram of conceptual data
Interval division be [0,10), [10,20), [20,30) ..., [and 90,100), the corresponding frequency in section be 2,203,200 ...,
200}。
(basic histogram conversion process)
In order to show histogram according to the demand of user, it is necessary to become the basic histogram after self-adaptive processing
It is changed to goal histogram corresponding with user demand.
Therefore, as shown in Figure 1, in step s 102, based on scheduled target interval or target group away from using basic histogram
Figure is obtained with target interval or target group away from corresponding goal histogram, and goal histogram is presented.That is, according to mesh
Section or target group are marked away from being between a new district by several adjacent interval mappings of basic histogram, and by corresponding frequency
Summarized, so that it may to obtain goal histogram corresponding with user demand, to display it to user, for customer analysis
Data distribution.
In one embodiment of the application, target interval or target group are away from can be system default.That is,
Receive for a certain data acquisition system initial query request when, can according to acquiescence target interval or target group away from(At this
Kind in the case of for system default display interval or display group away from)To generate goal histogram and the goal histogram be presented.
In another embodiment of the application, target interval or target group are away from can be that initial query request is specified.
That is, the goal histogram shown needed for being specified in user is for the initial query request of the data acquisition system
Target interval or target group away from.Need exist for it is clear that, the target interval or target group away from the target histogram that finally shows
The actual displayed section of figure or display group are away from may not be completely the same, but infinite approach.It will be described in more detail later.
In actually calculating, it is 10000 usually to choose suggestion group number, and calculated basic histogram has tens of thousands of a areas
Between, user directly can not therefrom obtain useful information, it is therefore desirable to show histogram according to the demand of user, the demand of user can
To include:User can select an interested data area, that is, select an interested display interval, the data of acquiescence
Ranging from all data;It can not select specific display interval and acquiescence is selected to show, in the case where giving tacit consent to display, generally
Show that more than ten of section, the section number for the basic histogram that each display interval includes are equal as possible;In present figure
It, can be with most group away from being shown under precision, wherein the precision of drawing refers to the more of section quantity in shown histogram
It is few;Can with more group away from or bigger group away from being shown, that is, amplified(zoom in)Or it reduces(zoom out)Display
Effect;It can be by a certain group input by user away from being shown;For the data of long-tail type, can select not show some districts
Between so that convenient for checking the small section of some frequencies;It can show the frequency and each show that each display interval is included
Show the percentage etc. for the frequency that section is occupied in current indication range.
As set forth above, it is possible to which the various demands according to user correspondingly convert basic histogram.In short, the demand with user
Various operations are correspondingly executed, such a problem finally can be all converted into:On the right of a target left boundary value and target
Dividing value and target group are away from seeking histogram.But it is not fully according to input by user in display target histogram
Desired value, that is, target interval or target group are away from showing.The reason is that the histogram shown according to desired value input by user
It is not necessarily histogram optimal under equal conditions, therefore, in the data processing method of the application, it will usually suitably adjust use
The desired value of family input, it is best to obtain to obtain best display group away from, display left boundary value and display right boundary value
Show histogram.Steps are as follows for specific calculating:
1, display group is calculated away from meeting two conditions:Display group away from be basic histogram group away from positive integer times;It is aobvious
Show group away from equal or close to target group away from.
2, it calculates and shows left margin, to meet two conditions:Display left boundary value be show group away from integral multiple;Display is left
Boundary value is equal to or less than target left boundary value.
3, it calculates and shows right margin, to meet two conditions:Display right boundary value be show group away from integral multiple;Display is right
Boundary value is equal to or more than target right boundary value.
4, by display group away from, display left boundary value and display right boundary value, obtain the division of display interval.
5, to each display interval, the sum of the frequency of respective bins for the basic histogram for belonging to the display interval is calculated,
Frequency as the display interval.
6, entire display interval division and frequency are exported.
Wherein, target left boundary value, target right boundary value and target group are away from can refer to numerical value that user specifies.
It this concludes the description of and show goal histogram in response to being asked for the initial query of a certain data acquisition system
Data processing method.And in the case where continuously sending out different subsequent query requests for same data acquisition system, it is only necessary to root
According to subsequent query request specified other target intervals or target group away from reconsolidating the several adjacent of basic histogram
Section.That is, in response to the subsequent query request for same data acquisition system, based on another specified by subsequent query request
Target interval or another target group by using basic histogram away from being obtained with another target interval or another target group away from right
Another goal histogram answered.Due in the data processing method asked for subsequent query, the adjacent region of basic histogram
Between merging treatment it is identical as the processing method for initial query request, specific described so omitting its herein.
<Data processing equipment>
According to an embodiment of the present application, a kind of data processing equipment is provided.Fig. 6 be the invention relates to data
The structural schematic diagram of processing equipment.As shown in fig. 6, the data processing equipment 600, which may include basic histogram, obtains device
601, goal histogram obtains device 602 and change device 603.
Specifically, basic histogram obtains device 601 and is configured in response to be directed to the initial query of a data acquisition system
Request obtains basic histogram by reading the data in the data acquisition system.
Goal histogram obtains device 602 and is configured to based on scheduled target interval or target group away from by basic histogram
It obtains with target interval or target group away from corresponding goal histogram, and goal histogram is presented.
Change device 603 is configured in response to ask for the subsequent query of the data acquisition system, based on described follow-up
Another target interval or another target group specified by inquiry request by basic histogram away from being obtained and another target interval or another
Another goal histogram is presented away from corresponding another goal histogram in one target group.
In turn, it may include diostribution device 611, the basic histogram acquisition device of node that basic histogram, which obtains device 601,
612 and histogram summarize device 613.
Specifically, diostribution device 611 is configured to data being assigned to multiple calculate nodes so that each calculate section
Point obtains node data.
The basic histogram of node obtain device 612 be configured to each calculate node by read a node data come
To the basic histogram of node.
Histogram summarizes device 613 and is configured to summarize the basic histogram of multiple nodes for a basic histogram.
In turn, Fig. 7 be the invention relates to the basic histogram of node obtain device an example structural schematic diagram.
As shown in fig. 7, it may include node data ranges determination device 701, node group away from true that the basic histogram of node, which obtains device 612,
Determine device 702, node interval divides device 703 and the basic histogram constituent apparatus of first node 704.
Specifically, node data ranges determination device 701 is configured to once read whole node datas to determine
The data area of whole node datas.
Node group is configured to according to the data area of predetermined group number and whole node datas away from determining device 702 come really
The group of fixed whole node data away from.
Node interval divides device 703 and is configured to according to the group of whole node datas away from drawing whole node datas
It is divided into multiple sections and calculates the frequency in each section.
The basic histogram constituent apparatus of first node 704 be configured to the group based on whole node datas away from each area
Between frequency obtain the basic histogram of node.
In addition, Fig. 8 be the invention relates to the basic histogram of node obtain another structural representation of device
Figure.As shown in figure 8, it may include that initial group is drawn away from determining device 801, initial section that the basic histogram of node, which obtains device 612,
Separating device 802, self-adaptive processing device 803, cycle reading device 804 and the basic histogram constituent apparatus of second node 805.
Specifically, initial group is configured to once read a part of node data as initial number away from determining device 801
According to and determine primary data data area, and according to the data area of predetermined group of number and primary data come determine initial group away from.
Initial interval division device 802 is configured to according to initial group away from primary data is divided into multiple initial sections simultaneously
Calculate the frequency in each initial section.
Self-adaptive processing device 803 is configured to read remaining node data as new data, locates in a self-adaptive manner
Manage new data, and determine current data group away from and frequency.Wherein, the current data includes primary data and new data.
Cycle reading device 804 is configured to continue to read the node if there are still the node data not being read
Data Concurrent is sent to self-adaptive processing device 803, until whole node data has been read, to obtain whole node datas
Group away from and frequency.
The basic histogram constituent apparatus of second node 805 be configured to the group based on whole node datas away from frequency come
To the basic histogram of node.
In turn, self-adaptive processing device 803 may include reading device 811, judgment means 812, the first decision maker 813
And second decision maker 814.
Reading device 811 is configured to read new data.Wherein, which contains one or more data.
Judgment means 812 are configured to judge whether each data in new data are all located at the data area of primary data
Within.
First decision maker 813 is configured to if it is determined that the judging result of device 812 is to be to further determine new number
Which initial section is each data in belong to, and correspondingly increases the frequency in affiliated initial section, to obtain current data
Group away from and frequency.Specifically,
Second decision maker 814 is configured to if it is determined that the judging result of device 812 is otherwise by multiple initial sections
It is adjusted between multiple new districts, and further determines which new district is each data in new data belong between, and correspondingly increase institute
Belong to the frequency between new district, with obtain the group of current data away from and frequency.
Between multiple initial sections are adjusted to multiple new districts, it is broadly divided into following two situations:
The first situation is, only increases the number in section, and group is away from remaining unchanged.That is, such situation is only to adjust data model
The case where enclosing.In this case, for the data except the data area of primary data in new data, according to initial
Number of the group away from increase section, and using each section after increase as between a new district so that new bit is in all new districts
Between within corresponding data area.That is, making the data except the data area of primary data in new data
Within the increased section of institute.
The second situation be increase the number in section, and change group away from.That is, such situation is to adjust data area simultaneously
With group away from the case where.Due to being continuously increased with new data, the number in section can exceed the accessible number of calculate node.This
Sample, it is necessary to the number in section be reduced away from by way of increase group, thus also need to adjust while adjusting data area
Whole group away from.In the latter case, first, for the data except the data area of primary data in new data,
According to initial group away from come when increasing section, when the sum of number and the number in initial section in increased section be more than a predetermined group number
Certain multiple when, by initial group away from prearranged multiple be adjusted to new group away from.Then, the left margin of current data range is adjusted
Value so that the left boundary value after adjustment be new group away from multiple, and using the data area after adjustment as new data range.
Then, according to new group away from new data range to be divided between multiple new districts so that the number between new district is more than or equal to a predetermined group number
And less than the certain multiple of predetermined group number.Then, it is determined that the frequency between each new district.Due to new group away from be initial group away from it is predetermined
Multiple, the left boundary value of new data range be new group away from multiple, so being the equal of between actually each new district will be by current
Made of several adjacent intervals in the histogram that data are constituted merge.In this way, the frequency between each new district is exactly that these are adjacent
The sum of each frequency in section.Fig. 9 be the invention relates to histogram obtain device an example structural schematic diagram.Such as figure
Shown in 9, histogram summarizes that device 613 may include comparison means 901, total group away from determining device 902, whole interval division device
903, partial section divides device 904, frequency summarizes device 905 and basic histogram constituent apparatus 906.
Comparison means 901 is configured to obtain conceptual data by comparing the data area of the basic histogram of multiple nodes
Data area.
Total group is configured to determine overall number according to the data area of conceptual data and suggestion group number away from determining device 902
According to group away from.
Whole interval division device 903 is configured to the group according to conceptual data away from dividing the data area of conceptual data
For multiple whole sections.
Partial section divides device 904 and is configured to respectively draw the basic histogram of each node according to multiple whole sections
It is divided into multiple portions section, and determines the frequency of each partial section.
Frequency summarizes device 905 and is configured to for each whole section, by the basic histogram of multiple nodes with this
The frequency of the corresponding partial section in whole section summarizes for the frequency in each whole section.
Basic histogram constituent apparatus 906 be configured to according to the group of conceptual data away from the frequency with each whole section come
Obtain the basic histogram for conceptual data.
Step in the specific implementation and the present processes of modules included by the data processing equipment 600 of the application
Rapid specific implementation is corresponding, in order not to obscure the application, omits the detail no longer to modules herein and carries out
Description.
The present processes, equipment and system can it is any can based on the data processing equipment of histogram in answer
With.It is described to be can include but is not limited to based on the data processing equipment of histogram:Desktop computer, mobile terminal device, knee
Laptop, tablet computer, personal digital assistant etc..
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net
Network interface and memory.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/or
The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology realizes information storage.Information can be computer-readable instruction, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves
State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable
Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM),
Digital versatile disc (DVD) or other optical storages, magnetic tape cassette, tape magnetic disk storage or other magnetic storage apparatus
Or any other non-transmission medium, it can be used for storage and can be accessed by a computing device information.As defined in this article, it calculates
Machine readable medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability
Including so that process, method, commodity or equipment including a series of elements include not only those elements, but also wrap
Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that wanted including described
There is also other identical elements in the process of element, method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer program product.
Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the application
Form.It is deposited moreover, the application can be used to can be used in the computer that one or more wherein includes computer usable program code
Storage media(Including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)The shape of the computer program product of upper implementation
Formula.
Above is only an example of the present application, it is not intended to limit this application.For those skilled in the art
For, the application can have various modifications and variations.It is all within spirit herein and principle made by any modification, equivalent
Replace, improve etc., it should be included within the scope of claims hereof.
Claims (18)
1. a kind of data processing method, which is characterized in that including:
In response to the initial query request for a data acquisition system, obtained by reading the data in a data acquisition system
Basic histogram, wherein the basic histogram is the intermediate data for constituting goal histogram, the basic histogram
Group away from the group less than the goal histogram away from;And
Based on scheduled target interval or target group away from, by the basic histogram obtain with the target interval or target group away from
Corresponding goal histogram, and the goal histogram is presented;
Wherein, it is described based on scheduled target interval or target group away from being obtained and the target interval by the basic histogram
Or target group is away from corresponding goal histogram, including:
According to target interval or target group away between several adjacent intervals of basic histogram are merged into a new district, and by phase
The frequency answered is summarized, and corresponding goal histogram is obtained.
2. according to the method described in claim 1, it is characterized in that, the data by reading in a data acquisition system
Further comprise come the step of obtaining basic histogram:
The data are assigned to multiple calculate nodes, so that each calculate node obtains node data;
Each calculate node obtains the basic histogram of node by reading a node data;And
Multiple basic histograms of node are summarized for a basic histogram.
3. according to the method described in claim 2, it is characterized in that, described to summarize multiple basic histograms of node be one
The step of a basic histogram, further comprises:
By comparing the data area of multiple basic histograms of node, the data area of the data is obtained;
According to the data area of the data and suggestion group number, determine the groups of the data away from;
According to described group away from the data area of the data is divided into multiple whole sections;
Each basic histogram of node is divided into multiple portions section respectively according to the multiple whole section, and is determined
The frequency of each partial section;
For each whole section, by part corresponding with the entirety section in multiple basic histograms of node
The frequency in section summarizes for the frequency in each whole section;And
According to described group away from the frequency with each whole section, the basic histogram for the data is obtained.
4. according to the method in claim 2 or 3, which is characterized in that described to be saved by reading a node data
The step of point basic histogram, further comprises:
It is primary to read whole node datas, determine the data area of whole node data;
Determined according to the data area of predetermined group of number and whole node data the group of whole node data away from;
According to the group of whole node data away from whole node data is divided into multiple sections, and calculate every
The frequency in a section;And
Group based on whole node data obtains the basic histogram of the node away from the frequency with each section.
5. according to the method in claim 2 or 3, which is characterized in that described to be saved by reading a node data
The step of point basic histogram, further comprises:
It determines step, once reads a part of node data as primary data, determine the data area of the primary data, and
According to the data area of predetermined group of number and the primary data come determine initial group away from;
Partiting step, according to the initial group away from the primary data is divided into multiple initial sections, and calculate it is each just
The frequency in beginning section;
Self-adaptive processing step reads remaining node data as new data, handles the new data in a self-adaptive manner, and
Determine the group of current data away from and frequency, wherein the current data includes the primary data and the new data;
Read step is recycled, if there are still the node data not being read, returns to the self-adaptive processing step, until
Until whole node datas have been read, with obtain the groups of whole node datas away from and frequency;And
Obtain step, the group based on whole node datas away from and frequency, obtain the basic histogram of the node.
6. according to the method described in claim 5, it is characterized in that, the self-adaptive processing step further comprises:
Read the new data, wherein the new data packets contain one or more data;
Judge whether each data in the new data are all located within the data area of the primary data;
If it is judged that being yes, then further determine which initial section is each data in the new data belong to,
And the frequency in initial section belonging to correspondingly increasing, with obtain the group of current data away from and frequency;And
If it is judged that being no, then between the multiple initial section being adjusted to multiple new districts, and further determine described new
Which new district is each data in data belong between, and correspondingly increases the frequency between affiliated new district, to obtain current number
According to group away from and frequency.
7. according to the method described in claim 6, it is characterized in that, described be adjusted to multiple new districts by the multiple initial section
Between further comprise:
For the data except the data area of the primary data in the new data, according to the initial group away from next
Increase one or more sections, so that being formed between multiple new districts so that the new bit between the multiple new district within.
8. according to the method described in claim 6, it is characterized in that, described be adjusted to multiple new districts by the multiple initial section
Between the step of further comprise:
For the data except the data area of the primary data in the new data, according to the initial group away from
Come when increasing section, when the sum of number and the number in initial section in increased section be more than certain times of the predetermined group of number
Number when, by the initial group away from prearranged multiple be adjusted to new group away from;
The left boundary value of the data area of the current data is adjusted to form new data range so that the left boundary value after adjustment
Be described new group away from multiple;And
According to described new group away from the new data range to be divided between multiple new districts so that the number between the new district is big
In the certain multiple equal to predetermined group of number and less than the predetermined group of number.
9. according to the method described in claim 1, it is characterized in that, it is described based on scheduled target interval or target group away from, by
The basic histogram obtains further comprising away from the step of corresponding goal histogram with the target interval or target group:
Based on the target group away from, determine display group away from;
Based on the target interval, determines and show left boundary value and display right boundary value;
Based on the display group away from and the display left boundary value and display right boundary value, determine each display interval;
For each display interval, the sum of frequency of respective bins of basic histogram of the display interval will be belonged to and be used as institute
State the frequency of display interval;And
According to the display group away from the frequency with the display interval, the goal histogram is obtained.
10. according to the method described in claim 1, it is characterized in that, further including:
In response to the subsequent query request for the data acquisition system, based on the specified another target of subsequent query request
Section or another target group by the basic histogram away from being obtained with another target interval or another target group away from corresponding
Another goal histogram, and another goal histogram is presented.
11. a kind of data processing equipment, which is characterized in that including:
Basic histogram obtains device, is configured in response to ask for the initial query of a data acquisition system, by reading one
Data in the data acquisition system obtain basic histogram, wherein the basic histogram is for constituting target histogram
The intermediate data of figure, the group of the basic histogram away from the group less than the goal histogram away from;And
Goal histogram obtains device, is configured to based on scheduled target interval or target group away from by the basic histogram
It obtains with the target interval or target group away from corresponding goal histogram, and the goal histogram is presented;
Wherein, the goal histogram obtains device, is specifically used for according to target interval or target group away from by basic histogram
Several adjacent intervals are merged between a new district, and corresponding frequency are summarized, and corresponding goal histogram is obtained.
12. equipment according to claim 11, which is characterized in that the basic histogram obtains device and further comprises:
Diostribution device is configured to the data being assigned to multiple calculate nodes, so that each calculate node obtains node
Data;
The basic histogram of node obtains device, is configured to each calculate node by reading a node data to obtain node
Basic histogram;And
Histogram summarizes device, is configured to summarize multiple basic histograms of node for a basic histogram.
13. equipment according to claim 12, which is characterized in that the histogram summarizes device and further comprises:
Comparison means is configured to the data area by comparing multiple basic histograms of node, obtains the data
Data area;
Total group is configured to the data area and suggestion group number according to the data, determines the group of the data away from determining device
Away from;
Whole interval division device is configured to according to described group away from the data area of the data is divided into multiple whole areas
Between;
Partial section divides device, and being configured to respectively will each basic histogram of node according to the multiple whole section
It is divided into multiple portions section, and determines the frequency of each partial section;
Frequency summarizes device, is configured to for each whole section, by multiple basic histograms of node with
The frequency of the corresponding partial section in the entirety section summarizes for the frequency in each whole section;And
Basic histogram constituent apparatus is configured to be directed to away from the frequency with each whole section according to described group
The basic histogram of the data.
14. equipment according to claim 12 or 13, which is characterized in that the basic histogram of node obtains device into one
Step includes:
Node data ranges determination device is configured to once read whole node datas, determines whole number of nodes
According to data area;
First node group away from determining device, be configured to according to the data area of predetermined group number and whole node data come
Determine the group of whole node data away from;
First node interval division device is configured to according to the group of whole node data away from by whole node
Data are divided into multiple sections, and calculate the frequency in each section;And
The basic histogram constituent apparatus of first node, be configured to the group based on whole node data away from it is described each
The frequency in section obtains the basic histogram of the node.
15. equipment according to claim 12 or 13, which is characterized in that the basic histogram of node obtains device into one
Step includes:
Initial group is configured to once read a part of node data as primary data, determines described initial away from determining device
The data area of data, and according to the data area of predetermined group of number and the primary data come determine initial group away from;
Initial interval division device is configured to according to the initial group away from the primary data is divided into multiple original areas
Between, and calculate the frequency in each initial section;
Self-adaptive processing device is configured to read remaining node data as new data, in a self-adaptive manner described in processing
New data, and determine current data group away from and frequency, wherein the current data include the primary data and the new number
According to;
Reading device is recycled, if being configured to, there are still the node data not being read, continue to read the node data
And it is sent to the self-adaptive processing device, until whole node data has been read, to obtain whole node datas
Group away from and frequency;And
The basic histogram constituent apparatus of second node, be configured to the group based on whole node datas away from and frequency, obtain
The basic histogram of node.
16. equipment according to claim 15, which is characterized in that the self-adaptive processing device further comprises:
Reading device is configured to read the new data, wherein the new data packets contain one or more data;
Judgment means, are configured to judge whether each data in the new data are all located at the data model of the primary data
Within enclosing;
First decision maker is configured to if it is judged that be yes, then further determine each data in the new data
Belong to which initial section, and the frequency in initial section belonging to correspondingly increasing, with obtain the group of current data away from and frequency
Number;And
Second decision maker is configured to if it is judged that be no, then the multiple initial section is adjusted to multiple new districts
Between, and further determine which new district is each data in the new data belong between, and correspondingly increase affiliated new district
Between frequency, with obtain the group of current data away from and frequency.
17. equipment according to claim 11, which is characterized in that the goal histogram obtains device and further comprises:
First determining device, for based on the target group away from, determine display group away from;
Second determining device determines for being based on the target interval and shows left boundary value and display right boundary value;
Third determining device, for based on the display group away from and the display left boundary value and display right boundary value, it is determining
Each display interval;
Computing device calculates the respective bins for the basic histogram for belonging to the display interval for being directed to each display interval
Frequency as the display interval of the sum of frequency;And
Device is obtained, for, away from the frequency with the display interval, obtaining the goal histogram according to the display group.
18. equipment according to claim 11, which is characterized in that further include:
Change device is configured in response to ask for the subsequent query of the data acquisition system, be asked based on the subsequent query
Ask specified another target interval or another target group away from, by the basic histogram obtain with another target interval or
Another goal histogram is presented away from corresponding another goal histogram in another target group.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310692643.2A CN104714976B (en) | 2013-12-17 | 2013-12-17 | Data processing method and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310692643.2A CN104714976B (en) | 2013-12-17 | 2013-12-17 | Data processing method and equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104714976A CN104714976A (en) | 2015-06-17 |
CN104714976B true CN104714976B (en) | 2018-08-24 |
Family
ID=53414319
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310692643.2A Active CN104714976B (en) | 2013-12-17 | 2013-12-17 | Data processing method and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104714976B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105224590A (en) * | 2015-07-07 | 2016-01-06 | 北京挺软科技有限公司 | The instant discretize of a kind of data and the implementation method gathered |
CN109544473B (en) * | 2018-11-12 | 2020-12-29 | 中国资源卫星应用中心 | Method, system and medium for calculating relative radiation correction coefficient of optical satellite |
CN110647557B (en) * | 2019-09-04 | 2023-05-05 | 创新先进技术有限公司 | Mass data statistics method, device and equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101751556A (en) * | 2008-12-03 | 2010-06-23 | 财团法人工业技术研究院 | Method for creating appearance model of object, method for distinguishing object, and monitoring system |
CN102723089A (en) * | 2011-05-11 | 2012-10-10 | 新奥特(北京)视频技术有限公司 | Realization method and system for on-site data outputting and playing |
CN103218837A (en) * | 2013-04-22 | 2013-07-24 | 北京航空航天大学 | Unequal class interval histogram rendering method based on empirical distribution function |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW201143305A (en) * | 2009-12-29 | 2011-12-01 | Ibm | Data value occurrence information for data compression |
-
2013
- 2013-12-17 CN CN201310692643.2A patent/CN104714976B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101751556A (en) * | 2008-12-03 | 2010-06-23 | 财团法人工业技术研究院 | Method for creating appearance model of object, method for distinguishing object, and monitoring system |
CN102723089A (en) * | 2011-05-11 | 2012-10-10 | 新奥特(北京)视频技术有限公司 | Realization method and system for on-site data outputting and playing |
CN103218837A (en) * | 2013-04-22 | 2013-07-24 | 北京航空航天大学 | Unequal class interval histogram rendering method based on empirical distribution function |
Also Published As
Publication number | Publication date |
---|---|
CN104714976A (en) | 2015-06-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102192863B1 (en) | Information recommendation method and device | |
US10701168B2 (en) | Method and apparatus for compaction of data received over a network | |
US9411659B2 (en) | Data processing method used in distributed system | |
CN103455531B (en) | A kind of parallel index method supporting high dimensional data to have inquiry partially in real time | |
CN107122126B (en) | Data migration method, device and system | |
KR102125119B1 (en) | Data handling method and device | |
CN110033247B (en) | Payment channel recommendation method and system | |
CN103747047A (en) | CDN file storage method, file distribution control center and system thereof | |
CN104702625A (en) | Method and device for scheduling access request in CDN (Content Delivery Network) | |
CN104714976B (en) | Data processing method and equipment | |
CN103096385A (en) | Method and device and terminal of flow control | |
CN108566660A (en) | A kind of method for switching network, device and computer readable storage medium | |
CN105320702A (en) | Analysis method and device for user behavior data and smart television | |
CN106202092A (en) | The method and system that data process | |
CN112672366A (en) | Vertical switching system based on personalized consumption preference in heterogeneous wireless network | |
US20170364833A1 (en) | Ranking video delivery problems | |
CN109800236A (en) | Support the distributed caching method and equipment of multinode | |
Dai et al. | Improving load balance for data-intensive computing on cloud platforms | |
CN111831891A (en) | Material recommendation method and system | |
CN103560974B (en) | Method and device for maintaining tokens | |
CN104717439B (en) | Data flow control method and its device in Video Storage System | |
CN104283934A (en) | WEB service pushing method and device based on reliability prediction and server | |
CN105939388A (en) | Method for pushing business content and content controller | |
US10803036B2 (en) | Non-transitory computer-readable storage medium, data distribution method, and data distribution device | |
CN110413579A (en) | Image cache method, equipment, storage medium and device based on caching value |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20211104 Address after: Room 507, floor 5, building 3, No. 969, Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province Patentee after: ZHEJIANG TMALL TECHNOLOGY Co.,Ltd. Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands Patentee before: ALIBABA GROUP HOLDING Ltd. |
|
TR01 | Transfer of patent right |