EP1230587A1 - Data visualization - Google Patents

Data visualization

Info

Publication number
EP1230587A1
EP1230587A1 EP00990488A EP00990488A EP1230587A1 EP 1230587 A1 EP1230587 A1 EP 1230587A1 EP 00990488 A EP00990488 A EP 00990488A EP 00990488 A EP00990488 A EP 00990488A EP 1230587 A1 EP1230587 A1 EP 1230587A1
Authority
EP
European Patent Office
Prior art keywords
visualization
data
values
parameters
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP00990488A
Other languages
German (de)
French (fr)
Inventor
Georges Grinstein
Patrick Hoffman
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Massachusetts UMass
Original Assignee
University of Massachusetts UMass
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Massachusetts UMass filed Critical University of Massachusetts UMass
Publication of EP1230587A1 publication Critical patent/EP1230587A1/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/14Digital output to display device ; Cooperation and interconnection of the display device with other functional units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/20Drawing from basic elements, e.g. lines or circles
    • G06T11/206Drawing of charts or graphs

Definitions

  • This invention relates to data visualization.
  • Data visualizations have included, for example, line graphs, bar charts, histograms, pie charts, survey plots, scatter plots, star plots, forced field visualizations, and so forth.
  • Each of these data representations or visualizations has strengths and weaknesses. As the number of data increases, it becomes extremely difficult to identify patterns in these data visualizations.
  • the invention features a method of providing a visualization of data including receiving a first set of one or more values for a first set of one or more parameters controlling one or more characteristics of a first type of data visualization, receiving a second set of one or more values for a second set of one or more parameters controlling one or more characteristics of a second type of data visualization, the second type of data visualization being a different type of data visualization than the first type of data visualization, and producing a visualization of a set of data, the visualization having characteristics based on one or more values of the first set of parameters and one or more values of the second set of parameters.
  • the invention features a data visualization method including arranging dimensional as an N-dimensional regular polygon, mapping to an axis of each of the N-dimensions data values, and displaying the mapped data values.
  • Embodiments of the invention may have one or more of the following advantages.
  • a graphic primitive called a dimensional anchor (DA) is described that can help generate new visualizations and provide insight to analyzing information visualizations.
  • the DA represents a unified framework or model for constructing a variety of visualizations including Parallel Coordinates, Scatterplot Matrices, Radviz, Survey Plots, Circle Segments, and so forth.
  • a DA is constructed by assigning values to parameters associated with various geometric graphic elements that encode the basics of the above visualizations.
  • FIG. 1 is a graph illustrating a single dimensional anchor.
  • FIG. 2 is a graph illustrating a first scatterplot using oblique coordinates.
  • FIG. 3 is graph illustrating a second scatterplot.
  • FIG. 4 is a graph illustrating a third scatterplot.
  • FIG. 5 is a graph illustrating a fourth scatterplot.
  • FIG. 6 is a graph illustrating a fifth scatterplot.
  • FIG. 7 is a graph illustrating a first spread polygon.
  • FIG. 8 is a graph illustrating a second spread polygon.
  • FIG. 9 is a graph illustrating a regular polygon.
  • FIG. 10 is a graph illustrating parallel coordinates with springs.
  • FIG. 11 is a graph illustrating an interpolating visualization between parallel coordinates and circular parallel coordinates.
  • FIG. 12 is a graph illustrating overlapped radial visualizations.
  • FIG. 13 is a block diagram of an exemplary data visualization system.
  • Radial Nisualization involves taking n-dimensional data points as points equally spaced around the perimeter of a circle. One end of a spring is attached to each perimeter point. The other end of each spring is attached to a data point. A spring constant Ki equals the values of the i-th coordinate of the fixed point. Each data point is then displayed where the sum of the spring forces equals 0. All the data point values are typically normalized to have values between 0 and 1. For example, if all n coordinates have the same value the data point lies exactly in the center of the circle. If the point is a unit vector point, it lies exactly at the fixed point on the edge of the circle where the spring for that dimension is fixed. Many points can map to the same position.
  • a visualization is modeled as a function V defined as V : A — > D which maps an array of data to a Display D.
  • the visualizations are limited to the parameter space defined by Pi, P 2 , ..., P n .
  • parameters classes of visualizations can be defined.
  • a visualization is also a function of the inherent geometry of the particular visualization. Data and parts of the visualization geometry are encoded into a primitive called a dimensional anchor (DA).
  • DA dimensional anchor
  • a dimensional anchor is a graphic primitive which can help generate new visualizations and provide insight to analyzing information visualizations.
  • the DA represents a unified framework or model for a variety of visualizations including
  • a dimensional anchor is constructed by assigning values to parameters associated with various geometric graphic elements that encode the basics of the above visualizations. Multiple DAs can be used to create all of the above visualizations and many new ones as well as interpolating visualizations. One column of the data is selected to be associated with the dimensional anchor. A number of parameters are associated with a DA. Nine are described below, as one embodiment of this model.
  • a definition or specification may be generated of a visualization that is simple but powerful enough to generate many of the standard multi-dimensional multi-variate visualizations used today. Additionally, many novel visualizations can also be generated.
  • the model has nine parameters which are: PI —size of scatter plot point
  • P2 length of perpendicular lines extending from Anchorpoints creating scatter plot point
  • a dimensional anchor is simply one of the axes in a two dimensional scatterplot. It is normally associated with a dimension or variate from a dataset or database. The data values for the associated dimension are mapped to the axis in the standard manner where the minimum and maximum values usually correspond to points near the end points of the axis. Labels and scale tick marks can also be associated with the dimensional anchor. Normally, these regular spacings along an axis are called the coordinate values. Mapped data points, which we call Anchorpoints, are the coordinate values (points along a dimensional anchor) corresponding to the dimensional data points.
  • an exemplary display having a one-dimensional anchor with lines extended from the Anchorpoints is shown.
  • the vertical lines may be colored to show the distribution of the data (e.g., a cars dataset) for the miles per gallon attribute. Additionally the color of the lines may show the type of car (e.g., American— red or dark, Japanese — green or light, and European — purple).
  • the nine parameters associated with a DA control how a DA interacts with other DAs to form graphical constructions such as points, lines and advanced visualizations.
  • one parameter of the DA is used to control the size of the scatter plot point (Pi).
  • the point is formed by the intersection of a line from the Anchorpoint of two DAs.
  • An Anchorpoint associated with the same data point on another DA also has a pe ⁇ endicular line extending outward (on both sides of the DA). If the two pe ⁇ endicular lines meet, the point of intersection becomes the point of the scatte lot.
  • the first parameter (PI) controls the size of the scatte ⁇ lot point, ranging from 0 (no point drawn) to 1 (a large point displayed). Often in a scatte ⁇ lot the size, shape, or color of a point can be associated with other dimensions (variates or columns) in the dataset. In an embodiment, a selected column determines color, and shape is a circle. In another embodiment, a set of parameters controls the shape so that other classes of visualizations using icons, or color icons, can be represented. However, whether a scatter plot point is displayed or not is a basic parameter (PI) of the dimensional anchor. The intersection of pe ⁇ endicular lines extending from Ancho ⁇ oints works with any arrangement or any number of dimensional anchors. A dimensional anchor, by definition, can be any sequence of line segments.
  • an alternative definition of a scatte ⁇ lot can use Oblique Coordinates i.e., not necessarily pe ⁇ endicular liens from the coordinate axis.
  • Two additional parameters (P2, P3) control the display of the pe ⁇ endicular lines from the Ancho ⁇ oints and lines connecting all the scatte ⁇ lot points associated with one data point.
  • P2 controls the length of the pe ⁇ endicular lines extending from the DA to the scatter plot point.
  • P2 is about .2, while referring to FIG. 4, the P2 parameter is 1.0.
  • the parameters are defined from 0 (no lines) to 1 (all intersecting lines displayed).
  • N dimensional anchors are used (in a dataset with N variates or dimensions), there are typically up to N points generated for each data point in the visualization to have these N points connected.
  • P3 controls three-dimensional anchors in an equilateral triangle pattern.
  • P4 a particular designated dataset dimension (DDD) (variate or column) is used for a DA to construct a Survey Plot or visualization similar to Circle Segments.
  • DDD dataset dimension
  • the P4 parameter controls the size of a rectangle extending from an Ancho ⁇ oint. The size also depends on the dimensional value at the Ancho ⁇ oint.
  • a CCCViz Color Dorrelated Column Visualization
  • Alternate dimensions of a data set with a special classification dimension Sort the columns (i.e., designated data set dimension) according to the classification dimension. Use a gray scale mapping for the dimension of the data set and use a rainbow color mapping for the classification dimension.
  • P4 perimeter i.e., the length of the rectangle in a survey plot
  • a color correlated column visualization is generated. The visualization illustrates whether a dimension (gray scales) correlates with a particular classification dimension (color scale).
  • the CCCViz is useful when the number of dimensions are small, i.e., less than 30.
  • DAs can produce a partial ordering of the data.
  • the visualization arranged in a crisscross grid pattern only uses the spring parameters p7 and p9, and the display results in a simple diagonal pattern, since every "spring" has an opposite spring symmetrically across the diagonal.
  • the criss cross DA pattern performs a visual sorting of the data which can be used as a discriminant to separate classes of data.
  • Parallel Coordinates visualization is as follows. Simply connect a line from one DA Ancho ⁇ oint to another. The length of these connecting lines is controlled by the DA parameter (P5). However, if all Ancho ⁇ oints on all DAs are exhaustively connected (each Ancho ⁇ oint connected to N-l other Ancho ⁇ oints) one gets additional interesting visualizations.
  • P5 DA parameter
  • P6 additional parameter
  • This blocking parameter when set to 0 produces the familiar Parallel Coordinate visualization.
  • Parallel Coordinates in a circle is also generated using the P5 and P6 parameters when the DAs are arranged as radial spokes from the center of a circle.
  • An “enhanced" radial visualization display is generated when dimensional anchors are arranged as an N-dimensional regular polygon.
  • polyvis One of the limitations of radial visualization was that many data points could overlap in the center of the circle although they had different coordinate values.
  • the original radial visualization display is generated.
  • Polyviz is a visualization that better utilizes the total area of the circle (or regular polygon).
  • An additional parameter (P8) is used to draw lines extending from the fixed spring points to the displayed point where the sum of the spring forces are 0.
  • An additional parameter (P9) is also used as a zoom factor in the display. Referring to FIG. 7 a polyviz visualization is shown with seven points.
  • FIG. 7 and FIG. 8 illustrate examples of a "spread" polygon, whereas FIG. 9 illustrates a regular polygon.
  • the zoom parameter P9 is designed such that when it is 0 all points lie at the geometric center, at .5 it corresponds to a normal physical spring
  • DAs and Polyviz provide a mechanism to generate linear combinations of visualizations.
  • V pc f(P pc ,G pc (d)) where G pc (d) is the geometry of the Parallel Coordinate arrangement of DAs.
  • the parameters to generate a radial visualization are:
  • V rv f(P rv ,G rv (d)) where G rv (d) is the geometry of the Radviz arrangement of DAs.
  • G rv (d) is the geometry of the Radviz arrangement of DAs.
  • V new f(V pc , V rv ,)
  • Vn e w .5V pc + .5V rv ,
  • a linear combination of the geometry of the DAs can be defined in many ways. For example, in FIG. 11 one embodiment describes a possible transformation on a Parallel Coordinate configuration of DAs is shown. The DAs are smoothly wrapped around into a cross shape. If DAs are now progressively shortened to points at the outer endpoints of the cross, the normal radial visualization is achieved. The geometry of the transformation is not affined (parallel lines don't stay parallel), or projective. Although each individual DA transformation could be considered projective, the DAs taken together are not (the incidence of points and lines is not invariant). The transformations can be considered topological if we include the constraint that the endpoints of the DAs never actually meet.
  • G ne w(d) .5G pc (d) + .5G rv (d)
  • Gnew(d) 5G rv (d) which is the transformation halfway from G pc (d) to G rv (d) (the bottom arrangement in
  • the linear combination of Parallel Coordinates and radial visualization is shown in FIG. 12.
  • the DA layout is Circular Parallel Coordinates and parameters for Radviz springs and Parallel Coordinates are positive values. Notice that the spring points are shown very close to the DAs and the Parallel Coordinates lines are not completely connected. First, the size of the points should be appropriate for the display (depending on the number of points) and the connecting Pc lines should not have gaps. This means that a more useful display might be
  • Vp c + Vrv where the radial visualization point size is normal, and where the Parallel Coordinate lines are full connected.
  • an exemplary data visualization system 10 includes a computer system 12 connected via a link 14 to a display unit 16.
  • the computer system includes at least a memory 18, a central processing unit (CPU) 20, and a mass storage device 22.
  • the mass storage device contains data visualization instructions 24.
  • the data visualization instructions 24 are loaded into the memory 18 for processing in the CPU 20.
  • Output from the processing in the CPU 20 displays visualizations on the display unit 16.
  • DAs and their associated parameters can be arranged in any arbitrary size, shape and position.
  • regular arrangements like in Parallel Coordinates, Circle Segments, Regular Polygons, spread polygons and Criss Cross Patterns.
  • arcs, or curves arranged in the form of lens shaped configurations and hyperbolic circular displays.
  • Dimensional anchors shaped in the form of polynomial, or logarithmic functions will also have useful properties.
  • the various visualizations generated from Parallel Coordinate type lines, scatter plot lines, and spring lines in various configurations such as triangles or polygons we have categorized as mesh plots.
  • the visualizations are characterized by varying densities of a "mesh" of lines.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Generation (AREA)
  • Ultra Sonic Daignosis Equipment (AREA)

Abstract

A method of providing a visualization of data including receiving a first set of one or more value for a set of one or more parameters (P4) controlling one or more characteristics of a first type of data visualization (P5), receiving a second set of one or more characteristics of a second type of data visualization, the second type of data visualization being a different type of data visualization than the first type of data visualization (P8).

Description

DATA VISUALIZATION
TECHNICAL FIELD
This invention relates to data visualization.
BACKGROUND
Generating and analyzing data visualizations and information visualizations are difficult tasks. Data visualizations have included, for example, line graphs, bar charts, histograms, pie charts, survey plots, scatter plots, star plots, forced field visualizations, and so forth. Each of these data representations or visualizations has strengths and weaknesses. As the number of data increases, it becomes extremely difficult to identify patterns in these data visualizations.
SUMMARY
In an aspect, the invention features a method of providing a visualization of data including receiving a first set of one or more values for a first set of one or more parameters controlling one or more characteristics of a first type of data visualization, receiving a second set of one or more values for a second set of one or more parameters controlling one or more characteristics of a second type of data visualization, the second type of data visualization being a different type of data visualization than the first type of data visualization, and producing a visualization of a set of data, the visualization having characteristics based on one or more values of the first set of parameters and one or more values of the second set of parameters. In another aspect, the invention features a data visualization method including arranging dimensional as an N-dimensional regular polygon, mapping to an axis of each of the N-dimensions data values, and displaying the mapped data values. Embodiments of the invention may have one or more of the following advantages. A graphic primitive called a dimensional anchor (DA) is described that can help generate new visualizations and provide insight to analyzing information visualizations. The DA represents a unified framework or model for constructing a variety of visualizations including Parallel Coordinates, Scatterplot Matrices, Radviz, Survey Plots, Circle Segments, and so forth. A DA is constructed by assigning values to parameters associated with various geometric graphic elements that encode the basics of the above visualizations. Multiple DAs can be used to create all of the above visualizations and many new ones as well as interpolating visualizations. The details of one or more embodiments of the invention are set forth in the accompa- nying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
DESCRIPTION OF DRAWINGS
FIG. 1 is a graph illustrating a single dimensional anchor. FIG. 2 is a graph illustrating a first scatterplot using oblique coordinates.
FIG. 3 is graph illustrating a second scatterplot.
FIG. 4 is a graph illustrating a third scatterplot.
FIG. 5 is a graph illustrating a fourth scatterplot.
FIG. 6 is a graph illustrating a fifth scatterplot. FIG. 7 is a graph illustrating a first spread polygon.
FIG. 8 is a graph illustrating a second spread polygon.
FIG. 9 is a graph illustrating a regular polygon.
FIG. 10 is a graph illustrating parallel coordinates with springs.
FIG. 11 is a graph illustrating an interpolating visualization between parallel coordinates and circular parallel coordinates.
FIG. 12 is a graph illustrating overlapped radial visualizations.
FIG. 13 is a block diagram of an exemplary data visualization system.
Like reference symbols in the various drawings indicate like elements.
DETAILED DESCRIPTION Spring constants may be used to represent relational values between points.
For example, Radial Nisualization (Radviz) involves taking n-dimensional data points as points equally spaced around the perimeter of a circle. One end of a spring is attached to each perimeter point. The other end of each spring is attached to a data point. A spring constant Ki equals the values of the i-th coordinate of the fixed point. Each data point is then displayed where the sum of the spring forces equals 0. All the data point values are typically normalized to have values between 0 and 1. For example, if all n coordinates have the same value the data point lies exactly in the center of the circle. If the point is a unit vector point, it lies exactly at the fixed point on the edge of the circle where the spring for that dimension is fixed. Many points can map to the same position. This represents a non- linear transformation of the data which preserves certain symmetries and which produces an intuitive display. Some features of this visualization include: points with approximately equal coordinate values lie close to the center; points with similar values whose dimensions are opposite each other on the circle lie near the center; points which have one or two coordinate values greater than the others lie closer to those dimensions; the position of a point depends on the layout of the particular dimensions around the circle; the layout of the data can be understood because of the spring analogy; a line in n- dimensions maps to a line; and other 2-dimensional geometric objects map to 2- dimensional objects in the plane. A visualization is modeled as a function V defined as V : A — > D which maps an array of data to a Display D. This V is a function of the rows and columns of the array (MxN) data and additional parameters Pi, P2, ..., Pn, which encode information about the particular visualization. Visualizations as functions of other visualizations are implemented. Vnew = f(V*, V2, V3, ..., Vn) with each V* a visualization.
The visualizations are limited to the parameter space defined by Pi, P2, ..., Pn. By varying the parameters classes of visualizations can be defined. We describe a set of parameters that includes all of the visualizations mentioned above and described below in more detail. A visualization is also a function of the inherent geometry of the particular visualization. Data and parts of the visualization geometry are encoded into a primitive called a dimensional anchor (DA).
A dimensional anchor (DA) is a graphic primitive which can help generate new visualizations and provide insight to analyzing information visualizations. The DA represents a unified framework or model for a variety of visualizations including
Parallel Coordinates, Scatterplot Matrices, Radviz, Survey Plots, Circle Segments and others. A dimensional anchor is constructed by assigning values to parameters associated with various geometric graphic elements that encode the basics of the above visualizations. Multiple DAs can be used to create all of the above visualizations and many new ones as well as interpolating visualizations. One column of the data is selected to be associated with the dimensional anchor. A number of parameters are associated with a DA. Nine are described below, as one embodiment of this model. A visualization may thus for our model be defined as a function V=F (DA parameters, geometry of DAs). If all DAs share the same parameter values then V=F(P1, P2, ..., P, DA geometry). If the geometry of the DAs consists of straight lines, and simple curves defined by a sequence of points, a definition or specification may be generated of a visualization that is simple but powerful enough to generate many of the standard multi-dimensional multi-variate visualizations used today. Additionally, many novel visualizations can also be generated.
In an embodiment, the model has nine parameters which are: PI —size of scatter plot point
P2— length of perpendicular lines extending from Anchorpoints creating scatter plot point
P3 — length of lines connecting points associated with the same data point in scatter plot P4 — length of rectangle in Survey Plot
P5 — length of Parallel Coordinate lines P6 — blocking factor for Parallel Coordinate lines P7 — size of Radviz/spring plot point
P8 — length of "spring" lines extending from Anchorpoints creating Radviz/spring plot point
P9 — zoom factor for the "spring" K constant. At a basic level, a dimensional anchor is simply one of the axes in a two dimensional scatterplot. It is normally associated with a dimension or variate from a dataset or database. The data values for the associated dimension are mapped to the axis in the standard manner where the minimum and maximum values usually correspond to points near the end points of the axis. Labels and scale tick marks can also be associated with the dimensional anchor. Normally, these regular spacings along an axis are called the coordinate values. Mapped data points, which we call Anchorpoints, are the coordinate values (points along a dimensional anchor) corresponding to the dimensional data points.
Referring to FIG. 1, an exemplary display having a one-dimensional anchor with lines extended from the Anchorpoints is shown. The vertical lines may be colored to show the distribution of the data (e.g., a cars dataset) for the miles per gallon attribute. Additionally the color of the lines may show the type of car (e.g., American— red or dark, Japanese — green or light, and European — purple).
The nine parameters associated with a DA control how a DA interacts with other DAs to form graphical constructions such as points, lines and advanced visualizations. For example, to generate a scatterplot, one parameter of the DA is used to control the size of the scatter plot point (Pi). The point is formed by the intersection of a line from the Anchorpoint of two DAs. These nine parameters and how their implementations form visualizations will now be described. Three DA parameters associated with the construction of scatterplots have been defined. One possible construction of a scatterplot is that a perpendicular line extends outward from an Anchorpoint on a DA. An Anchorpoint associated with the same data point on another DA (another column of the same dataset) also has a peφendicular line extending outward (on both sides of the DA). If the two peφendicular lines meet, the point of intersection becomes the point of the scatte lot.
The first parameter (PI) controls the size of the scatteφlot point, ranging from 0 (no point drawn) to 1 (a large point displayed). Often in a scatteφlot the size, shape, or color of a point can be associated with other dimensions (variates or columns) in the dataset. In an embodiment, a selected column determines color, and shape is a circle. In another embodiment, a set of parameters controls the shape so that other classes of visualizations using icons, or color icons, can be represented. However, whether a scatter plot point is displayed or not is a basic parameter (PI) of the dimensional anchor. The intersection of peφendicular lines extending from Anchoφoints works with any arrangement or any number of dimensional anchors. A dimensional anchor, by definition, can be any sequence of line segments.
This allows a curve of any arbitrary shape, such as an arc, to be a dimensional anchor. With this definition, peφendicular line extensions for scatter plots can still easily be constructed. These additional arrangements can generate other visualizations and will be discussed below. Referring to FIG. 2, an alternative definition of a scatteφlot can use Oblique Coordinates i.e., not necessarily peφendicular liens from the coordinate axis. Two additional parameters (P2, P3) control the display of the peφendicular lines from the Anchoφoints and lines connecting all the scatteφlot points associated with one data point. P2 controls the length of the peφendicular lines extending from the DA to the scatter plot point. For example, referring to FIG. 3, P2 is about .2, while referring to FIG. 4, the P2 parameter is 1.0. The parameters are defined from 0 (no lines) to 1 (all intersecting lines displayed). When N dimensional anchors are used (in a dataset with N variates or dimensions), there are typically up to N points generated for each data point in the visualization to have these N points connected. P3 controls three-dimensional anchors in an equilateral triangle pattern.
Referring to FIG. 5, the intersecting peφendicular lines that generate the scatter plot display points are shown (the parameter P2 = 1.0). In FIG. 6 the display points associated with the same data point are connected (P3 = 1.0). As one can see, in this visualization the triangles generated are very similar. In general, both P2 and P3 will generate N sided polygons if the DAs are configured as regular polygons. An additional parameter (P4), along with a particular designated dataset dimension (DDD) (variate or column) is used for a DA to construct a Survey Plot or visualization similar to Circle Segments. The P4 parameter controls the size of a rectangle extending from an Anchoφoint. The size also depends on the dimensional value at the Anchoφoint. One DDD option is no sort, which uses the order that the data was loaded). The last constraint of the P4 maximum possible value (needed to generate the Survey Plot). This maximum possible value is controlled so that P4 rectangles cannot touch P4 rectangles from any other DA. By using the P4 parameter, with its constraints and appropriate arrangement of dimensional anchors the Survey Plot and modified Circle Segments can be easily constructed. (The circle segments arcs become straight lines, and extend out to a regular polygon instead of a circle but the essentials of the visualization are still the same).
In other embodiments, construction of visualizations similar to circle segments are obtained. For example, in an embodiment, a CCCViz (Color Dorrelated Column Visualization) is generated as follows. Alternate dimensions of a data set with a special classification dimension. Sort the columns (i.e., designated data set dimension) according to the classification dimension. Use a gray scale mapping for the dimension of the data set and use a rainbow color mapping for the classification dimension. By varying the P4 perimeter, i.e., the length of the rectangle in a survey plot, a color correlated column visualization is generated. The visualization illustrates whether a dimension (gray scales) correlates with a particular classification dimension (color scale). The CCCViz is useful when the number of dimensions are small, i.e., less than 30. Various arrangements of DAs can produce a partial ordering of the data. For example, the visualization arranged in a crisscross grid pattern only uses the spring parameters p7 and p9, and the display results in a simple diagonal pattern, since every "spring" has an opposite spring symmetrically across the diagonal. The criss cross DA pattern performs a visual sorting of the data which can be used as a discriminant to separate classes of data.
The construction of Parallel Coordinates visualization is as follows. Simply connect a line from one DA Anchoφoint to another. The length of these connecting lines is controlled by the DA parameter (P5). However, if all Anchoφoints on all DAs are exhaustively connected (each Anchoφoint connected to N-l other Anchoφoints) one gets additional interesting visualizations. We define an additional parameter (P6), which represents how many DAs a P5 connecting line can cross. This blocking parameter when set to 0 produces the familiar Parallel Coordinate visualization. Parallel Coordinates in a circle is also generated using the P5 and P6 parameters when the DAs are arranged as radial spokes from the center of a circle. If the Anchoφoints along a dimensional anchor are considered to be fixed points where imaginary springs are attached to a movable data point, then a visualization similar to radial visualizations can be generated. One parameter P7 is used to control the size of the point placed in the display here the sum of the spring forces is 0. An "enhanced" radial visualization display, referred to as polyvis, is generated when dimensional anchors are arranged as an N-dimensional regular polygon. One of the limitations of radial visualization was that many data points could overlap in the center of the circle although they had different coordinate values. When the fixed spring points are spread out along the DA in a regular polygon, the chance of points overlapping is much less than in the original radial visualization. If the DAs are compressed to points uniformly distributed around the circumference of a circle, the original radial visualization display is generated. Polyviz is a visualization that better utilizes the total area of the circle (or regular polygon). An additional parameter (P8) is used to draw lines extending from the fixed spring points to the displayed point where the sum of the spring forces are 0. An additional parameter (P9) is also used as a zoom factor in the display. Referring to FIG. 7 a polyviz visualization is shown with seven points. FIG. 7 and FIG. 8 illustrate examples of a "spread" polygon, whereas FIG. 9 illustrates a regular polygon. FIG. 7 and FIG. 8 have the spring lines (P8=l) shown as line extensions from the Anchoφoints.
In an embodiment, referring to FIG. 10, re-arranging the DAs into a Parallel Coordinate arrangement and setting the appropriate P values produces a nice discrimination using the spring forces. Note that in the Parallel Coordinate arrangement in FIG. 10, the Anchoφoints and lines extending to the points look very similar to a Parallel Coordinate arrangement except for the first point (1,1,1,1). Because of the normalization (all values in a column are normalized to between 0 and 1) the spring forces on the first point are totally zero, and the point is displayed wherever the geometric center of the display is defined. Normally, this will be in the center of the display, but it could be defined at some other point depending on the arrangement of the DAs. If one wanted to detect outlier points that have minimum values for all dimensions, the geometric center should be defined differently from an equal spring force center. The zoom parameter P9 is designed such that when it is 0 all points lie at the geometric center, at .5 it corresponds to a normal physical spring
(Force = P9x2xKxDX) and at higher values it amplifies the spring K value. Slowly increasing P9 shows all points moving away from the geometric center (except for points with minimum values for all dimensions).
DAs and Polyviz provide a mechanism to generate linear combinations of visualizations.
To illustrate the idea of "a linear combination of visualizations" we shall give several examples. Let the total number of dimensions used in a visualization be d. Usually this will also be the total number of dimensional anchors used, however, for some visualizations the number of DAs with be 2d. The parameters to generate a Parallel Coordinate visualization are:
Ppc = {Pi, Pi, ... P9}=[0,0,0,0,1 A0,0,0] The Parallel Coordinate visualization can then be defined as
Vpc = f(Ppc,Gpc(d)) where Gpc(d) is the geometry of the Parallel Coordinate arrangement of DAs. The parameters to generate a radial visualization are:
Prv={P,, P2, ...P9} = [0,0,0,0,0,0,...5,0,...5] The radial visualization can be defined as
Vrv=f(Prv,Grv(d)) where Grv(d) is the geometry of the Radviz arrangement of DAs. A new visualization defined in terms of Parallel Coordinates and polyviz is
Vnew=f(Vpc, Vrv,) One example of a linear combination would be
Vnew = .5Vpc + .5Vrv,
Multiplication by a scalar and addition is easily defined for the parameter vectors giving the new visualization parameter vector as
Pnew = [0,0,0,0,..5,0,.25,0,.25[ = .5Ppc + .5Prv
A linear combination of the geometry of the DAs can be defined in many ways. For example, in FIG. 11 one embodiment describes a possible transformation on a Parallel Coordinate configuration of DAs is shown. The DAs are smoothly wrapped around into a cross shape. If DAs are now progressively shortened to points at the outer endpoints of the cross, the normal radial visualization is achieved. The geometry of the transformation is not affined (parallel lines don't stay parallel), or projective. Although each individual DA transformation could be considered projective, the DAs taken together are not (the incidence of points and lines is not invariant). The transformations can be considered topological if we include the constraint that the endpoints of the DAs never actually meet. If we define our distance function as having equal values between the transformations in FIG. 11, we can define a new visualization geometry. Gnew(d) = .5Gpc(d) + .5Grv(d) Technically, Gpc(d) and Grv(d) are transformations on an initial arrangement of DAs. If we set Gpc(d)=0 (our initial arrangement) we get
Gnew(d)= 5Grv(d) which is the transformation halfway from Gpc(d) to Grv(d) (the bottom arrangement in
FIG. 11.
The linear combination of Parallel Coordinates and radial visualization is shown in FIG. 12. The DA layout is Circular Parallel Coordinates and parameters for Radviz springs and Parallel Coordinates are positive values. Notice that the spring points are shown very close to the DAs and the Parallel Coordinates lines are not completely connected. First, the size of the points should be appropriate for the display (depending on the number of points) and the connecting Pc lines should not have gaps. This means that a more useful display might be
Vpc + Vrv, where the radial visualization point size is normal, and where the Parallel Coordinate lines are full connected.
There is some useful information from both parts of the visualization. Such as seeing the relative values in each dimension from the Parallel Coordinate lines, and seeing the more important relative dimension effect from the radial visualization/spring points.
It was shown above that dimensional anchors allow the generation of new visualizations as linear combinations of any visualization described by an arrangement of dimensional anchors. With the same algebra of dimensional anchors
(parameters and geometry) we can also smoothly transform one visualization to another. As an example we show how Parallel Coordinates can easily be transformed to Circle Parallel Coordinates. It helps to use a construction called dimensional bounds that encloses the dimensional anchors.
Consider the dimensional layout of Parallel Coordinates (PC) to be equally spaced straight lines (the DAs) connected to 2 lines (top and bottom dimensional bounds) that can be any arbitrary shape and length defined by a few parameters. It is easy to see that the dimensional bounds can move from straight lines to arcs and become an outer circle and an inner circle. The inner circle can get smaller and smaller until it is a single point. This then becomes the Circle Parallel Coordinates or the star glyph. Using the Conic Sections or more general quadratic polynomials we can provide a large number of transformations of similar visualizations. The transformations of the dimensional bounds can be simple paths of a plane through these conic sections.
Using the nine parameters defined above we see that this visualization space is minimally P9 (that is a 9 dimensional space). However, that is only for one- dimensional anchor. A typical dataset might have 10 dimensions, thus requiring 10 dimensional anchors each of which has 9 parameters (space =P9*10). More importantly, the different arrangements of DAs will further increase the dimension of the visualization space. If one limits the layout arrangements to those similar to Parallel Coordinates and Radviz, the visualization space is greatly reduced. Additionally, if one uses the same nine parameters for each dimensional anchor we may reduce the visualization space to perhaps P12. It is possible to take a "grand tour" through this visualization space. By varying the 9 parameters, and animating the arrangement of the dimensional anchors one can slowly (depending on the dataset size and computer speed) move from one visualization to another. The previously described visualizations demonstrate a limited manual tour that has proved useful in finding new visualizations. Referring to FIG. 13, an exemplary data visualization system 10 includes a computer system 12 connected via a link 14 to a display unit 16. The computer system includes at least a memory 18, a central processing unit (CPU) 20, and a mass storage device 22. The mass storage device contains data visualization instructions 24. In operation the data visualization instructions 24 are loaded into the memory 18 for processing in the CPU 20. Output from the processing in the CPU 20 displays visualizations on the display unit 16.
A number of embodiments of the invention have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. For example, DAs and their associated parameters can be arranged in any arbitrary size, shape and position. We have only investigated some of these "regular arrangements" like in Parallel Coordinates, Circle Segments, Regular Polygons, spread polygons and Criss Cross Patterns. There are many possible additional arrangements such as arcs, or curves arranged in the form of lens shaped configurations and hyperbolic circular displays. These configurations will make special "focused" visualizations. Dimensional anchors shaped in the form of polynomial, or logarithmic functions will also have useful properties. The various visualizations generated from Parallel Coordinate type lines, scatter plot lines, and spring lines in various configurations such as triangles or polygons we have categorized as mesh plots. The visualizations are characterized by varying densities of a "mesh" of lines.
The spring paradigm for display of data has been quite successful. By spreading out the fixed spring "Anchoφoints" we have increased the efficiency and usefulness of the polyviz. This has reduced the point overlap problem significantly.
Additionally, by showing part of the "spring" lines extending from the Anchoφoints on the dimensional anchors the understanding and usefulness of polyviz increased. Since in many visualizations points cluster in the center, the additional information provided by the partial spring lines does not cost very much in terms of screen real estate.
It appears that various configurations of dimensional anchors using the spring parameter can place an ordering of the data in various ways, for example, with the crisscross pattern or DAs compressed along a straight line. The usefulness of the Circle Segment (modified) visualization has been investigated previously. One of its most useful features is the ability to give an overall correlation of all dimensions (or variates) with a particular classification dimension. The Color Correlated Column Viz is a modification of this idea that is probably more useful when the number of dimensions is small (less than 30) and possibly might be easier to understand in some cases. Accordingly, other embodiments are within the scope of the following claims.

Claims

WHAT IS CLAIMED IS:
1. A method of providing a visualization of data, the method comprising: receiving a first set of one or more values for a first set of one or more parameters controlling one or more characteristics of a first type of data visualization; receiving a second set of one or more values for a second set of one or more parameters controlling one or more characteristics of a second type of data visualization, the second type of data visualization being a different type of data visualization than the first type of data visualization; and producing a visualization of a set of data, the visualization having characteristics based on one or more values of the first set of parameters and one or more values of the second set of parameters.
2. The method of claim 1, wherein one of the types of data visualization comprises a scatter plot.
3. The method of claim 2, wherein one of the parameters comprises at least one of the following: a size of a scatter plot point, a length of a line extending from a dimension anchor to a scatter plot point, and a length of lines connecting points associated with the same data point in a scatter plot.
4. The method of claim 1, wherein one of the types of data visualization comprises a survey plot.
5. The method of claim 4, wherein one of the parameters comprises a length of a rectangle in a survey plot.
6. The method of claim 1, wherein one of the types of data visualization comprises a radial visualization plot.
7. The method of claim 6, wherein one of the parameters comprises at least one of the following: a size of a radial visualization spring plot point, a length of lines extending from a dimension anchor to a radial visualization spring plot point, and a spring constant.
8. The method of claim 1, wherein one of the types of data visualization comprises a parallel coordinate visualization.
9. The method of claim 8, wherein one of the parameters comprises at least one of the following: a length of parallel coordinate lines and a blocking factor for parallel coordinate lines.
10. The method of claim 1, wherein the first set and the second set comprise receiving from one or more graphical user interface controls.
11. The method of claim 1, wherein the first set and the second set comprises accessing a computer memory location.
12. The method of claim 1, wherein producing a visualization comprises at least one of the following: displaying, printing and generating a file describing the visualization.
13. A computer program product, disposed on a computer readable medium, for providing a visualization of data, the program comprising instructions for causing a processor to: receive a first set of one or more values for a first set of one or more parameters controlling one or more characteristics of a first type of data visualization; receive a second set of one or more values for a second set of one or more parameters controlling one or more characteristics of a second type of data visualization, the second type of data visualization being a different type of visualization than the first type of data visualization; and produce a visualization of a set of data, the visualization having characteristics based on one or more values from the first set of parameters and one or more values from the second set of parameters.
14. A data visualization method comprising: arranging dimensional anchors as an N-dimensional regular polygon; mapping to an axis of each of the N-dimensions data values; and displaying the mapped data values.
15. The method of claim 14 where the mapping comprises assigning minimum and maximum data values of each of the N-dimensions to points near end points of each of the axes representing the N-dimensions.
16. A data visualization method displaying data comprising: alternating a base set of dimensions for the data with a special classification dimension; sorting one of the dimensions of the base set according to the special classification dimension; mapping the sorted dimensions to gray scale representations; mapping the special classification dimension to color scale representations; and displaying the graying scale representation and the color scale representation.
17. A method of producing a hybrid visualization comprising: generating a first visualization of data; generating a second visualization of the data; inteφolating between the first visualization and second visualization to produce the hybrid visualization of the data; and displaying the hybrid visualization of the data.
18. The method of claim 17 wherein the first visualization of data is parallel coordinate visualization.
19. The method of claim 17 wherein the second visualization of data is a Radviz visualization.
20. The method of claim 17 further comprising:
generating a plurality of visualizations of the data;
inteφolating the first and second visualization of data with the plurality of visualization of the data to produce the hybrid visualization of data; and
smoothly animating the hybrid visualization of data.
EP00990488A 1999-11-05 2000-11-06 Data visualization Withdrawn EP1230587A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US16389899P 1999-11-05 1999-11-05
US163898P 1999-11-05
PCT/US2000/041912 WO2001037072A1 (en) 1999-11-05 2000-11-06 Data visualization

Publications (1)

Publication Number Publication Date
EP1230587A1 true EP1230587A1 (en) 2002-08-14

Family

ID=22592073

Family Applications (1)

Application Number Title Priority Date Filing Date
EP00990488A Withdrawn EP1230587A1 (en) 1999-11-05 2000-11-06 Data visualization

Country Status (5)

Country Link
EP (1) EP1230587A1 (en)
JP (1) JP2003515211A (en)
KR (1) KR20020052200A (en)
CN (1) CN1409838A (en)
WO (1) WO2001037072A1 (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10417700B2 (en) * 2005-03-03 2019-09-17 Refinitiv Us Organization Llc System and method for graphical display of multivariate data
US7779344B1 (en) 2006-10-31 2010-08-17 Hewlett-Packard Development Company, L.P. System and method for creating a value-based stacked bar chart
KR100993817B1 (en) * 2007-12-21 2010-11-12 한국과학기술정보연구원 System and Method for analysis of information
US8677235B2 (en) 2008-05-13 2014-03-18 Microsoft Corporation Ranking visualization types based upon fitness for visualizing a data set
US20160012115A1 (en) 2013-02-28 2016-01-14 Celal Korkut Vata Combinational data mining
CN103472978B (en) * 2013-09-26 2017-10-13 深圳市华傲数据技术有限公司 A kind of method for visualizing and system based on quartile figure display data
CN105612741B (en) * 2013-10-09 2017-10-31 慧与发展有限责任合伙企业 Multivariate data is shown in various dimensions
US9978114B2 (en) 2015-12-31 2018-05-22 General Electric Company Systems and methods for optimizing graphics processing for rapid large data visualization
CN107367755A (en) * 2016-05-11 2017-11-21 中国石油化工股份有限公司 A kind of improved multi-parameter crossplot method for drafting
KR102163155B1 (en) * 2018-09-21 2020-10-08 한국과학기술연구원 유럽연구소 Method and apparatus for converting data by using nested scatter plot

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5473736A (en) * 1992-06-08 1995-12-05 Chroma Graphics Method and apparatus for ordering and remapping colors in images of real two- and three-dimensional objects
US5590271A (en) * 1993-05-21 1996-12-31 Digital Equipment Corporation Interactive visualization environment with improved visual programming interface
US6188403B1 (en) * 1997-11-21 2001-02-13 Portola Dimensional Systems, Inc. User-friendly graphics generator using direct manipulation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO0137072A1 *

Also Published As

Publication number Publication date
WO2001037072A8 (en) 2002-05-30
CN1409838A (en) 2003-04-09
JP2003515211A (en) 2003-04-22
WO2001037072A9 (en) 2002-11-14
KR20020052200A (en) 2002-07-02
WO2001037072A1 (en) 2001-05-25

Similar Documents

Publication Publication Date Title
US10872446B2 (en) Systems and methods for high dimensional 3D data visualization
Hoffman et al. Dimensional anchors: a graphic primitive for multidimensional multivariate information visualizations
US6400366B1 (en) Method and system for the interactive visualization and examination of data
US7196705B2 (en) System and method for presenting independent variable geometric relationships
Ward Xmdvtool: Integrating multiple methods for visualizing multivariate data
JP3747404B2 (en) Graphics image creating apparatus, method and program thereof
Kreuseler et al. A flexible approach for visual data mining
Herman et al. Graph visualization and navigation in information visualization: A survey
US20230103734A1 (en) Localized visual graph filters for complex graph queries
US7692652B2 (en) Selectively transforming overlapping illustration artwork
JPH06507261A (en) Multidimensional graph representation in two-dimensional space
EP1230587A1 (en) Data visualization
US8773436B1 (en) Pixel charts with data dependent display spaces
US10482130B2 (en) Three-dimensional tree diagrams
Nocke et al. Icon-based visualization using mosaic metaphors
KR101782816B1 (en) Treemap visualization method and device using the method
Wilhelm Interactive statistical graphics: the paradigm of linked views
Yang Visual exploration of large relational data sets through 3D projections and footprint splatting
Van Ham Interactive visualization of large graphs
Hoffman et al. Visualizations for high dimensional data mining-table visualizations
US11776207B2 (en) Three-dimensional shape data processing apparatus and non-transitory computer readable medium
Fuchs et al. Extended focus & context for visualizing abstract data on maps
Peng Dissemination via Data Visualization
GRINSTEIN PATRICK E. HOFFMAN
Kovalerchuk et al. General Line Coordinates (GLC)

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20020605

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Effective date: 20031029